Postgraduate taught 

Data Science MSc

Big Data: Systems, Programming, and Management (H) COMPSCI4064

  • Academic Session: 2023-24
  • School: School of Computing Science
  • Credits: 10
  • Level: Level 4 (SCQF level 10)
  • Typically Offered: Semester 2
  • Available to Visiting Students: Yes

Short Description

Big Data is nowadays manifested in a very large number of environments and application fields pertaining to our education, entertainment, health, public governance, enterprising, etc. The course will endow students with the understanding of the new challenges big data introduces and the currently available solutions. These include (i) challenges pertaining to the modelling, accessing, and storing of big data, (ii) an understanding of the fundamentals of systems designed to store and access big data, and (iii) programming paradigms for efficient scalable access to big data.

Timetable

3 hours contact time per week

Excluded Courses

Big Data (M)

Co-requisites

None 

Assessment

Examination 75%, Coursework 25%.

Main Assessment In: April/May

Are reassessment opportunities available for all summative assessments? No

Reassessments are normally available for all courses, except those which contribute to the Honours classification. For non Honours courses, students are offered reassessment in all or any of the components of assessment if the satisfactory (threshold) grade for the overall course is not achieved at the first attempt. This is normally grade D3 for undergraduate students and grade C3 for postgraduate students. Exceptionally it may not be possible to offer reassessment of some coursework items, in which case the mark achieved at the first attempt will be counted towards the final course grade. Any such exceptions for this course are described below. 

 

This coursework is done in groups and therefore cannot be reassessed.

Course Aims

The course aims to endow students with:

An understanding of the new challenges posed by the advent for big data, as they refer to its modelling, storage, and access, paying particular emphasis on the impact of the desiderata of scalability and efficiency in big data infrastructures.

Exposure to a number of different cloud data stores and their design and implementation details, showing how they can achieve efficiency and scalability, while also addressing design trade-offs and their impacts.

Familiarity with modern programming paradigms (e.g., MapReduce, RDDs, etc.), so to enable them to write programs which can execute in massively parallel infrastructures in the cloud.

The ability to understand the internals of (NoSQL) cloud data storage systems and the ability to enrich these systems with additional functionality.

Intended Learning Outcomes of Course

By the end of this course students will be able to:

1. Design, employ and evaluate programs to access big data repositories in a massively parallel manner;

2. Describe and contrast the internals of the design and implementation of current cloud data storage and processing systems;

3. Identify and discuss issues related to the scalability and efficiency challenges when processing complex queries/algorithms against big data systems, and propose ways of addressing said challenges;

4. Demonstrate that they have mastered the required background knowledge to pursue graduate studies in the fields of cloud systems and big data.

Minimum Requirement for Award of Credits

Students must submit at least 75% by weight of the components (including examinations) of the course's summative assessment.