Please note: there may be some adjustments to the teaching arrangements published in the course catalogue for 2020-21. Given current circumstances related to the Covid-19 pandemic it is anticipated that some usual arrangements for teaching on campus will be modified to ensure the safety and wellbeing of students and staff on campus; further adjustments may also be necessary, or beneficial, during the course of the academic year as national requirements relating to management of the pandemic are revised.

Data Science Foundations (ODL) STATS5095

  • Academic Session: 2021-22
  • School: School of Mathematics and Statistics
  • Credits: 10
  • Level: Level 5 (SCQF level 11)
  • Typically Offered: Semester 2
  • Available to Visiting Students: No
  • Available to Erasmus Students: No
  • Taught Wholly by Distance Learning: Yes

Short Description

This course introduces students to data analytics and data science as well as different approaches to learning from data and provides an introduction to statistical model-based inference.


The course mostly consists of asynchronous teaching material.

Requirements of Entry

The course is only available to online-distance learning students on the PGCert/PGDip/MSc in Data Analytics for Government.

Excluded Courses

Inference 3

Statistics 3I: Inference

Statistical Inference (Level M)

Learning from Data - Data Science Foundations (ODL)




100% Continuous Assessment

The continuous assessment will typically be made up of one class test, a report, and three homework exercises, including online quizzes. Full details are provided in the programme handbook..

Main Assessment In: April/May

Course Aims

The aims of this course are:

■ to introduce students to different types of data and different approaches to learning from data;

■ to introduce students to data visualisation;

■ to present the fundamental principles of likelihood-based inference, interval estimation and hypothesis testing;

■ to introduce Bayesian inference;

■ to show students how to implement these statistical methods using R.

Intended Learning Outcomes of Course

By the end of this course students will be able to:


■ explain different types of data and data structures and discuss advantages and challenges of using data of different types in a given context;

■ describe different ways of collecting data and discuss advantages and challenges of using data obtained from different sources in a given context;

■ describe and visualise structured and unstructured data of different types using suitable summaries and plots;

■ explain different approaches to learning from data and discuss their advantages and disadvantages in a given context;

■ define and contrast population and sample, parameter and estimate

■ write down and justify criteria required of 'good' point estimators, and check whether or not a proposed estimator within a stated statistical model satisfies these criteria;

■ apply the principle of maximum likelihood to obtain point and interval estimates of parameters in statistical models, making appropriate use of numerical methods for optimisation;

■ formulate and carry out hypothesis tests in Normal models, as well as general likelihood-based models, correctly using the terms null hypothesis, alternative hypothesis, test statistic, rejection region, significance level, power, p-value;

■ describe the rules for updating prior distributions in the presence of data, and for calculating posterior predictive distributions;

■ implement these statistical methods using the R computer package;

frame statistical conclusions from interval estimates and hypothesis tests clearly.

Minimum Requirement for Award of Credits

Students must submit at least 75% by weight of the components (including examinations) of the course's summative assessment.