Predictive Modelling (online) STATS5076

  • Academic Session: 2018-19
  • School: School of Mathematics and Statistics
  • Credits: 10
  • Level: Level 5 (SCQF level 11)
  • Typically Offered: Semester 2
  • Available to Visiting Students: No
  • Available to Erasmus Students: No
  • Taught Wholly by Distance Learning: Yes

Short Description

This course introduces students to predictive models for regression and classification.

Timetable

The course consists of short online lessons (each of at most 30 minutes length), totalling around 15-20 hours. Embedded in these lessons are formative quizzes and assessment tasks (not included in the above duration). These are flexible and can be taken (and re-taken) at any time. There also are 6-10 hours of tutorials and computer-based labs.

Requirements of Entry

The course is only available to students on the online MSc in Data Analytics.

Excluded Courses

Linear Models 3

Statistics 3L: Linear Models

Regression Models (Level M)

Co-requisites

-/-

Assessment

30% Continuous Assessment

70% Final exam (can be taken at test centres)

Main Assessment In: April/May

Course Aims

The aims of this course are:

■ to introduce students to predictive modelling using multiple linear regression as a showcase;

■ to present some of the distributional theory underpinning the normal linear models and the associated methods for testing and interval estimation;

■ to explain how the design matrix of a linear model can be constructed to accommodate categorical covariates or, through basis expansions, non-linear effects;

■ to introduce students to logistic regression as an example of a discriminative method for classification;

■ to introduce students to linear discriminant analysis as an example of a generative method for classification;

■ to describe and contrast several common methods for model assessment as well as variable and model selection;

■ to show students how to implement these statistical methods using the R computer package.

Intended Learning Outcomes of Course

By the end of this course students will be able to:

■ formulate normal linear models in vector-matrix notation and apply general results to derive ordinary least squares estimators in particular contexts;

■ construct a design matrix incorporating categorical covariates or covariates with a nonlinear effect;

■ derive, evaluate and interpret point and interval estimates of model parameters;

■ conduct and interpret hypothesis tests in the context of the Normal Linear Model;

■ derive, evaluate and interpret confidence and prediction intervals for the response at particular values of the explanatory variables;

■ assess the assumptions of a normal linear model using residual plots and diagnostics;

■ contrast the discriminative approach to classification to the generative one;

■ explain the model used by and make use of logistic regression and linear discriminant analysis;

■ make use of and critique different methods for assessing the performance of a predictive model such R2 or AIC/BIC and use these for model or variable selection;

■ explain and interpret ROC curves and performance measures such as AOC

■ implement these statistical methods using the R computer package;

■ frame statistical conclusions clearly.

Minimum Requirement for Award of Credits

Students must submit at least 75% by weight of the components (including examinations) of the course's summative assessment.