Data Mining and Machine Learning II: Big Data and Unstructured Data (ODL) STATS5081

  • Academic Session: 2019-20
  • School: School of Mathematics and Statistics
  • Credits: 10
  • Level: Level 5 (SCQF level 11)
  • Typically Offered: Summer
  • Available to Visiting Students: No
  • Available to Erasmus Students: No
  • Taught Wholly by Distance Learning: Yes

Short Description

This course introduces data mining and machine learning methods used in big data scenarios and also introduces methods for analysing networks and unstructured data.

Timetable

The course mostly consists of asynchronous teaching material.

Requirements of Entry

The course is only available to online-distance learning students on the PGCert/PGDip/MSc in Data Analytics and Data Analytics for Government.

Excluded Courses

Big Data Analytics

Big Data Analytics (Level M)

Co-requisites

-/-

Assessment

100% Continuous Assessment

This will typically be made up of a project (40%), two oral assessments (40%) and one homework exercise / online quiz (20%). Full details are provided in the programme handbook.

Course Aims

The aims of this course are:

■ to introduce students to Gaussian processes;

■ to introduce the students to big data methods commonly applied in Machine Learning, notably regularised regression;

■ to illustrate the role of sparsity when analysing high-dimensional data;

■ to introduce students to graphical models and how they can be used for structural inference in high-dimensional data;

■ to introduce students to informal and formal methods for social network analysis and quantitative text analysis.

Intended Learning Outcomes of Course

By the end of this course students will be able to:

■ fit Gaussian process models;

■ describe the challenges of the analysis of high-dimensional data and discuss, in a particular context, strategies for tackling big data problems;

■ formulate and fit a regularised linear model, such as ridge regression, the LASSO and partial least-squares;

■ infer statements about (conditional) independence from graphical models and factorisations of the joint distribution;

■ describe methods for structural inference in graphical models and apply them in a given context;

■ make appropriate use of informal and formal methods for social network analysis and quantitative text analysis.

Minimum Requirement for Award of Credits

Students must submit at least 75% by weight of the components (including examinations) of the course's summative assessment.