Postgraduate taught 

Computational Geoscience MSc

Machine Learning Applications for Earth Systems Problems EARTH5014

  • Academic Session: 2023-24
  • School: School of Geographical and Earth Sciences
  • Credits: 10
  • Level: Level 5 (SCQF level 11)
  • Typically Offered: Semester 2
  • Available to Visiting Students: Yes

Short Description

State of the art research methodologies and instrumentation confronts geoscience researchers with a wide range of complex high-dimensional and / or large datasets. The development of modern data science tools, such as machine learning, provides the ideal tools to extend the insights into statistics in new and exciting ways. Students are exposed to core concepts data sciences and machine learning. This includes supervised and unsupervised machine learning, data wrangling and data exploration. These are built up from initial topics in multivariate statistics. We leverage open-source programming language Python as well as Jupyter notebooks as a flexible platform for learning these concepts while developing basic coding skills


5 weeks. 2 hours of lecture per week and 2 hours of practical per week, and a further 1 hour of supervised workshop a week for groups to develop and work on their project with instructor feedback.

Excluded Courses



GEOG5123_Introduction to Environmental Statistics

GEOG51XX - Spatial data analytics

GEOG5008 Geospatial Fundamentals or EARTH50XX - Numerical Foundations of Geodynamics


Assessment consists of a "bring your own data" project (report, 1000-1200 words - 70%), which will be designed by the student to reflect their ability to use the tools presented to solve a relevant problem.

The remaining assessment is a short video presentation (30%, equivalent to 500 words).


If students do not have their own data, open data sets (such as the SMID, EarthChem or other open data sources can be made available).

Course Aims

This course will introduce students to the fundamentals of data science via problems in the Geosciences using the python programming language.

Intended Learning Outcomes of Course

By the end of this course students will be able to:

■ Demonstrate how to use the data structures, functions, and visualisation tools in Python (and several dependant libraries) to explore and analyse multivariate data.

■ Produce summary statistics for exploratory data analysis and multivariate statistical tests.

■ Employ supervised machine learning algorithms to perform classification and prediction tasks on data sets.

■ Apply unsupervised machine learning to perform dimensional reduction, data clustering, and categorisation on unlabelled high-dimensional data.

Minimum Requirement for Award of Credits

Students must submit at least 75% by weight of the components (including examinations) of the course's summative assessment.