New tool gives AI a ‘sense of place’

A new development in data science has given one popular machine learning tool an improved sense of place, which enables it to make more accurate predictions based on data linked to locations.

Researchers from the University of Glasgow and Florida State University have found a way to overcome a key limitation of TabPFN, one of a class of AI tools known as foundation models.

Foundation models are large AI systems which are trained once on vast amounts of data and which can then be applied to a wide range of tasks. While models like ChatGPT are designed to handle text and data, TabPFN is trained analyse and predict the outcomes of tabulated data – information stored in spreadsheets or databases as rows and columns.

In a new paper published in the International Journal of Geographical Information Science, the team show how they set out to test TabPFN’s ability to process and analyse geospatial data. Geospatial data spreadsheets contain socially and environmentally important information, where each row might represent a house, a neighbourhood, a monitoring station, or a local area.

A key distinction between geospatial data and other tabulated information is that each data point is related to the others as representations of physical spaces in the real world. Analysis of the data can enable research and decision-making on important issues including air pollution, housing prices, public health and public services, elections, demographic change, and climate change.

The researchers’ analysis of TabPFN showed that while it performs strongly on many tasks related to geospatial data, it becomes less reliable on large datasets, or when the relationships between nearby places are especially localised.

To overcome those barriers, they developed a new framework for TabPFN to use based around a methodology they named Geospatial Sparse Attention (GSA). Their modified, open-source tool, called TabPFN-GSA, gives the model a practical ‘sense of place’ by directing more of its attention towards geographically relevant observations while still drawing on selected information from farther away.

Dr Mingshu Wang, of the University of Glasgow’s School of Geographical & Earth Sciences, is one of the paper’s authors. He said: “The first law of geography is that ‘everything is related to everything else, but near things are more related than distant things’. In geospatial data, that means that we can scrutinise how closely data points are related to each other in space in order to find connections and draw conclusions.

“General-purpose tabular models can be very powerful, but they are trained to treat rows as independent observations - they don’t automatically understand the principles of geospatial data. That’s why we set out to expand TabPFN’s ability to make the connections between tabulated geospatial data instead of trying to build and train a new model from scratch.”

The team worked to determine how TabPFN reaches its predictions, and developed GSA to intervene at the point of inference, where it makes its predictions based on its understanding of the data. By analysing the model’s internal attention patterns, they found that its focus became increasingly concentrated on a small number of observations as it worked, and on geographically closer ones.

University of Glasgow PhD student Rui Deng is the paper’s first author. He said: “In geospatial data, each row of the table has its own locational information like map coordinates. In our Geospatial Sparse Attention model, we divide the whole region covered by the table into a grid, so we know the relative distance between all the data points. Then, we guide the model to attend more to nearer points rather than distant ones, focusing it on the local context. We didn’t modify TabPFN itself; instead, we provided it with a better context to improve the model’s performance.”

The researchers evaluated TabPFN-GSA’s performance on 30 synthetic datasets representing a range of geographical processes. Then, they set it to work on four real-world datasets spanning environmental and socioeconomic topics: air-pollution readings, county-level results from the 2020 US presidential election, housing prices, and neighbourhood-level poverty across the continental United States. The datasets span a wide range of scale, from just over a thousand records to roughly 70,000.

The team chose these studies, which have been used in other geospatial data research projects as a benchmark, to help determine the effectiveness of their Geospatial Sparse Attention model.

They found that TabPFN-GSA generally produced more accurate and robust predictions than the standard model, and reduced the memory failures that prevented the original from running on the largest datasets. Notably, it was able to complete predictions on the 70,000-row poverty dataset, which the unmodified model could not handle.

The researchers expect that TabPFN-GSA, which is freely available to as open-source software, will be useful to data science researchers in a wide range of contexts, from academia to local councils, national agencies and data-analytics companies. Since TabPFN-GSA can be used offline on local computers, it could help ensure that sensitive data can be processed without the security concerns associated with online AI models.

Dr Ziqi Li of Florida State University is a co-author of the paper. He said: “Foundation models are designed to generalise across many datasets, but geographical data contain distinctive structures that general-purpose models may overlook. This study shows that established geographical principles can be incorporated into a pretrained foundation model in a lightweight and practical way, improving both its spatial awareness and its ability to handle larger datasets.”

The development of TabPFN-GSA follows the same team’s work on a tool called GeoAggregator, released last year, which used a different method to enable distance-based analysis of geospatial data.

The two projects represent complementary methods of making AI more spatially aware, with GeoAggregator being built from scratch and TapPFN-GSA modifying an existing model. The team are exploring whether it may be possible to combine them in future, with a united framework capable of deciding which approach best suits a particular task.

The team’s paper, titled ‘Do Foundation Models Work for Geospatial Tabular Data? An Investigation of TabPFN and a Proposed Enhancement based on Geospatial Sparse Attention’ , is published in the International Journal of Geographical Information Science. The research was supported by funding from the NVIDIA’s Academic Grant Programme, and the Google Cloud Research Credits programme.

The data and code that support the findings of this study are available in the figshare repository. TabPFN-GSA is available in a public https://github.com/ruid7181/Python repository.

First published: 26 June 2026

<< University news

We use cookies

Necessary cookies

Analytics cookies

Clarity

University news

New tool gives AI a ‘sense of place’

Related links