A word in your ear

John Corbett imageWhat’s your favourite word? Peerie? Scunner? Glaikit? Perhaps you prefer braw or bonny?

As far as Professor John Corbett is concerned, words speak louder than actions. Working with a team from the School of English and Scottish Language and Literature, the Professor of Applied Language Studies has recently completed the second stage of the SCOTS project (Scottish Corpus of Text & Speech), creating a digital collection of writing and speech from contemporary Scotland that contains more than 4 million words.

The online corpus - the first large-scale project of its kind for Scotland - is searchable by word, author, gender and region and can offer up written and spoken examples of a host of ‘Scottish’, and not so ‘Scottish’ words at the touch of a button. According to Professor Corbett, it seeks to revolutionise our understanding of language use in Scotland.

‘We are particularly proud of the speech element of it,’ he says. ‘We’ve got everything from spontaneous parent-child discourse and conversations between people of different ages and backgrounds to lectures and interviews in Scots and English with people like Ian Rankin.’

‘When I was rewriting a textbook that is used by the undergraduates at the University I was constantly surprised that things I thought were fairly obvious were actually not true. Until you can actually sit down with a lot of recordings of people talking like this and do a search, it’s very difficult to get hard statistical data.’ Professor John Corbett

Putting in a good word

Begun with finance from the Engineering and Physical Sciences Research Council as an initiative shared with the University of Edinburgh, in its latter stages SCOTS has been a Glasgow University project funded by the Arts and Humanities Research Council. The website recorded around a million hits in the month of August 2007.

Mr David Beavan, Computing Manager for the corpus, explains: ‘Although our target audience is linguists, the project has been immensely popular with the public because it gives access to all varieties of language in one place. I think that perhaps we have broken down some walls in terms of people seeing this kind of project as purely the preserve of academics. People look up their favourite words, and with four million online, there’s a good chance they can find them.’

A sibling project, the Corpus of Modern Scottish Writing, will take the corpus back to 1700 and look at writing until 1945, which is when SCOTS begins. This will allow users to trace changes in language between 1700 and the present day.