Sentiment Analysis of Online Media
Brendan Murphy (University College Dublin)
Friday 3rd February, 2012 15:00-16:00 Maths 203
We consider the analysis of sentiment in an Irish online media data set comprising of online news articles and non-expert user annotations of these articles as having either negative, positive or irrelevant impact on the Irish economy. These users may exhibit annotator bias in having a tendency to view articles in an overly positive or negative way.
A joint model for annotation bias and document classification is presented in the context of an online media sentiment analysis problem. The joint model combines a statistical model for user annotation bias and a Naive-Bayes model for the document terms. An EM algorithm is used to estimate the annotation bias model, the unobserved biases in the user annotations, the classifier parameters and the sentiment of the articles.
The joint modeling of both the user biases and the classifier is demonstrated to be superior to a two-stage approach of estimating the annotator bias followed by the estimation of the classifier parameters.
Recent extensions of this work to cluster annotators that exhibit biases will also be discussed.
This work has been completed in conjunction with Michael Salter-Townshend.