Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts

Main Article Content

Vihanga Jayawickrama
Gihan Weeraprameshwara
Nisansa de Silva
Yudhanjaya Wijeratne

Abstract

Research on natural language processing in most regional languages is hindered due to resource poverty. A possible solution for this is utilization of social media data in research. For example, the Facebook network allows its users to record their reactions to text via a typology of emotions. This network, taken at scale, is therefore a prime dataset of annotated sentiment data. This paper uses millions of such reactions, derived from a decade worth of Facebook post data centred around a Sri Lankan context, to model an eye of the beholder approach to sentiment detection for online Sinhala textual content. Three different sentiment analysis models are built, taking into account a limited subset of reactions, all reactions, and another that derives a positive/negative star rating value. The efficacy of these models in capturing the reactions of the observers is then computed and discussed. The analysis reveals that the Star Rating Model, for Sinhala content, is significantly more accurate (0.82) than the other approaches. The inclusion of the like reaction is discovered to hinder the capability of accurately predicting other reactions. Furthermore, this study provides evidence for the applicability of social media data to eradicate the resource poverty surrounding languages such as Sinhala.

Article Details

Select the Journal Issue
Articles