Exploring Fine-Grained Emotion Detection in Microblog Text

 

Endowing computers with the ability to recognize emotions in text has important applications in the field of information science. Over a decade of research in sentiment analysis on Twitter, a popular microblogging site, has allowed large amounts of tweets (i.e., Twitter posts) to be harnessed to predict stock market trends1, measure the population’s level of happiness2, detect clinical depression3 and to augment our ability to understand emotions expressed in text. However, existing automatic emotion detectors analyze text at a coarse-grained level (positive or negative) or detect only a small set of emotions (happiness, sadness, fear, anger, disgust and surprise). I argue that there is a richer range of emotion expressed in microblog text that computers can be trained to detect. By training automatic emotion detectors to recognize a greater number of emotions, it is possible to create more emotion-sensitive systems that can enhance our interaction with computers as well as our understanding of human emotions in online interactions.

I embarked on a journey through my dissertation4 to uncover a set of emotion categories that are representative of the emotions expressed and described in tweets. An important starting point to build supervised machine learning models that can recognize the emotions represented in tweets is the identification of a set of suitable emotion categories. Realistically, this set of categories should be a set of emotion categories humans can reliably detect from tweets. What emotions can humans detect in microblog text? How do current machine learning techniques perform on more fine-grained categories of emotion?

To address the questions, I first applied content analysis to uncover a set of fine-grained emotions humans can detect in microblog text. A corpus (EmoTweet-28) of 15,553 tweets5 were annotated for emotion in two phases. In Phase 1, grounded theory was applied on a group of trained annotators, wherein the annotators were not supplied with a predefined set of emotion category labels. They were instructed to suggest emotion tags most suited to describe the emotion expressed in a tweet. A total of 246 emotion tags were proposed, which were then systematically reduced to 28 emotion categories. The 28 emotion categories that emerged in Phase 1 were further tested through large-scale content analysis using Amazon Mechanical Turk to determine how representative the emotion categories were of the range of emotions expressed on Twitter. Phase 3 focused on running a series of machine learning experiments to assess machine performance in the detection of the 28 emotion categories.

Figure 1: Human performance (Kappa) versus machine performance (F1)

Human performance, measured by class distinctiveness (inter-annotator reliability in Kappa per class), and machine performance, measured by F1 (harmonic mean between precision and recall for each class), are shown in Figure 1. Human performance and machine performance vary across the 28 emotion categories. With exception of “exhaustion”, the machine learning models achieved relatively higher performance compared to human reliability in recognizing all other emotion categories. Machine performance and human performance differ less for the emotion categories on the farther left of Figure 1 (e.g., “anger” and “indifference”). The emotion categories on the farther right of Figure 1 (e.g., “sympathy”, “pride” and “inspiration”) demonstrate better machine performance compared to human performance. One emotion category, “gratitude” achieved consistently high machine and human performance.

I have demonstrated that it is feasible to extend machine learning classification to fine-grained emotion detection (i.e., as many as 28 emotion categories). The real strength of machine learning classifier lies in its capability to reliably detect some of the emotion categories that are difficult for the humans to recognize (e.g., “sympathy”, “hate” and “hope”). The way I see it, my dissertation is a first step to pave the way towards the development of more emotion-sensitive systems. I will continue my quest to test the applicability of the emotion categories in other types of text and improve the performance of the machine learning models for fine-grained emotion detection in other domains and applications.

Click here to view references 1-5.

Dr. Jasy Liew Suet Yan

Jasy Liew Suet Yan is a Senior Lecturer at the School of Computer Sciences, University of Science Malaysia. She graduated with a Ph.D. from the School of Information Studies, Syracuse University in 2016. Her research focuses on using natural language processing (NLP) techniques to detect expressions of emotion in text; her broader research interests include sentiment analysis, computational linguistics and affective computing. Click here for more. She was runner-up for the 2017 iSchools Doctoral Dissertation Award.