Can we generalize from one social media platform to another?


One of the things my dissertation, Emotion in Social Media, highlights is the importance of the comparative perspective in social media research. I looked at three big questions in the field, proposing the same hypotheses for both Facebook and Twitter — but ended up drawing some conclusions that were pretty divergent and unique to each service.

My dissertation examines (1) the emotions we express in social media (i.e. the emotional profile of the status update), (2) what we can infer about someone’s emotional life in general based on what they say in their status updates (and possible limitations on those inferences), and (3) the emotional experience of browsing social media (Do we get riled up? Envy our friends’ lives?). I searched far and wide in my literature review — and took note of a handful of related studies that analyzed more than one social media service — but saw little to support the idea that Facebook and Twitter were fundamentally different in terms of emotional expression, emotional inference, or emotional experience. Indeed, many studies drew conclusions about ‘social media’ based on an analysis of just one service.

It’s true that if you read my dissertation, you might walk away with the impression that Facebook and Twitter share some important things in common. Status updates on both services are characterized by elevated arousal and higher levels of emotions like amusement, anger, surprise and awe that are more wound up. You’ll find that status updates on both services appear to provide something of a window into our emotional lives, though the association is not especially tight, is moderated by factors like how emotionally stable we are, and disappears entirely when the popular sentiment analysis program Linguistic Inquiry and Word Count (LIWC) is used to analyze the emotional contents of status updates. You’ll also find that the most robust effect of browsing both services appears to be that people tend to wind down (i.e. feel more relaxed, sleepy, bored, etc.), not wind up, as the stereotype goes.

That’s a lot in common. But in synthesizing the literature review for my dissertation, I noted a broad chain of reasoning that seemed to link the literature together, even if it was never fully articulated by any one researcher. This “overarching hypothesis” about emotion in social media goes something like this: Status updates are overly-positive, reflecting a concern for self-presentation, which in turn limits how valid status updates are for inferring our day-to-day emotional lives, and which ultimately causes us to feel envy while we browse social media.

In results, this entire chain of reasoning receives at least some support, but for Facebook only. Facebook posts are more positive than day-to-day emotional life, self-presentation concerns do seem to moderate the association between Facebook posts and emotional life, while browsing Facebook is characterized by some elevation in envy. There are limits to each link in the chain — self-presentation concerns do not eliminate the association between Facebook posts and emotional life, for example — but every link is nonetheless supported.

Interestingly, however, the overarching hypothesis receives little support for Twitter. Tweets are more negative than day-to-day emotional life, self-presentation concerns largely do not moderate the association between tweets and emotional experience, and envy might actually be alleviated while we browse Twitter. A lesson of this dissertation, therefore, may be about the importance of comparative perspectives in social media research. Key social psychological dynamics that characterize one service, like Facebook, may not generalize to another, like Twitter, even when they share the same core design, i.e. feeds of status updates.

Next time you hear someone talk about “social media” as though all services have uniform, monolithic implications for behavior, you might nudge them to consider that different services can create different, unique contexts — with potentially divergent implications for behavior.

Galen Panger received his Ph.D. from Berkeley in 2017, focusing on social media behavior, happiness and well-being, and behavioral economics. He is currently a user experience researcher at Google. Panger will be honored at iConference 2018 as the 2018 winner of the iSchools Doctoral Dissertation Award.


Exploring Fine-Grained Emotion Detection in Microblog Text


Endowing computers with the ability to recognize emotions in text has important applications in the field of information science. Over a decade of research in sentiment analysis on Twitter, a popular microblogging site, has allowed large amounts of tweets (i.e., Twitter posts) to be harnessed to predict stock market trends1, measure the population’s level of happiness2, detect clinical depression3 and to augment our ability to understand emotions expressed in text. However, existing automatic emotion detectors analyze text at a coarse-grained level (positive or negative) or detect only a small set of emotions (happiness, sadness, fear, anger, disgust and surprise). I argue that there is a richer range of emotion expressed in microblog text that computers can be trained to detect. By training automatic emotion detectors to recognize a greater number of emotions, it is possible to create more emotion-sensitive systems that can enhance our interaction with computers as well as our understanding of human emotions in online interactions.

I embarked on a journey through my dissertation4 to uncover a set of emotion categories that are representative of the emotions expressed and described in tweets. An important starting point to build supervised machine learning models that can recognize the emotions represented in tweets is the identification of a set of suitable emotion categories. Realistically, this set of categories should be a set of emotion categories humans can reliably detect from tweets. What emotions can humans detect in microblog text? How do current machine learning techniques perform on more fine-grained categories of emotion?

To address the questions, I first applied content analysis to uncover a set of fine-grained emotions humans can detect in microblog text. A corpus (EmoTweet-28) of 15,553 tweets5 were annotated for emotion in two phases. In Phase 1, grounded theory was applied on a group of trained annotators, wherein the annotators were not supplied with a predefined set of emotion category labels. They were instructed to suggest emotion tags most suited to describe the emotion expressed in a tweet. A total of 246 emotion tags were proposed, which were then systematically reduced to 28 emotion categories. The 28 emotion categories that emerged in Phase 1 were further tested through large-scale content analysis using Amazon Mechanical Turk to determine how representative the emotion categories were of the range of emotions expressed on Twitter. Phase 3 focused on running a series of machine learning experiments to assess machine performance in the detection of the 28 emotion categories.

Figure 1: Human performance (Kappa) versus machine performance (F1)

Human performance, measured by class distinctiveness (inter-annotator reliability in Kappa per class), and machine performance, measured by F1 (harmonic mean between precision and recall for each class), are shown in Figure 1. Human performance and machine performance vary across the 28 emotion categories. With exception of “exhaustion”, the machine learning models achieved relatively higher performance compared to human reliability in recognizing all other emotion categories. Machine performance and human performance differ less for the emotion categories on the farther left of Figure 1 (e.g., “anger” and “indifference”). The emotion categories on the farther right of Figure 1 (e.g., “sympathy”, “pride” and “inspiration”) demonstrate better machine performance compared to human performance. One emotion category, “gratitude” achieved consistently high machine and human performance.

I have demonstrated that it is feasible to extend machine learning classification to fine-grained emotion detection (i.e., as many as 28 emotion categories). The real strength of machine learning classifier lies in its capability to reliably detect some of the emotion categories that are difficult for the humans to recognize (e.g., “sympathy”, “hate” and “hope”). The way I see it, my dissertation is a first step to pave the way towards the development of more emotion-sensitive systems. I will continue my quest to test the applicability of the emotion categories in other types of text and improve the performance of the machine learning models for fine-grained emotion detection in other domains and applications.

Click here to view references 1-5.

Dr. Jasy Liew Suet Yan

Jasy Liew Suet Yan is a Senior Lecturer at the School of Computer Sciences, University of Science Malaysia. She graduated with a Ph.D. from the School of Information Studies, Syracuse University in 2016. Her research focuses on using natural language processing (NLP) techniques to detect expressions of emotion in text; her broader research interests include sentiment analysis, computational linguistics and affective computing. Click here for more. She was runner-up for the 2017 iSchools Doctoral Dissertation Award.