How can social media messages be used as a source for measuring sentiment in society? This has been studied by Statistics Netherlands (CBS) at the Center for Big Data Statistics, resulting in a first version of the social tension indicator. The indicator very specifically measures tensions or unrest within Dutch society, unlike more general measurements of positive or negative sentiments through social media. To define the indicator, qualitative research was conducted to compile a validated glossary containing words that are specifically related to (un)safety.
The indicator identifies those parts of daily messages on Twitter which discuss unrest or unsafety. These say something about tensions, perceptions of terrorist threats and feelings of unease in society. Social media are an excellent tool for measuring the general sentiment in the Netherlands. People use them to share their opinion on social issues, express their feelings and give their view on how things are going in the Netherlands.
High degree of social tension after Dam Screamer incident and terrorist attacks
This visualisation shows peaks in the social tension indicator. A larger number of messages were posted on or just after the days on which incidents took place which created feelings of unsafety and unrest. The big spike in 2010, for example, is related to the disruption of national Remembrance Day commemorations by the ‘Dam Screamer’ at Amsterdam Dam Square on 4 May. People’s responses to terrorist attacks are also reflected in the social tension indicator: the terrorist attacks in Paris (13 November 2015) and Brussels (22 March 2016) caused a peak in tensions in the Netherlands. In addition, the MH17 disaster (17 July 2014) resulted in strong feelings of unsafety and unrest. Other types of events such as the election of Donald Trump in the United States on 9 November 2016 caused social tensions as well.
Social tensions slightly down in recent years
According to the indicator, overall social tension increased slightly between 2010 and 2013, followed by a slight decrease; this trend can be compared with the results from the annual Safety Monitor, for which a representative group of citizens were asked by way of a questionnaire whether they experienced feelings of unsafety, among other things. The monitor shows that the number of citizens who feel unsafe at times has declined since 2010. However, the social tension indicator maps a phenomenon which is different from the perceptions of unsafety described in the Safety Monitor. Although terrorist attacks do influence perceptions of unsafety, the social impact of such events cannot directly be derived from the results of the Safety Monitor.
All public tweets and retweets on Twitter by Dutch users were incorporated into the analysis. Messages representing unrest and feelings of unsafety were offset against the total number of messages per day. Hence, it serves as a relative indicator. Based on in-depth interviews, words were identified which are used by people to describe situations of unsafety. CBS used questions from the Safety Monitor to prepare an interview instruction for a set of open questions. This list was supplemented with synonyms to arrive at a total of around 350 words which describe safety or unsafety. This list was then checked by a group of experts at CBS. Subsequently, the most frequently used words on social media relating to unsafety were used to compile the definitive glossary. Next, messages relating to sports events and politics were filtered out of the message selection as they had a distorting effect due to their often large amount of negative content. Finally, the peaks in the indicator were validated by checking which events occurred on or just before a certain date. A list of the most frequently used words from these messages was then checked manually to see whether the selected messages indeed dealt with these events. The peaks can therefore be linked directly to the events shown. In addition, researchers identified the types of events that were not covered by the indicator; these were mainly local events, or events that would not impair the collective security.
This indicator uses public Twitter messages. These messages are not linked to users, so CBS does not know who has sent them. Only the total number of messages is presented. No selection of users has been made and both private and business users are included. Furthermore, the indicator does not show any messages and the exact content of the tweets cannot be traced.
This indicator is based on daily data. In the future, it can be developed further into a (nearly) real-time indicator to measure the level of social tension in the Netherlands by the minute. Policymakers will then be able to take into account the sentiment in society as they respond to recent events and take the appropriate measures. Social media messages can also provide insight into which important issues play a role in Dutch society and whether these issues vary per region. In addition, this indicator can be compared to other relevant developments, e.g. the general feeling of safety or trust in other people. This will be the focus of future research using cointegration analyses of time series. Other research will have to show whether the indicator is useful in terms of predicting certain developments, e.g. in consumer spending or producer confidence.
Please let us know your opinion on the new social tension indicator. Is there anything that can be improved? What other applications are interesting for research using social media data?