My dissertation, still in its very early stages, seeks to understand how protests spread across countries, with a focus on the Arab Spring. One source of data I am exploring is Phil Schrodt and Kalev Leetaru’s Global Database of Events, Location, and Tone (GDELT), a machine-coded events dataset. (GDELT is an amazing resource; to read more about it, check out its website.)
Inspired by John Beieler’s protest maps and wanted to understand GDELT better, I set out to visualize different events during the Arab Spring. With the help of CartoDB, Github, Andrew Thompson from Azavea Labs, and forking John’s maps, I made five different maps – three of protests, one of violent events, and one of police actions.
Each map selects events from November 1st, 2010 through July 31st, 2011. This timespan was chosen to cover the major events of the Arab Spring and give me enough data to fit under CartoDB’s 50 megabyte cap. Each map also takes events from Algeria, Bahrain, Egypt, (no Israel), Iraq, Jordan, Kuwait, Lebanon, Libya, Morocco, Oman, Qatar, Saudi Arabia, Syria, Tunisia, United Arab Emirates, and Yemen.
The protest maps then uses all events with an EventRootCode of 14 that occurred in one of those countries. The first map shows all 18,563 events that match these criteria. The visualization immediately shows a few things. First, GDELT picks up many events that are probably not part of the Arab Spring, which is fine given the coarseness of the term “protest”. This is most evident in the activity of Egypt and Iraq at the end of 2010. Second, GDELT does a good job recording the Arab Spring. Notice how events in Tunisia start appearing in December 2010 and persist, perhaps even increasing, through January. Egypt similarly lights up towards the middle of January. Then Libya starts to see more events, soon followed by Yemen, Syria, and Bahrain. Algeria is pretty quiet, as are Morocco and Saudi Arabia. While more investigation is required, the broad movement of events clearly follows the contours of the Arab Spring. Third, GDELT sometimes cannot locate events more precisely than the country level. Others have noted this behavior, and it explains the appearance of events in the middle of Egypt, Saudi Arabia, Libya, etc.
It turns out that most protest events in my sample have very few articles written about them. While the correlation between news coverage and event size is interesting in its own right, I wanted to get rid of a lot of this coverage since the most important protests are presumably also those most likely to receive news coverage. (See this histogram of protests and the number of articles about them (sorry for the link, WordPress makes it annoying to embed .pdfs in a post).) To this end, I created two other maps, one with only protests mentioned in 10 or more articles, another with those in 20 or more. See all protests mentioned in at least 10 articles here and those with at least 20 here. I find these maps cleaner than the original one, making it easier to see the spread of protests and the differential experience of countries.
Finding John Beieler’s map of protests after President Morsi’s overthrow intriguing, I pulled the same events for my time period. This map of violent events shows all events with an EventRootCode of 18 (‘assault’) and where the target(s) is a civilian (Actor2Type1, Actor2Type2, or Actor2Type3 contains ‘CVL’). Interestingly, Iraq stands out as particularly violent, and the conflict in Libya and Syria is very clear. There is also a burst of activity in Bahrain, which makes sense, while the other countries are relatively quiet. Once again, we see that GDELT successfully captures the contours of the Arab Spring. This map shows events with EventBaseCode equal to 151 (‘Increase police alert status’) or 153 (‘Mobilize or increase police power’). This map appears to mirror the protest maps.
Overall, this process was illuminating and fun (when I wasn’t spending time learning Github). GDELT is a very rich dataset, and its potential really shines. Even throwing out most of the reported events, which I did for two of the protest maps, leaves a very rich set of events. My next step is to explore using these events as dependent variables and what we can learn from events mentioned in fewer than 10 articles.