Formatting CAMEO Event Codes in ICEWS

UPDATE: Thanks to @icews for helping me figure this out.  It turns out that the CAMEO Code field is saved as a string, but Pandas interprets that column as integers and drops the leading zero.  To read that column correctly, use the following line:

data = pd.read_csv(/Data/ICEWS/events.2010.20150313084533.tab', sep = '\t', dtype={'CAMEO Code': object})

——————————————————————————————————————————————————

Wanting to see how ICEWS represents Arab Spring protests, I pulled all protests from November 1st, 2010 through December 31st, 2011 in 16 countries in the Middle East and North Africa. My research is about protests, so I am very familiar with CAMEO’s Event Code 14.  I was therefore surprised to find some events with ICEWS Source Code of 14 but the event text said “Consider policy option”.  (“Event text” is the CAMEO description of the Event Code.)

I noticed other anomalies, for example a CAMEO Code of 42 corresponding to 42, the Event Text of which is “Make a visit”; to me, this looks like there should be an event that starts with 42, but CAMEO’s largest event code starts with 20. After some more investigating, it appears that ICEWS does not append leading or trailing zeroes.  In other words, 14 should actually be 014, 42 042.  This way, protests start with 14XX.

The fix can be made with one line of Python code:

data['CAMEO_Code_Modified'] = ['%03d' %item if len(str(item)) <= 2 else item for item in data['CAMEO_Code']]

This code adds a leading 0 to any CAMEO code that is shorter than length 3.  While it is possible this would misclassify events if there is in fact not supposed to be a trailing zero, I have not seen a case where there appears to not be a missing zero.  Note that, following David Masad, I replaced all spaces in the variable names with underscores.

Note as well that CAMEO codes have 3 levels, i.e. they are 4 digits.  ICEWS leaves off trailing zeroes as well, but I have not noticed a situation in which the lack of a trailing zero leads to ambiguity and so did not correct that.   But it would be easy to do: just multiply by 10 any CAMEO code that does not have 4 digits.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: