Batch zip files

One of my hard drives is down to its final terabyte, of 8, so its time for me to compress some files. Since I have thousands of files on that drive, it would be inefficient to select them one by one. It turns out its easy to pass a bunch of files to gzip. I […]

Understanding Subnational Variation in Tweets

My primary source of data is tweets I get from Twitter’s POST statuses/filter endpoint, what I believe was called the “Streaming Endpoint” when I started working with Twitter data eons ago.  While it has always been straightforward to use a bounding box to get tweets with geographic information, exactly what Twitter reports and how it […]

Crawling Followers with Intelligent Stopping

Like almost every other academic, I have started a Covid-19 project.  I think my team has a unique angle because of the kind of data I collect.  One dynamic we are interested in is patterns of following, and being able to analyze that across enough accounts required me to work with Twitter endpoints I have […]

My Ongoing Twitter Collections

I recently spent a lot of time reviewing my Twitter data collection infrastructure in order to start some more collections.  In that process, I discovered some tokens and streams I forgot about.  The purpose of this post is to document what data I am collecting as of 04.29.2020 so that I have an easy reference […]

Cell Phone or Geolocation Datasets

I want cell phone data.  Location tracking or call detail records, both have their users and can answer research questions.  Basically nothing is publicly available, but lots of companies exist and some say they will work with academics.  This post is for me to keep track of those companies. SafeGraph – Focused on places, not […]

What I Read, 2019 Version

Starting with the 19th book review (Mao biography), I have decided to add a grade and, to counterbalance what can often seem like negative reviews, one interesting fact learned from each book.  I aim for a B- average.  The Sellout by Paul Beatty – Wow, what a novel.  My wife bought it for me for […]