Compendium of Evidence for not Using Excel for Data Analysis

I always tell my students not to use Excel for data analysis, which I think means never to use it. Confused eyes stare back at me, understandably since Microsoft is like oxygen: imagining life without it sounds deadly. So this post is going to document examples of where using Excel for data analysis has led to major mistakes with significant policy implications.

The common theme through all these examples is that without a script (R file, .do, .py, etc.) that does the analysis, you cannot see where mistakes happen. Excel is like conducting politics in smoke filled backrooms; Stata, R, Python, etc. are like a deliberative democracy.

  1. Excel causes austerity.  At the height of the great recession, Serious People could defend the use of austerity measures because two famous economists, Carmen Reinhardt and Kenneth Rogoff, had a book that said so.  It turns out that they used Excel for their analysis and did not apply a formula consistently.  This inconsistency caused their analysis to suggest governments should not engage in deficit spending, but the results would have been different if the formula was extended correctly.  Couldn’t catch it because there is no script.  Note as well that a junior scholar could never ever ever never at all ever ever use Excel and be taken seriously.  Here is Paul Krugman’s take.  
  2. Excel causes JP Morgan to lose $6 billion because analysts did not notice they summed instead of averaging.
  3. Excel will cause Covid-19 to kill you.  Older versions of Excel do not allow more than ~1.05 million rows.  Public health officials in English received test results and put them into Excel files.  They did not notice that Excel would not add new rows after the file reaching a certain number of rows, so test results went missing.  If some of these people were positive, they could not be traced, and they could have infected others.  

