“If you’re expecting talk-radio and television shout fests to talk about how awesome your statistical validity is, you’re an idiot.”
Matt Waite has an excellent post on Source today that tells the story of early data journalism in Florida during the 2000 presidential election. He delves into issues of race and identity, and explains how easily journalists with good sense can mix things up — and miss big stories — because of how quickly numbers can obfuscate reality.
Race and ethnicity are tricky topics with loads of nuance and definitional difficulties. But they aren’t the only places these issues come up. Anytime you’re comparing data across agencies and across geographies, be on high alert for mismatches. Crime is a huge issue—jurisdictions have different definitions of what constitutes a big theft versus a little one, for instance. Driving laws are another—what constitutes reckless driving changes state to state. Budgets are another nightmare—what dollar figure requires a bid or not changes from city to city.
Getting the metadata, getting someone one the phone and basic descriptive statistics will help you avoid traps and hopefully let you avoid getting your butt kicked like I did.