Rules for Data Analysis


I’m 9 days from my due date.  At this point, my life has become one big game of “will I be able to get out of this chair without assistance.”

I have some great guest posts teed up for you all while I’m on maternity leave.  I can’t wait for the ROND Report readers to get a break from my ramblings and hear from the guest writers.  Thanks again to everyone who has written one.

Now, to the newsletter:

Topic of the Week: Rules for Data Analysis

We all have our secret professional pleasures.  They are the little things about our jobs that make us feel like a kid on Christmas morning, running downstairs to open our presents.

Some people live for the leftover food that folks bring into work.  Others enjoy hoarding “the good” office supplies in their desks.

For me, I love getting a huge package of messy data to analyze.  It’s just so exciting.

Sadly, I spend very little time analyzing data these days.  It’s one of the unfortunate consequences of being in charge and needing to delegate tasks.

That being said, I do still spend a fair amount of time reviewing analyses and teaching others how to analyze information.

Today, I thought I would share a few rules that I live by when it comes to data analysis:

Rule #1: Start with the most detailed raw data you can get your hands on

Summary data can be misleading and biased in support a particular conclusion.

Raw data is raw data. There should be no engrained biases in it.

Sure, raw data is messy and takes more time to clean and analyze, but it allows you to form strong conclusions and understand key drivers of a business.

For example:  Each day, the Facebook app tells me I have an urgent notification via a little red number in the upper right corner of my app.  After opening the app and discovering my aunt has shared her third cat video of the week, I then exit the app.

In Facebook’s summary metrics, I’m counted as a daily active user for that day.  Am I using the app as intended?  No.  Am I helping drive revenue? No.  But by Facebook’s definition of Daily Active User, I’m “engaged.”

Only the raw data would tell the full story.

Rule #2: Document and Link

Documenting how you manipulated data for analysis purposes is important for two reasons:

First, it’s rare that an analysis is done once and then forgotten forever.  It’s important to document the steps you took to manipulate the data so you can replicate it in the future with updated information.

Second, the person you present your analysis to may want to know how the analysis was conducted or dive into the detailed information supporting your summary conclusions.  Never hardcode your conclusions, instead, link to the detailed information.

Rule #3: Always Use Sanity Checks

The process of manipulating data is susceptible to human error.  I can’t tell you the number of times I was in a hurry and rushed through an analysis only to discover my summary conclusions didn’t add up to the right numbers.

Always take the time to add checks to your summary charts / analyses that tie back to the original file totals.

Rule #4: Keep it Simple

Even the smartest folks will have trouble following a summary analysis with 1,000 different numbers on a chart.  Start high level, focus on key conclusions, and use additional supporting analyses to peel back the onion.

Have a great weekend everyone!

About the author

Danielle O'Rourke

Recovering Investor. Mom. Wife.

By Danielle O'Rourke

Danielle O'Rourke

Recovering Investor. Mom. Wife.

Recent Posts