If you have a beautiful clean house, and dirty windows – what does the passerby think of your cleaning habits? The interior of the house may sparkle – giving clear proof of your skills with a mop that even Martha Stewart would envy. But from the outside, you’re a unrepentant Oscar Madison.
A simple observation supported by data, but incomplete at best.
We’ve discussed previously (in The Neutral Zone) the risk of bias in data collection. Who’s measuring and why, and the first task of the analyst to understand and correct for that bias.
The second major task for the analyst is navigating the data set. We can dismiss the classical database of rows and columns. Not only is it falling out of use from a technical perspective – it doesn’t accurately reflect the semi-structured format of most information.
You may read about “Data Lakes”, “Big Data”, and other concepts. They all amount to the same thing – a bunch of related stuff. We’ll go into this another time – but for now let’s continue to visualize a data set as a house.
Your house has millions of data points. Not just each object with its attributes, but the relationship between objects. For example, you have a chandelier that you love. You can talk about the size, number of lamps, electricity used, lumens produced. Pretty simple. What else does it impact? The size and height will impact airflow – maybe changing how your thermostat registers. The position in a room drives other furniture placement. Dining room table, then chairs, china cabinet, and such. Two data points, size of chandelier and diameter of table have an impact on the rest of the house. Reflected light on the chandelier may cause a window shade to be down, thus a reading lamp turned on in the next room. Getting more complicated now.
Now let’s think about it if you’re a restaurant owner. You probably have lots of different systems for tracking data like invoices, inventory, daily receipts. What’s the relationship between these systems? Do you pull in community information – like school calendars? You have the data – but what does it mean? Per ticket revenue goes up – a typical key metric for restaurants. Why? What can you do to control the other variables, or at least understand when you need to react? Large chain restaurants manage some of these variables, but are also willing to allow more waste. Small business owners can’t afford to do that. You need to understand when you’re looking at dirty windows, and when you can see the workings of the entire house.
I like to think of data in the form of choose your own adventure books. The data set is the same for every reader. Choices that you make drive different outcomes. Each time you read the book, your understanding of the choices (the data set) gets better. Our goal is to help you make the best choices.
If you have suggestions for additional topics, please comment on this article, or contact us.
The entirety of this site is protected by copyright © 2016 Bright Beach Consulting LLC