(This is the second article in my series, “The 00001010 Commandments of Data”. You can read the introduction to this series here.)
Rules always offend someone, especially when they’re as unilateral as this one:
Never give anyone access to raw data.
Even people who agree with me on this rule have a hard time enforcing it. That’s why I like calling it a “commandment”; if you want to achieve the goals of a data strategy, you’ll need to stick to your guns on this one.
I’m here to help.
Raw data is simply copies of the tables from your company business systems. That’s my technical definition. But for most people who spend their time hunting and gathering data, it’s simply any data that they haven’t modified yet. People who consume data often don’t know the real origins (“lineage”) of the data they use for decisions; they only know it’s not ready for use without their own changes. It’s raw.
Here’s why you need this commandment:
Raw data lacks context.
Raw data creates chaos for business logic.
Raw data disconnects lineage.
Raw data performs badly.
Raw data doesn’t protect confidentiality.
Raw data kills self-service.
Need more reasons? You’ll never really improve your business decision-making if you expose everyone to raw data.
Self-service? Or Chaos?
I’ve seen enough data solutions to know that companies grant access to raw data all the time. How did we get here?
As analytics grew in importance over the last twenty years, business managers and analysts demanded more access to data. IT leaders didn’t see this coming, and without a plan to manage decision data, they responded by giving everyone access to the raw data instead.
Then they applied a trendy name to it – “self-service”- which sounded appealing but really led to chaos.
Analysts loved it, however, because they learned new technical skills, like using visualization software to create their own dashboards from the raw data. What nobody realized was the work of organizing data shifted from IT to every analyst, massively exploding the volume of data preparation work in companies as analysts repeated (and repeated, and repeated) those tasks on their own desktops.
Many companies generated thousands of dashboards but didn’t align their decision making at all.
Making Data Easy to Find
Good self-service requires better structure, not less structure. A grocery store, for example, makes finding the ingredients you need for dinner easier, all by yourself. They do this by storing food in sections, aisles, and shelves. The store managers don’t know what you plan to eat for dinner tonight, but they do know the common ingredients that most people use for cooking.
Grocery stores also offer you service, like making sure your eggs aren’t broken or helping you to the car. Even though they tried to take that service away with self-checkout (which hasn’t worked out well), nobody is suggesting they send all the shoppers to the farm or the loading dock.
But that’s exactly what sending decision-makers to the raw data is like, and it doesn’t make decisions about your business (or your dinner) any faster, easier, or better.
Granting access to raw data kills self-service, because everyone independently creates their own business logic to make sense of the data. That’s a formula for frustration when your CEO gets conflicting answers to the same question. If you’ve already crossed this red line, don’t give up hope. Just offer a good plan to organize the data.
And whatever you do, don’t just give access to raw data. Instead, put the “service” back into self-service.
Thanks for reading my newsletter!
Previous post: Does Your Data Flow Through People?