When I was in high school, I went to camp a couple of hours outside of Dallas/Fort Worth. I had never been to Texas before, and I remember looking out the windows of the van from the airport and seeing oil wells everywhere. They were in fields, in parking lots, in people’s front yards. It felt very foreign to a kid from suburban Connecticut. But people were willing to put oil wells everywhere because what they were bringing up out of the ground was so valuable.
What they were bringing out of the ground wasn’t inherently useful, though. Crude oil can’t really do anything. You can’t pull it straight out of the ground and put it into your car. You can’t build stuff with basic crude. Pumping crude oil out of the ground is only the beginning of a process that ends up in everything from shoe laces and rain coats to lunch boxes, perfume, and denture adhesives. Not to mention the fuel that powers most of our vehicles.
So what does all of this have to do with data?
I think that data works in much the same way. Data is like crude oil. In and of itself it isn’t valuable. Only when you do you something with it does it create real value. Only when it goes through a process is value created from this raw resource.
The problem is that when we talk about “data” we often confuse the thing (a raw resource that is not inherently valuable) and the value chain (“the power of big data”). This confusion can lead organizations to under-invest in the value chain and thus not see the results they hoped for.
The Data Value Chain
At a high level, the data value chain contains four links: collection, storage, analysis, and implementation.
Collection is focused on how to capture thoughts, feelings, ideas, behaviors, actions, locations, and more in such a way that they can be used down the value chain. This is not just about survey design or click-through rates or donor profiles. Someone who really gets into collection is thinking about what level of granularity is needed, how frequent data should be collected, and whether data can be reliably self-reported and/or automatically collected. They think creatively and use myriad of tools to get the information they know will be valuable down the chain.
Storage takes the data that has been collected and develops ways to structure and preserve it for later use. What types of databases work best for particular types and uses of data? How should tables be structured? How should updates and changes to data be incorporated? These highly detail-oriented and focused questions are asked by those ensuring that processes are in place to accurately transmit data from one stage to the next.
Analysis is concerned with making meaning from data. Michelangelo said, “Inside of every stone is a statue, and the job of the sculptor is to release it.” The same is true for the analyst. They approach data knowing that a valuable insight resides inside, and it is their job to discover and uncover it. What is the right question? How can data be used to answer this question? What tools and methods are appropriate in this situation? These incredibly curious individuals are the ones crafting meaning from data.
Implementation takes the insights found during analysis and seeks to change business practices accordingly. People who love implementation are primarily concerned with making sure their organization is run using evidence. They are asking how data should be presented to spur action, what evidence can be gathered to support or refute that decision, and what biases people might have that undermine the use of data. They are driven by a desire to create more effective and efficient organizations.
The Management Problem
Most organizations aren’t appropriately investing in the entire value chain and instead over-focus on one link. What portion of the value chain you focus on often has a lot to do with your training. When you ask a techie about data, she or he tends to focus on the storage link, talking about databases and SQL queries. If you talk to a researcher about data, they’ll talk more about the analysis link and the tools and methods they use to derive meaning and insight from data. Program staff tend to focus on data collection (read: surveys and more surveys).
I see a lot of organizations looking for a “data person.” This mysterious person can tactically execute the entire value chain with equal ease. They understand the nuances of data collection, have the technical acumen to work seamlessly between Hadoop and MySQL, have the statistical and computing background to jump between logistic regression analysis and random forest machine learning techniques, and can intuitively grasp business needs and apply data to solve them.
This person doesn’t exist.
Organizations should stop looking to hire a lone data wizard and instead develop capacity around the data value chain. This can mean reallocating talent you already have to parts of the value chain and augmenting that with additional talent only when needed. If you are looking for one person to “own” data within your organization, they need to recognize that they are owning not a single link but the entire value chain.
Getting The Most Value
The organizations that I see getting the most value from their data are the ones that realize data is not inherently valuable. They recognize that data is a raw resource that must go through a process to realize value. They have hired the right human capital, made the right investments at every link in the chain, and are committed to changing organizational practices based upon evidence.
Data can create real value for your organization, but it isn’t pixie dust you can just sprinkle down for magical results. It’s a raw resource that needs to be processed to create something valuable. If you invest in developing a value chain, you will see incredible results in your organization.