it is so obvious no one bothers saying it.
Data collection is an essential part of any data strategy
After all: without data collection, there is no data. Without data there is no data value chain. No reporting, no analysis, no data science, no data-driven decision making.
It is not just that people in data don’t remark on the importance of data collection. They do not talk about data collection at all. To take just one example, let’s review Firstmark’s Big Data Landscape:
Roughly 15% of the landscape is given over to the ‘Data Sources and API’ providers. However, none of the providers listed, either in that section, or the rest of the map, specialize in enabling companies to collect their own data. The Big Data Landscape, then, is full of vendors that will help you do things with your data, and provide you with their own data. But all those providers assume you have your own data to do stuff with, so have got data collection sorted.
The awkward truth is that although most companies do have some of their own data, it is often not good data because it is not being collected properly. And most choose to invest in the rest of their data/analytics stack, without putting in place proper processes and systems to collect and store the good data in the first place. They might as well build houses without foundations. In this post, I’m going to explore: