The different approaches to data collection
Understanding the different data collection solutions for companies serious about building out their own data asset.
Setting the scene
More and more companies are investing in developing their own data asset. Typically they’ll want to:
Invest in a data warehouse and/or data lake
Centralise all data sources in a data warehouse and/or data lake
Use their event-level data for advanced analytical techniques like machine learning
Access and process data in real-time to power e.g. fraud detection or marketing automations
Snowplow Insights is an event data collection platform that enable companies to do just that.
However, it is not the only way that companies can collect event-level data into their data warehouse.
Other technologies, including some digital analytics tools, CDPs and ETL-as-a-service vendors offer alternative ways to collect data from e.g. websites and mobile apps.
Below, we explore the differences between using them and using Snowplow Insights to build your data asset.
How we compare
Packaged analytics providers
Packaged analytics providers like Google and Adobe provide powerful tools for analysts and marketers to understand the behavior of users on websites and mobile applications. Increasingly, these platforms also support making the underlying data that power their UIs available to you in your data warehouse. However, for these providers the underlying data is a byproduct of the UIs and hence their business logic bleeds into the data. Unlike with Snowplow Insights, you’re stuck with their one-size fits all data structures and data models which may not be appropriate for your business. Lastly, these vendors do not support the evolution of your data collection with your web and mobile apps, making them brittle sources of data.Learn more about how Snowplow differs from packaged analytics providers.
Customer data platforms (CDPs)
Customer Data Platforms (CDPs) are a new category of tech geared towards marketers that want to run unified campaigns to users across different channels. Some providers, like e.g. Segment, also support collecting event data to power those campaigns and support loading that data into your data warehouse. However, because of the focus on campaigns and sending data to multiple destinations, they’re not as powerful as Snowplow Insights when it comes to delivering that data to your data warehouse. You’ll see fewer events, out-of-the-box, less data points per event and you’ll be stuck with unstructured data that needs to be structured before your data scientists can work with it, and is unpredictable for data engineers who wish to consume the data to power real-time applications.Learn more about how Snowplow differs from CDPs.
ETL-as-a-service vendors provide easy ways for companies to move data from 3rd party systems (e.g. Facebook, Adwords, Salesforce) and operational databases (e.g. Postgres, SQL Server) into your data warehouse. Snowplow Insights customers will typically run an ETL-as-a-service vendor like xPlenty or Stitch alongside Snowplow Insights: using Snowplow Insights for the real-time collection of event-level data wherever possible and a ETL-as-a-service provider for those sources that do not support streaming of event-level data.
Build your own data pipeline
Some companies, serious about building their own data asset, invest heavily in developing their own data collection technology. However, companies consistently underestimate the amount of resources required to build, maintain and evolve their data collection infrastructure. When building it out themselves, time to value can take weeks, if not months, and companies typically bleed business logic into the code, making solutions brittle and very hard to evolve over time. With Snowplow, your pipeline can be running in minutes and will evolve with your business.Learn more about how Snowplow differs from building your own pipeline.