It is software that you can install in your cloud environment (AWS or GCP) to collect rich and high quality data.
You can use one of Snowplow’s trackers in your website, app or server; or a Snowplow webhook to capture third party data and the Snowplow pipeline will deliver the data to a data warehouse of your choice. From there, you are free to use the data as you wish.
Why would I use Snowplow?
At risk of sounding like a clickbait article, here are 10 reasons you would use Snowplow:
All your data lives in your own environment, not Snowplow’s, so you have unlimited access to all your event level data
Snowplow events are extremely rich, each one collected with at least 130 properties (where available) allowing for a very robust understanding of your users
Data is available in real time, which means fresh data for your reports in seconds
Data from all platforms (web/mobile/server side/emails/ad impressions) is structured in the same way and stored in the same place
Fully customizable tracking design is possible, your design is tailored to your business and use cases, meaning you capture data tailored to you
Often significantly cheaper than GA360 or Adobe Analytics
Data can be enriched with 3rd party sources
Data is very well structured and of exceptionally high quality due to the validation step of the pipeline (events that fail validation are also stored, but not in the warehouse, meaning the pipeline is non-lossy) making it a perfect input for machine learning models
Tracking is versioned allowing for evolution of data strategy - the complexity and detail can grow with the complexity of the questions that you want to answer
Lots of tooling around privacy to allow for GDPR compliance including data hashing at collection and scrubbing stored data
How do I use Snowplow?
With Snowplow Insights, we set up and maintain your pipeline. All you need to do is:
Decide what you want to track
Build custom models to start actioning off the data
The only resource you need to begin using Snowplow is some time from a front end developer who can paste our tracker code in your website, app and server as necessary. For out of the box tracking, this should only take an hour but for custom tracking this can take up to a day.
To really derive value from Snowplow data, you will need data analysts who can use SQL to build data models.
What do I track?
With Snowplow, you can track entities as well as events. The next post in this series is a longer post on this topic, including an explanation of entities.
For now, let’s assume you’ve worked with our Implementation Engineering team to set up tracking following our best practices and look at what the data would look like.
What will the data actually look like?
Let’s take an example of someone looking for parts to refurbish their snow plow before winter comes.
This shows a very simplified user journey as it would appear in your data warehouse. Data from all trackers is loaded into one table, the image above is what a subset of your BigQuery columns may look like.
Note: remember that only 3 out of 130 out of the box properties are shown here, each event can also come with timestamps, weather, location, device, cookies, marketing campaign and much more. In addition, each custom event and entity can have many many more properties, only a subset are shown here. Warning: The name is shown as an example field to make the blog post more readable, always be cautious collecting PII.
These are the actions that correspond with the data in this table:
Someone goes on their laptop and visits the site. We know what site they visited and on which browser
A few minutes later they register and we know their name is Joe. We know they are the same user as they have the same cookie
They then search for “wheels”, we know that they were only served with 2 results: “wheel_set” and “wheel” and their respective prices
Joe clicked on the wheel_set
After scrolling the page (scroll measured with pings, not shown in table), Joe adds the wheel_set to their basket
A few days later, Joe opens an email on the laptop (we know it’s Joe because the 3rd party cookie is the same) with a Black Friday coupon, we can capture which campaign it was part of
Joe was busy earlier so later that day Joe logs in on the iOS app and decides to checkout using the Black Friday coupon
Joe checks out with two items, the wheel_set and a plow (previously added to basket) and a transaction ID is logged
An event from the server is also logged with the same transaction ID showing the discounted price ($499.99 + $3499.99) x 0.7 = 2799.99 with 30% off from the coupon
Sadly, a week later, Joe is unhappy with the plow and returns it. This event is logged in the server with the same transaction ID
Hopefully you can now begin to see what Snowplow can do. The tool collects and delivers great raw data. What you do next is in the hands of your data team.
What can we do with the data?
Use this wealth of data to drive ROI by:
Reducing marketing spend
Increasing conversion rates
Minimizing revenue loss due to bugs
Reduce costs due to fraud
Drive revenue by optimizing what you sell
What you do with the data depends on how developed your data team is. Let’s take a look at 3 degrees of data team maturity. Follow the links to read a full post on how a team of each size in the retail sector could consume Snowplow data (note that each post assumes you have read the previous ones):