Introducing Snowplow Micro: Validate your tracker setup in your automated test suite


Jennifer was startled. The numbers did not make sense. Whilst transaction volumes and values looked stable over the last two weeks, the number of add-to-basket events recorded had declined dramatically six days ago. Jennifer quickly dove into the data to look at a transaction item level: for all the items that had been bought, could she identify corresponding add-to-basket events? After all, it was only possible to purchase an item that had previously been added to basket. The data told a different story: she identified a high proportion of sessions where items had been bought where no preceding add to basket event had been recorded. A further look back at the user level rather than session level data confirmed that these users had not purchased items that had been added to basket (and saved) on previous sessions either. The add to basket events were missing. Jennifer’s job was now to identify:

The above scenario – where tracking that has been running successfully, breaks for some reason is sadly, not uncommon. Tracking SDKs are typically put live by companies that want to collect data in a range of different places: on their web front ends, on their server-side systems, in their mobile apps. All of the applications where tracking has been enabled will be constantly evolving as new versions are rolled out. With each new version there is a risk that an unintended consequence, or side effect, of one of those updates will be to break one aspect of the tracking setup.

In the example above, Jennifer is actually pretty lucky: she spotted the issue after just six days. If the missing data is in the bad rows, she will be able to recover it. If the error meant the data was never sent, she’s “only” got a six-day gap: sometimes companies discover breaks in data collection weeks or even months late. These issues can be a lot worse. If they cause the business to lose value in the data, all the painstaking work performed by the different members of the data team to build that data set and use it to drive insight and action will come to nought. The stakes couldn’t be higher.

The purpose of Snowplow Micro, an experimental new service which we have just released, is to prevent the above situation ever arising.


White paper

Download our guide to better quality data to ensure better decision making


Validating your tracker setup with automated testing

Modern, digital savvy businesses, are constantly evolving their websites, mobile apps and services. With any new release of any website, app or service, there is a risk that a new bug or issue will be introduced with a new release: that bug might impact anything – data collection is just one area where a bug might emerge.

Engineers have a very good solution to prevent this from happening: automated test suites. Whenever a new piece of functionality is built, a corresponding set of tests are written that validate that the new functionality works as expected. These are configured to automatically run every time a new version of the application is deployed and if any of them fail, the deployment process is halted until the issue is resolved. This means that by writing comprehensive sets of tests, engineers can release updates to digital products frequently with confidence that none of the updates have inadvertently broken anything.

Unfortunately, up until now it has not really been possible to write tests that validate that data collection has been setup properly. That is because to check data collection, it is necessary to fire a set of events, process them and see if the output of that processing is as expected, including for example that:

With Snowplow Micro, all of that is now possible.You can make sure that new versions of your digital products do not break data collection.

What is Snowplow Micro?

Snowplow Micro is a very small Snowplow pipeline, with a few added extras:

What does this mean?

How can I get started?

Snowplow Micro is available from Dockerhub here.

We are in the process of putting together some tutorials to show how to embed it in different test suites.

In the meantime, we’d love to see what you, our users, do with Snowplow Micro! Check it out today and let us know your feedback.

Learn more about our unique approach to data delivery with a Snowplow demo.


Related articles