Configure your Snowplow pipeline to collect complete and accurate data

Share

In recent years, major browsers have released multiple changes that affect how cookies and event-tracking on the web work.

This post is about understanding the potential impact of these changes and the different configuration options available to you for your Snowplow pipeline, that will help you maximize the accuracy and completeness of your web tracking.

Examples of such changes are:

These changes, alongside the increasing use of Ad Blockers, mean the data you collect about your users behavior on your site may be impacted in ways such as:

#
eguide

White paper

Get our guide to better data quality and build confidence in your insights

Download

How might my tracking be affected?

If you are collecting data through event tracking on the web there are a number of ways that these changes could affect your data collection:

Default instrumentations of Snowplow could be blocked by AdBlock and tracking prevention

The Snowplow pipeline makes an endpoint available that your data is streamed into and we know that this endpoint, along with other aspects of default Snowplow tracking instrumentations, is targeted by some ad-blocking providers, including the disconnect.me list which Firefox ETP uses to block tracking.

Ad-blockers primarily try to target third party tracking, e.g. Facebook spying on users across other sites, instead of legitimate tracking from companies who are looking to learn from data and improve the experience they offer. However, data collection for both purposes happens with similar technologies and so all approaches get blocked.

The usefulness of client-side cookies is limited

With the increasingly short expiration windows on third-party and first-party cookies that are set on the client-side, analysing user journeys across any significant time window becomes challenging. Setting cookies on the client-side will deliver data that offers limited insight.

Cookies that are set on the server-side, in a first-party context, aren’t governed by these short-expiration windows and allow a much longer timeframe to stitch user-journeys together and better understand behaviour.

For example, understanding that a user first visited your site from a social campaign 55 days ago, and then went through a series of interactions with your brand, before finally purchasing; this kind of analysis is impossible using client-side cookies.

The Snowplow collector sets a server-side cookie with a set of associated attributes which can provide you the benefits of first-party, server-side tracking if your pipeline is correctly configured.

Browsers are introducing rules about how they treat cookies based on the attributes of those cookies, Chromium-based browsers are paving the way with this effort.

This means that the values of attributes like sameSite, httpOnly and secure have to be set correctly to ensure that cookies are set properly on different browsers: an incorrect setting may result in cookies not being set on one or more browsers, resulting in misleading data. For example, two actions performed by the same user will be reported as if they were performed by two separate users.

We go into detail on what each attribute means in a previous post.

How can I collect more accurate and complete data?

Snowplow has configuration options that help you to collect complete and accurate behavioral data.

Firstly, here’s how you can check your current configuration

Insights Console gives you a quick and easy way to review how your pipeline is configured. Simply login to console, find the pipeline you’d like to check the configuration for and navigate to Pipeline configuration.

Pipeline configuration screenshot

Configure custom collector endpoints to track first-party

We recommend setting your collector to track from a first-party domain so it can set a first-party server-side cookie; these cookies are unaffected by prevention methods such as ITP and ETP.

Ideally, you want your collector endpoint to be on the same root domain as the site you are tracking from. (e.g. if you are tracking on acme.com, then a good example endpoint would be spc.acme.com).

For first-party tracking to work, you’ll also need to set a cookie domain for the primary domain of your collector.

You can check how your domains are configured by visiting Pipeline Configuration in Insights Console and checking the settings for Domains.

Config screen domains

If you need to set up a new collector domain, you will need to:

Configure custom collector paths to avoid tracking being blocked

You’ll want to set custom request paths to avoid your tracking being blocked by tracking prevention measures and ad-blockers.

You can check whether you have set custom POST paths by visiting Pipeline configuration and checking the settings for Tracking request paths.

Custom collector paths

If you need to set up a new request path, you will need to:

If you are using the Iglu webhook to track events, you can also set a custom path for this tracking.

To ensure cookies are set properly on different browsers you’ll want to set the right attributes against your cookies.

You can check how your cook
ies attributes are currently set by visiting Pipeline configuration and checking the settings for Cookie attributes.

Configure cookie

If you need to set up a change your cookie attributes, you will need to:

In conclusion…

For companies that want to perform reliable first party tracking on their own websites, reliable tracking is possible, but it requires more work to setup your Snowplow infrastructure, especially with respect to cookies.

This post helps Snowplow users understand the options and the implications of different settings, so you can set your pipeline up to collect the most accurate and complete data set from the web.

If you are already using Snowplow Insights check your configuration in the Insights Console and speak to your Customer Service Manager if you have any questions.

If you are not yet using Snowplow and are interested in finding out more about how you can increase the completeness and quality of the data you collect from your digital products, reach out today.

Learn more about our unique approach to data delivery with a Snowplow demo.

Share

Related articles