This is an 8-part series
Click below to navigate to the next chapter:
Chapter 1 The state of web analytics in 2021
Chapter 2 Privacy updates, ad blockers, and the need for 1st-party tracking
Chapter 3 Building a web analytics stack: packaged vs modular
Chapter 4 The best-in-class tools for web analytics
Chapter 5 Redefining web analytics metrics
Chapter 6 Data modeling for web analytics
Chapter 7 Snowplow for web analytics
Chapter 8 How Welcome to the Jungle took ownership of their web data with Snowplow
Download the full eBook Rethinking modern web analytics
The proliferation of privacy tools and ad-blocking software has made the job of web analysts increasingly difficult. This technology obscures many of the actions and behaviors that constitute the rich behavioral data visitors generate on your website. Features like private browsing modes and ad blockers were implemented to help protect users from the most egregious intrusions into their browsing. Now, many of these privacy measures are baked into browsers or enabled by default.
If your recommendation engine or marketing attribution model relies on detailed behavioral data, and you rely on a packaged analytics solution for collection, the unfortunate reality is you’re missing data — (as much as 20%!). But “tracking” has become a dirty word: what looks like well-intentioned data collection to improve site navigation to some may seem like unwelcome surveillance to others. As web analysts, we have to somehow bridge that gap between the users (whose privacy deserves respect) and our organizations (which require accurate data).
The Browser Wars
After what seems like a never-ending stream of high profile data breaches coupled with increasingly strict privacy laws around the world, web browsers are locking down their user data, placing restrictions on who can access that data and how. All of this is not without reason. Regulators are seizing any opportunity to question organizations over their tracking practices and users are becoming more educated on how their data is being used. The result is that visitors to your website expect the same high quality experience while retaining full control over what data they do or do not wish to share.
Enhanced browser privacy and built-in ad blocking make collecting meaningful behavioral web data difficult. This is the result of masking information about the user and their browsing history, controlling activity logs, and restricting cookies. Pre-packaged analytics platforms are often blocked by default, a result of the context in which the tracking event occurs (more on this below). While out-of-the-box tools can be helpful for organizations early in their analytics journeys, the increasing focus on user privacy that restricts third-party tracking makes a compelling case for first-party tracking.
Web browsers are implementing clever features designed to both protect users while preventing websites from tracking those visitors. Behind the scenes, browsers are removing tracking parameters from URLs, stripping or spoofing referral IDs, and setting strict limits on how websites can interact with a user’s browser storage via cookies.
As a refresher, cookies are essentially bits of code that use browser storage to maintain specific states as a visitor navigates from one page to the next. Cookies make sure your visitors stay logged in and keep their items in the shopping cart as they browse your website. Cookies are often referred to as first or third party, but it’s more accurate to describe their context, the circumstances under which the cookie was written to a visitor’s browser. From Cookie Status:
Here, “eTLD+1” refers to the effective top-level domain plus one part. For example, blog.snowplowanalytics.com is an eTLD+1 for the domain snowplowanalytics.com. Cookies with a first-party context (“first-party” cookies) occur between pages that share an eTLD+1, e.g. navigating from blog.snowplowanalytics.com/post_1 to blog.snowplowanalytics.com/post_2. Third-party context occurs between pages that don’t share a domain, like your email service provider’s subscription form popping up in an iframe or a restaurant’s menu PDF being served directly from S3 via s3.amazonaws.com.
Despite such an innocuous name, cookies wield a remarkable amount of power to inform and alter a user’s web browsing experience, so it’s no surprise they’re at the center of some of the most robust privacy initiatives in modern web development.
The way the (third-party) cookie crumbled
The two most significant browser privacy initiatives (in intent if not in scope) currently impacting web analysts are Apple’s Intelligent Tracking Prevention and Mozilla’s Enhanced Tracking Protection.
Intelligent Tracking Prevention (ITP)
Apple’s ITP was introduced to prevent intrusive, disruptive practices by ad tech companies in the earlier days of the internet. Ad tech companies responded by moving to first-party cookies set client-side, which happens to be the same mechanism that many packaged analytics tools use to identify site visitors. ITP’s restrictions eventually impacted cookies set by analytics providers as well: version 2.1 of ITP included a seven-day expiry period for all cookies in the Safari browser, and version 2.2 capped cookies to just one day of storage if the domain URL matched a known tracker.
This means if someone visits your website to browse your products and comes back ten days later and makes a purchase, that second visit looks like a new person to your analytics.
Mozilla introduced Enhanced Tracking Protection into its Firefox browser in 2018 and enabled the privacy-focused suite of features by default in 2019. Similar to ITP, ETP blocks third-party cookies. As of version 2.0, Firefox deletes tracking cookies every 24 hours, as opposed to Apple’s generous seven days. ETP extends a grace period for websites you visit frequently, like search engines or social media, storing those first-party cookies for 45 days (or indefinitely, depending on how often you visit the site).
Privacy in Google Chrome Microsoft Edge
Apple and Mozilla are not alone in developing advanced privacy tools for their browsers. While Google Chrome doesn’t currently offer as many options as other browsers, the company announced a new privacy initiative in January 2020 set to give users greater control over their data, calling it “a path to making third-party cookies obsolete.” Microsoft’s Edge browser offers tracking protection by default, with the recommended settings functioning similar to ITP and ETP but without an expiration period.
Even if you don’t advertise, ad blockers can be having a significant impact on your web analytics. Ad blockers function like other tracking prevention, by checking scripts as a page loads against a list of domains to block. Depending on the implementation and the ad blocker, tracking scripts from Google Analytics or other on-page analytics platforms can be caught by the filters.
The impact on third-party tracking
All of the signs are clear: Google is not alone on the path to making the third-party cookie obsolete. Of the leading web browser versions by global market share as of January 2021, the browsers discussed above account for over 80% of web users.
Browser-based privacy features and the prevalence of ad blockers (over 40% of internet users employ some form of ad-blocking) means organizations relying on many analytics platforms will have to rethink their data analytics, digital marketing, marketing attribution, and personalization strategies. Put another way: companies have found that Safari users spend more on average than Chrome users, so if your analytics solution can’t track Safari, you’re losing 25% of your most lucrative visitors.
Collect complete behavioral data with first-party tracking
Your analytics will have distortions and gaps if you rely on cookies set in a third-party context or by many known tracking and analytics services. First-party data collection platforms like Snowplow use server side set cookies (first-party context), leaving them unaffected by ITP, ETP, or most other tracking prevention. In an experiment run by Moz, calculating traffic obscured by ad blockers or browser tracking prevention revealed anywhere from a 5-30% discrepancy in volume.
Without being limited by expiration dates, tracking using server side set cookies provides a source of rich, detailed behavioral data for businesses to use to make more informed decisions. Just as important, setting cookies this way preserves user privacy. Server side set cookies are currently the most reliable way to track anonymous visitors to your website.
First party, server side tracking is a win for users and businesses
First-party tracking meets the privacy standards Apple, Mozilla, Google, and others set to protect their users while preserving behavioral data integrity. Because server-side tracking occurs in a first-party context, cookies set this way are unaffected by modern browser privacy measures. Controlling your data pipeline end to end also significantly reduces the likelihood of a third-party breach exposing any of your visitor data.
Organizations that use intentionally designed first-party tracking solutions, like Welcome to the Jungle uses Snowplow, collect high quality behavioral data to deliver the best experiences for their customers. When you do data collection and analysis right, your user’s positive experience should be as rewarding to them as their data is to you, the collector.
Because first-party tracking can collect behavioral data without unnecessary personally identifiable information or information otherwise locked behind ad blockers or ITP/ETP, you can still benefit from rich, behavioral data, while your visitors maintain their privacy.
This post was written with the help of Freelance Content Writer, Anthony Mandelli.