We are delighted to announce a new release of the Snowplow AMP Tracker. Version 1.0 of the tracker introduces a host of new functionality to Snowplow tracking on the AMP platform:
- Page ping tracking
- Custom events and entities
- The AMP web page standard entity
- The AMP ID standard entity for user identification
- Linking users across AMP and non-AMP pages
It also overhauls some of the existing functionality, to resolve some issues involved in the misalignment between the tracking on the AMP platform, and tracking in a traditional web page.
This release introduces breaking changes to how data is tracked - users are encouraged to migrate to the latest version of the AMP tracker as soon as possible.
Read on below the fold for:
- Aggregating Page Views
- Mapping User Journeys
- Custom Events
- Custom Entities
- Changes to existing behaviour
1. Aggregating Page Views
1.1 Page Ping Tracking
Once enabled, page ping events will be sent as an AMP-specific page ping event, against the AMP page ping schema. This will contain the following data, as defined in the AMP documentation for variable substitutions: scrollLeft, scrollWidth, viewportWidth, scrollTop, scrollHeight, viewportHeight, totalEngagedTime.
Scroll percentage and engaged time metrics can be calculated by aggregating on these values. Additionally, this method can be used in combination with the new AMP web page entity to aggregate to a page view level.
Documentation can be found in the page ping section of the AMP tracker docs
1.2 Web Page Entity
The web page entity has been introduced, and is attached to every event by default. The AMP web page entity attaches an AMP-specific entity to all events, against the AMP web page schema. This contains the ampPageViewId.
Note that the value provided is the AMP-provided Page View ID 64:
Provides a string that is intended to be random with a high entropy and likely to be unique per URL, user and day.
1.3 Modeling Considerations
The scrollLeft and scrollTop fields provide the amount of scroll from the leftmost and topmost point of the page, in pixels. The scrollWidth/Height and viewportWidth/Height fields provide the size of the page, and size of the viewport, again in pixels. These values can therefore be aggregated together to calculate page scroll ratios in a similar but slightly different way to how this is done traditionally.
The AMP Page View ID value provided is not unique to an instance of a page view, but rather is unique when combined with the URL, date, and AMP client ID, so data should be aggregated by the concatenation of these values, rather than the ID alone.
2. Mapping User Journeys
User identification using AMP is now significantly improved, and functionality has been introduced to allow easier identification of users across AMP and traditional web pages. The AMP ID entity facilitates identifying users who move from traditional pages to AMP, the AMP linker is now used to map users from AMP to traditional pages, and the AMP client ID is now attached to all events as distinct from other user ids.
2.1 AMP ID Entity
The AMP ID entity is a new feature to Snowplow tracking, with a specific purpose of making it easier to identify users on the AMP platform, and across AMP and non-AMP pages. By default, the AMP ID entity will be attached to all events, against the AMP ID schema.
This will contain the ampClientId (the consistent identifier for users on the AMP platform), the
user_id if set, and the
domain_userid to the querystring, and the AMP tracker will automatically retrieve it and attempt to retain it across pages.
Note that while the AMP tracker attempts to retain the domain userid across pages and sessions, the AMP platform does not offer any means to guarantee that this can be done in all cases - so the user identification strategy here is based on having the value attached to at least one event, rather than all events.
2.2 AMP Linker
The AMP tracker now offers the ability to attach the AMP client ID to the querystring, in order to identify users moving from an AMP page to a non-AMP page. This is enabled by ensuring that the AMP linker is enabled for any destination domains required:
This will add a querystring parameter ‘linker=’ to the destination url, which contains the amp_id value, base-64 encoded.
2.3 Modeling Considerations
The value attached to the querystring by the AMP linker will look something like this:
?linker=1*1c1wx43*amp_id*amp-a1b23cDEfGhIjkl4mnoPqr. To extract the AMP ID, this must be parsed, and the value immediately following the
amp_id* string must be extracted.
This value will only be present for the first page the user lands on after leaving the AMP page.
domain_userid is found by the AMP tracker, it is not guaranteed to be attached to every event. Therefore, a good strategy for modeling user identification on both sides is to create a mapping table of domain_userid to amp-id, and join this to the rest of the data to attribute users.
3. Custom Events
Custom events and entities can now be tracked using the AMP tracker - read more about the general topic of custom tracking with Snowplow in the documentation.
Custom events are sent by instrumenting the selfDescribingEvent request in a trigger, and passing customEventSchemaVendor, customEventSchemaName, customEventSchemaVersion, customEventSchemaData to it as variables - where customEventSchemaData is an escaped JSON string, as follows:
Documentation can be found in the custom event section of the AMP tracker docs
4. Custom Entities
Custom entities can be attached to any event by assigning a full self-describing json - as an escaped json string - to a variable named
customContexts. A singular entity may be passed, or more than one may be used if separated by a comma. For example:
Note that custom entities may be assigned globally (ie, for the entire tracking configuration rather than per-trigger) - however once one is assigned globally, more may not be added individually per-trigger.
Documentation can be found here
4. Changes to Existing Behaviour
The following design decisions have been made in this tracker, to resolve issues with the design of the previous version:
- The domain userid field is now not populated.
- Page Urls are now full urls
The previous instrumentation used the canonical url, which didn’t contain the querystring, and didn’t denote the domain from which the page is served. v1.0 of the AMP tracker uses the full ampdoc url.
- Device Created timestamp is now set instead of Device Set timestamp
AMP offers only one means of recording a timestamp, which is not likely to conform to the time of sending events. The device created timestamp is now used instead of device sent.