As you can see, in this excerpt we have a variety of different referers - some internal pages and some search pages (Google, Google Images, Bing Images and AOL).
As with the 0.8.0 release, this new release assumes that you are running the Hadoop (Scalding) ETL and feeding your data into Redshift.
To upgrade to 0.8.1 from 0.8.0, follow these steps:
If you are using EmrEtlRunner, you need to update your configuration file,
config.yml, to the latest version of the Hadoop ETL:
:snowplow: :hadoop_etl_version: 0.2.0 # Version of the Hadoop ETL
We have updated the Redshift table definition, you can find the latest version in the GitHub repository here.
If you already have your Snowplow data in the previous version of the Redshift events table, then we have written a migration script to handle the upgrade. Please review this script carefully before running and check that you are happy with how it handles the upgrade.
Also please note that we have had to remove the “raw”
referrer_url field from our Redshift events table for space reasons. This means that your historical data will lose all referer information in your events table unless you run a re-computation, see below.
If you would like to see referer details for historic Snowplow events (i.e. events already in your Snowplow events table in Redshift), then we recommend re-running your Snowplow ETL process across all of your historical raw data.
This is also advisable given that we have removed the raw
referrer_url field from our Redshift table definition for space reasons.
To re-run your Snowplow ETL process across all your historical data, please see our answer to I want to recompute my Snowplow events, how? on the Troubleshooting wiki page.
And that’s it! Once you have made these changes, you should have Snowplow populating the referer details for all new events.
As always, if you do run into any issues or don’t understand any of the above changes, please raise an issue or get in touch with us via the usual channels.
You can see the full list of issues delivered in Snowplow 0.8.1 on GitHub.