There is a strong body of research to suggest that the weather is a major influence on the behavior of your end-users, for an example see the paper The Effect of Weather on consumer Spending (Murray, Di Muro, Finn, Leszczyc, 2010). To be able to perform these kinds of analyses, it’s critical to be able to attach the correct weather to each event prior to storing and analyzing those events in Redshift, Spark or similar.
Note that this release only adds this enrichment for the Snowplow Hadoop pipeline; we will be adding this to the Kinesis pipeline in the next release of that pipeline.
To use the new Weather Enrichment functionality you need to:
The example configuration JSON for this enrichment is as follows:
To go through each of these settings in turn:
apiKeyis your key you need to obtain from OpenWeatherMap.org
cacheSizeis the number of requests the underlying Scala Weather client should store. The number of requests for your plan, plus 1% for errors, should work well
timeoutis the time in seconds after which request should be considered failed. Notice that failed weather enrichment will cause your whole enriched event to end up in the bad bucket
apiHostis set to one of several available API hosts - for most cases
history.openweathermap.orgshould be fine
geoPrecisionis the fraction of one to which geo coordinates will be rounded for storing in the cache. Setting this to 1 gives you ~60km inaccuracy (worst case), the most precise value of 10 gives you ~6km inaccuracy (worst case)
To take advantage of this new enrichment, update the “hadoop_enrich” jar version in the “emr” section of your configuration YAML:
Make sure to add a
weather_enrichment_config.json configured as above into your
enrichments folder too.
Finally, if you are using Snowplow with Amazon Redshift, you will need to deploy the following table into your database:
For more details on this release, please check out the R74 European Honey Buzzard release notes on GitHub. Specific documentation on the new enrichment is available here:
If you have any questions or run into any problems, please raise an issue or get in touch with us through the usual channels.
By popular demand, we are adding a section to our release blog posts to trail upcoming Snowplow releases. Note that these releases are always subject to change between now and the actual release date.
Upcoming releases are: