We are happy to announce the release of Snowplow 68, Turquoise Jay. This is a small release which adapts the EmrEtlRunner to use the new Elastic MapReduce API.
Table of contents:
- Updates to the Elastic MapReduce API
- Multiple “in” buckets
- Backwards compatibility with old Hadoop Enrich versions
- Getting help
1. Updates to the Elastic MapReduce API
The Snowplow EmrEtlRunner uses Rob Slifka’s Elasticity Ruby library to interact with the Elastic MapReduce API. AWS recently altered this API for new AWS users so that it is now based on clusters rather than job flows, breaking the API calls used by Elasticity to check the status of an EMR job.
Rob has moved very fast to put out a new Elasticity release (version 6.0.2) using the all-new EMR APIs. Thanks a lot Rob!
For more information about Elasticity, check out Rob’s guest post from back in 2013.
2. Multiple “in” buckets
The EmrEtlRunner is no longer limited to a single bucket. Now you can specify an array of in buckets in the configuration YAML and raw event files from all of them will be moved to the processing bucket. This is helpful when upgrading your collector version: you can process events from your own and new collectors in tandem until all event traffic has moved to the new collector.
See the repository for an example configuration file.
3. Backwards compatibility with old Hadoop Enrich versions
More recent versions of Scala Hadoop Enrich (1.0.0 and later) are stored in a different S3 bucket from previous versions. Unforunately, our previous EmrEtlRunner release (0.15.0 in Release 66 Oriental Skylark) always looked in the new location, no matter what version of Hadoop Enrich was specified.
The new version of EmrEtlRunner decides where to look for the jar based on the jar’s version; this means that you can use the latest EmrEtlRunner version with earlier versions of Hadoop Enrich.
You need to update EmrEtlRunner to the latest version (0.16.0) on GitHub:
5. Getting help
For more details on this release, please check out the r68 Turquoise Jay on GitHub.