Snowplow Scala Tracker 0.2.0 released
We are pleased to release version 0.2.0 of the Snowplow Scala Tracker. This release introduces a new custom context with EC2 instance metadata, a batch-based emitter, new tracking methods and one breaking API change.
In the rest of this post we will cover:
- EC2 custom context
- Batch emitter
- New track methods
- Device sent timestamp
- Other updates
- Bug fixes
- Getting help
1. EC2 custom context
On any AWS EC2 instance, you can access basic information about your instance, such as region, IP-address, instance type etc by requesting a special Amazon-provided URI. With this release, the Scala Tracker can now automatically add this custom context to your events.
To add EC2 instance metadata to your events, you need to invoke
enableEc2Context method on your Tracker after its initialization:
enableEc2Context will prevent all subsequent events from sending until the context is fetched or the request to the AWS metadata URI times out after 10 seconds, although usually it takes much less than second. This also means that your events will be buffered for 10 seconds if you try enable EC2 context on non-EC2 box.
Sometimes you may want access to the context directly so that you can decide yourself when to send it, and when not.
In that case, use the blocking call of
Ec2Metadata.getInstanceContextBlocking on initialization of your app to get a Self-describing JSON with the context, and then manually pass it to the
2. Batch emitter
The initial release of the Scala Tracker had only a basic
AsyncEmitter; both make one HTTP
GET request per event.
Now you can also use
AsyncBatchEmitter to enque events and send them in batches via
With this configuration, any event will be sent immediately with a
GET-request to the CloudFront collector, but also will be buffered into
batchEmitter which will send all events at once as soon as all 32 events have been buffered.
3. New track methods
The Scala Tracker v0.1.0 could track only custom unstructured events with
We have now also implemented
trackStructEvent - which could be useful if for instance you are developing a webapp with the Play framework. Here’s an example of a Play Action tracking a Snowplow page view:
Note that in a real-world project, you may want to use the Play 2.4 dependency injection approach.
Here is an example of tracking a structured event:
4. Device sent timestamp
Thanks to batch emitting and temporarily-inaccessible collectors, the time between creating an event and finally sending it to a collector can be significant.
To make it easier for Snowplow to determine exactly when an event occurred, the Scala Tracker now automatically sends a
stm in our [Tracker Protocol] [tracker-protocol]) with every event, to record exactly when the event left the tracker.
You can read more about all these timestamps and Snowplow’s treatment of event time in our recent blog post.
5. Other updates
5.1 Time units changed
Breaking change: in this release, all
track* methods now take temporal arguments in milliseconds rather than seconds as before.
This change is to converge the Scala Tracker with the other Snowplow trackers, and provide more control over events’ timestamps.
5.2 Delay in sending the first event
The Scala Tracker uses spray-client and Akka under the hood. This setup involves a relatively heavy actor system initialization, and thus your first event may be sent with few seconds delay.
We are working on the elimination of this delay (see #22 for details).
6. Bug fixes
The most important bug fix is (#24), where we improved our handling of unavailable or broken event collectors.
In version 0.1.0, if the collector was unavailable or responded with a bad HTTP response, the request was re-sent continuously until the collector succeeded (or indeed the app shutdown). In this release, we extend the backoff period after each failed request and the tracker will give up after the 10th try.
You can find out more about installing and upgrading this tracker on the Scala Setup Guide on our wiki. We have also added an installation guide for Maven and Gradle users.
The Redshift table definition for the EC2 instance metadata is available on GitHub as instance_identity_document_1.sql.
8. Getting help
You can find the Scala Tracker usage manual on our wiki.
The full release notes are on GitHub as Snowplow Scala Tracker v0.2.0 release.