We are pleased to release version 0.8.0 of the Snowplow Java Tracker. This release introduces several performance upgrades and a complete rework of the API. Many thanks to David Stendardi from Viadeo for his contributions!
In the rest of this post we will cover:
- API updates
- Emitter changes
- Changing the Subject
- Other improvements
- Getting help
1. API updates
This release introduces a host of API changes to make the Tracker more modular and easier to use. Primary amongst these is the introduction of the builder pattern for almost every object in the Tracker. This pattern lets us:
- Set default values for almost everything without the need for overloaded functions
- Add features without breaking the API in the future
- Add new events for Tracking without changing the API
Please read the technical documentation for notes on setting up the Tracker.
To setup a basic Tracker under the new API:
Event tracking: old approach
We have also updated how you track events. In place of many different types of
trackXXX functions, we now have a single
track function which can take different types of
Events as its argument. These events are also built using the builder pattern.
Let’s look at how we were tracking a page view event before, in version 0.7.0:
For events like an Ecommerce Transaction it quickly becomes difficult to understand:
Event tracking: new approach
By contrast, here is a page view in version 0.8.0:
And here is the ecommerce event:
The new builder pattern is slightly more verbose but the readbility is greatly improved. You also no longer have to pass in
null entries for fields that you don’t want to populate.
2. Emitter changes
The Emitter has also undergone a major overhaul in this release to allow for greater modularity and asynchronous capability.
Firstly, we have removed the need to define whether you would like to send your events via
POST by introducing two different types of Emitters instead. You now use the
GET requests and the
You can build the emitters like so:
Builder functions explained:
HttpClientAdapterobject for the emitter to use
threadCountsets the size of the Thread Pool which can be used for sending events
requestCallbackis an optional callback function which is run after each sending attempt; it will return failed event Payloads for further processing
bufferSizeis only available for the
BatchEmitter; it allows you to set how many events go into a
Secondly, we now offer more than one
HttpClient for sending events. On top of the
ApacheHttpClient we have now added an
OkHttpClient. The following objects are what we would embed in the
httpClientAdapter( ... ) builder functions above:
Thus you now have control over the actual client used for sending and can define your own custom settings for it.
Builder functions explained:
urlis the collector URL where events are going to be sent
Many thanks to David Stendardi from Viadeo for this contribution in making the Tracker so modular!
This release also fixes a major performance issue experienced around sending events. The Tracker was, up until now, sending all events using a synchronous blocking model. To fix this we are now sending all of our events using a pool of background threads; the pool size is configurable in the emitter creation step. As a result:
- All event sending is now non-blocking and fully asynchronous
- You control the amount of events that can be sent asychronously to directly control the load on your tracker’s host system
To emphasise the speed changes we performed some stress testing on the Tracker with the previous model and the new model:
PageViewevents were sent into the Tracker
- Request type was
- Buffer size was 10
- Version 0.7.0 took ~40 seconds to finish sending, blocking execution
- Version 0.8.0 took ~2-3 seconds to finish sending, non-blocking execution
That is more than a 1300% speed increase! This increase could potentially get even bigger when running the Tracker on more powerful systems and increasing the Thread Pool accordingly.
We also spent some time exploring the most efficient buffer-size for the Tracker on our system. To test this we sent 10k events from the Tracker and recorded the time taken to successfully send all of them. As you would imagine the larger the buffer-size the lower the latency in getting the events to the collector:
If you are expecting large event volumes, do adjust your buffer size and thread count to allow the Tracker to handle this. However please be aware of the 52000 byte limit per request, if you set the buffer too high it is likely you won’t be able to successfully send anything!
4. Changing the Subject
In an environment where many different Subjects are involved (e.g. a web server or a RabbitMQ bridge), having a single Subject associated with a Tracker is very restrictive.
This release lets you pass a Subject along with your event, to be used in place of the Tracker’s Subject. In this way, you can rapidly switch Subject information between different events:
5. Other improvements
Other changes worth highlighting:
- Added several new key-value pairs to the Subject class with new
setXXXfunctions (#125, #124, #88, #87)
- Made the
TrackerPayloadmuch more typesafe by only allowing String values (#127)
- Added a fail-fast check for an invalid collector URL (#131)
The new version of the Snowplow Java Tracker is 0.8.0. The Java Setup Guide on our wiki has been updated to the latest version.
Please note this releae breaks compatibility with Java 6; from now on we will only be supporting Java 7+.**
You can find the updated Java Tracker usage manual on our wiki.
You can find the full release notes on GitHub as Snowplow Java Tracker v0.8.0 release.
8. Getting help
Despite its version number the Java Tracker is still relatively immature and we will be working hard with the community to improve it over the coming weeks and months; in the meantime, do please share any user feedback, feature requests or possible bugs.