Snowplow 94 Hill of Tara released

10 October 2017  •  Ben Fradet

We are pleased to announce the urgent release of Snowplow 94 Hill of Tara, named after the archaeological complex in Ireland.

We take data loss extremely seriously at Snowplow - shortly after the Snowplow 93 Virunum release, routine load testing of another component (the Elasticsearch Loader) detected an active data loss scenario for our core Stream Enrich app, introduced in R93. This data loss manifests itself around auto-scaling of the Stream Enrich component and the Kinesis stream it is writing to.

On discovering this, we immediately prioritised an urgent Snowplow release to fix this specific issue, pushing back the other Snowplow releases we are working on.

Please read on after the fold for:

  1. Fixing the Stream Enrich data loss issue
  2. Upgrading
  3. Roadmap
  4. Help

hill-of-tara

1. Fixing the Stream Enrich data loss issue

Prior to R93, Stream Enrich would unnecessarily crash when the Kinesis stream that Stream Enrich was writing to was resharding and Stream Enrich was itself undergoing auto-scaling.

This issue was solved in R93 by Stream Enrich failing to instantiate the Kinesis sink until the stream had finished resharding. However, R93’s Stream Enrich would unfortunately continue to read raw events and checkpoint those reads, resulting in missing enriched events.

In fact, it is completely fine to write to a stream in the process of resharding (#3452), so this behavior has been corrected in R94, fixing the underlying bug.

There is a comprehensive guide to this issue on Discourse, in case you have been affected by it or would like to discuss it further.

2. Upgrading

The latest version of Stream Enrich is available from our Bintray here.

3. Roadmap

Upcoming Snowplow releases will include:

  • R95 [BAT] Ellora, enhancing our Redshift event storage with ZSTD encoding, plus various bug fixes for the batch pipeline
  • R96 [STR] Zeugma, which will add support for NSQ to the stream processing pipeline, ready for adoption in Snowplow Mini
  • R9x [STR] Priority fixes, removing the potential for data loss in the stream processing pipeline
  • R9x [BAT] 4 webhooks, which will add support for 4 new webhooks (Mailgun, Olark, Unbounce, StatusGator)

4. Getting help

For more details on this release, please check out the release notes on GitHub.

If you have any questions or run into any problem, please visit our Discourse forum.