Elasticsearch Loader 0.9.0 released

21 July 2017  •  Ben Fradet
We are thrilled to announce version 0.9.0 of Elasticsearch Loader, our component that lets you sink your Kinesis streams of Snowplow enriched events to Elasticsearch. This release adds support for Elasticsearch 5 and other important features such as the possibility to use SSL when relying on the REST API of Elasticsearch and the ability to sign requests when using Amazon Elasticsearch Service. In this post, we will cover: Support for Elasticsearch 5 Security features Bug...

Loading and analyzing Snowplow event data in Neo4j

17 July 2017  •  Dilyan Damyanov
Back in 2014 we published a series of blog post on using Snowplow event data in the graph database Neo4j. Three years on, they’re still among our most popular blog posts. (See below for links to the original posts.) A lot has changed since then. Neo4j has strengthened its position as a leading graph database solution. Its query language, Cypher, has grown with the platform. It has changed to the point where some of the...

Kinesis S3 0.5.0 released

07 July 2017  •  Ben Fradet
We are proud to be releasing version 0.5.0 of Kinesis S3, our project dedicated to sinking Kinesis streams, including Snowplow raw and enriched event streams, to S3. This release revolves around community-driven improvements as well as the modernization of the project. This post will cover: Fix silent suppresion of failures Community contributions Project modernization Roadmap Contributing 1. Fix silent suppression of failures We’ve uncovered a situation where failures prior to the serialization of the records...

Snowplow .NET Analytics SDK 0.1.0 released

15 June 2017  •  Devesh Shetty
Following in the footsteps of the Snowplow Scala Analytics SDK and Snowplow Python Analytics SDK, we are happy to announce the release of the Snowplow .NET Analytics SDK. This SDK makes your Snowplow enriched events easier to work with from Azure Data Lake Analytics, Azure Functions, AWS Lambda, Microsoft Orleans and other .NET-compatible data processing frameworks. This SDK has been developed as a first step towards our RFC, Porting Snowplow to Microsoft Azure. Over time,...

Snowplow 89 Plain of Jars released, porting Snowplow to Spark

12 June 2017  •  Ben Fradet
We are tremendously excited to announce the release of Snowplow 89 Plain of Jars. This release centers around the port of our batch pipeline from Twitter Scalding to Apache Spark, a direct implementation of our most popular RFC, Migrating the Snowplow batch jobs from Scalding to Spark. Read on for more information on R89 Plain of Jars, named after an archeological site in Laos: Thanks Why Spark? Spark Enrich and RDB Shredder Under the hood...

Dataflow Runner 0.3.0 released

30 May 2017  •  Ben Fradet
We are pleased to announce version 0.3.0 of Dataflow Runner, our cloud-agnostic tool to create clusters and run jobflows. This release is centered around new features and usability improvements. In this post, we will cover: Preventing overlapping job runs through locks Tagging playbooks New template functions Other updates Roadmap Contributing 1. Preventing overlapping job runs through locks This release introduces a mechanism to prevent two jobs from running at the same time. This is great...

Snowplow Scala Analytics SDK 0.2.0 released

24 May 2017  •  Anton Parkhomenko
We are pleased to announce the 0.2.0 release of the Snowplow Scala Analytics SDK, a library providing tools to process and analyze Snowplow enriched events in Scala-compatible data processing frameworks such as Apache Spark, AWS Lambda, Apache Flink and Scalding, as wells other JVM-compatible data processing frameworks. This release adds run manifest functionality, removes the Scalaz dependency and adds SDK artifacts to Maven Central, along with many other internal changes. In the rest of this...

Snowplow JavaScript Tracker 2.8.0 released

18 May 2017  •  Ben Fradet
We are pleased to announce a new release of the Snowplow JavaScript Tracker. Version 2.8.0 gives you much more flexibility and control in the area of in-browser user privacy, as well as adding new integrations for Parrable and OptimizelyX. Read on below the fold for: State storage strategy Opt-out cookie Better form tracking for passwords New OptimizelyX and Parrable contexts Extracting valuable metadata from the tracker Improved page activity handling Upgrading Documentation and help 1....

Snowplow Meetup Amsterdam #3 was all about personalisation across the customer journey

08 May 2017  •  Idan Ben-Yaacov
We were delighted to be running our third Snowplow Meetup in Amsterdam on April 5th and lucky to have speakers from de Bijenkorf and Greenhouse Group alongside our co-founder Alex Dean. Such a compelling ensemble of speakers resulted in a great turnout and lots of interesting questions from the audience. It was great to connect with the Amsterdam community of analytics practitioners, digital agencies and data scientists. It’s always exciting to connect with our community...

Insights from the first Snowplow meetup in Brazil

03 May 2017  •  Bernardo Srulzon
This is a guest blog post by Bernardo Srulzon, Business Intelligence lead at GetNinjas and a Snowplow enthusiast since 2015. In this post, Bernardo shares his insights from our first Snowplow meetup in São Paulo, which took place on April 19th. Many thanks to Bernardo for sharing his thoughts with this post and to Getninjas for hosting our meetup! If you have a story to share, feel free to get in touch. It was a...

Introducing Factotum Server

28 April 2017  •  Nicholas Ung
We are pleased to announce the release of Factotum Server, a new open-source system for scheduling and executing Factotum jobs. In previous posts, we have talked about how our pipeline orchestration journey started with cron and make, before moving on to release Factotum. Initially, the only way to interact with Factotum has been through the CLI, but now we have Factotum Server. Where Factotum fills the gap of our previous make-based solution, Factotum Server replaces...

Snowplow 88 Angkor Wat released

27 April 2017  •  Anton Parkhomenko
We are pleased to announce the release of Snowplow 88 Angkor Wat. This release introduces event de-duplication across different pipeline runs, powered by DynamoDB, along with an important refactoring of the batch pipeline configuration. Read on for more information on R88 Angkor Wat, named after the largest religious monument in the world: New storage targets configuration Cross-batch natural deduplication Upgrading Roadmap Getting help 1. New storage targets configuration Historically storage targets for the Snowplow batch...

Snowplow at GDC: why gaming companies don’t need to build their own event data pipeline

18 April 2017  •  Yali Sassoon
We at Snowplow were very excited to be invited by AWS to this year’s Games Developer Conference (GDC) in San Francisco. We both presented at the AWS Developer Day and demoed Snowplow at the AWS stand. Snowplow presentation at GDC Alex Dean, my cofounder, and I were delighted to speak at the AWS Developer Day. You can view our presentation, “Open Source Game Analytics Powered by AWS”, below. And the slides by themselves: Snowplow: open...

How JustWatch uses Snowplow data to build a differentiated service for advertising movies and drive spectacular growth

13 April 2017  •  Giuseppe Gaviani
This blog post is about how JustWatch has been using Snowplow to build a highly effective and differentiated advertising technology business and drive spectacular business growth. You can download this story in pdf here. “Snowplow provides rich, granular data that enabled us to build a sophisticated audience intelligence and double the efficiency of trailer advertising campaigns for our clients compared to the industry average” Dominik Raute, Co-Founder & CTO, JustWatch JustWatch: a data-driven company JustWatch...

How to develop better games with level analytics

12 April 2017  •  Colm O Griobhtha
Summary Product managers and game designers generally aim to design game levels in such a way that they challenge gamers enough to make completing a level satisfying, but not so challenging that they drop out and stop playing the game. This blog post shows an example of how product managers and game designers can use a well designed dashboard to better understand user behaviour across a game’s levels, design highly playable game levels, A/B test...