Snowplow at GDC: why gaming companies don’t need to build their own event data pipeline

18 April 2017  •  Yali Sassoon
We at Snowplow were very excited to be invited by AWS to this year’s Games Developer Conference (GDC) in San Francisco. We both presented at the AWS Developer Day and demoed Snowplow at the AWS stand. Snowplow presentation at GDC Alex Dean, my cofounder, and I were delighted to speak at the AWS Developer Day. You can view our presentation, “Open Source Game Analytics Powered by AWS”, below. And the slides by themselves: Snowplow: open...

How JustWatch uses Snowplow data to build a differentiated service for advertising movies and drive spectacular growth

13 April 2017  •  Giuseppe Gaviani
This blog post is about how JustWatch has been using Snowplow to build a highly effective and differentiated advertising technology business and drive spectacular business growth. You can download this story in pdf here. “Snowplow provides rich, granular data that enabled us to build a sophisticated audience intelligence and double the efficiency of trailer advertising campaigns for our clients compared to the industry average” Dominik Raute, Co-Founder & CTO, JustWatch JustWatch: a data-driven company JustWatch...

How to develop better games with level analytics

12 April 2017  •  Colm O Griobhtha
Summary Product managers and game designers generally aim to design game levels in such a way that they challenge gamers enough to make completing a level satisfying, but not so challenging that they drop out and stop playing the game. This blog post shows an example of how product managers and game designers can use a well designed dashboard to better understand user behaviour across a game’s levels, design highly playable game levels, A/B test...

Snowplow Python Analytics SDK 0.2.0 released

11 April 2017  •  Anton Parkhomenko
We are pleased to announce the 0.2.0 release of the Snowplow Python Analytics SDK, a library providing tools to process and analyze Snowplow enriched event format in Python-compatible data processing frameworks such as Apache Spark and AWS Lambda. This release adds new run manifest functionality, along with many internal changes. In the rest of this post we will cover: Run manifests Using the run manifest Documentation Other changes Upgrading Getting help 1. Run manifests This...

Snowplow Analytics gets nod at MeasureCamp London

03 April 2017  •  Dilyan Damyanov
It was a busy Saturday in Pimlico as hundreds descended on the area for the 10th edition of MeasureCamp London on 25 March. My colleague Diogo and I were there representing Snowplow’s Analytics team. The dozens of sessions that attendees delivered as part of the event were heavily dominated by topics around Google Analytics and its suite of accompanying tools and services. But open-source platforms such as Snowplow got their fair share of shout outs....

Dataflow Runner 0.2.0 released

31 March 2017  •  Benjamin Fradet
Building on the initial release of Dataflow Runner last month, we are proud to announce version 0.2.0, aiming to bring Dataflow Runner up to feature parity with our long-standing EmrEtlRunner application. As a quick reminder, Dataflow Runner is a cloud-agnostic tool to create clusters and run jobflows which, for the moment, only supports AWS EMR. If you need a refresher on the rationale behind Dataflow Runner, feel free to checkout the RFC on the subject....

Google Cloud Dataflow example project released

30 March 2017
We are pleased to announce the release of our new Google Cloud Dataflow Example Project! This is a simple time series analysis stream processing job written in Scala for the Google Cloud Dataflow unified data processing platform, processing JSON events from Google Cloud Pub/Sub and writing aggregates to Google Cloud Bigtable. The Snowplow GCP Dataflow Streaming Example Project can help you jumpstart your own real-time event processing pipeline on Google Cloud Platform (GCP). In this...

How Peak uses Snowplow to drive product development and neuroscience

03 March 2017  •  Giuseppe Gaviani
This blog post explains how Peak has been using Snowplow since July 2015 to drive its business through product development and neuroscience. You can download this story in pdf here. “Snowplow is really powerful when you start to hit that growth curve and going upwards: when you see the signs of accelerating growth and you need to start collecting as much event data as possible”, Thomas in’t Veld, Lead Data Scientist, Peak About Peak Peak...

Sigfig and Weebly talk at second Snowplow Meetup San Francisco

24 February 2017  •  Yali Sassoon
Last night we were delighted to host our second Snowplow Meetup San Francisco, at the lovely Looker offices. The event kicked off with a talk from Sigfig’s Benny Wijatno and Jenna Lemonias. Benny and Jenna gave an overview of Sigfig, before exploring how they use Snowplow to answer a wide variety of questions related to customer acquisition. Snowplow at Sigfig Weebly’s Audrey Carstensen and Bo Han followed up with an overview of how Snowplow is...

Snowplow 87 Chichen Itza released

21 February 2017  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 87 Chichen Itza. This release contains a wide array of new features, stability enhancements and performance improvements for EmrEtlRunner and StorageLoader. As of this release EmrEtlRunner lets you specify EBS volumes for your Hadoop worker nodes; meanwhile StorageLoader now writes to a dedicated manifest table to record each load. Continuing with this release series named for archaelogical sites, Release 87 is Chichen Itza, the ancient...

Snowplow away week in Berlin

20 February 2017  •  Giuseppe Gaviani
Some of the Snowplow team works remotely, so last November the team went on an away week in Berlin to rekindle the team spirit on occasion of our third Snowplow Meetup in Berlin. Team members travelled from far and wide from four countries - Russia, Canada, France and the United Kingdom, - to convene in Berlin. Here is some of the things the team did on their away week… It started with a session about...

Snowplow Meetup London Number 4: a roundup

15 February 2017  •  Giuseppe Gaviani
Our fourth Snowplow London Meetup took place on February the 8th at CodeNode. It was a fun and informative event with around 60 people attending, great talks and lots of interesting questions from the audience. We have filmed the talks, which you can watch in the links below, along with the presentation slides. How Gousto is moving to the real-time pipeline to enable just-in-time personalization Why Snowplow is at the heart of Busuu’s data and...

Snowplow .NET Tracker 1.0.0 supporting mobile devices through Xamarin released

15 February 2017  •  Ed Lewis
We’re pleased to announce the 1.0.0 release of Snowplow’s .NET Tracker. This is a major reboot of the existing .NET Tracker, convering it into a .NET Standard project; this conversion brings with it support for the tracker on mobile devices through Xamarin, plus all platforms that support .NET Core (Windows, Linux and macOS). Here is our mobile demonstration app for the tracker running on Xamarin: Read on for more: A brief history of .NET Standard...

Introducing Dataflow Runner

10 February 2017  •  Joshua Beemster
We are pleased to announce the release of Dataflow Runner, a new open-source system for the creation and running of AWS EMR jobflow clusters and steps. Big thanks to Snowplow intern Manoj Rajandrakumar for all of his hard work on this project! This release signals the first step in our journey to deconstruct EmrEtlRunner into two separate applications, a Dataflow Runner and snowplowctl, per our RFC on Discourse. In the rest of this post we...

Iglu Ruby Client 0.1.0 released

08 February 2017  •  Anton Parkhomenko
We are pleased to announce the initial release of the Iglu Ruby Client, our third library in the family of Iglu clients. In the rest of this post we will cover: Introducing Iglu Ruby Client Use cases Setup guide Usage Roadmap and upcoming features Getting help 1. Introducing Iglu Ruby Client Iglu clients are simple SDKs which let users fetch schemas for self-describing data and validate that data against its schema. As part of broadening...