Snowplow S3 Loader 0.6.0 released

14 September 2017  •  Enes Aldemir
We are pleased to release version 0.6.0 of Snowplow S3 Loader, formerly known as Kinesis S3, our project dedicated to storing data, including Snowplow raw and enriched event streams, to Amazon S3. This post will cover: NSQ Support Support for “AT_TIMESTAMP” as initial position Upgrading Contributing 1. NSQ Support This release introduces NSQ as an event source - it is for this reason that we have renamed the project from Kinesis S3. Adding NSQ support...

Elasticsearch Loader 0.10.0 released

12 September 2017  •  Enes Aldemir
We are thrilled to announce version 0.10.0 of the Snowplow Elasticsearch Loader, our application for writing Snowplow enriched events and more to Elasticsearch. In this post, we will cover: NSQ support Support for writing raw JSONs Support for “AT_TIMESTAMP” as initial position Configuration changes Contributing 1. NSQ Support With this release, we are adding support for NSQ as an event source: the loader can now sink Snowplow enriched events from an NSQ topic to Elasticsearch....

Snowplow 92 Maiden Castle released

11 September 2017  •  Ben Fradet
We are pleased to announce the release of Snowplow 92 Maiden Castle. This release is a direct follow-up of Snowplow 91 Stonehenge, incorporating various improvements from seeing R90 and R91 operate in the wild. In particular, this release fixes some important gotchas in EmrEtlRunner’s --skip behavior, as well as a bug in the handling of run locks. If you’d like to know more about R92 Maiden Castle, named after the Iron Age hill fort in...

RDB Loader 0.13.0 released

06 September 2017  •  Anton Parkhomenko
We are thrilled to announce version 0.13.0 of Relational Database Loader, our Snowplow component that lets you load your data into relational databases such as Redshift and PostgreSQL. This release marks the migration of our RDB Loader and RDB Shredder apps from part of the snowplow/snowplow “mono-repo” into an independent project with its own release cadence. In this post, we will cover: Dedicated repository Single folder load Dry run Other changes Upgrading Contributing 1. Dedicated...

Snowplow Mini 0.3.0 released

30 August 2017  •  Enes Aldemir
We are pleased to announce the 0.3.0 release of Snowplow Mini, our accessible “Snowplow in a box” distribution. Snowplow Mini is the complete Snowplow real-time pipeline running on a single instance, available for easy deployment as a pre-built AMI. Use it to: Set up an inexpensive and easily discardable Snowplow stack for testing your tracker and schema changes Learn about Snowplow without having to set up a horizontally-scalable, highly-available production-grade pipeline This release focuses on...

Snowplow 91 Stonehenge released with important bug fix

17 August 2017  •  Ben Fradet
We are pleased to announce the release of Snowplow 91 Stonehenge. This release revolves around making EmrEtlRunner, the component launching the EMR steps for the batch pipeline, significantly more robust. Most notably, this release fixes a long-standing bug in the way the staging step was performed, which affected all users of the Clojure Collector (issue #3085). This release also lays important groundwork for our planned migration away from EmrEtlRunner towards separate snowplowctl and Dataflow Runner...

Snowplow 90 Lascaux released, moving loading step onto EMR

26 July 2017  •  Anton Parkhomenko
We are tremendously excited to announce the release of Snowplow 90 Lascaux. This release introduces RDB Loader, a new EMR-run application replacing our trusty StorageLoader, as proposed in our Splitting EmrEtlRunner RFC. This release also brings various enhancements and alterations in EmrEtlRunner. Read on for more information on R90 Lascaux, named after the Upper Paleolithic cave complex in southwestern France: RDB Loader Other improvements Upgrading Roadmap Getting help 1. RDB Loader 1.1 The rationale for...

Elasticsearch Loader 0.9.0 released

21 July 2017  •  Ben Fradet
We are thrilled to announce version 0.9.0 of Elasticsearch Loader, our component that lets you sink your Kinesis streams of Snowplow enriched events to Elasticsearch. This release adds support for Elasticsearch 5 and other important features such as the possibility to use SSL when relying on the REST API of Elasticsearch and the ability to sign requests when using Amazon Elasticsearch Service. In this post, we will cover: Support for Elasticsearch 5 Security features Bug...

Kinesis S3 0.5.0 released

07 July 2017  •  Ben Fradet
We are proud to be releasing version 0.5.0 of Kinesis S3, our project dedicated to sinking Kinesis streams, including Snowplow raw and enriched event streams, to S3. This release revolves around community-driven improvements as well as the modernization of the project. This post will cover: Fix silent suppresion of failures Community contributions Project modernization Roadmap Contributing 1. Fix silent suppression of failures We’ve uncovered a situation where failures prior to the serialization of the records...

Snowplow .NET Analytics SDK 0.1.0 released

15 June 2017  •  Devesh Shetty
Following in the footsteps of the Snowplow Scala Analytics SDK and Snowplow Python Analytics SDK, we are happy to announce the release of the Snowplow .NET Analytics SDK. This SDK makes your Snowplow enriched events easier to work with from Azure Data Lake Analytics, Azure Functions, AWS Lambda, Microsoft Orleans and other .NET-compatible data processing frameworks. This SDK has been developed as a first step towards our RFC, Porting Snowplow to Microsoft Azure. Over time,...

Snowplow 89 Plain of Jars released, porting Snowplow to Spark

12 June 2017  •  Ben Fradet
We are tremendously excited to announce the release of Snowplow 89 Plain of Jars. This release centers around the port of our batch pipeline from Twitter Scalding to Apache Spark, a direct implementation of our most popular RFC, Migrating the Snowplow batch jobs from Scalding to Spark. Read on for more information on R89 Plain of Jars, named after an archeological site in Laos: Thanks Why Spark? Spark Enrich and RDB Shredder Under the hood...

Dataflow Runner 0.3.0 released

30 May 2017  •  Ben Fradet
We are pleased to announce version 0.3.0 of Dataflow Runner, our cloud-agnostic tool to create clusters and run jobflows. This release is centered around new features and usability improvements. In this post, we will cover: Preventing overlapping job runs through locks Tagging playbooks New template functions Other updates Roadmap Contributing 1. Preventing overlapping job runs through locks This release introduces a mechanism to prevent two jobs from running at the same time. This is great...

Snowplow Scala Analytics SDK 0.2.0 released

24 May 2017  •  Anton Parkhomenko
We are pleased to announce the 0.2.0 release of the Snowplow Scala Analytics SDK, a library providing tools to process and analyze Snowplow enriched events in Scala-compatible data processing frameworks such as Apache Spark, AWS Lambda, Apache Flink and Scalding, as wells other JVM-compatible data processing frameworks. This release adds run manifest functionality, removes the Scalaz dependency and adds SDK artifacts to Maven Central, along with many other internal changes. In the rest of this...

Snowplow JavaScript Tracker 2.8.0 released

18 May 2017  •  Ben Fradet
We are pleased to announce a new release of the Snowplow JavaScript Tracker. Version 2.8.0 gives you much more flexibility and control in the area of in-browser user privacy, as well as adding new integrations for Parrable and OptimizelyX. Read on below the fold for: State storage strategy Opt-out cookie Better form tracking for passwords New OptimizelyX and Parrable contexts Extracting valuable metadata from the tracker Improved page activity handling Upgrading Documentation and help 1....

Introducing Factotum Server

28 April 2017  •  Nicholas Ung
We are pleased to announce the release of Factotum Server, a new open-source system for scheduling and executing Factotum jobs. In previous posts, we have talked about how our pipeline orchestration journey started with cron and make, before moving on to release Factotum. Initially, the only way to interact with Factotum has been through the CLI, but now we have Factotum Server. Where Factotum fills the gap of our previous make-based solution, Factotum Server replaces...

Snowplow 88 Angkor Wat released

27 April 2017  •  Anton Parkhomenko
We are pleased to announce the release of Snowplow 88 Angkor Wat. This release introduces event de-duplication across different pipeline runs, powered by DynamoDB, along with an important refactoring of the batch pipeline configuration. Read on for more information on R88 Angkor Wat, named after the largest religious monument in the world: New storage targets configuration Cross-batch natural deduplication Upgrading Roadmap Getting help 1. New storage targets configuration Historically storage targets for the Snowplow batch...

Snowplow Python Analytics SDK 0.2.0 released

11 April 2017  •  Anton Parkhomenko
We are pleased to announce the 0.2.0 release of the Snowplow Python Analytics SDK, a library providing tools to process and analyze Snowplow enriched event format in Python-compatible data processing frameworks such as Apache Spark and AWS Lambda. This release adds new run manifest functionality, along with many internal changes. In the rest of this post we will cover: Run manifests Using the run manifest Documentation Other changes Upgrading Getting help 1. Run manifests This...

Dataflow Runner 0.2.0 released

31 March 2017  •  Ben Fradet
Building on the initial release of Dataflow Runner last month, we are proud to announce version 0.2.0, aiming to bring Dataflow Runner up to feature parity with our long-standing EmrEtlRunner application. As a quick reminder, Dataflow Runner is a cloud-agnostic tool to create clusters and run jobflows which, for the moment, only supports AWS EMR. If you need a refresher on the rationale behind Dataflow Runner, feel free to checkout the RFC on the subject....

Google Cloud Dataflow example project released

30 March 2017  •  Guilherme Grijó Pires
We are pleased to announce the release of our new Google Cloud Dataflow Example Project! This is a simple time series analysis stream processing job written in Scala for the Google Cloud Dataflow unified data processing platform, processing JSON events from Google Cloud Pub/Sub and writing aggregates to Google Cloud Bigtable. The Snowplow GCP Dataflow Streaming Example Project can help you jumpstart your own real-time event processing pipeline on Google Cloud Platform (GCP). In this...

Snowplow 87 Chichen Itza released

21 February 2017  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 87 Chichen Itza. This release contains a wide array of new features, stability enhancements and performance improvements for EmrEtlRunner and StorageLoader. As of this release EmrEtlRunner lets you specify EBS volumes for your Hadoop worker nodes; meanwhile StorageLoader now writes to a dedicated manifest table to record each load. Continuing with this release series named for archaelogical sites, Release 87 is Chichen Itza, the ancient...

Snowplow .NET Tracker 1.0.0 supporting mobile devices through Xamarin released

15 February 2017  •  Ed Lewis
We’re pleased to announce the 1.0.0 release of Snowplow’s .NET Tracker. This is a major reboot of the existing .NET Tracker, convering it into a .NET Standard project; this conversion brings with it support for the tracker on mobile devices through Xamarin, plus all platforms that support .NET Core (Windows, Linux and macOS). Here is our mobile demonstration app for the tracker running on Xamarin: Read on for more: A brief history of .NET Standard...

Introducing Dataflow Runner

10 February 2017  •  Joshua Beemster
We are pleased to announce the release of Dataflow Runner, a new open-source system for the creation and running of AWS EMR jobflow clusters and steps. Big thanks to Snowplow intern Manoj Rajandrakumar for all of his hard work on this project! This release signals the first step in our journey to deconstruct EmrEtlRunner into two separate applications, a Dataflow Runner and snowplowctl, per our RFC on Discourse. In the rest of this post we...

Iglu Ruby Client 0.1.0 released

08 February 2017  •  Anton Parkhomenko
We are pleased to announce the initial release of the Iglu Ruby Client, our third library in the family of Iglu clients. In the rest of this post we will cover: Introducing Iglu Ruby Client Use cases Setup guide Usage Roadmap and upcoming features Getting help 1. Introducing Iglu Ruby Client Iglu clients are simple SDKs which let users fetch schemas for self-describing data and validate that data against its schema. As part of broadening...

Snowplow Javascript Tracker 2.7.0 released

09 January 2017  •  Yali Sassoon
We are delighted to kick off 2017 with a new release of our Javascript Tracker. Version 2.7.0 includes a number of new and improved features including: Improved tracking for single-page webapps Content Security Policy compliance Automatic and manual error tracking New configuration options for first party cookies More elegant Optimizely integration New trackSelfDescribingEvent method 1. Improved tracking for single-page webapps The webPage context is invaluable when you analyse or model web data, and want to...

Factotum 0.4.0 released with support for constraints

22 December 2016  •  Joshua Beemster
We’re pleased to announce the 0.4.0 release of Snowplow’s DAG running tool Factotum! This release centers around making DAGs safer to run on distributed clusters by constraining the run to a specific host. In the rest of this post we will cover: Constraining job runs Downloading and running Factotum Roadmap Contributing 1. Constraining job runs This release adds the ability to constrain your DAG’s execution to a single host. This allows for job distribution to...

Snowplow 86 Petra released

20 December 2016  •  Anton Parkhomenko
We are pleased to announce the release of Snowplow 86 Petra. This release introduces additional event de-duplication functionality for our Redshift load process, plus a brand new data model that makes it easier to get started with web data. This release also adds support for AWS’s newest regions: Ohio, Montreal and London. Having exhausted the bird population, we needed a new set of names for our Snowplow releases. We have decided to name this release...

SQL Runner 0.5.0 released

12 December 2016  •  Joshua Beemster
We are pleased to announce version 0.5.0 of SQL Runner. This release adds some powerful new features, including local and Consul-based remote locking to ensure that SQL Runner runs your playbooks as singletons. Locking your run Checking and deleting locks Running a single query Other changes Upgrading Getting help 1. Locking your run This release adds the ability to lock your run - this ensures that you cannot accidentally start another job whilst one is...

Snowplow 85 Metamorphosis released with beta Apache Kafka support

15 November 2016  •  Alex Dean
We are pleased to announce the release of Snowplow 85 Metamorphosis. This release brings initial beta support for using Apache Kafka with the Snowplow real-time pipeline, as an alternative to Amazon Kinesis. Metamorphosis is one of Franz Kafka’s most famous books, and an apt codename for this release, as our first step towards an implementation of the full Snowplow platform that can be run off the Amazon cloud, on-premise. (We’ll come up with a new...

Factotum 0.3.0 released with webhooks

07 November 2016  •  Ed Lewis
We’re pleased to announce the 0.3.0 release of Snowplow’s DAG running tool Factotum! This release centers around making DAGs easier to create, monitor and reason about, including adding outbound webhooks to Factotum. In the rest of this post we will cover: Improving the workflow when creating DAGs Improving job monitoring using webhooks Behaviors on task failure Extras Downloading and running Factotum Roadmap Contributing 1. Improving the workflow when creating DAGs We’ve decided that to separate...

Snowplow Python Tracker 0.8.0 released

12 October 2016  •  Yali Sassoon
We are delighted to release version 0.8.0 of the Snowplow Python Tracker, for tracking events from your Python apps, services and games. This release adds Python 3.4-5 support, 10 new event types and much richer timestamp support. Read on for: Python 3.4 and 3.5 support First class support for 10 new event types Support for true timestamps and device sent timestamps Updated API for sending self-describing events Other changes Huge thanks to Snowplow user Adam...

Snowplow 84 Steller's Sea Eagle released with Elasticsearch 2.x support

08 October 2016  •  Joshua Beemster
We are pleased to announce the release of Snowplow 84 Steller’s Sea Eagle. This release brings support for Elasticsearch 2.x to the Kinesis Elasticsearch Sink for both Transport and HTTP clients. Elasticsearch 2.x support Elasticsearch Sink buffer Override the network_id cookie with nuid param Hardcoded cookie path Migrating Redshift assets to Iglu Central Other changes Upgrading Roadmap Getting help 1. Elasticsearch 2.x support This release brings full support for Elasticsearch 2.x for both the HTTP...

Iglu 6 Ceres released with significant updates to Igluctl

07 October 2016  •  Yali Sassoon
We are pleased to announce a new Iglu release with some significant updates to Igluctl - our Iglu command-line tool. Read on for more information on Release 6 Ceres, named after the first postage stamp release in France: New option to lint schemas to a higher standard Publish schemas and jsonpath files to S3 Other updates 1. New option to lint schemas to a higher standard Snowplow users will define JSON Schemas for event and...

Kinesis Tee 0.1.0 released for Kinesis stream filtering and transformation

03 October 2016  •  Ed Lewis
We are pleased to announce the release of version 0.1.0 of Kinesis Tee. Kinesis Tee is like Unix tee, but for Kinesis streams. You can use it to: Write a Kinesis stream to another Kinesis stream (in the same region, or a different AWS account/region) Transform the format of a Kinesis stream Filter records from a Kinesis stream based on JavaScript rules In the rest of this post we will cover: Introducing Kinesis Tee Example:...

Introducing Sauna, a decisioning and response platform

22 September 2016  •  Alex Dean
It’s not every day that we get to announce an all-new category of software product here at Snowplow: we are hugely excited to be releasing version 0.1.0 of Sauna, our new open-source decisioning and response platform. Our Snowplow platform is about enabling you, as a business, to track and capture events across all your different channels, in granular detail, in a data warehouse, so you can build intelligence on that data. The data that flows...

Snowplow 83 Bald Eagle released with SQL Query Enrichment

06 September 2016  •  Anton Parkhomenko
We are pleased to announce the release of Snowplow 83 Bald Eagle. This release introduces our powerful new SQL Query Enrichment, long-awaited support for the EU Frankfurt AWS region, plus POST support for our Iglu webhook adapter. SQL Query Enrichment Support for eu-central-1 (Frankfurt) POST support for the Iglu webhook adapter Other improvements Upgrading Roadmap Getting help 1. SQL Query Enrichment The SQL Query Enrichment lets us perform dimension widening on an incoming Snowplow event...

Snowplow Android Tracker 0.6.0 released with automatic crash tracking

29 August 2016  •  Joshua Beemster
We are pleased to announce the release of the Snowplow Android Tracker version 0.6.0. This is our first mobile tracker release featuring automated event tracking, in the form of uncaught exceptions and lifecycle events. The Tracker has also undergone a great deal of refactoring to simplify its codebase and make it easier to use. This release post will cover the following topics: Uncaught exception tracking Lifecycle event tracking Removing RxJava Singleton setup Client session updates...

Snowplow Ruby Tracker 0.6.0 released

17 August 2016  •  Ed Lewis
We are pleased to announce the release of version 0.6.0 of the Snowplow Ruby Tracker. This release introduces true timestamp support, and marks the end of our support for Ruby 1.9.3. Read on for more detail on: True timestamp support Device-sent timestamp support Self describing events Upgrading Getting help 1. True timestamp support True timestamps in Snowplow are a way to indicate that you really trust the time given as accurate; this is particularly useful...

Snowplow 82 Tawny Eagle released with Kinesis Elasticsearch Service support

08 August 2016  •  Joshua Beemster
We are happy to announce the release of Snowplow 82 Tawny Eagle! This release updates the Kinesis Elasticsearch Sink with support for sending events via HTTP, allowing us to now support Amazon Elasticsearch Service. Kinesis Elasticsearch Sink Distribution changes Upgrading Getting help 1. Kinesis Elasticsearch Sink This release adds support to the Kinesis pipeline for loading of an Elasticsearch cluster over HTTP. This allows Snowplow to now load Amazon Elasticsearch Service, which only supports interaction...

Snowplow Tracking CLI 0.1.0 released

04 August 2016  •  Ronny Yabar
We are pleased to announce the first release of the Snowplow Tracking CLI! This is a command-line application (written in Golang) to make it fast and easy to send an event to Snowplow directly from the command line. You can use the app to embed Snowplow tracking directly into your shell scripts. In the rest of this post we will cover: How to install the app How to use the app Examples Under the hood...

Iglu Schema Registry 5 Scinde Dawk released

31 July 2016  •  Anton Parkhomenko
We are pleased to announce the fifth release of the Iglu Schema Registry System, with an initial release of igluctl - an Iglu command-line tool and Schema DDL as part of Iglu project. Read on for more information on Release 5 Scinde Dawk, named after the first postage stamp in Asia: igluctl Schema DDL Migration guide Iglu roadmap Getting help 1. igluctl The main feature of this release is our new igluctl command-line application, which...

Snowplow C++ Tracker 0.1.0 released

23 June 2016  •  Ed Lewis
We are pleased to announce the release of the Snowplow C++ Tracker. The Tracker is designed to work asynchronously and dependency-free within your C++ code to provide great performance in your applications, games and servers, even under heavy load, while also storing all of your events persistently allowing recovery from temporary network outages. In the rest of this post we will cover: How to install the tracker How to use the tracker Core features Roadmap...

Snowplow 81 Kangaroo Island Emu released

16 June 2016  •  Fred Blundun
We are happy to announce the release of Snowplow 81 Kangaroo Island Emu! At the heart of this release is the Hadoop Event Recovery project, which allows you to fix up Snowplow bad rows and make them ready for reprocessing. Hadoop Event Recovery Stream Enrich race condition New schemas Upgrading Getting help 1. Hadoop Event Recovery In April 2014 we released Scala Hadoop Bad Rows as part of Snowplow 0.9.2. This was a simple project...

Factotum 0.2.0 released

13 June 2016  •  Ed Lewis
We are pleased to announce release 0.2.0 of Snowplow’s DAG running tool, Factotum. This release introduces variables for jobs and the ability to start jobs from a given task. In the rest of this post we will cover: Job configuration variables Starting a job from a given task Output improvements Downloading and running Factotum Roadmap Contributing 1. Job configuration variables Jobs often contain per-run information such as a target hostname or IP address. In Factotum...

Snowplow 80 Southern Cassowary released

30 May 2016  •  Fred Blundun
Snowplow 80 Southern Cassowary is now available! This is a real-time pipeline release which improves stability and brings the real-time pipeline up-to-date with our Hadoop pipeline. The latest Common Enrich Exiting on error Configurable maxRecords Changes to logging Continuous deployment Other improvements Upgrading Getting help The latest Common Enrich This version of Stream Enrich uses the latest version of Scala Common Enrich, the library containing Snowplow’s core enrichment logic. Among other things, this means that...

Iglu Schema Registry 4 Epaulettes released

22 May 2016  •  Anton Parkhomenko
We are pleased to announce the fourth release of the Iglu Schema Registry System, with an initial release of the Iglu Core library, implemented in Scala. Read on for more information on Release 4 Epaulettes, named after the famous Belgian postage stamps: Scala Iglu Core Registry Syncer updates Iglu roadmap Getting help 1. Scala Iglu Core Why we created Iglu Core Our initial development of Iglu two years ago was a somewhat piecemeal process. The...

Introducing Avalanche for load-testing Snowplow

20 May 2016  •  Joshua Beemster
We are pleased to announce the very first release of Avalanche, the Snowplow load-testing project. As the Snowplow platform matures and is adopted more and more widely, understanding how Snowplow performs under various event scales and distributions becomes increasingly important. Our new open-source Avalanche project is our attempt to create a standardized framework for testing Snowplow batch and real-time pipelines under various loads. It will hopefully also expand ours and the community’s knowledge on what...

Snowplow Python Analytics SDK 0.1.0 released

17 May 2016  •  Fred Blundun
Following in the footsteps of the Snowplow Scala Analytics SDK, we are happy to announce the release of the Snowplow Python Analytics SDK! This library makes your Snowplow enriched events easier to work with in Python-compatible data processing frameworks such as Apache Spark and AWS Lambda. Some good use cases for the SDK include: Performing event data modeling in PySpark as part our Hadoop batch pipeline Developing machine learning models on your event data using...

Snowplow Scala Tracker 0.3.0 released

14 May 2016  •  Anton Parkhomenko
We are pleased to release version 0.3.0 of the Snowplow Scala Tracker. This release introduces a user-settable “true timestamp”, as well as several bug fixes. In the rest of this post we will cover: True timestamp Availability on JCenter and Maven Central Minor updates and bug fixes Upgrading Roadmap Getting help 1. True timestamp Last year we published the blog post Improving Snowplow’s understanding of time, which introduced a new tracker parameter, true_tstamp. This parameter...

Snowplow 79 Black Swan with API Request Enrichment released

12 May 2016  •  Anton Parkhomenko
We are pleased to announce the release of Snowplow 79 Black Swan. This appropriately-named release introduces our powerful new API Request Enrichment, plus a new HTTP Header Extractor Enrichment and several other improvements on the enrichments side. API Request Enrichment HTTP Header Extractor Enrichment Iglu client update Other improvements Upgrading Roadmap Getting help 1. API Request Enrichment The API Request Enrichment lets us perform dimension widening on an incoming Snowplow event using any internal or...

Snowplow Golang Tracker 0.1.0 released

24 April 2016  •  Joshua Beemster
We are pleased to announce the release of the Snowplow Golang Tracker. The Tracker is designed to work asynchronously within your Golang code to provide great performance in your applications and servers, even under heavy load, while also storing all of your events persistently in the event of network failure. It will also be used as a building block for a number of projects, including a new daemon to support robust asynchronous sending for the...

Introducing Factotum data pipeline runner

09 April 2016  •  Ed Lewis
We are pleased to announce the release of Factotum, a new open-source system for the execution of data pipeline jobs. Pipeline orchestration is a common problem faced by data teams, and one which Snowplow has discussed in the past. As part of the Snowplow Managed Service we operate numerous data pipelines for customers, with many pipelines including with customer-specific event data modeling. As we started to outgrow our existing Make-based solution, we reviewed many job...

Introducing Snowplow Mini

08 April 2016  •  Joshua Beemster
We’ve built Snowplow for robustness, scalability and flexibility. We have not built Snowplow for ease of use or ease of setup. Nor has the Snowplow Batch Pipeline been built for speed: you might have to wait several hours from sending an event before you can view and analyze that event data in Redshift. There are occasions when you might want to work with Snowplow in an easier, faster way. Two common examples are: New users...

Schema Guru 0.6.0 released with SQL migrations support

07 April 2016  •  Anton Parkhomenko
We are pleased to announce the release of Schema Guru 0.6.0, with long-awaited initial support for database migrations in SQL. This release is an important step in allowing Iglu users to easily and safely upgrade Redshift table definitions as they evolve their underlying JSON Schemas. This release post will cover the following topics: Introducing migrations Redshift migrations in Schema Guru New –force flag Minor CLI changes Upgrading Getting help Plans for future releases 1. Introducing...

Snowplow Scala Analytics SDK 0.1.0 released

23 March 2016  •  Alex Dean
We are pleased to announce the release of our first analytics SDK for Snowplow, created for data engineers and data scientists working with Snowplow in Scala. The Snowplow Analytics SDK for Scala lets you work with Snowplow enriched events in your Scala event processing, data modeling and machine-learning jobs. You can use this SDK with Apache Spark, AWS Lambda, Apache Flink, Scalding, Apache Samza and other Scala-compatible data processing frameworks. Some good use cases for...

Google Accelerated Mobile Pages adds support for Snowplow

19 March 2016  •  Alex Dean
We are pleased to announce that Google’s Accelerated Mobile Pages Project (AMPP or AMP) now supports Snowplow. AMP is an open source initiative led by Google to improve the mobile web experience by optimizing web pages for mobile devices. As of this week, Snowplow is natively integrated in the project, so pages optimized with AMP HTML can be tracked in Snowplow by adding the appropriate amp-analytics tag to your pages. Read on after the fold...

Snowplow 78 Great Hornbill released

15 March 2016  •  Fred Blundun
We are pleased to announce the immediate availability of Snowplow 78 Great Hornbill! This release brings our Kinesis pipeline functionally up-to-date with our Hadoop pipeline, and makes various further improvements to the Kinesis pipeline. Access to the latest Common Enrich version Click redirect mode Configurable cookie name Randomized partition keys Kinesis Elasticsearch Sink: increased flexibility New format for bad rows Kinesis Client Library upgrade Renaming Scala Kinesis Enrich to Stream Enrich Other improvements Upgrading Getting...

Iglu JSON Schema Registry 3 Penny Black released

04 March 2016  •  Fred Blundun
We are excited to announce the immediate availability of a new version of Iglu, incorporating a release of the Swagger-powered Scala Repo Server. Iglu has existed as a project at Snowplow for over two years now: after a period of relative quiet, we have an ambitious release schedule for Iglu planned for 2016, starting with this release. To reflect the growing importance of Iglu, and the number of moving parts within the platform, we will...

Snowplow JavaScript Tracker 2.6.0 released with Optimizely and Augur integration

03 March 2016  •  Joshua Beemster
We are excited to announce the release of version 2.6.0 of the Snowplow JavaScript Tracker! This release brings turnkey Optimizely and Augur.io integration, so you can automatically grab A/B testing data (from Optimizely) and device and user recognition data (from Augur) with the events you track with the JavaScript Tracker. In addition, we have rolled out support for Enhanced Ecommerce tracking, improved domain management and better handling of time! Read on to find out more…...

Snowplow 77 Great Auk released with EMR 4.x series support

28 February 2016  •  Fred Blundun
Snowplow release 77 Great Auk is now available! This release focuses on the command-line applications used to orchestrate Snowplow, bringing Snowplow up-to-date with the new 4.x series of Elastic MapReduce releases. Elastic MapReduce AMI 4.x series compatibility Moving towards running Storage Loader on Hadoop Retrying the job in the face of bootstrap failures Monitoring improvements Removal of snowplow-emr-etl-runner.sh and snowplow-storage-loader.sh Bug fixes and other improvements Upgrading Roadmap Getting help 1. Elastic MapReduce AMI 4.x series...

Schema Guru 0.5.0 released

11 February 2016  •  Anton Parkhomenko
We are pleased to announce the releases of Schema Guru 0.5.0 and Schema DDL 0.3.0, with JSON Schema and Redshift DDL processing enhancements and several bug fixes. This release post will cover the following topics: More git-friendly DDL files Added Java interoperability Fixed DDL file version bug Improvements in Schema-to-DDL transformation Upgrading Getting help Plans for future releasess 1. More git-friendly DDL files Usually Schema Guru users store their DDL files along with their JSON...

Snowplow 76 Changeable Hawk-Eagle released

26 January 2016  •  Alex Dean
We are pleased to announce the release of Snowplow 76 Changeable Hawk-Eagle. This release introduces an event de-duplication process which runs on Hadoop, and also includes an important bug fix for our recent SendGrid webhook support (#2328). Here are the sections after the fold: Event de-duplication in Hadoop Shred SendGrid webhook bug fix Upgrading Roadmap and contributing Getting help 1. Event de-duplication in Hadoop Shred 1.1 Event duplicates 101 Duplicate events are an unfortunate fact...

Snowplow Objective-C Tracker 0.6.0 released

18 January 2016  •  Joshua Beemster
We are pleased to release version 0.6.0 of the Snowplow Objective-C Tracker. This release refactors the event tracking API, introduces tvOS support and fixes an important bug with client sessionization (#257). Many thanks to community member Jason for his contributions to this release! In the rest of this post we will cover: Event batching Event creation API updates Geolocation context iOS 9.0 and XCode 7 changes tvOS support Demonstration app Other changes Upgrading Getting help...

Snowplow 75 Long-Legged Buzzard released with support for Urban Airship Connect and SendGrid

02 January 2016  •  Ed Lewis
We are pleased to announce the immediate availability of Snowplow 75 Long-Legged Buzzard. This release lets you warehouse the event streams generated by Urban Airship and SendGrid, and also updates our web-recalculate data model. The new webhook integrations are as follows: Urban Airship - for tracking mobile app-related events from Urban Airship using the new Urban Airship Connect product SendGrid - for tracking email-related events delivered by SendGrid via SendGrid webhooks Here are the sections...

Snowplow 74 European Honey Buzzard with Weather Enrichment released

22 December 2015  •  Anton Parkhomenko
We are pleased to announce the release of Snowplow release 74 European Honey Buzzard. This release adds a Weather Enrichment to the Hadoop pipeline - making Snowplow the first event analytics platform with built-in weather analytics! The rest of this post will cover the following topics: Introducing the weather enrichment Configuring the weather enrichment Upgrading Getting help Upcoming releases 1. Introducing the weather enrichment Snowplow has a steadily growing collection of configurable event enrichments -...

Scala Weather 0.1.0 released

13 December 2015  •  Anton Parkhomenko
We are pleased to announce the release of Scala Weather version 0.1.0. Scala Weather is a high-performance Scala library for fetching historical, forecast and current weather data from the OpenWeatherMap API. We are pleased to be working with OpenWeatherMap.org, Snowplow’s third external data provider after MaxMind and Open Exchange Rates. This release post will cover the following topics: Why we wrote this library Usage The cache client Getting help Plans for next release 1. Why...

Snowplow 73 Cuban Macaw released

04 December 2015  •  Fred Blundun
Snowplow release 73 Cuban Macaw is now generally available! This release adds the ability to automatically load bad rows from the Snowplow Elastic MapReduce jobflow into Elasticsearch for analysis, and formally separates the Snowplow enriched event format from the TSV format used to load Redshift. The rest of this post will cover the following topics: Loading bad rows into Elasticsearch Changes to the event format loaded into Redshift and Postgres Improved Hadoop job performance Better...

SQL Runner 0.4.0 released

03 December 2015  •  Joshua Beemster
We are pleased to announce version 0.4.0 of SQL Runner. SQL Runner is an open source app, written in Go, that makes it easy to execute SQL statements programmatically as part of a Snowplow data pipeline. This release adds some powerful new features to SQL Runner - many thanks to community member Alessandro Andrioni for his huge contributions towards yet another release! Consul support Dry run mode Environment variables template function File loading order Upgrading...

Schema Guru 0.4.0 with Apache Spark support released

17 November 2015  •  Anton Parkhomenko
We are pleased to announce the release of Schema Guru version 0.4.0 with Apache Spark support, new features in both schema and ddl subcommands, bug fixes and other enhancements. In support of this, we have also released version 0.2.0 of the schema-ddl library, with Scala 2.11 support, Amazon Redshift COMMENT ON and a more precise schema-to-DDL transformation algorithm. This release post will cover the following topics: Apache Spark support Predefined enumerations Comments on Redshift table...

SQL Runner 0.3.0 released

05 November 2015  •  Joshua Beemster
We are pleased to announce version 0.3.0 of SQL Runner. SQL Runner is an open source app, written in Go, that makes it easy to execute SQL statements programmatically as part of a Snowplow data pipeline. This release adds some powerful new features to SQL Runner - many thanks to community member Alessandro Andrioni for his huge contributions towards this release! For the first time, we are also publishing SQL Runner binaries for Windows and...

Iglu Objective-C Client 0.1.0 released

19 October 2015  •  Joshua Beemster
We are pleased to announce the release of version 0.1.0 of the Iglu Objective-C Client. This is the second Iglu client to be released (following the Iglu Scala Client) and will allow you to test and validate all of your Snowplow self-describing JSONs directly in your OS X and iOS applications. The rest of this post will cover the following topics: How to install the client How to use the client Why you should use...

Snowplow 72 Great Spotted Kiwi released

15 October 2015  •  Alex Dean
We are pleased to announce the release of Snowplow version 72 Great Spotted Kiwi. This release adds the ability to track clicks through the Snowplow Clojure Collector, adds a cookie extractor enrichment and introduces new deduplication queries leveraging R71’s event fingerprint. The rest of this post will cover the following topics: Click tracking New cookie extractor enrichment New deduplication queries Upgrading Getting help Upcoming releases 1. Click tracking Although the Snowplow JavaScript Tracker offers link...

Snowplow Scala Tracker 0.2.0 released

14 October 2015  •  Anton Parkhomenko
We are pleased to release version 0.2.0 of the Snowplow Scala Tracker. This release introduces a new custom context with EC2 instance metadata, a batch-based emitter, new tracking methods and one breaking API change. In the rest of this post we will cover: EC2 custom context Batch emitter New track methods Device sent timestamp Other updates Bug fixes Upgrading Getting help 1. EC2 custom context On any AWS EC2 instance, you can access basic information...

Snowplow Node.js Tracker 0.2.0 released

09 October 2015  •  Fred Blundun
Version 0.2.0 of the Snowplow Node.js Tracker is now available! This release changes the Tracker’s architecture and adds the ability to send Snowplow events via either GET or POST. Read on for more information… Emitters Vagrant quickstart Getting help 1. Emitters This release brings the Node.js Tracker’s API closer to those of other trackers with the addition of Emitters, objects which control how and when the events created by the Tracker are sent to the...

Snowplow Unity Tracker 0.1.0 released

08 October 2015  •  Joshua Beemster
We are pleased to announce the release of our much-requested Snowplow Unity Tracker. This Tracker rounds out our support for popular mobile environments, and is an important part of our analytics offering for videogame companies. The Tracker is designed to work completely asynchronously within your Unity code to provide great performance in your games, even under heavy load. In the rest of this post we will cover: How to install the tracker How to use...

Snowplow 71 Stork-Billed Kingfisher released

02 October 2015  •  Fred Blundun
We are pleased to announce the release of Snowplow version 71 Stork-Billed Kingfisher. This release significantly overhauls Snowplow’s handling of time and introduces event fingerprinting to support deduplication efforts. It also brings our validation of unstructured events and custom context JSONs “upstream” from our Hadoop Shred process into our Hadoop Enrich process. The rest of this post will cover the following topics: Better handling of event time JSON validation in Scala Common Enrich New unstructured...

Samza Scala example project released

30 September 2015  •  Alex Dean
We are pleased to announce the release of our new Samza Scala Example Project! This is a simple stream processing job written in Scala for the Apache Samza framework, processing JSON events from an Apache Kafka topic and regularly emitting aggregates to a second Kafka topic: This project was built by the Data Engineering team at Snowplow Analytics as a proof-of-concept for porting the Snowplow Enrichment process (which is written in Scala) to Samza. Read...

Snowplow Java Tracker 0.8.0 released

14 September 2015  •  Joshua Beemster
We are pleased to release version 0.8.0 of the Snowplow Java Tracker. This release introduces several performance upgrades and a complete rework of the API. Many thanks to David Stendardi from Viadeo for his contributions! In the rest of this post we will cover: API updates Emitter changes Performance Changing the Subject Other improvements Upgrading Documentation Getting help 1. API updates This release introduces a host of API changes to make the Tracker more modular...

SQL Runner 0.2.0 released

13 September 2015  •  Alex Dean
We are pleased to announce version 0.2.0 of SQL Runner. SQL Runner is an open source app, written in Go, that makes it easy to execute SQL statements programmatically as part of the Snowplow data pipeline. To use SQL Runner, you assemble a playbook i.e. a YAML file that lists the different .sql files to be run and the database they are to be run against. It is possible to specify which sequence the files...

Snowplow Objective-C Tracker 0.5.0 released

03 September 2015  •  Joshua Beemster
We are pleased to release version 0.5.0 of the Snowplow Objective-C Tracker. This release introduces client sessionization, several performance upgrades and some breaking API changes. In the rest of this post we will cover: Client sessionization Tracker performance Event decoration API changes Demonstration app Other changes Upgrading Getting help 1. Client sessionization This release lets you add a new client_session context to each of your Snowplow events, allowing you to easily group events from a...

Huskimo 0.3.0 released: warehouse your Twilio telephony data in Redshift

30 August 2015  •  Alex Dean
We are pleased to announce the release of Huskimo 0.3.0, for companies who use Twilio and would like to analyze their telephony data in Amazon Redshift, alongside their Snowplow event data. For readers who missed our Huskimo introductory post: Huskimo is a new open-source project which connects to third-party SaaS platforms (Singular and now Twilio), exports their data via API, and then uploads that data into your Redshift instance. Huskimo is a complement to Snowplow’s...

Kinesis S3 0.4.0 released with gzip support

26 August 2015  •  Joshua Beemster
We are pleased to announce the release of Kinesis S3 version 0.4.0. Many thanks to Kacper Bielecki from Avari for his contribution to this release! Table of contents: gzip support Infinite loops Safer record batching Bug fixes Upgrading Getting help 1. gzip support Kinesis S3 now supports gzip as a second storage/compression option for the files it writes out to S3. Using this format, each record is treated as a byte array containing a UTF-8...

AWS Lambda Scala example project released

20 August 2015  •  Vincent Ohprecio
We are pleased to announce the release of our new AWS Lambda Scala Example Project! This is a simple time series analysis stream processing job written in Scala for AWS Lambda, processing JSON events from Amazon Kinesis and writing aggregates to Amazon DynamoDB. AWS Lambda can help you jumpstart your own real-time event processing pipeline, without having to setup and manage clusters of server infrastructure. We will take you through the steps to get this...

Snowplow 70 Bornean Green Magpie released

19 August 2015  •  Fred Blundun
We are happy to announce the release of Snowplow version 70 Bornean Green Magpie. This release focuses on improving our StorageLoader and EmrEtlRunner components and is the first step towards combining the two into a single CLI application. The rest of this post will cover the following topics: Combined configuration Move to JRuby Improved retry logic App monitoring with Snowplow Compression support Loading Postgres via stdin Multiple in buckets New safety checks Other changes Upgrading...

Snowplow Objective-C Tracker 0.4.0 released

16 August 2015  •  Joshua Beemster
We are pleased to release version 0.4.0 of the Snowplow Objective-C Tracker. Many thanks to Alex Denisov from Blacklane, James Duncan Davidson from Wunderlist, Agarwal Swapnil and Hao Lian for their huge contributions to this release! In the rest of this post we will cover: Tracker performance Emitter callback Static library Demonstration app Other changes Upgrading Getting help 1. Tracker performance This release brings a complete rework of how the tracker sends events to address...

Snowplow Ruby Tracker 0.5.0 released

11 August 2015  •  Fred Blundun
We are happy to announce the release of version 0.5.0 of the Snowplow Ruby Tracker. As well as making the Tracker more robust, this release introduces several breaking API changes. Read on for more detail on: Improved concurrency More robust error handling The SelfDescribingJson class New setFingerprint method Upgrading Getting help 1. Improved concurrency The Ruby Tracker’s AsyncEmitter class now uses the Queue class to implement the producer-consumer pattern, where a fixed pool of threads...

Snowplow Python Tracker 0.7.0 released

07 August 2015  •  Fred Blundun
We are pleased to announce the release of version 0.7.0 of the Snowplow Python Tracker. This release is focused on making the Tracker more robust. The rest of this post will cover: Better concurrency Better error handling The SelfDescribingJson class Unicode support Upgrading Getting help 1. Better concurrency The Python Tracker’s AsyncEmitter now uses the Queue class to implement the producer-consumer pattern where a fixed pool of threads work on sending events. Reusing threads this...

Schema Guru 0.3.0 released for generating Redshift tables from JSON Schemas

29 July 2015  •  Anton Parkhomenko
We are pleased to announce the release of Schema Guru 0.3.0 and Schema DDL 0.1.0, our tools to work with JSON Schemas. This release post will cover the following new topics: Meet the Schema DDL library Commands and CLI changes Overview of the ddl command ddl command for Snowplow users Advanced options for ddl command Upgrading Getting help Plans for next release 1. Meet the Schema DDL library Schema DDL is a new Scala library...

Snowplow Android Tracker 0.5.0 released

28 July 2015  •  Joshua Beemster
We are pleased to announce the release of the Snowplow Android Tracker version 0.5.0. The Tracker has undergone a series of performance improvements, plus the addition of client-side sessionization. This release post will cover the following topics: Client-side sessionization Tracker performance Event building Other changes Demo app Documentation Getting help 1. Client-side sessionization This release lets you add a new client_session context to each of your Snowplow events, allowing you to easily group events from...

Snowplow 69 Blue-Bellied Roller released with new and updated SQL data models

24 July 2015  •  Christophe Bogaert
We are pleased to announce the release of Snowplow 69, Blue-Bellied Roller, which contains new and updated SQL data models. The blue-bellied roller is a beautiful African bird that breeds in a narrow belt from Senegal to the northeast of the Congo. It has a dark green back, a white head, neck and breast, and a blue belly and tail. This post covers: Updated data model: incremental New data model: mobile New data model: deduplicate...

Snowplow 68 Turquoise Jay released

23 July 2015  •  Fred Blundun
We are happy to announce the release of Snowplow 68, Turquoise Jay. This is a small release which adapts the EmrEtlRunner to use the new Elastic MapReduce API. Table of contents: Updates to the Elastic MapReduce API Multiple “in” buckets Backwards compatibility with old Hadoop Enrich versions Upgrading Getting help 1. Updates to the Elastic MapReduce API The Snowplow EmrEtlRunner uses Rob Slifka’s Elasticity Ruby library to interact with the Elastic MapReduce API. AWS recently...

Snowplow JavaScript Tracker 2.5.0 released

22 July 2015  •  Fred Blundun
We are excited to announce the release of version 2.5.0 of the Snowplow JavaScript Tracker! Among other things, this release adds new IDs for sessions and pageviews, making rich in-page and in-session analytics easier. Read on for more information: The session ID The page view ID Context-generating functions New Grunt task Breaking change to trackPageView Breaking change to session cookie timeouts Upgrading Documentation and help 1. The session ID In April, Snowplow Release 63 Red-Cheeked...

Snowplow 67 Bohemian Waxwing released

13 July 2015  •  Joshua Beemster
We are pleased to announce the release of Snowplow 67, Bohemian Waxwing. This release brings a host of upgrades to our real-time Amazon Kinesis pipeline as well as the embedding of Snowplow tracking into this pipeline. Table of contents: Embedded Snowplow tracking Handling outsized event payloads More informative bad rows Improved Vagrant VM New Kinesis S3 repository Other changes Upgrading Getting help 1. Embedded Snowplow tracking Both Scala Kinesis Enrich and Kinesis Elasticsearch Sink now...

AWS Lambda Node.js example project released

11 July 2015  •  Vincent Ohprecio
We are pleased to announce the release of our new AWS Lambda Node.js Example Project! This is a simple time series analysis stream processing job written in Node.js for AWS Lambda, processing JSON events from Amazon Kinesis and writing aggregates to Amazon DynamoDB. The AWS Lambda can help you jumpstart your own real-time event processing pipeline, without having to setup and manage clusters of server infrastructure. We will take you through the steps to get...

Kinesis S3 0.3.0 released

07 July 2015  •  Joshua Beemster
We are pleased to announce the release of Kinesis S3 version 0.3.0. This release greatly improves the speed, efficiency, and reliability of Snowplow’s real-time S3 sink for Kinesis streams. Table of contents: Embedded Snowplow tracking Optimization and efficiency More informative bad rows Improved Vagrant VM Other changes Upgrading Getting help 1. Embedded Snowplow tracking This release brings with it the ability to record Snowplow events from within the sink application itself. These events include a...

Schema Guru 0.2.0 released with brand-new web UI and support for self-describing JSON Schema

05 July 2015  •  Anton Parkhomenko
Almost a month has passed since the first release of Schema Guru, our tool for deriving JSON Schemas from multiple JSON instances. That release was something of a proof-of-concept - in this 0.2.0 release we are adding much richer functionality, plus deeper integration with the Snowplow platform. This release post will cover the following new features: Web UI Newline-delimited JSON Duplicated keys warning Base64 pattern Enums Schema segmentation Self-describing schemas Upgrading Getting help Plans for...

Snowplow Android Tracker 0.4.0 released

22 June 2015  •  Joshua Beemster
We are pleased to announce the release of the fourth version of the Snowplow Android Tracker. The Tracker has undergone a series of changes in light of the issues around the Android dex limit, resulting in the library being split in two, allowing users to either use an RxJava-based version of the tracker, or a “classic” version using a standard Java threadpool. Big thanks to Duncan at Wunderlist for his work on splitting apart the...

Snowplow 66 Oriental Skylark released

16 June 2015  •  Alex Dean
We are pleased to announce the release of Snowplow 66, Oriental Skylark. This release upgrades our Hadoop Enrichment process to run on Hadoop 2.4, re-enables our Kinesis-Hadoop lambda architecture and also introduces a new scriptable enrichment powered by JavaScript - our most powerful enrichment yet! Table of contents: Our enrichment process on Hadoop 2.4 Re-enabled Kinesis-Hadoop lambda architecture JavaScript scripting enrichment Other changes Upgrading Getting help 1. Our enrichment process on Hadoop 2.4 Since the...

Apache Spark Streaming example project released

10 June 2015  •  Vincent Ohprecio
We are pleased to announce the release of our new Apache Spark Streaming Example Project! This is a simple time series analysis stream processing job written in Scala for the Spark Streaming cluster computing platform, processing JSON events from Amazon Kinesis and writing aggregates to Amazon DynamoDB. The Snowplow Apache Spark Streaming Example Project can help you jumpstart your own real-time event processing pipeline. We will take you through the steps to get this simple...

Schema Guru 0.1.0 released for deriving JSON Schemas from JSONs

03 June 2015  •  Anton Parkhomenko
We’re pleased to announce the first release of Schema Guru, a tool for automatic deriving JSON Schemas from a collection of JSON instances. This release is part of a new R&D focus at Snowplow Analytics in improving the tooling available around JSON Schema, a technology used widely in our own Snowplow and Iglu projects. Read on after the fold for: Why Schema Guru? Current features Design principles A fuller example Getting help Roadmap 1. Why...

Snowplow Scala Tracker 0.1.0 released

29 May 2015  •  Fred Blundun
We are pleased to announce the release of the new Snowplow Scala Tracker! This initial release allows you to build and send unstructured events and custom contexts using the json4s library. We plan to move Snowplow towards being “self-hosting” by sending Snowplow events from within our own apps for monitoring purposes; the idea is that you should be able to monitor the health of one deployment of Snowplow by using a second instance. We will...

Spark Example Project 0.3.0 released for getting started with Apache Spark on EMR

10 May 2015  •  Alex Dean
We are pleased to announce the release of our Spark Example Project 0.3.0, building on the original release of the project last year. This release is part of a renewed focus on the Apache Spark stack at Snowplow. In particular, we are exploring Spark’s applicability to two Snowplow-specific problem domains: Using Spark and Spark Streaming to implement r64 Palila-style data modeling outside of Redshift SQL Using Spark Streaming to deliver “analytics-on-write” realtime dashboards as part...

Snowplow 65 Scarlet Rosefinch released

08 May 2015  •  Fred Blundun
We are pleased to announce the release of Snowplow 65, Scarlet Rosefinch. This release greatly improves the speed, efficiency, and reliability of Snowplow’s real-time Kinesis pipeline. Table of contents: Enhanced performance CORS support Increased reliability Loading configuration from DynamoDB Randomized partition keys for bad streams Removal of automatic stream creation Improved Elasticsearch index initialization Other changes Upgrading Getting help 1. Enhanced performance Kinesis’ new PutRecords API enabled the biggest performance improvement: rather than sending events...

Snowplow 64 Palila released with support for data models

16 April 2015  •  Christophe Bogaert
We are excited to announce the immediate availability of Snowplow 64, Palila. This is a major release which adds a new data modeling stage to the Snowplow pipeline, as well as fixes a small number of important bugs across the rest of Snowplow. In this post, we will cover: Why model your Snowplow data? Understanding how the data modeling takes place The basic Snowplow data model Implementing the SQL Runner data model Implementing the Looker...

Snowplow 63 Red-Cheeked Cordon-Bleu released

02 April 2015  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 63, Red-Cheeked Cordon-Bleu. This is a major release which adds two new enrichments, upgrades existing enrichments and significantly extends and improves our Canonical Event Model for loading into Redshift, Elasticsearch and Postgres. The new and upgraded enrichments are as follows: New enrichment: parsing useragent strings using the ua_parser library New enrichment: converting the money amounts in e-commerce transactions into a base currency using Open Exchange...

Snowplow ActionScript 3 Tracker 0.1.0 released

23 March 2015  •  Alex Dean
We are pleased to announce the release of our new Snowplow ActionScript 3 Tracker, contributed by Snowplow customer Viewbix. This is Snowplow’s first customer-contributed tracker - an exciting milestone for us! Huge thanks to Dani, Ephraim, Mark and Nati and the rest of the team at Viewbix for making this tracker a reality. The Snowplow ActionScript 3.0 (AS3) Tracker supports ActionScript 3.0, and lets you add analytics to your Flash Player 9+, Flash Lite 4...

Snowplow 62 Tropical Parula released

17 March 2015  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 62, Tropical Parula. This release is designed to fix an incompatibility issue between r61’s EmrEtlRunner and some older Elastic Beanstalk configurations. It also includes some other EmrEtlRunner improvements. Many thanks to Snowplow community member Dani Solà from Simply Business for his contribution to this release! Fix to support legacy Beanstalk access logs Custom bootstrap actions Other improvements to EmrEtlRunner Upgrading Getting help 1. Fix to...

Snowplow JavaScript Tracker 2.4.0 released

15 March 2015  •  Fred Blundun
We are pleased to announce the release of version 2.4.0 of the Snowplow JavaScript Tracker! This release adds support for cross-domain tracking and a new method to track timing events. Read on for more information: Tracking users cross-domain Tracking timings Dynamic handling of single-page apps Improved PerformanceTiming context Other improvements Upgrading Documentation and help 1. Tracking users cross-domain Version 2.4.0 of the JavaScript Tracker adds support for tracking users cross-domain. When a user clicks on...

Snowplow JavaScript Tracker 2.3.0 released

03 March 2015  •  Fred Blundun
We are pleased to announce the release of version 2.3.0 of the Snowplow JavaScript Tracker! This release adds a number of new features including the ability to send events by POST rather than GET, some new contexts, and improved automatic form tracking. This blog post will cover the changes in detail. POST support Customizable form tracking Automatic contexts Development quickstart Other improvements Upgrading Documentation and getting help 1. POST support Until now, the JavaScript Tracker...

Snowplow 61 Pygmy Parrot released

02 March 2015  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 61, Pygmy Parrot. This release has a variety of new features, operational enhancements and bug fixes. The major additions are: You can now parse Amazon CloudFront access logs using Snowplow The latest Clojure Collector version supports Tomcat 8 and CORS, ready for cross-domain POST from JavaScript and ActionScript EmrEtlRunner’s failure handling and Clojure Collector log handling have been improved The rest of this post will...

Snowplow Android Tracker 0.3.0 released

18 February 2015  •  Joshua Beemster
We are pleased to announce the release of the third version of the Snowplow Android Tracker. The Tracker has undergone a series of changes including removing the dependancy on the Java Core Library and a move towards using RxJava as a way of implementing asynchronous background tasks. Big thanks to Hamid at Trello for his suggestions and guidance in using Rx to track events on Android. Please note that version 0.3.0 of the Android Tracker...

Snowplow Objective-C Tracker 0.3.0 released

15 February 2015  •  Alex Dean
We are pleased to release version 0.3.0 of the Snowplow Objective-C Tracker. Many thanks to James Duncan Davidson and atdrendel from 6Wunderkinder, and former Snowplow intern Jonathan Almeida for their huge contributions to this release! In the rest of this post we will cover: Mac OS-X support New trackTimingWithCategory event Removed AFnetworking dependency Other API changes Upgrading Getting help 1. Mac OS-X support The team at 6Wunderkinder have added Mac OS X support to the...

Snowplow Python Tracker 0.6.0 released

14 February 2015  •  Fred Blundun
We are pleased to announce the release of version 0.6.0.post1 of the Snowplow Python Tracker. This version adds several methods to help identify users by adding client-side data to events. This makes the Tracker more powerful when used in conjunction with a web framework such as Django or Flask. The rest of this post will cover: set_ip_address set_useragent_user_id set_domain_user_id set_network_user_id Improved logging Upgrading and compatibility Other changes Getting help 1. set_ip_address The ip_address field in...

Snowplow 60 Bee Hummingbird released

03 February 2015  •  Fred Blundun
We are happy to announce the release of Snowplow 60! Our sixtieth release focuses on the Snowplow Kinesis flow, and includes: A new Kinesis “sink app” that reads the Scala Stream Collector’s Kinesis stream of raw events and stores these raw events in Amazon S3 in an optimized format An updated version of our Hadoop Enrichment process that supports as an input format the events stored in S3 by the new Kinesis sink app Together,...

Snowplow Java Tracker 0.7.0 released

24 January 2015  •  Alex Dean
We are pleased to release version 0.7.0 of the Snowplow Java Tracker. Many thanks to David Stendardi from Viadeo, former Snowplow intern Jonathan Almeida and Hamid from Trello for their contributions to this release! In the rest of this post we will cover: Architectural updates API updates Testing updates Upgrading the Java Tracker Documentation Getting help 1. Architectural updates Some Snowplow Java and Android Tracker users have reported serious performance issues running these trackers respectively...

Snowplow Ruby Tracker 0.4.1 released

06 January 2015  •  Fred Blundun
We are happy to announce the release of version 0.4.1 of the Snowplow Ruby Tracker. This is a bugfix release which resolves compatibility issues between the Ruby Tracker and the rest of the Snowplow data pipeline. Please note that version 0.2.0 of the Ruby Tracker is dependent upon Snowplow 0.9.14 for POST support; for more information please refer to the technical documentation. Read on for more detail on: POST request format fix Compatibility Getting help...

Snowplow PHP Tracker 0.2.0 released

05 January 2015  •  Joshua Beemster
We are pleased to announce the release of the second version of the Snowplow PHP Tracker. The tracker now supports a variety of synchronous, asynchronous and out-of-band event emitters for GET and POST requests. Please note that version 0.2.0 of the PHP Tracker is dependent upon Snowplow 0.9.14; for more information please refer to the technical documentation. This release post will cover the following topics: New emitters explained New client passthrough functions Debug mode added...

Snowplow 0.9.14 released with additional webhooks

31 December 2014  •  Alex Dean
We are pleased to announce the release of Snowplow 0.9.14, our 17th and final release of Snowplow for 2014! This release contains a variety of important bug fixes, plus support for three new event streams which can be loaded into your Snowplow event warehouse and unified log: Mandrill - for tracking email and email-related events delivered by Mandrill PagerDuty - for tracking incidents generated by PagerDuty Pingdom - for tracking site outages detected by Pingdom...

New Java and Android Tracker versions released

27 December 2014  •  Alex Dean
We are pleased to release new versions of the Snowplow Android Tracker (0.2.0) and the Snowplow Java Tracker (0.6.0), as well as the Java Tracker Core (0.2.0) that underpins both trackers. Many thanks to XiaoyiLI from Viadeo, Hamid from Trello and former Snowplow intern Jonathan Almeida for their contributions to these releases! In the rest of this post we will cover: Vagrant support Updates to Java Tracker Core Updates to the Java Tracker Updates to...

Snowplow JavaScript Tracker 2.2.0 released

15 December 2014  •  Fred Blundun
We are happy to announce the release of version 2.2.0 of the Snowplow JavaScript Tracker. This release improves the Tracker’s callback support, making it possible to use access previously internal variables such as the tracker-generated user fingerprint and user ID. It also adds the option to disable the Tracker’s use of localStorage and first-party cookies. The rest of this blog post will cover the following topics: More powerful callbacks Disabling localStorage and cookies Non-integer offsets...

Snowplow 0.9.13 released with important bug fixes

01 December 2014  •  Fred Blundun
We are happy to announce the release of Snowplow 0.9.13 fixing two bugs found in last week’s release. Read on for more information. Safer URI parsing Fixed dependency conflict Upgrading Help 1. Safer URI parsing Version 0.9.12 used the Net-a-Porter URI library to fix up non-compliant URIs which initially failed validation. This made the enrichment process more forgiving of bad URIs. It also introduced a bug: exceptions thrown by the new step were not caught....

Snowplow 0.9.12 released with real-time loading of data into Elasticsearch beta

26 November 2014  •  Fred Blundun
Back in February, we introduced initial support for real-time event analytics using Amazon Kinesis. We are excited to announce the release of Snowplow 0.9.12 which significantly improves and extends our Kinesis support. The major new feature is our all new Kinesis Elasticsearch Sink, which streams event data from Kinesis into Elasticsearch in real-time. The data is then available to power real-time dashboards and analysis (e.g. using Kibana). In addition to enabling real-time loading of data...

Snowplow 0.9.11 released with support for webhooks

10 November 2014  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 0.9.11. For the first time, you can now use Snowplow to collect, store and analyze event streams generated by supported third-party software. Many Software-as-a-Service vendors publish their own internal event streams for customers to consume - these event stream APIs are often referred to as “webhooks”, sometimes as “streaming APIs”, “postbacks” or “HTTP response APIs”. Snowplow 0.9.11 adds first-class support for an initial set of...

Snowplow iOS Tracker 0.2.0 released

08 November 2014  •  Alex Dean
We are pleased to announce the release of version 0.2.0 of the Snowplow iOS Tracker. This is an important update which changes the Tracker’s approach to recording Apple’s Identifier For Advertisers (IFA). Apps that do not display advertisements are not allowed to access the IFA on an iOS device, and Apple will reject apps that attempt to do this. Unfortunately, the Snowplow iOS Tracker v0.1.x was configured to always record the IFA as part of...

Snowplow Ruby Tracker 0.4.0 released

07 November 2014  •  Fred Blundun
We are pleased to announce the release of version 0.4.0 of the Snowplow Ruby Tracker. This release adds several methods to help identify users using client-side data, making the Ruby Tracker much more powerful when used from a Ruby web or e-commerce framework such as Rails, Sinatra or Spree. The rest of this post will cover: set_ip_address set_useragent_user_id set_domain_user_id set_network_user_id Other changes Getting help 1. set_ip_address The ip_address field in the Snowplow event model is...

Snowplow JavaScript Tracker 2.1.1 released with new events

06 November 2014  •  Fred Blundun
We are delighted to announce the release of version 2.1.1 of the Snowplow JavaScript Tracker! This release contains a number of new features, most prominently several new unstructured events and a context for recording the browser’s PerformanceTiming. This blog post will cover the following topics: New events Page performance context Link content Tracker core integration Custom callbacks forceSecureTracker Outbound queue New example page Other improvements Upgrading Getting help 1. New events 1.1 Automatic form tracking...

Snowplow 0.9.10 released with support for new JavaScript Tracker v2.1.0 events

06 November 2014  •  Alex Dean
We are pleased to announce the release of Snowplow 0.9.10. This is a minimalistic release designed to support the new events and context of the Snowplow JavaScript Tracker v2.1.1, also released today This release is primarily targeted at Snowplow users of Amazon Redshift who are upgrading to the latest Snowplow JavaScript Tracker (v2.1.0+). Here are the sections after the fold: New Redshift tables New JSON Path files A note on link_clicks Upgrading Documentation and help...

Snowplow 0.9.9 released with campaign attribution enrichment

27 October 2014  •  Fred Blundun
We are pleased to announce the release of Snowplow 0.9.9. This is primarily a comprehensive bug fix release, although it also adds the new campaign_attribution enrichment to our enrichment registry. Here are the sections after the fold: The campaign_attribution enrichment Clojure Collector fixes StorageLoader fixes EmrEtlRunner fixes and enhancements Hadoop Enrich fixes and enhancements Upgrading Documentation and help 1. The campaign_attribution enrichment Snowplow has five fields relating to campaign attribution: mkt_medium, mkt_source, mkt_term, mkt_content, and...

Snowplow PHP Tracker 0.1.0 released

30 September 2014  •  Joshua Beemster
We are pleased to announce the release of the first version of the Snowplow PHP Tracker. The tracker supports synchronous GET and POST requests. This introductory post will cover the following topics: Installation How to use the tracker Getting help 1. Installation The Snowplow PHP Tracker is published to Packagist, the central repository for Composer PHP packages. To add it to your project, add it as a requirement in your composer.json file: { "require": {...

Snowplow .NET Tracker 0.1.0 released

29 September 2014  •  Fred Blundun
We are pleased to announce the release of the first version of the Snowplow .NET Tracker. The tracker supports synchronous and asynchronous GET and POST requests and has an offline mode which stores unsent events using Message Queueing. This introductory post will cover the following topics: Installation How to use the tracker Features Logging Getting help 1. Installation The Snowplow .NET Tracker is published to NuGet, the .NET package manager. To add it to your...

Snowplow 0.9.8 released for mobile analytics

18 September 2014  •  Alex Dean
We are hugely excited to announce the release of the long-awaited Snowplow version 0.9.8, adding event analytics support for iOS and Android applications. Mobile event analytics has been the most requested feature from the Snowplow community for some time, with many users keen to feed their Snowplow data pipeline with events from mobile apps, alongside their existing websites and server software. Mobile event analytics is a major step in Snowplow’s journey from a web analytics...

Snowplow iOS Tracker 0.1.1 released

17 September 2014  •  Jonathan Almeida
We’re extremely excited to announce our initial release of the Snowplow iOS Tracker. Mobile trackers have been one of the Snowplow community’s most highly requested features, and we are very pleased to finally have this ready for release. The Snowplow iOS Tracker will allow you to track Snowplow events from your iOS applications and games. This release comes with many features you may already be familiar with in other Snowplow Trackers, along with a few...

Snowplow Android Tracker 0.1.1 released

17 September 2014  •  Jonathan Almeida
We are proud to release the Snowplow Android Tracker, one of the most requested Trackers so far. This is a major milestone for us, leveraging Snowplow 0.9.8 for mobile analytics support. The Android Tracker has evolved in tandem with the Java Tracker. We have based the Android Tracker on the same Java Tracker Core that powers the Java Tracker, along with a few additions, such as tracking geographical location, and sending mobile-specific context data. So...

Snowplow 0.9.7 released with important bug fixes

02 September 2014  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow version 0.9.7. 0.9.7 is a “tidy-up” release which fixes some important bugs, particularly: A bug in 0.9.5 onwards which was preventing events containing multiple JSONs from being shredded successfully (#939) Our Hive table definition falling behind Snowplow 0.9.6’s enriched event format updates (#965) A bug in EmrEtlRunner causing issues running Snowplow inside some VPC environments (#956) As well as these important fixes, 0.9.7 comes with...

Snowplow Ruby Tracker 0.3.0 released

29 August 2014  •  Fred Blundun
We are happy to announce the release of the Snowplow Ruby Tracker version 0.3.0. This version adds support for asynchronous requests and POST requests, and introduces the new Subject and Emitter classes. The rest of this post will cover: The Subject class The Emitter class Chainable methods Logging Contracts Other changes Upgrading Getting help 1. The Subject class An instance of the Subject class represents a user who is performing an event in the Subject-Verb-Direct...

Iglu release 2 with a new RESTful schema server

28 August 2014  •  Ben Fradet
We are pleased to announce the second release of Iglu, our machine-readable schema repository system for JSON Schema. If you are not familiar with what Iglu is, please read the blog post for the initial release of Iglu. Iglu release 2 introduces a new Scala-based repository server, allowing users to publish, test and serve schemas via an easy-to-use RESTful interface. This is a huge step forward compared to our current approach, which involves uploading schemas...

Snowplow Java Tracker 0.5.0 released

18 August 2014  •  Jonathan Almeida
We’re excited to announce another release of the Snowplow Java Tracker version 0.5.0 This release comes with a few changes to the Tracker method signatures to support our upcoming Snowplow 0.9.7 release with POST support, bug fixes, and more. Notably, we’ve added a new class for supporting your context data. I’ll be covering everything mentioned above in more detail: Project structure changes Collector endpoint changes for POST requests The SchemaPayload Class Emitter callback Configuring the...

Snowplow Python Tracker 0.5.0 released

13 August 2014  •  Fred Blundun
We are happy to announce the release of version 0.5.0 of the Snowplow Python Tracker! This release is focused mainly on synchronizing the Python Tracker’s support for POST requests with the rest of Snowplow, but also makes its API more consistent. In this post we will cover: POST requests New feature: multiple emitters More consistent API for callbacks More consistent API for tracker methods UUIDs Bug fix: flushing an empty buffer Upgrading Support 1. Updated...

Snowplow Node.js Tracker 0.1.0 released

08 August 2014  •  Fred Blundun
We are delighted to announce the release of the first version of the Snowplow Node.js Tracker. This is an npm module designed to send Snowplow events to a Snowplow collector from a Node.js environment. This post will cover installing and setting up the Node.js Tracker and introduce its main features. Background How to install the tracker How to use the tracker Features Getting help 1. Background The Snowplow Node.js Tracker is our first release making...

Snowplow Ruby Tracker 0.2.0 released

31 July 2014  •  Fred Blundun
We are pleased to announce the release of the Snowplow Ruby Tracker version 0.2.0. This release brings the Ruby Tracker up to date with the other Snowplow trackers, particularly around support of self-describing custom contexts and unstructured events. Huge thanks go to Elijah Tabb, a.k.a. ebear, for contributing the updated track_unstruct_event and track_screen_view tracker API methods among other features! Read on for more information… New tracker initialization method Updated format for unstructured events Updated format...

Snowplow 0.9.6 released with configurable enrichments

26 July 2014  •  Fred Blundun
We are pleased to announce the release of Snowplow 0.9.6. This release does four things: It fixes some important bugs discovered in Snowplow 0.9.5, related to our new shredding functionality It introduces new JSON-based configurations for Snowplow’s existing enrichments It extends our geo-IP lookup enrichment to support all five of MaxMind’s commercial databases It extends our referer-parsing enrichment to support a user-configurable list of internal domains We are really excited about our new JSON-configurable enrichments....

Snowplow Java Tracker 0.4.0 released

23 July 2014  •  Jonathan Almeida
We’re excited to announce another release of the Snowplow Java Tracker version 0.4.0. This release makes some significant updates to the Java Tracker. The main objective for this release was to bring the Tracker much closer in functional terms to the Python Tracker. In doing so, we’ve added new Emitter, TrackerPayload and Subject classes along with various changes to the existing Tracker class. Some of the other more notable features in this release is support...

Snowplow Java Tracker 0.3.0 released

13 July 2014  •  Jonathan Almeida
Today we are introducing the release of the Snowplow Java Tracker version 0.3.0. Similar to the previous 0.2.0 release, this too is a mixture of minor & stability fixes. We’ve made only a few minor interface changes, so it shouldn’t affect current users of the Java Tracker too much. You can find more on the new additions futher down in this post: Strings replaced with Maps for Context Timestamp for Trackers Logging with SLF4J Dependency...

Snowplow 0.9.5 released with JSON validation and shredding

09 July 2014  •  Alex Dean
We are hugely excited to announce the release of Snowplow 0.9.5: the first event analytics system to validate incoming event and context JSONs (using JSON Schema), and then automatically shred those JSONs into dedicated tables in Amazon Redshift. Here are some sample rows from this website, showing schema.org’s WebPage schema being loaded into Redshift as a dedicated table. (Click to zoom into the image.): With the release of Snowplow 0.9.1 back in April, we were...

Snowplow JavaScript Tracker 2.0.0 released

03 July 2014  •  Fred Blundun
We are happy to announce the release of the Snowplow JavaScript Tracker version 2.0.0. This release makes some significant changes to the public API as well as introducing a number of new features, including tracker namespacing and new link click tracking and ad tracking capabilities. This blog post will cover the following changes: Changes to the Snowplow API New feature: tracker namespacing New feature: link click tracking New feature: ad tracking New feature: offline tracking...

Snowplow Java Tracker 0.2.0 released

02 July 2014  •  Jonathan Almeida
We are pleased to announce the release of the Snowplow Java Tracker version 0.2.0. This release comes shortly after we introduced the community-contributed event tracker a little more than a week ago. In that previous post, we also mentioned our roadmap for the Java Tracker to include Android support as well as numerous other features. This release doesn’t directly act on that roadmap, but is largely a refactoring for future releases of the tracker with...

Iglu schema repository 0.1.0 released

01 July 2014  •  Alex Dean
We are hugely excited to announce the release of Iglu, our first new product since launching our Snowplow prototype two and a half years ago. Iglu is a machine-readable schema repository initially supporting JSON Schemas. It is a key building block of the next Snowplow release, 0.9.5, which will validate incoming unstructured events and custom contexts using JSON Schema. As far as we know, Iglu is the first machine-readable schema repository for JSON Schema, and...

Snowplow Java Tracker 0.1.0 by Kevin Gleason released

20 June 2014  •  Alex Dean
We are proud to announce the release of our new Snowplow Java Tracker, developed by Snowplow community member Kevin Gleason. This is our first community-contributed event tracker - a real milestone for us at Snowplow and it’s all thanks to Kevin’s fantastic work! The Snowplow Java Tracker is a simple client library for Snowplow, designed to send raw Snowplow events to a Snowplow collector. Use this tracker to add analytics to your Java-based desktop and...

Snowplow Python Tracker 0.4.0 released

10 June 2014  •  Fred Blundun
We are happy to announce the release of the Snowplow Python Tracker version 0.4.0. This version introduces the Subject class, which lets you keep track of multiple users at once, and several Emitter classes, which let you send events asynchronously, pass them to a Celery worker, or even send them to a Redis database. We have added support for sending batches of events in POST requests, although the Snowplow collectors do not yet support POST...

Snowplow 0.9.4 released with improved Looker models

30 May 2014  •  Yali Sassoon
We are very pleased to release Snowplow 0.9.4, which includes a new base LookML data model and dashboard to get Snowplow users started with Looker. The new base model has some significant improvements over the old one: Querying the data is much faster. When new Snowplow event data is loaded into Redshift, Looker automatically detects it and generates the relevant session-level and visitor-level derived tables, so that they are ready to be queried directly. We’ve...

Snowplow 0.9.3 released with Clojure Collector fixes

21 May 2014  •  Alex Dean
We are pleased to announce the release of Snowplow 0.9.3, with a whole host of incremental improvements to EmrEtlRunner, plus two important bug fixes for Clojure Collector users. The first Clojure Collector issue was a problem in the file move functionality in EmrEtlRunner, which was preventing Clojure Collector users from scaling beyond a single instance without data loss. Many thanks to community members Derk Busser and Ryan Doherty for identifying the issue and working with...

Snowplow 0.9.2 released to support new CloudFront log format

30 April 2014  •  Alex Dean
We have now released Snowplow 0.9.2, adding Snowplow support for the updated CloudFront access log file format introduced by Amazon on the morning of 29th April. This release was a highly collaborative effort with the Snowplow community (see this email thread for background). If you currently use the Snowplow CloudFront-based event collector, you are recommended to upgrade to this release as soon as possible. As well as support for the new log file format, this...

Snowplow Python Tracker 0.3.0 released

25 April 2014  •  Fred Blundun
We are pleased to announce the release of the Snowplow Python Tracker version 0.3.0. In this version we have added support for Snowplow custom contexts for all events. We have also updated the API for tracker initialization and ecommerce transaction tracking, added the option to turn off Pycontracts to improve performance, and added an event vendor parameter for custom unstructured events. In the rest of the post we will cover: Tracker initialization Disabling contracts Ecommerce...

Snowplow Ruby Tracker 0.1.0 released

23 April 2014  •  Fred Blundun
We are happy to announce the release of the new Snowplow Ruby Tracker. This is a Ruby gem designed to send Snowplow events to a Snowplow collector from a Ruby or Rails environment. This post will cover installing and setting up the Tracker, and provide some basic information about its features: How to install the tracker How to use the tracker Features Getting help 1. How to install the tracker The Snowplow Ruby Tracker is...

Spark Example Project released for running Spark jobs on EMR

17 April 2014  •  Alex Dean
On Saturday I attended Hack the Tower, the monthly collaborative hackday for the London Java and Scala user groups hosted at the Salesforce offices in Liverpool Street. It’s an opportunity to catch up with others in the Scala community, and to work collaboratively on non-core projects which may have longer-term value for us here at Snowplow. It also means I can code against the backdrop of some of the best views in London (see below)!...

Snowplow Python Tracker 0.2.0 released

15 April 2014  •  Fred Blundun
We are happy to announce the release of the Snowplow Python Tracker version 0.2.0. This release adds support for Python 2.7, makes some improvements to the Tracker API, and expands the test suite. This post will cover: Changes to the API Python 2.7 Integration tests Other improvements Upgrading Support 1. Changes to the API The call to import the tracker module has not changed: from snowplow_tracker.tracker import Tracker Tracker initialization has been simplified: t =...

Snowplow 0.9.1 released with initial JSON support

11 April 2014  •  Alex Dean
We are hugely excited to announce the immediate availability of Snowplow 0.9.1. This release introduces initial support for JSON-based custom unstructured events and custom contexts in the Snowplow Enrichment and Storage processes; this is the most-requested feature from our community and a key building block for mobile and app event tracking in Snowplow. Snowplow’s event trackers have supported custom unstructured events and custom contexts for some time, but prior to 0.9.1 there had been no...

Snowplow Python Tracker 0.1.0 by wintern Anuj More released

28 March 2014  •  Alex Dean
We are proud to announce the release of our new Snowplow Python Tracker, developed by Snowplow wintern Anuj More. Anuj was one of our two remote interns this winter, joining the Snowplow team from his base in Mumbai to work on making it easy to send events to Snowplow from Python environments. The Snowplow Python Tracker is a simple PyPI-hosted client library for Snowplow, designed to send raw Snowplow events to a Snowplow collector. Use...

Snowplow JavaScript Tracker 1.0.0 released

27 March 2014  •  Fred Blundun
We are pleased to announce the release of the Snowplow JavaScript Tracker version 1.0.0. This release adds new options for user fingerprinting and makes some minor changes to the Tracker API. In addition, we have moved to a module-based project structure and added automated testing. This post will cover the following topics: New feature: user fingerprint options Changes to the Snowplow API Move to modules Automated testing Removed deprecated functionality Other structural improvements Upgrading Getting...

Snowplow JavaScript Tracker 0.14.0 released with new features

12 February 2014  •  Fred Blundun
Alex writes: this is the first blog post - and code release - by Snowplow “springtern” Fred Blundun. Stay tuned for another blog post soon introducing Fred! We are pleased to announce the release of the Snowplow JavaScript Tracker version 0.14.0. In this release we have introduced some new tracking options and compressed our tracker for better load times. We have also updated our build process to use Grunt. This blog post will cover the...

Snowplow 0.9.0 released with beta Amazon Kinesis support

04 February 2014  •  Alex Dean
We are hugely excited to announce the release of Snowplow 0.9.0. This release introduces our initial beta support for Amazon Kinesis in the Snowplow Collector and Enrichment components, and was developed in close collaboration with Snowplow wintern Brandon Amos. At Snowplow we are hugely excited about Kinesis’s potential, not just to enable near-real-time event analytics, but more fundamentally to serve as a business’s unified log, aka its “digital nervous system”. This is a concept we...

Snowplow JavaScript Tracker 0.13.0 released with custom contexts

27 January 2014  •  Alex Dean
We’re pleased to announce the immediate availability of the Snowplow JavaScript Tracker version 0.13.0. This is the first new release of the Snowplow JavaScript Tracker since separating it from the main Snowplow repository last year. The primary objective of this release was to introduce some key new tracking capabilities, in preparation for adding these to our Enrichment process. Secondarily, we also wanted to perform some outstanding housekeeping and tidy-up of the newly-independent repository. In the...

A guide to custom contexts in Snowplow JavaScript Tracker 0.13.0

27 January 2014  •  Alex Dean
WARNING: This blog contains an outdated information. To review the current uproach, please, refer to our wiki post Custom contexts. — Earlier today we announced the release of Snowplow JavaScript Tracker 0.13.0, which updated all of our track...() methods to support a new argument for setting custom JSON contexts. In our earlier blog post we introduced the idea of custom contexts only very briefly. In this blog post, we will take a detailed look at...

Scala Forex library by wintern Jiawen Zhou released

17 January 2014  •  Alex Dean
We are proud to announce the release of our new Scala Forex library, developed by Snowplow wintern Jiawen Zhou. Jiawen joined us in the Snowplow offices in London this winter and was tasked with taking Scala Forex from a README file to an enterprise-strength Scala library for foreign exchange operations. One month later and we are hugely excited to be sharing her work with the community! Scala Forex is a high-performance Scala library for performing...

Snowplow 0.8.13 released with Looker support

08 January 2014  •  Yali Sassoon
We are very pleased to announce the release of Snowplow 0.8.13. This release makes it easy for Snowplow users to get started analyzing their Snowplow data with Looker, by providing an initial Snowplow data model for Looker so that a whole host of standard dimensions, metrics, entities and events are recognized in the Looker query interface. In this post we will cover: What’s so special about analyzing Snowplow data with Looker? What does the Looker...

Five things that make analyzing Snowplow data in Looker an absolute pleasure

08 January 2014  •  Yali Sassoon
Towards the end of 2013 we published our first blog post on Looker where we explored at a technical level why Looker is so well suited to analyzing Snowplow data. Today we released Snowplow 0.8.13, the Looker release. This includes a metadata model to make it easy for Snowplow users to get up and running with Looker on top of Snowplow very quickly. In this post, we get a bit less theoretical, and highlight five...

Snowplow 0.8.12 released with a variety of improvements to the Scalding Enrichment process

07 January 2014  •  Alex Dean
We are very pleased to announce the immediate availability of Snowplow 0.8.12. We have quite a packed schedule of releases planned over the next few weeks - and we are kicking off with 0.8.12, which consists of various small improvements to our Scalding-based Enrichment process, plus some architectural re-work to prepare for the coming releases (in particular, Amazon Kinesis support). Background on this release Scalding Enrichment improvements Re-architecting our Enrichment process Installing this release 1....

Snowplow 0.8.11 released - supports all Cloudfront log file formats and host of small improvements for power users

22 October 2013  •  Alex Dean
We’re very pleased to announce the release of Snowplow 0.8.11. This releases includes two different sets of updates: Critical update: support for Amazon’s new Cloudfront log file format (rolled out by Amazon during 21st October 2013) Nice-to-have additions - the most significant of which is IP anonymization We’ll discuss the updates one at a time, before covering how to upgrade to the latest version. Critical upgrade: support for Amazon’s new CloudFront log file format IP...

Snowplow 0.8.10 released with analytics cubes and recipes 'baked in'

18 October 2013  •  Yali Sassoon
We are pleased to announce the release of Snowplow 0.8.10. In this release, we have taken many of the SQL recipes we have covered in the Analysts Cookbook and ‘baked them’ into Snowplow by providing them as views that can be added directly to your Snowplow data in Amazon Redshift or PostgreSQL. Background on this release Reorganizing the Snowplow database Seeing a recipe in action: charting the number of uniques over time Seeing a cube...

Snowplow 0.8.9 released to handle CloudFront log file format change

05 September 2013  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 0.8.9. This release was necessitated by an unannounced change Amazon made to the CloudFront access log file format on 17th August, discussed in this AWS Forum thread and this snowplow-user email thread. Essentially, Amazon switched from URL-encoding all “%”” signs found in the cs-uri-query field, to only URL-encoding them if they were not already escaped, i.e. were not followed by “25” (“%25”). This unannounced change...

Snowplow 0.8.8 released with Postgres and Hive support

05 August 2013  •  Alex Dean
We are pleased to announce the immediate release of Snowplow 0.8.8. This is a big release for us: it adds the ability to store your Snowplow events in the popular PostgreSQL open-source database. This has been the most requested Snowplow feature all summer, so we are delighted to finally release it. And if you are already happily using Snowplow with Redshift, there are two other new features to check out: We have added support for...

.NET (C#) support added to referer-parser

09 July 2013  •  Alex Dean
We are pleased to announce the addition of .NET support (C#) to our standalone referer-parser library. Many thanks to Sepp Wijnands at iPerform Software for contributing this latest port! To recap: referer-parser is a simple library for extracting seach marketing attribution data from referer (sic) URLs. You supply referer-parser with a referer URL; it then tells you the medium, source and term (in the case of a search) for this referrer. The Scala implementation of...

Snowplow 0.8.7 released with JavaScript Tracker improvements

07 July 2013  •  Alex Dean
After a brief summer intermission, we are pleased to announce the release of Snowplow 0.8.7. This is a small release, primarily consisting of bug fixes for the JavaScript Tracker, which is bumped to version 0.12.0. As well as some tweaks and improvements, this release fixes bugs which only occurred on older versions of Internet Explorer, and fixes a bug which prevented the setCustomUrl() method from working properly. Many thanks to community member mfu0 and Snowplow...

Snowplow Tracker for Lua event analytics released

03 July 2013  •  Alex Dean
We are very pleased to announce the release of our SnowplowTracker for Lua event analytics. This is our fourth tracker to be released, following on from our JavaScript, Pixel and Arduino Trackers. As a lightweight, easily-embeddable scripting language, Lua is available in a huge number of different computing environments and platforms, from World of Warcraft through OpenResty to Adobe Lightroom. And now, the Snowplow Lua Tracker lets you collect event data from these Lua-based applications,...

Snowplow 0.8.6 released with performance improvements

03 June 2013  •  Alex Dean
We are very pleased to announce the release of Snowplow 0.8.6, with two significant performance-related improvements to the Hadoop ETL. These improvements are: The Hadoop ETL process is now much faster at processing raw Snowplow log files generated by the CloudFront Collector, because we have tackled the Hadoop “small files problem” You can now configure your ETL process on Elastic MapReduce to use Task instances alongside your Master and Core instances; optionally these task instances...

Snowplow 0.8.5 released with ETL bug fixes

24 May 2013  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 0.8.5. This is a bug fixing release, following on from our launch last week of Snowplow 0.8.4 with geo-IP lookups. This release fixes one showstopper issue with Snowplow 0.8.4, and also includes a set of smaller enhancements to help the Scalding ETL better handle “bad quality” event data from webpages. We recommend everybody on the Snowplow 0.8.x series upgrade to this version. Many thanks to...

Snowplow 0.8.4 released with MaxMind geo-IP lookups

16 May 2013  •  Alex Dean
We are pleased to announce the immediate availability of Snowplow 0.8.4. This is a big release, which adds geo-IP lookups to the Snowplow Enrichment stage, using the excellent GeoLite City database from MaxMind, Inc. This has been one of the most requested features from the Snowplow community, so we are delighted to launch it. Now you can determine the location of your website visitors directly from the Snowplow events table, and plot that data on...

A guide to unstructured events in Snowplow 0.8.3

14 May 2013  •  Alex Dean
Earlier today we announced the release of Snowplow 0.8.3, which updated our JavaScript Tracker to add the ability to send custom unstructured events to a Snowplow collector with trackUnstructEvent(). In our earlier blog post we briefly introduced the capabilities of trackUnstructEvent with some example code. In this blog post, we will take a detailed look at Snowplow’s custom unstructured events functionality, so you can understand how best to send unstructured events to Snowplow. Understanding the...

Snowplow 0.8.3 released with unstructured events

14 May 2013  •  Alex Dean
We’re pleased to announce the release of Snowplow 0.8.3. This release updates our JavaScript Tracker to version 0.11.2, adding the ability to send custom unstructured events to a Snowplow collector with trackUnstructEvent(). The Clojure Collector is also bumped to 0.5.0, to include some important bug fixes. Please note that this release only adds unstructured events to the JavaScript Tracker - adding unstructured events to our Enrichment process and storage targets is on the roadmap -...

Snowplow 0.8.2 released with Clojure Collector enhancements

08 May 2013  •  Alex Dean
We’re pleased to announce the immediate availability of Snowplow 0.8.2. This release updates the Clojure Collector only; if you are using the CloudFront Collector, then no upgrade to 0.8.2 is necessary. Many thanks to community member Mark H. Butler for his major contributions to this release - much appreciated Mark! This release bumps the Clojure Collector to version 0.4.0. There are three main changes to the Collector: Building the Collector’s warfile is now much simpler,...

Snowplow 0.8.1 released with referer URL parsing

12 April 2013  •  Alex Dean
Just nine days after our Snowplow 0.8.0 release, we are pleased to have our next release ready: Snowplow 0.8.1. With the last release we promised that the new Scalding-based ETL/enrichment process would lay a strong technical foundation for our roadmap - and hopefully this release bears that out! Until this release, Snowplow has provided users the raw referer URL, from which analysts can deduce who the referer was. In this release, Snowplow processes that referer...

Snowplow 0.8.0 released with all-new Scalding-based data enrichment

03 April 2013  •  Alex Dean
A new month, a new release! We’re excited to announce the immediate availability of Snowplow version 0.8.0. This has been our most complex release to date: we have done a full rewrite our ETL (aka enrichment) process, adding a few nice data quality enhancements along the way. This release has been heavily informed by our January blog post, The Snowplow development roadmap for the ETL step - from ETL to enrichment. In technical terms, we...

Snowplow Arduino Tracker released - sensor and event analytics for the internet of things

25 March 2013  •  Alex Dean
Today we are releasing our first non-Web tracker for Snowplow - an event tracker for the Arduino open-source electronics prototyping platform. The Snowplow Arduino Tracker lets you track sensor and event-stream information from one or more IP-connected Arduino boards. We chose this as our first non-Web tracker because we’re hugely excited about the potential of sophisticated analytics for the Internet of Things, following in the footsteps of great projects like Cosm and Exosite. And of...

Snowplow 0.7.6 released with Redshift data warehouse support

03 March 2013  •  Alex Dean
We’re excited to announce the immediate release of Snowplow version 0.7.6 with support for storing your Snowplow events in Amazon Redshift. We were very excited when Amazon announced Redshift back in late 2012, and we have been working to integrate Snowplow data since Redshift became generally available two weeks ago. Our tests with Redshift since launch have not disappointed - and we can’t wait to see what the Snowplow community do with the new platform!...

Snowplow 0.7.5 released with important JavaScript fix

25 February 2013  •  Alex Dean
We are releasing Snowplow version 0.7.5 - which upgrades the JavaScript tracker to version 0.11.1. This is a small but important release - because we are fixing an issue introduced in Snowplow version a month ago: if you are on versions 0.9.1 to 0.11.0 of the JavaScript tracker, please upgrade! Essentially, version 0.9.1 of the JavaScript tracker (released in Snowplow 0.7.2) fixed an old bug which we inherited from the Piwik JavaScript tracker when we...

Snowplow 0.7.4 released for better eventstream analytics

22 February 2013  •  Alex Dean
Another week, another release! We’re excited to announce Snowplow version 0.7.4. The primary purpose of this release is to clean up and rationalise our event data model, in particular around user IDs and event timestamps. This release should lay the foundations for more sophisticated eventstream analytics (such as funnel analysis), by: Enabling companies to assign custom user IDs (e.g. when a customer logs on) Distinguish between IDs set at a domain level (via first-party cookies)...

Snowplow 0.7.3 released, tracking additional data

15 February 2013  •  Alex Dean
We’re excited to announce the release of Snowplow version 0.7.3. This release adds a set of 16 all-new fields to our event model: A new Event Vendor field The Page URL split out into its component parts (scheme, host, port, path, querystring, fragment/anchor) The web page’s character set The web page’s width and height The browser’s viewport (i.e. visible width and height) For page pings, we are now tracking the user’s scrolling during the last...

Snowplow 0.7.2 released, with the new Pixel tracker

29 January 2013  •  Alex Dean
We’re excited to announce the release of Snowplow version 0.7.2. As well as a couple of bug fixes, this release includes our second Snowplow tracker - the Pixel Tracker, to be used in web environments where a JavaScript-based tracker is not an option. One of the bug fixes is particularly important: we are recommending that all users of the Clojure-based Collector upgrade to the new version (0.2.0) due to a serious bug in the way...

Introducing the Pixel tracker

29 January 2013  •  Yali Sassoon
The Pixel tracker enables companies running Snowplow to track users in environments that do not support Javascript. In this blog post we will cover: The purpose of the Pixel tracker) How it works Considerations when using the Pixel tracker with the Clojure collector in particular Next steps on the Snowplow tracker roadmap What is the purpose of the Pixel tracker? Our aim with Snowplow has been to enables companies to track user events across all...

Snowplow 0.7.1 released, with easier-to-run Ruby apps

22 January 2013  •  Alex Dean
We’re happy to announce the release of Snowplow version 0.7.1. This release is designed to make it much easier to install and run the two Snowplow Ruby applications: EmrEtlRunner - which runs the Snowplow ETL job StorageLoader - which loads Snowplow events into Infobright From the feedback we received, setting up and running these two Ruby apps was the most challenging (and error-prone) part of the Snowplow experience. Many thanks to all of those in...

Scala MaxMind GeoIP library released

16 January 2013  •  Alex Dean
A short blog post this, to announce the release of Scala MaxMind GeoIP, our Scala wrapper for the MaxMind Java Geo-IP library. We have extracted Scala MaxMind GeoIP from our current (ongoing) work porting our ETL process from Apache Hive to Scalding. We extracted this as a separate library for two main reasons: Being good open-source citizens - as with our referer-parser library, we believe this library willl be useful to the wider community of...

Snowplow 0.7.0 released, with new Clojure-based collector

03 January 2013  •  Alex Dean
Today we are hugely excited to announce the release of Snowplow version 0.7.0, which includes an experimental new Clojure-based collector designed to run on Amazon Elastic Beanstalk. This release allows you to use Snowplow to uniquely identify and track users across multiple domains - even across a whole content or advertising network. Many thanks to community member Simon Rumble for developing many of the ideas underpinning the new collector in SnowCannon, his node.js-based collector for...

referer-parser now with Java, Scala and Python support

02 January 2013  •  Alex Dean
Happy New Year all! It’s been three months since we introduced our Attlib project, now renamed to referer-parser, and we are pleased to announce that referer-parser is now available in three additional languages: Java, Scala and Python. To recap: referer-parser is a simple library for extracting seach marketing attribution data from referer (sic) URLs. You supply referer-parser with a referer URL; it then tells you whether the URL is from a search engine - and...

Snowplow 0.6.5 released, with improved event tracking

26 December 2012  •  Alex Dean
We’re excited to announce our next Snowplow release - version 0.6.5, a Boxing Day release for Snowplow! This is a big release for us, as it introduces the idea of event types - every event sent by the JavaScript tracker to the collector now has an event field which specifies what type of event it is. This should be really helpful for a couple of things: It should make querying Snowplow events much easier It...

Snowplow 0.6.4 released, with Infobright improvements

20 December 2012  •  Alex Dean
We’re happy to announce our next Snowplow release - version 0.6.4. This release includes updates: An upgraded Infobright table definition which scales to millions of pageviews easily Clarified Hive table definitions Before we start - a big thanks to the community members who helped out on this release: Gilles Moncaubeig @ OverBlog worked closely with us on the updated Infobright table definition Mike Moulton @ meltmedia for flagging the missing Hive table definition We’ll take...

Snowplow 0.6.3 released, with JavaScript and HiveQL bug fixes

18 December 2012  •  Alex Dean
Today we are releasing Snowplow version 0.6.3 - another clean-up release following on from the 0.6.2 release. This release bumps the JavaScript Tracker to version 0.8.2, and the Hive-data-format HiveQL file to version 0.5.2. Many thanks to the community members who contributed bug fixes to this release: Mike Moulton @ meltmedia, Simon Andersson @ Qwaya and Michael Tibben @ 99designs. We’ll take a look at both fixes below: JavaScript tracker fixes This release fixes the...

Snowplow 0.6.2 released, with JavaScript tracker bug fixes

28 November 2012  •  Alex Dean
Today we are releasing Snowplow version 0.6.2 - a clean-up release after yesterday’s 0.6.1 release. This release bumps the JavaScript Tracker to version 0.8.1; the updated minified tracker is available as always here: http(s)://d1fc8wv8zag5ca.cloudfront.net/0.8.1/sp.js This release fixes two bugs: Issue #101 - we had left in a console.log() in the production version, which should only have been printed in debug mode. Harmless but worth taking out. Many thanks to Michael Tibben @ 99designs for spotting...

Snowplow 0.6.1 released, with lots of small improvements

27 November 2012  •  Alex Dean
We’re happy to announce our next Snowplow release - version 0.6.1. This release includes updates: Additional data collection. The Javascript tracker has been updated to capture additional data points, including a user fingerprint (which can be used as a user_id for companies tracking users across domains), the tracker version, browser timezone and color depth Javascript tracker updates. A number of updates have been made to make the Javascript tracker more robust Updates to the ETL...

Integrating Snowplow with Google Tag Manager

16 November 2012  •  Yali Sassoon
A month and a half ago, Google launched Google Tag Manager (GTM), a free tag management solution. That was a defining moment in tag management history as it will no doubt bring tag management, until now the preserve of big enterprises, into the mainstream. We have spent some time testing how to get Snowplow tags working well with Google Tag Manager, and have documented our recommended approach to setting up Snowplow with GTM on the...

Snowplow 0.6.0 released, with the new StorageLoader

12 November 2012  •  Alex Dean
We’re very pleased to start the week by releasing a new version of Snowplow - version 0.6.0. This is a big release for us - as it includes the first version of our all-new StorageLoader. The release also includes a small set of tweaks and bug fixes across the existing Snowplow components, but let’s start by introducing StorageLoader: Introducing StorageLoader Up until now, Snowplow has stored all its data in S3, where it can be...

Snowplow 0.5.2 released, and introducing the Sluice Ruby gem

06 November 2012  •  Alex Dean
Another week, another release: Snowplow 0.5.2! This is a small release, consisting just of a small set of bug fixes and improvements to EmrEtlRunner - although we’ll also use this post to introduce our new Ruby gem, called Sluice. Many thanks to community member Tom Erik Stower for his testing of EmrEtlRunner over the weekend, which helped us to identify and fix these bugs: Bugs fixed Issue 71: the template config.yml (in the GitHub repo...

Snowplow 0.5.1 released, with lots of small improvements

01 November 2012  •  Alex Dean
We have just released Snowplow 0.5.1! Rather than one large new feature, version 0.5.1 is an incremental release which contains lots of small fixes and improvements to the ETL and storage sub-systems. The two big themes of these updates are: Improving the robustness of the ETL process Laying the foundations for loading Snowplow events into Infobright Community Edition (ICE) To take each of these themes in turn: 1. A more robust ETL process The Hive...

Snowplow 0.5.0 released, now with a Ruby gem to run Snowplow's ETL process on Amazon EMR

25 October 2012  •  Alex Dean
We have just released Snowplow 0.5.0, with an all-new component, the Snowplow EmrEtlRunner. EmrEtlRunner is a Ruby application to run Snowplow’s Hive-based ETL (extract, transform, load) process on Amazon Elastic MapReduce with minimum fuss. We are hugely grateful to community member Michael Tibben from 99designs for his contributions to EmrEtlRunner: thanks to Michael, EmrEtlRunner is more efficient, more flexible and more robust than it otherwise would have been - and ready sooner. Many thanks Michael!...

Infobright Ruby Loader Released

21 October 2012  •  Alex Dean
We’re pleased to start the week with the release of a new Ruby gem, our Infobright Ruby Loader (IRL). At Snowplow we’re committed to supporting multiple different storage and analytics options for Snowplow events, alongside our current Hive-based approach. One of the alternative data stores we are working with is Infobright, a columnar database which is available in open source and commercial versions. For all but the largest Snowplow users, columnar databases such as Infobright...

Snowplow 0.4.10 released

11 October 2012  •  Alex Dean
We have just released version 0.4.10 of Snowplow - people using 0.4.8 can jump straight to this version. This version updates: snowplow.js to version 0.7.0 the Hive deserializer to version 0.4.9 Big thanks to community members Michael Tibben from 99designs and Simon Andersson from Qwaya for their most-helpful contributions to this release! Main changes The main changes are as follows: The querystring parameter for site ID which the JavaScript tracker sends to your collector is...

Attlib - an open source library for extracting search marketing attribution data from referrer URLs

11 October 2012  •  Yali Sassoon
Update 17-Dec-12: We have renamed Attlib to referer-parser, to make it clearer what Attlib does: parse referer URLs. The repository has been updated accordingly. Some of the example code below is out-of-date now: we recommend checking out the repository for more information. Last night we published Attlib, an open source Ruby library for extracting search marketing attribution data from referer (sic) URLs. In this post we talk through: What Attlib does, and how to use...

Snowplow 0.4.8 released

14 September 2012  •  Alex Dean
We have just released Snowplow version 0.4.8, with a set of enhancements to the existing Hive deserializer: The Hive deserializer now supports Amazon’s new CloudFront log file format (launched 12 September 2012) as well as the older format The Hive deserializer now supports a tracking pixel called simply i (saving some characters versus ice.png) (issue #35) The Hive deserializer now works if the CloudFront distribution has Forward Query String = yes (issue #39) The Hive...

Snowplow 0.4.7 released with additional JavaScript tracking options

06 September 2012  •  Alex Dean
We have just released Snowplow version 0.4.7. This release bumps the Snowplow JavaScript tracker to version 0.6, with two significant new features: The ability to set a site ID for your tracking - useful for multi-site publishers The ability to log ecommerce transactions - useful for merchants wanting to track orders A huge thanks to community member Simon Andersson from Qwaya for contributing the ecommerce tracking functionality - thank you Simon! We’ll take a look...

Snowplow 0.4.6 released

20 August 2012  •  Alex Dean
Over the weekend we released Snowplow version 0.4.6. This was a minor release that added a new capability into the Snowplow JavaScript tracker. Specifically, with the JavaScript you can now specify your own collector URL, rather than simply pass in an account ID which resolves to a CloudFront bucket. You can use this feature in your JavaScript invocation code like so: <!-- Snowplow starts plowing --> <script type="text/javascript"> var _snaq = _snaq || []; _snaq.push(['setCollectorUrl',...

SnowCannon - a node.js collector for Snowplow

13 August 2012  •  Alex Dean
We are hugely excited to introduce SnowCannon, a Node.js collector for Snowplow, authored by [@shermozle] (http://twitter.com/shermozle). SnowCannon is an alternative collector to the default cloudfront collector included with Snowplow. It offers a number of significant advantages over the Cloudfront connector: It allows the use of 3rd party cookies. In particular, this makes it possible to track usage across multiple domains It enables real-time analytics. (This is not possible with the Cloudfront-enabled collector, where there’s a...