Snowplow Event Recovery 0.1.0 released

16 January 2019  •  Ben Fradet
We are excited to announce the release of Snowplow Event Recovery. The different Snowplow pipelines being all non-lossy, if something goes wrong during schema validation or enrichment, the payloads (alongside the errors that happened) are stored in a bad rows storage solution, be it a data stream or object storage, instead of being discarded. The goal of recovery is to fix the payloads contained in these bad rows so that they are ready to be...

Snowplow Spotlight Anthony Mandelli

11 January 2019  •  Miriam de Medwe
Anthony Mandelli - Digital Marketing Manager based in New Jersey What do you do at Snowplow? I’m the Digital Marketing Manager at Snowplow. My primary responsibility is writing our content like blog posts, web pages, and case studies. Outside of that, I do my best to support the rest of the marketing team who handle our social media, paid advertising, and email campaigns by reviewing their content. I also help maintain our website. Why did...

Snowplow Spotlight Cara Baestlein

19 December 2018  •  Miriam de Medwe
Sales and Implementation Engineering Lead based in Berlin, Germany What do you do at Snowplow? As part of Snowplow’s Professional Services team, I get to help clients around the world design and implement their data collection, as well as use the data most effectively to answer the questions that make the difference to their business. Why did you decide to go into data analytics? Studying economics at university, what I enjoyed most was putting the...

Debugging bad data in GCP with BigQuery

19 December 2018  •  Colm O Griobhtha
One of the key features of the Snowplow pipeline is that it’s architected to ensure data quality up front - rather than spending a lot of time cleaning and making sense of the data before using it, schemas are defined up front and used to validate data as it comes through the pipeline. Another key feature is that it’s highly loss-averse: when data fails validation, those events are preserved as bad rows. Read more about...

Snowplow Google Cloud Storage Loader 0.1.0 released

03 December 2018  •  Ben Fradet
We are pleased to release the first version of the Snowplow Google Cloud Storage Loader. This application reads data from a Google Pub/Sub topic and writes it to a Google Cloud Storage bucket. This is an essential component in the Snowplow for GCP stack we are launching: this application enables users to sink any bad data from Pub/Sub to Cloud Storage, from where it can be reprocessed, and subsequently sink either the raw or enriched...

Snowplow for Google Cloud Platform is here

03 December 2018  •  Anthony Mandelli
Since the early days of Snowplow Analytics, we’ve been committed to giving our users very granular, highly structured data because we believe that’s what you need to be truly data driven. Doing awesome things with this data, though, has been historically challenging because of how detailed it is. Thanks to Google, we have a solution to that problem. Google Cloud Platform (GCP) has grown, over the last ten years, to become one of the largest,...

Snowplow BigQuery Loader released

03 December 2018  •  Anton Parkhomenko
We are tremendously excited to announce the public release of the Snowplow BigQuery Loader. Google BigQuery is a highly-scalable and fully-managed data warehouse with real-time ingestion and rich support for semi-structured data. Since its launch, we have had many Snowplow users and prospective users request that we extend Snowplow to support loading Snowplow data into BigQuery as a storage target. This release enables us to do just that. The BigQuery Loader was the key “missing...

Long sales cycles don't have to be trouble

14 November 2018  •  Anthony Mandelli
Retailers know that understanding the way customers behave during the sales cycle is the key to optimizing this process so it’s enjoyable, rewarding, and painless for the customer and efficient for the retailer. Marketers want to connect activities like advertising campaigns to downstream activities like making a purchase. While this process might be straightforward for many companies, it can be quite convoluted for retailers with longer sales cycles such as high value goods like cars,...

Snowplow Objective-C Tracker 0.9.0 released

31 October 2018  •  Mike Hadam
We are pleased to announce a new release of the Snowplow Objective-C Tracker. Version 0.9.0 introduces an application context, and lifecycle event tracking. Read on below the fold for: Application context Lifecycle tracking Updates and bug fixes Documentation Getting help 1. Application context In this release we introduce an application context. This feature allows one to determine which version of an app sent a particular event. This can be enabled in the tracker initialization, and...

Right to be Forgotten Spark job released for meeting GDPR requirements

31 October 2018  •  Konstantinos Servis
We are pleased to announce the release of our R2F (Right to be Forgotten) Spark job. This is a stand-alone Spark job that removes rows from your Snowplow enriched events archive in Amazon S3, based on specific PII identifiers. It lets Snowplow users easily remove data about a specific user, when the data subject has requested it when exercising his or her “right to be forgotten” under Article 17 of the GDPR. For those deploying...