Since launching the open source quick start on AWS a couple of months ago, it’s been really exciting to welcome so many newcomers to the Snowplow community who are getting started with open source Snowplow quicker than ever before. And so today we are really pleased to announce the open source quick start for GCP.
The quick start was originally designed with one key objective; to make it easier for companies that want to collect best-in-class behavioural data to get started with Snowplow open source.
Getting to grips with Snowplow open source has, until recently, been quite a steep learning curve. Understanding each of the Snowplow microservices, how to configure them, and spin up the required infrastructure for an operational pipeline was a little daunting. But we also believe that it is important to maintain the modular, microservice-based architecture of the Snowplow pipeline for a number of reasons:
- To give you the ability to scale each part of your pipeline as needed – giving you fine grained control so that you can build resilience into your pipeline as your business & event volumes scale
- To give you flexibility around the architecture of your pipeline – whilst we wanted to make it easy to set-up a entire data pipeline, we also wanted to maintain the level of control that you have over the topology
To achieve this, we have leveraged the power of Terraform – one of the leading Infrastructure-as-a-code solutions on the market today – and created a set of terraform modules for each part of the Snowplow pipeline. These modules enable you to spin up the required infrastructure, such as load balancers, streams and VMs, and deploy the Snowplow microservices rapidly and reliably.
To accompany these modules, we have also added quick start example scripts. These are essentially a wrapper that chain each of the modules together and consolidate the required inputs into one file, making it even simpler to spin up an entire pipeline by running a simple terraform command.
This means that you can focus more time on collecting and getting the most value from your behavioural data:
- Start tracking our out-of-the-box web, mobile and server side events using our suite of trackers
- Create your own custom events and entities that precisely describe your business
- Easily enable and disable our full suite of out-of-the-box real time enrichments & transformations
- Consume your enriched data in near real time from Pubsub and Postgres
Get started today
Our Quick Start guide will guide you through spinning up your open source pipeline on GCP.
Whilst we are launching with just one warehouse – Postgres – the ability to load to BigQuery is coming soon and we plan to continue to expand the number of destinations available within the quick start.
All of these features can be found on our public roadmap. If you are keen that we get something done sooner, then do let us know there.