We are delighted to announce our Quick Start edition for open source; enabling you to have data flowing through an open source pipeline to stream, lake and warehouse in less than an hour.
Our open source community is incredibly important to us here at Snowplow. We are developers at heart; launching our open source solution – the engine behind our behavioural data platform – all the way back in 2012, and have continued to invest in open source ever since. Some key highlights over the years:
- Built one of the most reliable and scalable solutions available today for delivering rich behavioural data
- Launched the new failed events format making it easier to spot & diagnose data quality issues
- Invested heavily in solutions that make the data teams jobs easier such as building automatic table migrations into our warehouse loaders
- Held over 30 meetups worldwide, in locations such as Toronto, Tel-Aviv and Australia with over 2,000 attendees
Over the past year, we have listened to feedback from those wanting to get started with our open source, and learnt that we needed to make it far quicker and easier to set up. And so for the past month we have been working on a quick start edition for open source that takes the complexity out of setting up the infrastructure, cloud services & applications required to get started with Snowplow open source.
This pipeline is perfect for getting to grips with the Snowplow architecture and supports delivering & demonstrating almost the full functionality of the Snowplow open source product. It is easy to deploy, and will have you collecting rich behavioral data far quicker than ever before.
Deliver rich data to stream, lake or warehouse
- Quickly start tracking our out-of-the-box web, mobile and server side events using our suite of trackers
- Create your own custom events and entities
- Easily enable and disable our full suite of out-of-the-box real time enrichments
- Consume your enriched data in near real time from Kinesis
- Query your data on S3 and in Postgres
How the quick start edition works under the hood
We have built a set of terraform modules, which automates the setting up & deployment of the required infrastructure & applications for an operational Snowplow open source pipeline, with just a handful of input variables required on your side.
Improving our product with telemetry
We want to make this experience as easy & as valuable as possible for open source users new to Snowplow, and so we have added (optional) telemetry to the Quick Start so that we can be more data driven about future improvements. This telemetry gives us a very basic view of the topology and health of your pipeline so that we can start to understand where our open source users are running into problems and the value that this edition provides.
We will always be completely transparent about what we are tracking – the telemetry module is open source so you can inspect the code yourself, and we have made it easy to opt-out entirely should you wish.
Get started today
Our Quick Start guide will guide you through spinning up your open source pipeline, and teach you about some of the core functionality that the Snowplow pipeline provides. You can also find a side-by-side feature comparison.
GCP! Whilst we are launching on AWS, support for GCP is coming in the next couple of months, so watch this space.
We have loads of plans on how we can better the open source experience even further, including:
- Support for Redshift and Snowflake loading
- An open source UI, where you can monitor your pipeline
- Support for Kafka as the streaming service
- … And much more
All of these features can be found on our public roadmap. If you are keen that we get something done sooner, then do let us know there.
Let us know what you think
We would, as always, love to hear from you. If you have any suggestions or feedback that you would like to share with us, or you just want to say hello – please reach out to me at email@example.com (Product Manager for Open Source).