Snowplow Open-Source: the engine behind our Behavioral Data Platform

Collect, enrich and deliver behavioral event data to event streams, data warehouses and data lakes.

Trusted by thousands of organizations worldwide

The most widely adopted behavioral data pipeline and third most adopted web tracker in the world

Three ways to get started with Snowplow Open-Source

Access the code on Github

Start setting up Snowplow by accessing the full codebase on Github.

Explore the documentation

We’ll help you on your way to capturing your first Snowplow events.

Join the discussion on Discourse

Our open source community hangs out on Discourse – post here for help!

Deliver rich, high-quality behavioral data at scale

Build a rich, high-quality behavioral data asset

Rich data out-of-the-box

Snowplow’s SDKs deliver rich web and mobile tracking out-of-the-box, including page views, time on page, scroll depth and sessions.

Enrich data in real-time

Out-of-the-box enrichments automatically add geographic, browser, OS, device and campaign parameters to your data, with the flexibility to enrich your data against any 1st or 3rd party data sets.

Extend the data to describe your business

Define your own events and entity schemas, and capture an unlimited number of properties with each event. These schemas are used to validate your data & optimize its structure in the warehouse for easier querying.

A single log-level event table with multiple sources

Architected from the ground up for reliability and scale

Scales without limits

Snowplow is built for scale, as proven by countless companies that use our open source tech such as CapitalOne, GitLab and The Washington Post.

Independently scalable microservices

Adopt best practice data engineering principles for managing your data at scale, with our suite of fast, reliable and scalable microservices.

Fully observable

Each microservice in the pipeline is observable, with latency and throughput reported in real time to Cloudwatch, Google Cloud Monitoring or forwarded to your monitoring tool of choice.

Benefit from over 9 years experience running behavioural data pipelines in the cloud

Open source first

Benefit from the 10,000s companies worldwide using our tracking SDKs, and the excellence that comes from doing things in the open since 2012.

Join a community of experts

Become part of an established and growing community; get support, share knowledge and solve problems together.

Hit the ground running

Get started with Snowplow open source and start realising the value of your behavioural data in less than a day.

partners

15+ Trackers

Collect data from web, mobile, server, email, IoT and more with our extensive set of trackers & 3rd party webhooks.

15+ enrichments

Further enrich your behavioural data, in real time, with our extensive set of configurable enrichments.

9+ destinations

Power more use cases by delivering your behavioural data to real-time streams, your data lake and your data warehouse.

The building blocks that enable you to do more with your data

pipeline diagram showing sources - validation - enrichment - real-time event stream - data warehouse

Define custom events and entities

Define and evolve your own custom schemas, and automatically handle schema migrations with our Iglu schema-ing and warehouse loader technology.

Control and obfuscate PII data

Easily collect fully anonymous data with our Javascript tracker, or obfuscate and remove personally identifiable information with our configurable PII and IP anonymizations.

Rollout tracking with confidence

Validate data in real time in your development environment with Snowplow Mini, and write automated tests against your tracking with Snowplow Micro so you can catch data quality issues before they hit production.

Out-of-the-box data models

Directly query your data in your BI tool or ingest in your machine learning model with our performant web and mobile data models that deliver aggregated tables by user, session, web page or mobile screen.

Reprocess failed events

Diagnose, recover and reprocess data that fails to be processed with a pipeline that has been built from the ground up to be non-lossy and ensures any failed event comes with rich metadata about its failure.

The Snowplow Behavioral Data Platform

Behavioral Data Management

Workflow tooling that solves organisational problems with behavioral data productivity and governance – available only on Insights.

Behavioral Data Engine

The open source layer that empowers anyone to capture their customer’s behavior in a high-fidelity, machine-readable way.

Behavioral Data Fabric

Our unique deployment model underpinned with SLAs and security for resilience at scale – available only on Insights.

data platform illustration - flexibility - data quality - ownership

Give your organization a competitive advantage with

high-quality behavioral data