Snowplow Open-Source: the engine behind our Behavioral Data Platform
Collect, enrich and deliver behavioral event data to event streams, data warehouses and data lakes.
Trusted by thousands of organizations worldwide
The most widely adopted behavioral data pipeline and third most adopted web tracker in the world
Get started today with Snowplow Open-Source
Get started in under an hour
Set up a Snowplow pipeline in under an hour with the open source quick start.
Access the code on Github
Explore Snowplow open source by accessing the full codebase on Github.
Explore the documentation
We’ll help you on your way to capturing your first Snowplow events.
Join the discussion on Discourse
Our open source community hangs out on Discourse – post here for help!
Deliver rich, high-quality behavioral data at scale
Build a rich, high-quality behavioral data asset
Rich data out-of-the-box
Snowplow’s SDKs deliver rich web and mobile tracking out-of-the-box, including page views, time on page, scroll depth and sessions.
Enrich data in real-time
Out-of-the-box enrichments automatically add geographic, browser, OS, device and campaign parameters to your data, with the flexibility to enrich your data against any 1st or 3rd party data sets.
Extend the data to describe your business
Define your own events and entity schemas, and capture an unlimited number of properties with each event. These schemas are used to validate your data & optimize its structure in the warehouse for easier querying.
Architected from the ground up for reliability and scale
Scales without limits
Snowplow is built for scale, as proven by countless companies that use our open source tech such as CapitalOne, GitLab and The Washington Post.
Independently scalable microservices
Adopt best practice data engineering principles for managing your data at scale, with our suite of fast, reliable and scalable microservices.
Each microservice in the pipeline is observable, with latency and throughput reported in real time to Cloudwatch, Google Cloud Monitoring or forwarded to your monitoring tool of choice.
Benefit from over 9 years experience running behavioural data pipelines in the cloud
Open source first
Benefit from the 10,000s companies worldwide using our tracking SDKs, and the excellence that comes from doing things in the open since 2012.
Join a community of experts
Become part of an established and growing community; get support, share knowledge and solve problems together.
Hit the ground running
Get started with Snowplow open source and start realising the value of your behavioural data in less than a day.
Collect behavioral data from all of your platforms and channels, enrich it and deliver it to the places you need it
Collect data from web, mobile, server, email, IoT and more with our extensive set of trackers & 3rd party webhooks.
Further enrich your behavioural data, in real time, with our extensive set of configurable enrichments.
Power more use cases by delivering your behavioural data to real-time streams, your data lake and your data warehouse.
The building blocks that enable you to do more with your data
Define custom events and entities
Define and evolve your own custom schemas, and automatically handle schema migrations with our Iglu schema-ing and warehouse loader technology.
Control and obfuscate PII data
Rollout tracking with confidence
Validate data in real time in your development environment with Snowplow Mini, and write automated tests against your tracking with Snowplow Micro so you can catch data quality issues before they hit production.
Out-of-the-box data models
Directly query your data in your BI tool or ingest in your machine learning model with our performant web and mobile data models that deliver aggregated tables by user, session, web page or mobile screen.
Reprocess failed events
Diagnose, recover and reprocess data that fails to be processed with a pipeline that has been built from the ground up to be non-lossy and ensures any failed event comes with rich metadata about its failure.
The Snowplow Behavioral Data Platform
Behavioral Data Management
Workflow tooling that solves organisational problems with behavioral data productivity and governance – available only on Insights.
Behavioral Data Engine
The open source layer that empowers anyone to capture their customer’s behavior in a high-fidelity, machine-readable way.
Behavioral Data Fabric
Our unique deployment model underpinned with SLAs and security for resilience at scale – available only on Insights.