There are occasions when you might want to work with Snowplow in an easier, faster way. Two common examples are:
Today we’re delighted to announce Snowplow Mini to meet the above two use cases: the complete Snowplow Real-Time Pipeline, on a single AMI / EC2 instance. Download, and setup, in minutes…
Snowplow Mini is the complete Snowplow Real-Time Pipeline running on a single instance, available for easy install as a pre-built AMI. Set it up in minutes by following the quickstart guide.
Once deployed, you fetch the public IP for your Snowplow Mini instance from the EC2 console. You can then:
1.1 Log into Snowplow Mini
1.2 Record events
1.3 Explore your data in Elasticsearch and Kibana
1.4 Debug bad data in Elasticsearch and Kibana
Once Snowplow Mini is up and running, you should be able to fetch the IP address of the instance it is running on from the EC2 console:
Navigate to that IP address in the browser. There you’ll find Snowplow Mini:
Send in some events in! You can do this directly from the Snowplow Mini UI, by selecting the Example Events tab and clicking the different buttons. Each button click will be recorded as an event:
You can, more usefully, send events using any one of our Snowplow trackers. Simply configure the tracker to use the Snowplow Mini collector endpoint on
Sending in events is great, but now we want to look at the data.
The simplest way to get started is to view the data in Kibana. This will require a quick initial setup.
Navigate in the browser to
http://<<your-snowplow-mini-public-ip>>:5601. Kibana will invite you to setup an index pattern. Let’s first setup an index for ‘good’ data (i.e. data that is successfuly processed) by entering the following values:
Hit the create button. Now we have our good index setup:
Now let’s create a second index for our bad data. Clikc the Add New button on the top right of the screen and then enter the following values to configure the index for bad data:
Now let’s look at our data. Hit the “Discover” menu item:
We can build graphs in the Visualize section and assemble them together in the Dashboards section. You can also use other tools for visualizing the data in Elasticsearch: Elasticsearch can be queried directly on
One of the primary uses of Snowplow Mini is to enable Snowplow users to debug updates to their tracker instrumentation in real-time, significantly reducing updates to tracker deployments.
If you have defined your own event and entity (context) schemas, you will need to push these schemas to the Iglu repository that is bundled with Snowplow Mini. There is a simple script you can run to copy those schemas from your lcoal machine to Snowplow Mini: instructions can be found here.
Once you’ve done that, you can start sending data into Snowplow Mini to see if it is processed successfully. Each event you send should either land in the
good index or
bad index. To switch from one to the other in Kibana, select the cog icon in the top right of the screen and then select the index you want to view from the dropdown:
In the below example you can see that one bad event has landed. It is straightforward to drill in and identify the issue with processing the event (it has a invalid type of
More information on debugging your data in Elasticsearch / Kibana can be found here.
The pipeline running on Snowplow Mini is essentially the Snowplow Real-Time Pipeline:
The key difference is that on Snowplow Mini:
This diagram illustrates the mini data pipeline:
The current Snowplow Mini stack consists of the following applications:
As so many services are running on the box we recommend a
t2.medium or higher for a smooth experience during testing and use. This is dependant on a number of factors such as the amount of users and the amount of events being sent into the instance.
We have big plans for Snowplow Mini:
We also want to make it easy to setup and run Snowplow Mini outside of EC2 by:
If you have an idea of something you would like to see or need from Snowplow Mini please raise an issue!
For more details on this release, please check out the release notes on GitHub.
If you have any questions or run into any problems, please raise an issue.