Snowplow Mini 0.6.0 released

08 August 2018  •  Oguzhan Unlu

We are pleased to announce the 0.6.0 release of Snowplow Mini, our accessible “Snowplow in a box” distribution.

Snowplow Mini is the complete Snowplow real-time pipeline running on a single instance, available for easy deployment as a pre-built AMI on AWS as well as a hosted image for GCP. Use it to:

  1. Set up an inexpensive and easily discardable Snowplow stack for testing your tracker and schema changes
  2. Learn about Snowplow without having to set up a horizontally-scalable, highly-available production-grade pipeline

This release brings Snowplow Mini experience to Google Cloud Platform, migrates underlying infrastructure to Docker and bumps Elasticsearch stack to latest stable version. Also, as of this release, Snowplow Mini will be available in 3 different sizes; large, xlarge and xxlarge to meet varying purposes.

Read on for:

  1. Google Cloud Platform support
  2. Docker migration
  3. New Iglu Server
  4. Freshening Elasticsearch stack
  5. Other updates
  6. Documentation and getting help

1. Google Cloud Platform support

Version 0.6.0 introduces Snowplow Mini to the GCP ecosystem, enabling our users to have the Snowplow real-time pipeline experience on GCP!

We offer three different images for the three new sizes of Snowplow Mini.

Check out Snowplow Mini GCP Setup Guide to find out how to use them and more!

2. Docker migration

Up until this release, we were using the traditional Linux service management package, SysVinit. Even though this approach is quite mature enough, we wanted to leverage our Docker images to benefit from the advantages of managing Snowplow Mini with Docker, i.e. portability across machines, out-of-the-box logging service, volume management, and more.

This migration also comes with some internal changes under the hood, including:

  • Bumping the Iglu Server version to 0.3.0 (#152)
  • Bumping the Elasticsearch & Kibana versions to 6.3.1 (#79)
  • Bumping the Stream Enrich to 0.18.0 (#174)
  • Bumping the Scala Stream Collector to 0.13.0 (#176)

3. New Iglu Server

One of our goals for Snowplow Mini is making it stateless, meaning that all the required services such as Iglu Server, Elasticsearch, Postgres, etc, live outside the actual box running Snowplow Mini.

As part of this goal, we’ve introduced plenty of features for Control Plane previously in Snowplow Mini 0.4.0. Today we are adding a new feature on top of them: enabling Iglu Server to use an external Postgres instance.

Instead of specifying external Postgres configuration only, placed in Iglu Server’s configuration file, we introduce the ability to upload Iglu Server configuration file, enabling to play with all bits of the configuration including Postgres connection details.

iglu-server-conf

Note that this release also bumps Iglu Server to 0.3.0 which introduced a new configuration parameter repo-server.baseURL meaning that our users should upload their own Iglu Server config file with repo-server.baseURL set to <snowplow-mini-deployment-address>/iglu-server, if they want to use Swagger UI of Iglu Server. Note that you should omit the protocol (i.e. http(s)://), because Swagger UI will automatically prepend that.

4. Freshening Elasticsearch stack

Most of the recent issues we faced with Snowplow Mini were mostly due to running very old versions of Elasticsearch (1.7.5) and Kibana (4.0.1). Although we considered renewing them before, there was a tradeoff between heavier resource usage and having brand-new Elasticsearch stack. We finally made the call and decided to bump their versions to 6.3.1 at the expense of using more resource, RAM especially.

5. Other updates

Until today, Snowplow Mini was being used inside AWS’s t2.medium instances and it served well for demonstration purposes. However, we observed that Snowplow Mini started exceeding its initial motivation and machine resources started to become an obstacle, causing issues with Elasticsearch etc. This is why, 0.6.0 is available at 3 different sizes.

  • large : Same image published so far. Elasticsearch has 4g heap size and Snowplow apps has 0.5g heap size.
  • xlarge : Double the large image. Elasticsearch has 8g heap size and Snowplow apps has 1.5g heap size.
  • xxlarge : Double the xlarge image. Elasticsearch has 16g heap size and Snowplow apps has 3g heap size.

What’s more, as part of bumping Elasticsearch version to 6.x, we had to remove Head plugin since site plugins are removed from Elasticsearch as of 5.x. However, Head plugin can be used as Google Chrome extension.

6. Documentation and getting help

To learn more about getting started with Snowplow Mini, check out the Quickstart guide.

If you run into any problems, please raise a bug or join our gitter room or get in touch with us through the usual channels.