The point of Sauna then is to do this third piece: to make it easier for you to act on your insight by pushing the output of computation/intelligence performed on your event streams (in step 2) to different channels.
Read on below the fold to find out more:
When we started building Snowplow almost five years ago, our focus was on delivering a scalable open-source event data pipeline. Our view was that you shouldn’t need to have the scale or deep pockets of a Google or Facebook to warehouse your clickstream data; we understood that establishing an event data warehouse was the essential first step to driving insight for your business.
Fast forward to today and it’s clear that much has changed: the importance of owning your own event stream data is a given, and the frontier has moved on to turning our insights into actions. It’s no longer enough just to understand which of your customers have a high propensity to churn - what can you do about it, ideally in near-real-time?
Sauna is an all-new open-source product designed to make it easy for business analysts to turn insights they derive from their event streams into actions often performed via third-party marketing systems like Optimizely and SendGrid. If Snowplow makes it easier for data analysts to build intelligence on their data, Sauna is here to make it easy for them to action that intelligence, by feeding it back into the different channels that are used to manage customer engagement. Whilst Snowplow provides you with a complete view of your user’s journeys, Sauna enables you to intervene in those journeys, hopefully to improve them.
There are many different channels that an individual company might use to engage with its users, including (but not limited to):
Often companies use third party providers to manage their communications via each of the above channels. Whilst there are many excellent providers of e.g. email marketing, there are no general purpose frameworks for individual companies to push intelligence and decisions built on their event-level data into the myriad channels (and third party providers) a company works with. Sauna has been built to solve that problem.
If Snowplow is all about consolidating event streams from many sources into a event warehouse in Redshift, then Sauna is its complement: once you have the output of your analysis in Redshift, you can use Sauna to automatically pipe that data into Optimizely or SendGrid; a variety of integrations with other systems will be added to Sauna in due course.
Although Sauna is complementary to Snowplow (and built by the same team), you don’t have to be a Snowplow user to use Sauna; you don’t even have to be running your company on AWS. Sauna is for anybody who wants to make decisions based on their event stream data and then to act on those decisions, particularly via another software system.
Popular enterprise middleware frameworks like Apache Camel and MuleSoft have existed for many years. These technologies have typically been targeted at back-end developers, providing relatively low-level building blocks and frameworks for integrating various software systems together.
More recently, user-programmable rules engines have emerged, most famously IFTTT, which is hosted, and Huginn, which is an open-source Ruby project.
How does Sauna compare to all this? Firstly, Sauna is a platform, not a framework: it is targeted at business analysts and other non-engineers who want to be able to respond to the insights they are generating without involving their Tech team in costly bespoke integration work.
Secondly, unlike IFTTT and Huginn, Sauna has been built to inter-operate with a company’s unified events log: like Snowplow, Sauna is designed from the ground-up to be horizontally scalable and handle massive data volumes.
Sauna is a single executable, written in Scala using the Akka actor framework. Sauna is composed of three distinct types of module:
This technical architecture shows how these module types fit together within Sauna:
This first release of Sauna is shipping with the following modules:
We’ll go through each of the two responders - the core of Sauna - in the next two sections.
Our first responder allows you to use Sauna with SendGrid, the marketing and transactional email service provider (ESP).
This Sauna responder has a single responder action, which lets you export user-level data from your event warehouse and upload this data to SendGrid for use in email marketing. The SendGrid Responder will wait for files of email recipients to arrive in its configured file landing area, and then upload these email recipients into the SendGrid Contacts Database (part of the SendGrid Marketing Campaigns suite).
The responder works with both of our observers, local filesystem and Amazon S3. Coupling this responder with Redshift’s UNLOAD statement and our SQL Runner, you can schedule nightly updates to your email marketing lists based on your Snowplow data in Redshift.
Under the hood the SendGrid Responder uses SendGrid’s Contacts API. This responder saves you from a costly manual integration of your data pipeline into SendGrid using this API.
For more information on the SendGrid Responder, please check out:
Our second responder in this release adds support for Optimizely, the A/B testing service.
This responder supports two responder actions:
As with our SendGrid Responder, the Optimizely Responder works with both of our observers, local filesystem and Amazon S3. Coupling this responder with Redshift’s UNLOAD statement and our SQL Runner, you can schedule nightly updates to your A/B testing targeting lists or DCP profiles, all based on your Snowplow data in Redshift.
Under the hood, Sauna makes use of Optimizely’s Targeting List and Bulk Upload APIs. This responder saves you from having to manually integrate either or both of these APIs into your data pipeline.
For more information on the Optimizely Responder, please check out:
Ready to get started with Sauna? You can deploy it onto a single server - version 0.1.0 doesn’t yet support clustering - and put it through its paces.
You’ll find all the necessary documentation on the Setting up Sauna homepage for devops and systems admins on the Sauna wiki.
We’re taking a very explorative, iterative approach with Sauna - the first release is deliberately narrow, being focused on just two marketing platforms and only supporting relatively “batchy” source data.
However we have ambitious plans for Sauna’s future. In the short-term, summer intern Manoj Rajandrakumar has been working on an additional responders for Urban Airship, which we hope to release soon (here is a sneak peek of the users guide).
Looking to the future, we are also very interested in extending Sauna to be able to respond to decisions in near-real-time. Our current thinking is to use JSON Schema (or Avro) to define specific commands (e.g. “send email”, “raise PagerDuty incident”), and for Sauna to then be able to read those commands from Amazon Kinesis or Apache Kafka streams. This would involve adding new observers for Kinesis and Kafka, as well as defining the new command schemas, which is discussed in Command schema: design (issue #54).
Lastly, while Sauna currently runs on a single server, it has been built on top of Akka, and we will be working to add Akka Cluster support for a distributed multi-node setup (issue #56).
Sauna is completely open source - and has been from the start! If you’d like to get involved, perhaps adding a new observer, responder or logger, please do check out the repository.
If you are looking for an additional integration to be added to Sauna please get in touch to discuss sponsorship options.
And finally, we are super-excited to be developing a new software category - decisioning and response - through the Sauna project. If you have general thoughts or ideas on what the future of Sauna should look like, do please open a new thread on our forums.