Five things that make analyzing Snowplow data in Looker an absolute pleasure

Towards the end of 2013 we published our first blog post on Looker where we explored at a technical level why Looker is so well suited to analyzing Snowplow data. Today we released Snowplow 0.8.13, the Looker release. This includes a metadata model to make it easy for Snowplow users to get up and running with Looker on top of Snowplow very quickly. In this post, we get a bit less theoretical, and highlight five very tangible reasons why analyzing Snowplow data with Looker is such an absolute pleasure.

  1. Slice and dice any combination of dimension and metrics
  2. Quickly and easily define dimensions and metrics that are specific to your business using Looker’s lightweight metadata model
  3. Drill up and drill right down to visitor-level and event-level data
  4. Dashboards are a strating point for more involved analysis
  5. Access your data from any application: Looker as a general purpose data server

With Looker, you can slice your Snowplow data by any dimension / metric combination. To give some illustrative examples - we can plot the number of visits, bounce rate, pages per visit and events per visit by landing page:

We may want to plot the number of new visitors by landing page over time:

Or perhaps we want to compare the number of transactions by customers based on the channel they were first acquired on (first touch referer source):

Creating the above slices of data is as simple as selecting the dimension / metric combination from the long list provided in the Looker UI.

Looker’s metadata model makes it very easy to define and analyze busines specific:

  1. Events
  2. Dimensions (e.g. audience segments, session classification)
  3. Funnels
  4. Metrics

To give a very specific example: at Snowplow we are very interested in whether or not visitors to our website visit the ‘services’ pages, for example, as that indicates that they are potentially interested in our Pro Services offering.

We can add a dimension to our events.lookerml model that categorises whether a specific event has occurred on a services page or not:

# Snowplow-website specific dimension
  - dimension: occurred_on_services_page
    type: yesno
    sql: ${page_urlpath} LIKE '%services%'

We can then create a metric that counts the number of events that occur on services pages, further down the events.lookerml definition:

# Snowplow-website specific metric
  - measure: events_on_services_page_count
    type: count
    filters:
      occurred_on_services_page: yes

Both the above dimension and metric will now be available to include in any report produced in the Explorer. For example, we can now compare the number of events that occurred on services page by marketing campaign, landing page or over time.

We can define additional derived metrics (e.g. average events on a service page per visitor / session) or dimensions (e.g. classify visitors by whether or not they have visited the services pages at all) by simply extending the metadata model. The Looker metadata model is flexible enough to extend with your business, as you become more sophisticated in your use of data.

To illustrate this, let’s start by comparing visit and engagement levels by refer medium for the last month (i.e. a session-level analysis):

We can see visitors referered from other websites appear to engage more deeply, on average. We can explore that further, to see if it is true across e.g. all landing pages, by clicking on the Landing Page Count (which is “7” and circled above):

This opens another view, which lets us compare events per visit and bounce rates by the seven different landing pages that users refered to our websites from other websites were driven to. It looks like users refered from external websites to our recipe on market basket analysis engaged particularly deeply with our website:

We can explore this further by clicking on the visit count to see the actual visits. For example, if we click on the “17” visits to the market basket analysis (circled above)…

…we are shown an actual list of the 17 visits, including the cookie ID and the time each visitor spent on the website. (Note that all but the 3rd visitor were visiting our website for the first time).

It looks like the 9th visitor on the list was on our website for a particularly long period of time - let’s click on “Event Stream” (circled above) to find out what he / she actually did on the website:

We are now shown the complete event stream for that user on that session. Incredibly, the visitor only visited that recipe page (did not navigate to any other pages on our website).

<h2>4. Dashboards are a starting point for more involved analysis</h2>

It is straightforward in Looker to develop customized dashboards. The following is an example of one included in our Looker release:

Most BI tools offer great dashboarding facilities. What we like particularly about Looker’s is that clicking on any of the graphs sends you straight into the Explorer, so you can then start slicing / dicing and drilling in as described in the sections above. For example, if you clicked on the data point circled above (representing the number of visits from search engines to the website on January 6th) brings up a list of all those different sessions. We can then click on the Event Stream for any of those sessions to see what actually occurred.

<h2>5. Access your data from any application: Looker as a general purpose data server</h2>

As well as enabling users to plot graphs directly in Looker, it is also possible to use Looker as a data server to make your data easily available to other applications to visualize.

You can set Looker up to make specific slices of data available at designated URLs, in JSON, CSV or tab-delimited format, so that it can easily be ingested and refreshed from any application, including:

  • Web applications (e.g. built using D3.js)
  • Analytics tools e.g. R, Python / Pandas
  • Spreadsheets e.g. Google Docs and Excel

Say for example the following cut of data was important to us (the number of visits and events per visitor by web page, for the last week):

We can use Looker to publish the data to a URL. We’ve published the above view to the following URLs - check them out in your browser to see how easy it is to fetch the data:

  • Tab-delimited data
  • CSV data
  • JSON data
  • Open the data in Google Spreadsheets using =ImportData("https://snowplowanalytics.looker.com/looks/xyY6yZTg23RzrYpT2y9y4t5tggSMq7xz.csv")

You can see how the data looks in Google Spreadsheets below:

Because the data is being served live, it is always up-to-date. Pretty cool, huh?

Want to get started with Looker?

Then get in touch with the team at Looker or the team at Snowplow to arrange a trial.

Thoughts or questions? Come join us in our Discourse forum!

Yali Sassoon

Yali is co-founder and analytics lead at Snowplow. You can find in him on , Twitter and LinkedIn.