Modelling page view events as a graph

13 August 2018  •  Dilyan Damyanov
In the previous post in this series we started exploring options for modelling event data as a graph in general. We looked at three ways of modelling atomic event data: The event grammar approach. The event graph approach. The denormalised graph approach. Ultimately we chose to model events as a denormalised graph, where the same things are represented multiple times in different ways (eg as both a node and a relationship). That adds redundancy to...

Building a model for event data as a graph

26 March 2018  •  Dilyan Damyanov
In recent months we’ve been busy expanding the variety of storage targets available for Snowplow users to load Snowplow enriched events. We recently launched our Snowflake Loader, and work is underway to add support for Google’s BigQuery. Thinking even further ahead, one intriguing option is to add a graph-based storage target for Snowplow. We’d like to take the community with us on this journey, so we will be documenting our progress in a series of...

An introduction to event data modeling

16 March 2016  •  Yali Sassoon
Data modeling is an essential step in the Snowplow data pipeline. We find that those companies that are most successful at using Snowplow data are those that actively develop their event data models: progressively pushing more and more Snowplow data throughout their organizations so that marketers, product managers, merchandising and editorial teams can use the data to inform and drive decision making. ‘Event data modeling’ is a very new discipline and as a result, there’s...

Data modeling in Spark (Part 1): Running SQL queries on DataFrames in Spark SQL

02 December 2015  •  Christophe Bogaert
An updated version of this blogpost was posted to Discourse. We have been thinking about Apache Spark for some time now at Snowplow. This blogpost is the first in a series that will explore data modeling in Spark using Snowplow data. It’s similar to Justine’s write-up and covers the basics: loading events into a Spark DataFrame on a local machine and running simple SQL queries against the data. Data modeling is a critical step in...