Snowplow for retail part 4: what can we do with data when we're growing?

06 March 2019  •  Archit Goyal

We recommend that you have read the first post in this series before diving into this one to ensure you have all the context you need!

Now we’re looking at a data team that is growing and has several analysts and maybe some spare engineering resource as the company is starting to see real value in the analytics you have served to date.

We’re working under the assumption that you’ve already taken all the steps recommended for a data team that’s just starting out.

That means that at this point you have (where applicable):

  • Aggregated and refined web and mobile data
  • Joined data from web, mobile and offline
  • Created a marketing attribution model

Now that you are a team with several analysts then you can take this a bit further and take the following steps:

  1. Track online conversions server side
  2. Use webhooks to track app installs and social media clicks
  3. Understand the funnels in your products
  4. Begin building a more intelligent user stitch

Again, let’s take a non-technical look at how each of these could be achieved with Snowplow. All the ideas mentioned here are things we have seen Snowplow users do.

1. Track conversions server-side

Rather than rewrite what has already been brilliantly covered by Rebecca, please read her blog post on the subject.

In short, ensure that you are always tracking your mission critical events server-side as this is the most robust way to track these. Doing this also helps maintain good data governance by limiting what information is pushed to the data layer.

2. Use webhooks to track app installs and social media clicks

Use Snowplow’s third party integrations to track app installs and social media clicks (as examples).

Any third party data can be sent through the Snowplow pipeline so that it lands in your warehouse in the same format as all the other events. This means you can use whatever service you are comfortable with to track social media clicks and send this event to the Snowplow pipeline. Yali also wrote a great blog post that goes into more detail on ad impressions and link clicks.

With these events in your data warehouse, you can gain an even better understanding of your user journeys. You can also add this data to your attribution model and refine it further.

Remember, you can figure out which user installed the app or clicked on the facebook link using the third party cookie or IDFA/IDFV as described in the earlier section on how to “Join web, mobile and offline data”.

As your team gets more sophisticated, so too should your tracking. With a customizable tool like Snowplow, your tracking design is limited only by your imagination. With more sophisticated tracking, you can gain an even better understanding of your users and how your money is being spent.

3. Funnels

We have written an ebook on the subject on Product Analytics which you can download now.

To look specifically at user journeys, most Snowplow Insights customers can take advantage of our Indicative integration with a free Indicative account. Indicative is one of the best tools out there for understanding how users move through your product. Do get in touch for more information or to request a demo!

Example: the signup process Let’s assume the product manager wants some insights into how users move through the signup process.

Out of the box tracking for form filling behavior (not just form submission behavior) lets you understand how people are dropping off. What we have seen many clients do is add a custom field to a signup custom event that holds the failure reason if a user enters an invalid value, which is what is shown in the example below.

This is a user flow for someone going through a signup process in an app. Each row contains one event. There are three event types, screen views (out of the box), application background (out of the box) and signup flow custom events. There is one entity, a screen entity that tells us what screen the event took place on.

signup flow

timestamps

From this table, some quick insights we have:

  • Which steps took the longest to complete (step 2)
  • The user gave up after their email address attempt had an invalid underscore character (steps 10 and 11)

This is only one user so the data isn’t too reliable, so now that you have an understanding of the data, it is easy to aggregate up to millions of sessions and users and show this in Indicative:

indicative

The UX designer and Product manager can work on this to maximize conversions where possible.

4. Behavioral user stitching

In a previous section we talked through how to do a standard user stitch that relies on a user identifying themselves. Snowplow was built as a tool to capture many user identifiers so it is well suited to that task.

However, in some cases a user will just refuse to identify themselves, browsing your site in blissful anonymity. For example, you might have multiple people browsing your site with the same browser (a family, office or school) where not all of them identify themselves in every session: how would the analysts on your data team separate their behavior?

Let’s steer clear of Machine Learning as a one stop solution for now though. What can you do to build on the user stitch of the previous post?

Assuming each family member identifies themselves at some point, you have some sessions where you have a high confidence that you know who they are. You can start assigning probabilities to future sessions on the same device based on behavior.

  • One family member (Josephine) goes on the careers pages often
  • The other family member searches for products often (Joseph)

You can build a simple ranking of frequent behavior using the event level data. Then, when a session starts and the user doesn’t identify themselves you can guess who they are.

  • They search for “hobs”: they want to buy hobs, probably Joseph?
  • Wait, they just searched for “jobs”: Slightly more likely to be Joseph than Josephine
  • They view 6 job pages: pretty sure its Josephine

This is a very simplified example just meant to illustrate what you can do with access to rich, event level data. The model that you build will be specific to your business and will be designed after thorough exploration of this rich dataset.

Read part 5 next: What can we do with the data when we’re well established?