Snowplow Python Tracker 0.7.0 released

07 August 2015  •  Fred Blundun

We are pleased to announce the release of version 0.7.0 of the Snowplow Python Tracker. This release is focused on making the Tracker more robust.

The rest of this post will cover:

  1. Better concurrency
  2. Better error handling
  3. The SelfDescribingJson class
  4. Unicode support
  5. Upgrading
  6. Getting help

1. Better concurrency

The Python Tracker’s AsyncEmitter now uses the Queue class to implement the producer-consumer pattern where a fixed pool of threads work on sending events. Reusing threads this way performs better than the previous implementation in which a new thread was created for every network request.

You can configure the number of threads to use with the new thread_count keyword argument of the AsyncEmitter’s constructor. If your application only rarely sends events, the default of 1 thread should be good enough; otherwise try experimenting with different values to determine which works best.

We have also eliminated a race condition where sending many events at once could cause events to be duplicated or skipped.

2. Better error handling

The previous Tracker version only treated requests with status code 200 as successful. This behavior has been broadened to include all 2xx and 3xx status codes.

If a request causes a network-related exception, the Tracker will now catch that exception and treat the request as failed. This means that network unavailability will no longer cause the Tracker to throw an exception.

3. The SelfDescribingJson class

The new SelfDescribingJson class is used to make building unstructured events and custom contexts more straightforward.

So instead of fully specifying the JSON you wish to send like this:

my_event = {
	'schema': 'iglu:com.acme/myevent/jsonschema/1-0-0',
	'data': {
		'color': 'red'
	}
}
my_context = {
	'schema': 'iglu:com.acme/mycontext/jsonschema/1-0-1',
	'data': {
		'size': 5
	}
}
my_tracker.track_unstruct_event(my_event, [my_context])

you would now use the SelfDescribingJson class to automatically handle the “schema” and “data” fields like this:

from snowplow_tracker import SelfDescribingJson
my_event = SelfDescribingJson(
	'iglu:com.acme/myevent/jsonschema/1-0-0',
	{
		'color': 'red'
	}
)

my_context = SelfDescribingJson(
	'iglu:com.acme/mycontext/jsonschema/1-0-0',
	{
		'size': 5
	}
)

my_tracker.track_unstruct_event(my_event, [my_context])

WARNING: This is a breaking change and old-style API calls which manually construct unstructured events and custom contexts will no longer work.

4. Unicode support

Michael Thomas (@mthomas on GitHub) made the Tracker compatible with Python 2.x’s unicode strings. Many thanks Michael!

5. Upgrading

To add the Snowplow Tracker as a dependency to your own Python app, edit your requirements.txt and add:

snowplow-tracker ~> 0.7.0

If you are upgrading from the previous version, note that the AsyncEmitter’s new default of a single worker thread may not be fast enough. If this is the case then you may want to experiment with configuring different numbers of worker threads.

You will need to update all unstructured events and contexts using the SelfDescribingJson class as described above.

Finally, note that synchronously flushing the AsyncEmitter from inside an on_success or on_failure callback can now lead to deadlock. This is because these callbacks get executed by the worker threads. You can avoid deadlock by performing the flush in a different thread, but it shouldn’t be necessary to do a synchronous flush in a callback at all.

6. Getting help

Useful links:

If you have an idea for a new feature or want help getting things set up, please get in touch. And of course raise an issue if you spot any bugs!