As described in our Discourse post, MaxMind will not provide monthly updates to their now-legacy databases starting April 2nd.
To tackle this issue and keep the IP Lookups Enrichment as accurate as possible, we are releasing a new version of the enrichment, for both the batch and streaming pipelines, which interacts with GeoIP2 databases, Maxmind’s new format.
A special thanks to Tiago Macedo and Andrew Korzhuev, who worked on the scala-maxmind-iplookups library upgrade, without which this enrichment upgrade wouldn’t have been possible.
On the security side of things, we have made the cross-domain policy of the Clojure Collector configurable; this change is inline with the updates made to the Scala Stream Collector back in Release 98 Argentomagus.
First, what is a Flash cross-domain policy? Quoting the Adobe website:
A cross-domain policy file is an XML document that grants a web client, such as Adobe Flash Player or Adobe Acrobat (though not necessarily limited to these), permission to handle data across domains. When clients request content hosted on a particular source domain and that content make requests directed towards a domain other than its own, the remote domain needs to host a cross-domain policy file that grants access to the source domain, allowing the client to continue the transaction.
To allow a Flash media player hosted on another web server to access content from the Adobe Media Server web server, we require a crossdomain.xml file. A typical use case will be HTTP streaming (VOD or Live) to a Flash Player. The crossdomain.xml file grants a web client the required permission to handle data across multiple domains.
A cross-domain policy file gives the necessary permissions when, for example, you are trying to make a request to a Snowplow collector from a Flash game given that both are running on different hosts.
The Clojure Collector embeds what was a very permissive cross-domain policy file, giving permission to any domain and not enforcing HTTPS:
With this release, we’re completely removing the
/crossdomain.xml route by default - should you need it, manually re-enable it by adding the two following environment properties to your Elastic Beanstalk application:
SP_CDP_DOMAIN: the domain that is granted access,
*.acme.comwill match both
SP_CDP_SECURE: a boolean indicating whether to only grant access to HTTPS or both HTTPS and HTTP sources
This release also marks the availability of the PII enrichment for the batch pipeline, check out the dedicated blog post to learn more.
This release contains quite a few community contributions which we’d like to highlight, huge thanks to everyone involved!
Thanks to Mike Robins from Snowflake Analytics, extracting IP addresses from collector payloads originating from the Scala Stream Collector has gotten better.
Snowplow now successfully extracts IPv6 IPs from these Scala Stream Collector payloads, and now inspects the
Forwarded header in addition to the historically supported
subaccount property in the Mandrill events format has meant that many Mandrill events have been failing enrichment.
To resolve this, community member Adam Gray has authored new 1-0-1 schemas for our Mandrill events, and updated the adapter to emit these new versions.
Finally, thanks to Kristoffer Snabb and Thales Mello for improving the repo-embedded documentation, as follows:
Whether you are using the batch or streaming pipeline, it is important to perform this upgrade if you make use of the MaxMind IP Lookups Enrichment.
To make use of the new enrichment, you will need to update your
ip_lookups.json so that it conforms to the new
An example is provided in the GitHub repository.
If you are a streaming pipeline user, a version of Stream Enrich incorporating the upgraded IP Lookups Enrichment can be found on our Bintray here.
If you are a batch pipeline user, you’ll need to either update your EmrEtlRunner configuration to the following:
or directly make use of the new Spark Enrich available at:
The new Clojure Collector is available in S3 at:
To re-enable the
/crossdomain.xml path, make sure to specify the
SP_CDP_SECURE environment properties as described above.
We have a packed schedule of new and improved features coming for Snowplow. Upcoming Snowplow releases will include:
For more details on this release, please check out the release notes on GitHub.
If you have any questions or run into any problems, please visit our Discourse forum.