We are pleased to announce the fourth release of the Iglu Schema Registry System, with an initial release of the Iglu Core library, implemented in Scala.
Read on for more information on Release 4 Epaulettes, named after the famous Belgian postage stamps:
1. Scala Iglu Core
Why we created Iglu Core
Our initial development of Iglu two years ago was a somewhat piecemeal process. The design was centred on a few core ideas such as self-describing schemas, SchemaVer and several associated applications and libraries, including Schema Guru, Iglu Scala Client and of course the Snowplow platform itself.
Working on these applications, we found ourselves implementing the same Iglu-related data structures and functions multiple times. To clean up this rather piecemeal approach, we decided to extract this common functionality into a single library – Iglu Core.
The goal of Iglu Core is to provide a reference implementation of the Iglu concepts, which can then be re-implemented for other languages. This is important because Iglu is designed to be platform and language independent – it should be as usable from Scala as it is from Arduino or C++ or JavaScript.
Core concepts
The key elements introduced in our Iglu Core library are:
SchemaKey
, which contains information about the schema for a self-describing entity. A self-describing entity can be JSON data, a JSON Schema or any other rich schema or type system that can be made self-describingSchemaVer
, part of theSchemaKey
holding semantic information about the schema’s version. This is a triplet ofMODEL
,REVISION
andADDITION
SchemaCriterion
, a default way to filter self-describing entities. It holds aSchemaKey
where some or all of the version components (MODEL
,REVISION
,ADDITION
) can be unfilled
Scala-specific features
Alongside the key elements set out above, the Scala implementation of Iglu Core has some neat Scala-specific features.
Scala Iglu Core contains type classes for injecting and extracting the SchemaKey
for various data types, including representations of JSON in different Scala libraries including Json4s and Circe.
The library also offers container classes called SelfDescribingSchema
and SelfDescribingData
, to represent the SchemaKey
along with the data that key describes.
Use these containers to store, serialize and exchange data inside your Scala code in a more type-safe and concise way.
Using Iglu Core
Iglu Core has been designed around Snowplow and Iglu’s own requirements, but we expect the library will be useful to external implementers as well.
Typically you won’t have to learn the details of the Scala Iglu Core’s type classes, since we are also providing complete implementations for popular Scala JSON libraries, starting with iglu-core-json4s and iglu-core-circe.
Just include the appropriate implementation as a dependency in your project (the artifacts are available in Maven Central):
val igluCirce = "com.snowplowanalytics" %% "iglu-core-json4s" % "0.1.0" // Or: val igluJson4s = "com.snowplowanalytics" %% "iglu-core-circe" % "0.1.0"
Here is an example using iglu-core-json4s:
import com.snowplowanalytics.iglu.core.json4s._ implicit val stringifyData = StringifyData val schemaKey = SchemaKey("com.acme", "event", "jsonschema", SchemaKey(1,0,0)) val data: JValue = ??? SelfDescribingData(schemaKey, data).asString
More detailed information can be found on wiki pages dedicated to Iglu Core and Scala Iglu Core.
2. Registry Syncer updates
Until recently, a static Iglu registry was the default way to host schemas; that is now changing as the Scala-based RESTful registry server starts to mature.
To help our users work with the registry server, Iglu includes a tool called Registry Syncer, a simple Bash script allowing you to populate a registry server over HTTP in a few commands.
This release introduce following some minor improvements to Registry Syncer:
- We changed the name from Repo Syncer (as we are now referring to “schema registries” not “schema repositories”)
- The synchronization process now stops on the first failure
- We use
PUT
instead of POST, so existing schemas can be automatically overridden
In order to bootstrap your RESTful registry server with schemas you will need to:
${iglu_dir}/0-common/registry-syncer/sync.bash http://iglu.acme.com:8080 ${super_api_key} ${schemas_dir}
where ${iglu_dir}
holds a checked-out copy of the Iglu repository, ${super_api_key}
is the API key you created earlier and ${schemas_dir}
holds a directory of schemas.
3. Iglu roadmap
We have a lot planned for Iglu – both in terms of new functionality and ongoing clean-up and consolidation of our existing Iglu technology.
The next release will introduce an Iglu command-line tool, “Iglu CLI”, to help users with various Iglu-related tasks. To start with, we will port over to Iglu CLI:
- Schema Guru’s current
schema-guru ddl
command, which will evolve into a static registry generator comamnd in Iglu CLI - Our Registry Syncer, which will be ported from Bash into Scala and added as an Iglu CLI sub-command
Beyond Iglu CLI we have plenty more planned for Iglu, including adding first class support within Iglu for database table definitions (such as Redshift), mappings between different data formats (e.g. JSON Schema to Redshift), and schema migrations. Stay tuned!
4. Getting help
If you have any questions or run into any problems, please raise an issue or get in touch with us through the usual channels.