Schema Guru 0.5.0 released

11 February 2016  •  Anton Parkhomenko

We are pleased to announce the releases of Schema Guru 0.5.0 and Schema DDL 0.3.0, with JSON Schema and Redshift DDL processing enhancements and several bug fixes.

This release post will cover the following topics:

  1. More git-friendly DDL files
  2. Added Java interoperability
  3. Fixed DDL file version bug
  4. Improvements in Schema-to-DDL transformation
  5. Upgrading
  6. Getting help
  7. Plans for future releasess

1. More git-friendly DDL files

Usually Schema Guru users store their DDL files along with their JSON Schemas in a single git repository; if the user adds or modifies their schemas and then regenerates the DDL files, all of the DDL files will then contain fresh timestamps, leading to confusing git diffs.

To avoid this you can now use --no-header option, whereby Schema Guru will generates DDL files without any header information, just plain DDL.

2. Added Java interoperability

Java users have been keen to use Schema Guru from their code - from this release, all the schema-to-DDL processing and schema flattening features of the Schema DDL library are available from Java.

3. Fixed DDL file version bug

Schema DDL had a long-standing bug where it versioned all Redshift DDLs with hardcoded _1 version postfix.

To work well with SchemaVer, our Redshift DDL should be versioned after the MODEL element of JSON Schema version. For example, schemas with SchemaVers 1-0-0, 1-2-0 or 1-2-3 should all result in table with a version postfix _1, and events having any of these three versions can be loaded into this _1 table.

Many thanks to community member Cameron Bytheway for his fix here!

4. Improvements in Schema-to-DDL transformation

With each new release Schema Guru is steadily growing smarter at transforming JSON Schemas into DDL files, detecting various clues about how to map JSON Schema properties into column definitions.

This release brings the following improvements:

  • Property of type string with equal minLength and maxLength will become CHAR even if it can also become null
  • Property of type number having multipleOf equal 1 will become INT
  • Property of type number having multipleOf equal 0.01 will become DECIMAL with 2 digits after floating point (this is useful for monetary amounts)

5. Upgrading

Schema Guru CLI

Simply download the latest Schema Guru from Bintray:

$ wget http://dl.bintray.com/snowplow/snowplow-generic/schema_guru_0.5.0.zip
$ unzip schema_guru_0.5.0.zip

Assuming you have a recent JVM installed, running should be as simple as:

$ ./schema-guru-0.5.0 {schema|ddl} {input} {options}

Schema Guru web UI and Spark Job

No changes have been made to either Schema Guru web UI and Spark Job, so you still can freely use 0.4.0 versions. Versions with the 0.5.0 badge are also available on Bintray for consistency.

6. Getting help

For more details on this release, please check out the Schema Guru 0.5.0 on GitHub.

More details on the technical architecture of Schema Guru can be found on the For Developers page of the Schema Guru wiki.

If you have any questions or run into any problems, please raise an issue or get in touch with us through the usual channels.

7. Plans for future releases

We have plenty of features planned for Schema Guru! The roadmap includes:

  • Generating schemas in Apache Avro format (issue #38)
  • Deriving the required property in our schema subcommand (issue #54)
  • Generating CREATE TABLE DDL for other databases (issue #26)