1 of 13

Getting started

Snow Owl® is a highly scalable, open source terminology server and collaborative authoring platform. It allows you to store, search and author high volumes of terminology artifacts quickly and efficiently.

Here are a few use-cases that Snow Owl could be used for:

You work in the healthcare industry and are interested in using a terminology server for browsing, accessing and distributing components of various terminologies and classifications to third-party consumers. In this case, you can use Snow Owl to load the necessary terminologies and access them via FHIR and proprietary APIs.
You are responsible for maintaining and publishing new versions of a particular terminology. In this case, you can use Snow Owl to collaboratively access and author the terminology content and at the end of your release schedule publish it with confidence and zero errors.
You have an Electronic Health Record system and would like to capture, maintain and query clinical information in a structured and standardized manner. Your Snow Owl terminology server can integrate with your EHR server via standard APIs to provide the necessary access for both terminology binding and data processing and analytics.

In this tutorial, you will be guided through the process of getting Snow Owl up and running, taking a peek inside it, and performing basic operations like importing SNOMED CT RF2 content, searching, and modifying your data. At the end of this tutorial, you should have a good idea of what Snow Owl is, how it works, and hopefully be inspired to see how you can use it for your needs.

Basic Concepts

There are a few concepts that are core to Snow Owl. Understanding these concepts from the outset will tremendously help ease the learning process.

Terminology / Code System

A terminology (also known as code system, classification and/or ontology) defines and encapsulates a set of terminology components (eg. set of codes with their meanings) and versions. A terminology is identified by a unique name and stored in a repository. Multiple code systems can exist in a single repository besides each other as long as their name is unique.

Terminology Component

A terminology component is a basic element in a code system with actual clinical meaning or use. For example in SNOMED CT, the Concept, Description, Relationship and Reference Set Member are terminology components.

Version

A version that refers to an important snapshot in time, consistent across many terminology components, also known as tag or label. It is often created when the state of the terminology is deemed to be ready to be published and distributed to downstream customers or for internal use. A version is identified by its version ID (or version tag) within a given code system.

Repository

A repository manages changes to a set of data over time in the form of revisions. Conceptually very similar to a source code repository (like a Git repository), but information stored in the repository must conform to a predefined schema (eg. SNOMED CT Concepts RF2 schema) as opposed to storing it in pure binary or textual format. This way a repository can support various full-text search functionalities, semantical queries and evaluations on the stored, revision-controlled terminology data.

A repository is identified by a name and this name is used to refer to the repository when performing create, read, update, delete and other operations against the revisions in it. Repositories organize revisions into branches and commits.

Revision

A revision is the basic unit of information stored in a repository about a terminology component or artifact. It contains two types of information:

one is the actual data that you care about, for example a single code from a code system with its meaining and properties.
the other is revision control information (aka revision metadata). Each revision is identified by a random Universally Unique IDentifier (UUID) that is assigned when performing a commit in the repository. Also, during a commit each revision is associated with a branch and timestamp. Revisions can be compared, restored, and merged.

Branch

A set of components under version control may be branched or forked at a point in time so that, from that time forward, two copies of those components may develop at different speeds or in different ways independently of each other. At later point in time the changes made on one of these branches can be merged into the other.

Branches are organized into hierarchies like directories in file systems. A child branch has access to all of the information that is stored on its parent branch up until its baseTimestamp, which is the time the branch was created. Each repository has a predefined root branch, called MAIN.

Commit

A commit represents a set of changes made against a branch in a repository. After a successful commit, the changes made by the commit are immediately available and searchable on the given branch.

Merge / Rebase

A merge/rebase is an operation in which two sets of changes are applied to set of components. A merge/rebase always happens between two branches, denoting one as the source and the other as the target of the operation.

Installation

Snow Owl requires Java 11 or newer version. Specifically as of this writing, it is recommended that you use JDK (Oracle of OpenJDK is preferred) version 11.0.2. Java installation varies from platform to platform so we won’t go into those details here. Oracle’s recommended installation documentation can be found on Oracle’s website. Suffice to say, before you install Snow Owl, please check your Java version first by running (and then install/upgrade accordingly if needed):

java -version
echo $JAVA_HOME

Once we have Java set up, we can then download and run Snow Owl. The binaries are available at the Releases pages. For each release, you have a choice among a zip or tar archive, a DEB or RPM package.

Installation example with zip

For simplicity, let's use a zip file.

Let's download the most recent Snow Owl release as follows:

curl -L -O https://github.com/b2ihealthcare/snow-owl/releases/download/<version>/snow-owl-oss-<version>.zip

Then extract it as follows:

unzip snow-owl-oss-<version>.zip

It will then create a bunch of files and folders in your current directory. We then go into the bin directory as follows:

cd snow-owl-oss-<version>/bin

And now we are ready to start the instance:

./startup

Successfully running instance

If everything goes well with the installation, you should see a bunch of log messages that look like below:

TODO example output

Explore Snow Owl

Now that we have our instance up and running, the next step is to understand how to communicate with it. Fortunately, Snow Owl provides very comprehensive and powerful APIs to interact with your instance.

REST API

Among the few things that can be done with the API are as follows:

Check your instance health, status, and statistics
Administer your instance data
Perform CRUD (Create, Read, Update, and Delete) and search operations against your terminologies
Execute advanced search operations such as paging, sorting, filtering, scripting, aggregations, and many others

Check Health

Let’s start with a basic health check, which we can use to see how our instance is doing. We’ll be using curl to do this but you can use any tool that allows you to make HTTP/REST calls. Let’s assume that we are still on the same node where we started Snow Owl on and open another command shell window.

We will be using Snow Owl's Core API to check its status. You can run the following command by clicking the "Copy" link on the right side and pasting it into a terminal.

curl http://localhost:8080/snowowl/info?pretty

And the response:

{
  "version": "<version>",
  "description": "You Know, for Terminologies",
  "repositories": {
    "items": [ {
      "id" : "snomed",
      "health" : "GREEN",
      "diagnosis" : "",
      "indices" : [ {
        "index" : "snomed-relationship",
        "status" : "GREEN"
      }, {
        "index" : "snomed-commit",
        "status" : "GREEN"
      }, ...
    } ]
  }
}

We can see the installed version along with available repositories, their overall health (eg. "snomed" with health "GREEN"), associated indices and status (eg. "snomed-relationship" with status "GREEN").

Repository indices store content for any number of code systems that share the same data structure and API, in the case of "snomed" the International Edition of SNOMED CT and its extensions.

Whenever we ask for repository status, we either get GREEN, YELLOW, or RED and an optional diagnosis message.

GREEN - everything is good (repository is fully functional)
YELLOW - some data or functionality is not available, or diagnostic operation is in progress (repository is partially functional)
RED - diagnostic operation required in order to continue (repository is not functional)

List available Code Systems

Now let's take a peek at our code systems:

curl http://localhost:8080/snowowl/codesystems?pretty

The response:

{
  "items" : [ ],
  "limit" : 0,
  "total" : 0
}

...it sure looks empty! This is expected, as Snow Owl does not contain any predefined code system metadata out of the box. We can create the first code system with the following request:

curl -X POST \
-H "Content-type: application/json" \
http://localhost:8080/snowowl/codesystems \
-d '{
  "id": "SNOMEDCT",
  "url": "http://snomed.info/sct/900000000000207008",
  "title": "SNOMED CT International Edition",
  "language": "en",
  "description": "SNOMED CT International Edition",
  "status": "active",
  "copyright": "(C) 2022 International Health Terminology Standards Development Organisation 2002-2022. All rights reserved.",
  "owner": "snowowl",
  "contact": "https://snomed.org",
  "oid": "2.16.840.1.113883.6.96",
  "toolingId": "snomed",
  "settings": {
    "moduleIds": [
      "900000000000207008",
      "900000000000012004"
    ],
    "locales": [
      "en-x-900000000000508004",
      "en-x-900000000000509007"
    ],
    "languages": [
      {
        "languageTag": "en",
        "languageRefSetIds": [
          "900000000000509007",
          "900000000000508004"
        ]
      },
      {
        "languageTag": "en-us",
        "languageRefSetIds": [
          "900000000000509007"
        ]
      },
      {
        "languageTag": "en-gb",
        "languageRefSetIds": [
          "900000000000508004"
        ]
      }
    ],
    "publisher": "SNOMED International",
    "namespace": "373872000",
    "maintainerType": "SNOMED_INTERNATIONAL"
  }
}'

Use of SNOMED CT is subject to additional conditions not listed here, and the full copyright notice has been shortened for brevity in the request above. Please see https://www.snomed.org/snomed-ct/get-snomed for details.

The request body includes:

The code system identifier "SNOMEDCT"
Various pieces of metadata offering a human-readable title, ownership and contact information, code system status, URL and OID for identification, etc.
The tooling identifier "snomed" that points to the repository that will store content
Additional code system settings stored as key-value pairs

If everything goes well, the command will run without any errors (the server returns a "204 No Content" response). We can double-check that code system metadata has been registered correctly with the following request:

curl http://localhost:8080/snowowl/codesystems/SNOMEDCT?pretty

The expected response is:

{
  "id": "SNOMEDCT",
  "url": "http://snomed.info/sct/900000000000207008",
  "title": "SNOMED CT International Edition",
  "language": "en",
  ...
  "branchPath": "MAIN/SNOMEDCT",
  ...
}

In addition to the submitted values, you will find that additional administrative properties also appear in the output. One example is branchPath which specifies the working branch of the code system within the repository.

SNOMED CT

Now that we have a code system, let's take a look at its content! We can list concepts using either the tailored to this tooling, or the for a representation that is uniform across different kinds of code systems. For the sake of simplicity, we will use the former in this example.

To list all available concepts in a code system, use the following command (just as with importing, the second SNOMEDCT in the request path represents the code system identifier):

curl http://localhost:8080/snowowl/snomedct/SNOMEDCT/concepts?pretty

The expected response is:

{
  "items": [ ],
  "limit": 50,
  "total": 0
}

The concept list is empty, indicating that we haven't imported anything into Snow Owl - yet.

Import RF2 distribution

Let's import an RF2 release in SNAPSHOT mode so that we can further explore the available SNOMED CT APIs! To do so, use the appropriate request from the as follows (the second SNOMEDCT in the request path represents the code system identifier):

curl -v http://localhost:8080/snowowl/snomedct/SNOMEDCT/import?type=snapshot\&createVersions=false \
-F file=@SnomedCT_RF2Release_INT_20170731.zip

Curl will display the entire interaction between it and the server, including many request and response headers. We are interested in these two (response) rows in particular:

< HTTP/1.1 201 Created
< Location: http://localhost:8080/snowowl/snomedct/SNOMEDCT/import/107f6efa69886bfdd73db5586dcf0e15f738efed

The first one indicates that the file was uploaded successfully and a resource has been created to track import progress, while the second row indicates the location of this resource.

Depending on the size and type of the RF2 package, hardware and Snow Owl configuration, RF2 imports might take hours to complete. Official SNAPSHOT distributions can be imported in less than 30 minutes by allocating 6 GB of heap size to Snow Owl and configuring it to use a solid state disk for the data directory.

The process itself is asynchronous and its status can be checked by periodically sending a GET request to the location indicated by the response header:

curl http://localhost:8080/snowowl/snomedct/SNOMEDCT/import/107f6efa69886bfdd73db5586dcf0e15f738efed?pretty

The expected response while the import is running:

{
  "id" : "107f6efa69886bfdd73db5586dcf0e15f738efed",
  "status" : "RUNNING"
}

Upon completion, you should receive a different response which lists component identifiers visited during the import as well as any defects encountered in uploaded release files:

{
  "id" : "107f6efa69886bfdd73db5586dcf0e15f738efed",
  "status" : "FINISHED",
  "response" : {
    "visitedComponents" : [ ... ],
    "defects" : [ ],
    "success" : true
  }
}

Search SNOMED CT

GET the ROOT concept:

curl 'http://localhost:8080/snowowl/snomedct/MAIN/concepts/138875005'

And the response:

{
  "id": "138875005",
  "released": true,
  "active": true,
  "effectiveTime": "20020131",
  "moduleId": "900000000000207008",
  "iconId": "138875005",
  "definitionStatus": "PRIMITIVE",
  "subclassDefinitionStatus": "NON_DISJOINT_SUBCLASSES"
}

Search by ECL:

curl 'http://localhost:8080/snowowl/snomedct/MAIN/concepts?active=true&ecl=%3C&#33;138875005&limit=1'

And the response:

{
  "items": [
    {
      "id": "308916002",
      "released": true,
      "active": true,
      "effectiveTime": "20020131",
      "moduleId": "900000000000207008",
      "iconId": "138875005",
      "definitionStatus": "PRIMITIVE",
      "subclassDefinitionStatus": "NON_DISJOINT_SUBCLASSES"
    }
  ],
  "searchAfter": "AoE_BWVlYzI3Mjc0LTYyZTctNDg3NS05NmVlLThhNTk3OTcxOTJiNw==",
  "limit": 1,
  "total": 19
}

Create a Concept

Version SNOMED CT

Export SNOMED CT

Conclusion

Snow Owl is both a simple and complex product. We’ve so far learned the basics of what it is, how to look inside of it, and how to work with it using some of the available APIs. Hopefully this tutorial has given you a better understanding of what Snow Owl is and more importantly, inspired you to further experiment with the rest of its great features!