arrow-left

Only this pageAll pages
gitbookPowered by GitBook
1 of 21

8.9.2

Introduction

Loading...

Plan your deployment

Loading...

Loading...

Loading...

Configuration

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Software Upgrades

Loading...

Backup and Restore

Loading...

Loading...

Miscellaneous

Loading...

Loading...

Technology stack

The technology stack behind the Terminology Server consists of the following components:

  • The Terminology Server application

  • Elasticsearch as the data layer

  • An LDAP-compliant authentication and authorization service

  • Optional: A reverse-proxy handling the requests towards the REST API

hashtag
Terminology Server

Outgoing communication from the Terminology Server goes via:

  • HTTP(s) towards Elasticsearch

  • LDAP(s) towards the A&A service

Incoming communication is handled through the HTTP port of 8080.

A selected reverse proxy solution is responsible for channeling all incoming traffic through to the Terminology Server.

hashtag
Elasticsearch

The currently supported version of Elasticsearch is which is upward compatible with any patch releases coming on the 7.x version stream. Elasticsearch v8 is not supported yet.

The Elasticsearch cluster can either be:

  • a co-located, single-node, self-hosted cluster

  • a managed Elasticsearch cluster hosted by

circle-exclamation

Having a co-located Elasticsearch service next to the Terminology Server has a direct impact on the hardware requirements. See our list of recommended hardware on the .

hashtag
LDAP-compliant A&A service

For authorization and authentication, the application supports any traditional LDAP Directory Servers. We recommend starting with and evolving to other solutions later because it is easy to set up and maintain while keeping Snow Owl's user data isolated from any other A&A services.

hashtag
Reverse proxy

A reverse proxy, such as is recommended to be utilized between the Terminology Server and either the intranet or the internet. This will increase security and help with channeling REST API requests appropriately.

With a preconfigured domain name and DNS record, the default installation package can take care of requesting and maintaining the necessary certificates for secure HTTP. See the details of this in the Configuration section.

circle-info

For simplifying the initial setup process we are shipping the Terminology Server with a default configuration of a co-located Elasticsearch cluster, a pre-populated OpenLDAP server, and an NGINX reverse proxy with the ability to opt-in for an SSL certificate.

Hardware requirements

hashtag
Snow Owl TS and co-located Elasticsearch cluster

For installations where Snow Owl TS and Elasticsearch is co-located we recommend the following hardware specification:

Snow Owl TS + ES
Cloud
Dedicated

Snow Owl® TS Admin Guide

hashtag
Introduction

Welcome to the official documentation of the Snow Owl Terminology Server: the search and authoring engine that powers the Snow Owl Authoring Platform and the Snowray Terminology Service. If you want to learn how to install and provision the Terminology Server, you've come to the right place. This guide shows you how to:

v7.17.1arrow-up-right
elastic.coarrow-up-right
next page
OpenLDAParrow-up-right
NGINXarrow-up-right
Snow Owl 8.x Terminology Server Architecture Diagram
Select the appropriate hardware and software environment to host the service
  • Download, install and configure the entire technology stack necessary for operating the server

  • Handle release packages to upgrade to a newer version

  • Perform a data backup or a restore

  • Manage intermittent tasks, e.g. adding/revoking user access

  • In case you would like to skip ahead, here is a set of quick links leading to different sections of the guide.

    🗺️Plan your deploymentchevron-right
    📤Configurationchevron-right
    ↗️Software Upgradeschevron-right
    💾Backup and Restorechevron-right
    📋Miscellaneouschevron-right

    vCPU

    8

    8

    Memory

    32 GB

    32 GB

    I/O performance

    >= 5000 IOPS SSD

    >= 5000 IOPS SSD

    Disk space

    200 GB

    200 GB

    hashtag
    Snow Owl TS and managed Elasticsearch cluster

    For installations where Snow Owl TS connects to a managed Elasticsearch cluster at elastic.coarrow-up-right we recommend the following hardware specification:

    Snow Owl TS
    Cloud
    Dedicated

    vCPU

    8 (compute optimized)

    8

    Memory

    16 GB

    16 GB

    I/O performance

    OS: balanced disk

    TS file storage: local SSD

    OS: HDD / SSD

    TS file storage: SSD

    Elasticsearch @ elastic.co
    Cloud

    vCPU

    8 (compute optimized)

    Memory

    4 GB

    I/O performance

    handled by elastic.co

    Disk space

    180 GB

    hashtag
    Cloud VMs

    Here are a few examples of which Virtual Machine types could be used for hosting the Terminology Server at the three most popular Cloud providers (including but not limited to):

    Cloud Provider
    VM type

    GCP

    AWS

    Azure

    Disk space

    OS: 20 GB

    TS file storage: 100 GB

    OS: 20 GB

    TS file storage: 100 GB

    c2d-highcpu-8arrow-up-right
    c5d-2xlargearrow-up-right
    F8s v2arrow-up-right

    Software requirements

    hashtag
    Operating System

    The Terminology Server is recommended to be installed on x86_64 / amd64 Linux operating systems where Docker Engine is available. See the list of supported distributions by Docker:

    Here is the list of distributions that we suggest in the order of recommendation:

    • CentOS 7

    • Ubuntu 20.04 (or 18.04)

    • Debian 10 - Buster

    hashtag
    Software packages

    Before starting the actual deployment of the Terminology Server make sure that the following packages are installed and configured properly:

    • Docker Engine

    • ability to execute bash scripts

    hashtag
    Firewall

    In case a reverse proxy is used, the Terminology Server requires two ports to be opened either towards the intranet or the internet (depends on usage):

    • http:80

    • https:443

    In case there is no reverse proxy installed, the following port must be opened to be able to access the server's REST API:

    • http:8080

    docker-composearrow-up-right

    Release package

    Terminology Server releases are shared with customers through custom download URLs. The downloaded artifact is a Linux (tar.gz) archive that contains:

    • an initial folder structure

    • the configuration files for all services

    • a docker-compose.yml file that brings together the entire technology stack to run and manage the service

    • the credentials required to pull our proprietary docker images

    As a best practice, it is advised to extract the content of the archive under /opt. So the deployment folder will be /opt/snow-owl. The docker-compose setup will rely on this path, however, if required it can be changed by editing the ./snow-owl/docker/.env file later on (see DEPLOYMENT_FOLDER environment variable).

    When decompressing the archive it is important to use the --same-owner and --preserve-permissions options so the docker containers can access the files and folders appropriately.

    The next page will describe the content of the release package in more detail.

    Preload dataset (optional)

    In certain cases, a pre-built dataset is also shipped together with the Terminology Server. This is to ease the initial setup procedure and get going fast.

    circle-info

    This method is only applicable to deployments where the Elasticsearch cluster is co-located with the Terminology Server.

    To load data into a managed Elasticsearch cluster, there are several options:

    • use

    • use

    • use Snow Owl to rebuild the data to the remote cluster

    These datasets are the compressed form of the Elasticsearch data folder which follows the same structure. Except for having a top folder called indexes . This is the same folder as in ./snow-owl/resources/indexes . So to be able to load the dataset one should just extract the contents of the dataset archive to this path.

    circle-exclamation

    Make sure to validate the file ownership of the indexes folder after decompression. Elasticsearch requires UID=1000 and GID=0 to be set for its data folder.

    Perform an upgrade

    When a new Snow Owl Terminology Server release is available we recommend performing the following steps.

    New releases are going to be distributed the same way: a docker stack and its configuration within an archive.

    It is advised to decompress the new release files to a temporary folder and compare the contents of ./snow-owl/docker .

    [root@host]# diff /opt/snow-owl/docker/ /opt/new-snow-owl-release/snow-owl/docker/
    Common subdirectories: /opt/snow-owl/docker/configs and /opt/new-snow-owl-release/snow-owl/docker/configs
    diff /opt/snow-owl/docker/.env /opt/new-snow-owl-release/snow-owl/docker/.env
    10c10
    < ELASTICSEARCH_VERSION=7.16.3
    ---
    > ELASTICSEARCH_VERSION=7.17.1
    24c24
    < SNOWOWL_VERSION=8.1.0
    ---
    > SNOWOWL_VERSION=8.1.1
    

    The changes usually are restricted to version numbers in the .env file. In such cases, it is equally acceptable to overwrite the contents of the ./snow-owl/docker folder as is or cherry-pick the necessary modifications by hand.

    Once the new version of the files is in place it is sufficient to just issue the following commands, an explicit stop of the service is not even required (in the folder ./snow-owl/docker):

    circle-exclamation

    Do not usedocker-compose restart because it won't pick up any .yml or .env file changes. See the .

    Restore

    Using the custom backup container it is possible to restore:

    • the Elasticsearch indices

    • the OpenLDAP database (if present)

    To restore any of the data the following steps have to be performed:

    tar --extract \
        --gzip \
        --verbose \
        --same-owner \
        --preserve-permissions \
        --file=/path/to/snow-owl-linux-x86_64.tar.gz \
        --directory=/opt/
    cross-cluster replicationarrow-up-right
    snapshot-restorearrow-up-right
    docker-compose pull
    docker-compose up -d
    official docker guiderarrow-up-right
    tar --extract \
        --gzip \
        --verbose \
        --same-owner \
        --preserve-permissions \
        --file=snow-owl-resources.tar.gz \
        --directory=/opt/snow-owl/resources/
    
    chown -R 1000:0 /opt/snow-owl/resources

    stop Snow Owl, Elasticsearch, and the OpenLDAP containers (in the folder ./snow-owl/docker):

    • (re)move the contents of the old / corrupted Elasticsearch data folder:

    • restart the Elasticsearch container only (keep Snow Owl stopped):

    • use the backup container's terminal and execute the restore script:

      • without any parameters, if only the Elasticsearch indices have to be restored

      • with parameter -l in case the Elasticsearch indices and the OpenLDAP database have to be restored at the same time

    • the script will list all available backups and prompts for selection:

    • enter the numerical identifier of the backup to restore and wait until the process finishes

    • exit the backup container and restart all containers:

    circle-info

    In case only the contents of the OpenLDAP server has to be restored, it is sufficient to just extract the contents of the backup archive to ./snow-owl/ldap and restart the container.

    Get SSL certificate (optional)

    Having secure HTTP in case the Terminology Server is a public-facing instance is definitely a must. For such cases, we are providing a pre-configured environment and a convenience script to acquire the necessary SSL certificate.

    SSL certificate retrieval and renewal are managed by certbotarrow-up-right, the official ACME client recommended by Let's Encryptarrow-up-right.

    To be able to obtain an SSL certificate the following requirements must be met:

    • docker and docker-compose are installed

    • the server instance has a public IP address

    • a DNS A record is configured for the desired domain name routing to the server's IP address

    For the sake of example let's say the target domain name is snow-owl.b2ihealthcare.com .

    Go to the sub-folder called ./snow-owl/docker/configs/cert. Make sure the init-certificate.sh script has permissions to be executable and get some details about its parameters:

    As you can see -d is used for specifying the domain name, and -e is used for specifying a contact email address (optional). Now execute the script with our example parameters:

    circle-exclamation

    Script execution will overwrite the files under ./snow-owl/docker/docker-compose.yml and ./snow-owl/docker/configs/nginx/nginx.conf. Make a note of any changes if required.

    After successful execution, a new folder is created ./snow-owl/cert which contains all the certificate files required by NGINX. The docker-compose.yml file is also amended with a piece of code that guarantees automatic renewal of the certificate:

    At this point everything is prepared for having secure HTTP, let's see what else needs to be configured before spinning up the service.

    Backup

    circle-info

    This method is only applicable to deployments where the Elasticsearch cluster is co-located with the Snow Owl Terminology Server.

    A managed Elasticsearch service will automatically configure a snapshot policy upon creation. See details herearrow-up-right.

    The Terminology Server release package contains a built-in solution to perform rolling and permanent data backups. The docker stack has a specialized container (called snow-owl-backup) that is responsible for creating scheduled backups of:

    • the Elasticsearch indices

    • the OpenLDAP database (if present)

    For the Elasticsearch indices, the backup container uses the . Snapshots are labeled in a predefined format with timestamps. E.g. snowowl-daily-20220324030001

    The OpenLDAP database is backed up by compressing the contents of the folder under ./snow-owl/ldap. Filenames are generated using the name of the corresponding Elasticsearch snapshot. E.g. snowowl-daily-20220324030001.tar.gz.

    circle-exclamation

    Backup Window: when a backup operation is running the Terminology Server blocks all write operations on the Elasticsearch indices. This is to prevent data loss and have consistent backups.

    circle-exclamation

    Backup Duration: the very first backup of an Elasticsearch cluster takes a bit more time (depends on the size and I/O performance but between 20 minutes - 40 minutes), subsequent backups should take significantly less: 1 - 5 minutes.

    hashtag
    Daily backups

    Daily backups are rolling backups, scheduled, and cleaned up based on the settings specified in the ./snow-owl/docker/.env file. Here is a summary of the important settings that could be changed.

    hashtag
    BACKUP_FOLDER

    To store backups redundantly it is advised to mount a remote file share to a local path on the host. By default, this folder is configured to be at ./snow-owl/backup. It contains:

    • the snapshot files of the Elasticsearch cluster

    • the backup files of the OpenLDAP database

    • extra configuration files

    circle-info

    Make sure the remote file share has enough free space to store around the double of the ./snow-owl/resources/indexes folder.

    hashtag
    CRON_DAYS, CRON_HOURS, CRON_MINUTES

    Backup jobs are scheduled by crond, so cron-expressions can be defined here to specify the time a daily backup should happen.

    hashtag
    NUMBER_OF_DAILY_BACKUPS_TO_KEEP

    This is used to tell the backup container how many daily backups must be kept.

    hashtag
    Example daily backup config

    Let's say we have an external file share mounted to /mnt/external_folder. There is a need to create daily backups after each working day, during the night at 2:00 am. Only the last two-weeks-worth of data should be kept (assuming 5 working days each week).

    hashtag
    One-off backups

    It is also possible to perform backups occasionally, e.g. before versioning an important SNOMED CT release or before a Terminology Server version upgrade. These backups are kept until manually removed.

    To create such backups the following command needs to be executed using the backup container's terminal:

    The script will create a snapshot backup of the Elasticsearch data with a label snowowl-my-backup-label-20220405030002 and an archive that contains the database of the OpenLDAP server with the name snowowl-my-backup-label-20220405030002.tar.gz.

    root@host:# docker exec -it backup bash
    root@ad36cfb0448c:# /backup/restore.sh
    docker-compose stop snowowl elasticsearch ldap
    mv -t /tmp ./snow-owl/resources/indexes/nodes
    docker-compose start elasticsearch
    root@ad36cfb0448c:# /backup/restore.sh
    
    ################################
    Snow Owl restore script STARTED.
    
    #### Verify Elasticsearch snapshot repository ####
    
    Checking existence of repository 'snowowl-snapshots' ...
    Repository with name 'snowowl-snapshots' is present, verifying repository state ...
    Repository 'snowowl-snapshots' is functional
    
    #### Select backup to restore ####
    
    Found 10 available backups under '/backup'
    Please select the backup to restore by choosing the right number in the menu below (hit Enter when the selection was made)
    
     1) snowowl-daily-20220323030001
     2) snowowl-daily-20220324030001
     3) snowowl-daily-20220325030002
     4) snowowl-daily-20220326030002
     5) snowowl-daily-20220329030001
     6) snowowl-daily-20220330030001
     7) snowowl-daily-20220331030002
     8) snowowl-daily-20220401030002
     9) snowowl-daily-20220402030001
    10) snowowl-daily-20220405030002
    
    #?
    
    root@ad36cfb0448c:# exit
    root@host:# docker-compose up -d
    Snapshot APIarrow-up-right
    root@host:# docker exec -it backup bash
    root@ad36cfb0448c:# /backup/restore.sh -l
    [root@host]# pwd
    /opt/snow-owl/docker/configs/cert
    [root@host]# chmod +x init-certificate.sh
    [root@host]# ./init-certificate.sh -h
      DESCRIPTION:
    
         Get certificate for the specified domain name using Let's Encrypt and certbot
    
      OPTIONS:
         -h
            Show this help
         -d domain
            Define the domain name to get the certificate for
         -e email (optional)
            The email address to use for the certificate registration
    
      EXAMPLES:
    
         ./init-certificate.sh -d mywebsite.com -e [email protected]
    
         ./init-certificate.sh -d example.com
    
    ./init-certificate.sh -d snow-owl.b2ihealthcare.com -e [email protected]
      nginx:
        image: nginx:stable
        container_name: nginx
        volumes:
          - ./configs/nginx/conf.d/:/etc/nginx/conf.d/
          - ./configs/nginx/nginx.conf:/etc/nginx/nginx.conf
          - ${CERT_FOLDER}/conf:/etc/letsencrypt
          - ${CERT_FOLDER}/www:/var/www/certbot
        depends_on:
          - snowowl
        ports:
          - "80:80"
          - "443:443"
        # Reload nginx config every 6 hours and restart
        command: "/bin/sh -c 'while :; do sleep 6h & wait $${!}; nginx -s reload; done & nginx -g \"daemon off;\"'"
        restart: unless-stopped
      certbot:
        image: certbot/certbot:latest
        container_name: certbot
        volumes:
          - ${CERT_FOLDER}/conf:/etc/letsencrypt
          - ${CERT_FOLDER}/www:/var/www/certbot
        # Check for SSL cert renewal every 12 hours
        entrypoint: "/bin/sh -c 'trap exit TERM; while :; do certbot renew; sleep 12h & wait $${!}; done;'"
        restart: unless-stopped
    BACKUP_FOLDER=/mnt/external_folder
    NUMBER_OF_DAILY_BACKUPS_TO_KEEP=10
    
    CRON_DAYS=Tue-Sat
    CRON_HOURS=2
    CRON_MINUTES=0
    root@host:/# docker exec -it backup bash
    root@ad36cfb0448c:/# /backup/backup.sh -l my-backup-label

    Folder structure

    Here is the list of files and folders extracted from the release package and their role described down below.

    hashtag
    /docker

    Contains every configuration file used for the docker stack, including docker-compose.yml.

    This folder is considered to be the context by docker, which means that upon executing commands one must either address the config file explicitly or execute docker-compose commands directly inside here.

    E.g. to verify the status of the stack there are two approaches:

    Execute the command inside ./snow-owl/docker:

    Execute the command from somewhere else then ./snow-owl/docker:

    hashtag
    /docker/configs/cert

    This folder contains the files necessary to acquire an SSL certificate. None of the files should be changed here ideally.

    hashtag
    /docker/configs/elasticsearch

    There is one important file here, elasticsearch.yml which can be used for fine-tuning the Elasticsearch cluster. However, this is not necessary by default, only if an advanced configuration is required.

    hashtag
    /docker/configs/ldap-boostrap

    This folder contains the files used upon the first start of the OpenLDAP server. The files within describe a set of groups and users to set up an initial user access model. User credentials for the test users can be found in the file called 200_users.ldif.

    hashtag
    /docker/configs/nginx

    Location of all configuration files for NGINX. By default, a non-secure HTTP configuration is assumed. If there is no need for an SSL certificate, then the files here will be used. If an SSL certificate was acquired, then the main configuration file of NGINX (nginx.conf) will be overwritten with the one under /docker/cert/nginx.conf.

    hashtag
    /docker/configs/snowowl

    snowowl.yml: this file is the default configuration file of the Terminology Server. It does not need any changes by default either.

    users: list of users for file-based authentication. There is one default user called snowowl for which the credentials can be found under ./docker/.env.

    hashtag
    /docker/docker-compose.yml

    The main configuration file for the docker stack. This file is replaced in case an SSL certificate was acquired (with file /docker/cert/docker-compose.yml). This is where volumes, ports, or environment variables can be configured.

    hashtag
    /docker/docker_login.txt

    The credentials to use for authenticating with the B2i private docker registry.

    hashtag
    /docker/.env

    The collection of environment variables for the docker-compose.yml file.

    This is the file to configure most of the settings of the Terminology Server. Including java heap size, Snow Owl or Elasticsearch version, passwords, or folder structure.

    hashtag
    /ldap

    The location where the OpenLDAP server stores its data.

    hashtag
    /logs

    Log files of the Terminology Server

    hashtag
    /resources

    Location of Elasticsearch and Snow Owl resources.

    hashtag
    /resources/indexes

    This is the data folder of Elasticsearch. Datasets must be extracted to this directory.

    hashtag
    /resources/attachments

    Snow Owl's local file storage. Import and export artifacts are stored here.

    circle-info

    Pro tip💡: in case the Terminology Server is deployed to the cloud, make sure this path is served by a fast SSD disk (local or ephemeral SSD is the best). This will make import or export processes even faster.

    hashtag
    /cert (optional)

    In case an SSL certificate is acquired, all the files used by certbot and NGINX are stored here. This folder is automatically created by the certificate retrieval script.

    hashtag
    /backup (optional)

    This is the initial folder of all backup artifacts. This should be configured as a network mount to achieve data redundancy.

    snow-owl/
    ├── backup
    ├── docker
    │   ├── configs
    │   │   ├── cert
    │   │   │   ├── conf.d
    │   │   │   ├── docker-compose-cert.yml
    │   │   │   ├── docker-compose.yml
    │   │   │   ├── init-certificate.sh
    │   │   │   └── nginx.conf
    │   │   ├── elasticsearch
    │   │   │   ├── elasticsearch.yml
    │   │   │   └── synonym.txt
    │   │   ├── ldap-boostrap
    │   │   │   ├── 100_groups.ldif
    │   │   │   └── 200_users.ldif
    │   │   ├── nginx
    │   │   │   ├── conf.d
    │   │   │   │   └── snowowl.conf
    │   │   │   └── nginx.conf
    │   │   └── snowowl
    │   │       ├── snowowl.yml
    │   │       └── users
    │   ├── docker-compose.yml
    │   ├── docker_login.txt
    │   └── .env
    ├── ldap
    ├── logs
    └── resources
        ├── attachments 
        └── indexes
    [root@host docker]# docker-compose ps -a
    [root@host ~]# docker-compose --file /opt/snow-owl/docker/docker-compose.yml ps -a

    Content syndication

    In version v8.9.0 a new content syndication feature has been introduced which allows data to be seamlessly moved between different Snow Owl Terminology Server deployments.

    This functionality is useful when content created in a central deployment (upstream) needs to be distributed to one or more read-only downstream instances. The resource distribution is designed to be uni-directional and semi-automated where an actor has to configure any new downstream instances to be able to receive data from the central unit.

    hashtag
    Configure upstream

    To be able to access the upstream server and its content the following items are required:

    • the HTTP port of Elasticsearch has to be accessible for the downstream Snow Owl and Elasticsearch instances (configured via the http.port property, the default is 9200)

    • the REST API of Snow Owl has to be accessible for the downstream Snow Owl servers

    hashtag
    Access Elasticsearch

    In case Snow Owl uses a self-hosted Elasticsearch instance the HTTP port can be opened by modifying the container settings in the docker-compose.yml file. Make sure to remove the localhost prefix from the port declaration (127.0.0.1:9200:9200)

    circle-exclamation

    When opening up a self-hosted Elasticsearch make sure to use strengthened security with secure HTTP and username/password access.

    A detailed guide on Elasticsearch security can be found here:

    In the case of a hosted Elasticsearch instance there is nothing to do, it will already be accessible from outside.

    hashtag
    Access Snow Owl

    The default reverse proxy configuration (shipped in the released package) exposes the Snow Owl REST API via the URL: http(s)://upstream-snow-owl-url/snowowl

    Other than that no additional configuration is needed.

    hashtag
    Obtain an Elasticsearch API key

    Creating a new API key for Elasticsearch is either possible through its Api Key API or - in the case of a hosted instance - from within Kibana.

    The content syndication operation requires the following permissions:

    • cluster privilege: monitor

    • index privilege: read

    Here is an example request body for the Api Key API:

    This request will return with the following response:

    Take note of the encoded API Key, which is the one that will be used later on.

    To obtain an API key using Kibana, follow this guide with the same settings from above:

    hashtag
    Obtain a Snow Owl API Key

    To request an API key from the upstream Snow Owl Terminology Server the following REST API endpoint must be used:

    hashtag
    To request an API key

    POST https://upstream-snow-owl-url/snowowl/token

    hashtag
    Request Body

    Name
    Type
    Description

    hashtag
    Select distributable resources

    All three major terminology resource types can be configured as distributable. Resources have a settings map that can be updated via their specific REST API endpoints:

    • PUT /codesystems/{codeSystemId}

    • PUT /valuesets/{valueSetId}

    • PUT /conceptmaps/{conceptMapId}

    A setting called distributable has to be set which either could be true or false. Here is an example update request to mark 'Example Code System' as distributable:

    hashtag
    Configure downstream

    hashtag
    Elasticsearch

    There is one configuration property that must be set before provisioning a new downstream Snow Owl Terminology Server.

    Any potential upstream Elasticsearch instance must be listed as an allowed source of information for the downstream Elasticsearch instances via a configuration parameter in the elasticsearch.yml file.

    The property is called reindex.remote.whitelist :

    The whitelisted URL must contain the upstream HTTP port and must not contain the scheme.

    hashtag
    Provision a new downstream server

    Provisioning a new downstream server has the following prerequisites:

    • start with an empty dataset

    • collect all terminology resource identifiers that need to be syndicated

    • get all the necessary credentials to communicate with upstream

    hashtag
    Collect terminology resources for syndication

    To populate a downstream server with terminology resources via an upstream source, one must collect the required resource identifiers or resource version identifiers beforehand.

    Resource identifiers must be in their simple form, e.g.:

    • SNOMED-CT

    • ICD-10

    • LOINC

    Resource version identifiers must be in the following form <resource_id>/<version_id>, e.g.:

    • SNOMED-CT/2020-01-31

    • ICD-10/v2019

    • LOINC/v2.72

    To determine which resources are available for syndication, the following upstream REST API endpoint can be used. It returns an atom feed that consists of resource versions from where one can collect the required identifiers.

    hashtag
    Retrieve syndication resource feed

    GET https://upstream-snow-owl-url/snowowl/syndication/feed.xml

    Retrieves the feed of all distributable resources

    hashtag
    Query Parameters

    Name
    Type
    Description
    circle-info

    It is not required to list all resource version identifiers for an already selected resource. E.g.:

    • If SNOMED-CT is selected as a resource, it is not required to select all its versions among the version resource identifiers.

    hashtag
    Syndicate resources

    To kick off a syndication process the following parameters are required:

    • the list of resource identifiers

    • the list of resource version identifiers

    • the upstream Snow Owl URL without its REST API root context:

    circle-info

    When there are no existing resources on the downstream server yet, at least one resource identifier or one resource version identifier must be selected.

    circle-exclamation

    Snow Owl will resolve all resource dependencies and will handle syndication requests rigorously. If e.g. a Value Set depends on a specific SNOMED CT version and that version is not among the selected resources - or does not exist on the downstream server yet - the syndication run will fail to note that there is a missing dependency. It is always required to list all dependencies that the selected resources have for a given syndication run.

    The above parameters should be fed to the following downstream Snow Owl REST API endpoint:

    hashtag
    Syndicate resource(s)

    POST https://downstream-snow-owl-url/snowowl/syndication/syndicate

    Syndicate resources from a remote Snow Owl instance. In case no resource identifiers are provided, all existing resources will be syndicated to their latest version.

    hashtag
    Request Body

    Name
    Type
    Description

    The syndication process starts in the background as an asynchronous job. It can be tracked by calling the following endpoint using the job identifier returned in the Location:

    hashtag
    Retrieve syndication job

    GET https://downstream-snow-owl-url/snowowl/syndication/{id}

    Returns the specified syndication run's configuration and status.

    hashtag
    Path Parameters

    Name
    Type
    Description

    The returned result object will contain all information related to the given syndication run:

    • status of the run (RUNNING, FINISHED, FAILED)

    • list of successfully syndicated resource versions

    • additional details about created or updated Elasticsearch indices

    hashtag
    Examples of resource selection

    hashtag
    Code Systems

    There is a need to syndicate the SNOMED-CT US extension. It depends on the SNOMED CT International version 2021-01-31. Provide the following resource identifier and resource version identifier configuration:

    This will syndicate all versions of SNOMED-CT-US and all international versions until 2021-01-31.

    If the configuration is changed to:

    This will syndicate all versions of SNOMED-CT-US and SNOMED-CT international, including all international versions even after 2021-01-31.

    hashtag
    Value Sets

    There is a Value Set with an identifier of VS and members from SNOMED-CT/2020-07-31:

    hashtag
    Concept Maps

    There is a Concept Map with an identifier of CM mapping concepts between LOINC/v2.72 and ICD-10/v2019:

    hashtag
    Keeping a downstream server up-to-date

    If a given downstream server already contains the desired resources and the goal is to keep the content up-to-date, it is not required to fill in the resource and resource version identifiers for the syndication request.

    One can call the POST /syndication/syndicate endpoint with all the credentials and URLs but without specifying any resource or version identifier. The server will automatically determine - based on the set of existing downstream resources - if there are any new resource versions available for syndication.

    To check whether there are any updates available, there is an endpoint that can be called:

    hashtag
    Retrieve a list of resource versions which are available for syndication

    GET https://downstream-snow-owl-url/snowowl/syndication/list

    Returns the full list of resource versions to be syndicated based on the search criteria. If no filters are provided updates are calculated for all existing resources.

    hashtag
    Query Parameters

    Name
    Type
    Description

    If there are any updates this endpoint will return a list of versions, if there are none it will return an empty result.

    an Elasticsearch API key with sufficient privileges for authentication and authorization
  • a Snow Owl API key with sufficient privileges for authentication and authorization

  • configure selected terminology resources as distributable

  • permissions

    List<String>

    List of permissions

    initiate the resource syndication and verify the result

    effectiveTime

    String

    The effective time value to match (yyyyMMdd) or an effective time range value to match (yyyyMMdd...yyyyMMdd), inclusive range

    createdAt

    Long

    Exact match filter for the resource version created at field

    createdAtFrom

    Long

    Greater than equal to filter for the resource version created at field

    createdAtTo

    String

    Less than equal to filter for the resource version created at field

    limit*

    int

    The maximum number of items to return

    If a specific version is selected (SNOMED-CT/2020-01-31) and the resource is not listed among the selected resources, then only versions created until 2020-01-31 will be syndicated

    e.g. https://upstream-snow-owl-url.com

  • the API key to authenticate with the upstream Snow Owl server

  • the upstream Elasticsearch URL, including the scheme and port:

    • e.g. https://upstream-elasticsearch-url.com:9200

  • the API key to authenticate with the upstream Elasticsearch

  • upstreamDataUrl*

    String

    The URL of the upstream Elasticsearch

    upstreamDataToken*

    String

    API key for the upstream Elasticsearch

    limit*

    int

    The number of resource versions to return if there are any

    username*

    String

    The username to authenticate with

    password*

    String

    The password belonging to the username

    token

    String

    Previous token to re-new

    expiration

    String

    Expiration interval, e.g. 1d or 2h

    resource

    List<String>

    The resource identifier(s) to include in the feed

    resourceType

    List<String>

    The types of resources to include in the feed (e.g. conceptmaps, valuesets, codesystems)

    resourceUrl

    List<String>

    The URLs of the resources to include in the feed

    packageTypes

    List<String>

    The types of packages to include in the feed. Only BINARY is supported at the moment

    resource

    List<String>

    List of resource identifiers

    version

    List<String>

    List of version resource identifiers

    upstreamUrl*

    String

    The URL of the upstream Snow Owl

    upstreamToken*

    String

    API key for the upstream Snow Owl

    id*

    String

    The identifier of a syndication run

    resource

    List<String>

    The resource identifier(s) to syndicate, e.g. SNOMEDCT (== latest version)

    version

    List<String>

    The version identifier(s) to syndicate, e.g. SNOMEDCT/2022-07-31

    upstreamUrl*

    String

    The URL of the upstream Snow Owl server

    upstreamToken*

    String

    The token to authenticate with the upstream Snow Owl server

    docker-compose.yml
    ...
      elasticsearch:
        image: docker.elastic.co/elasticsearch/elasticsearch:${ELASTICSEARCH_VERSION}
        container_name: elasticsearch
    ...
        ports:
          - "9200:9200"
    POST /_security/api_key
    {
      "name": "syndication-api-key",
      "expiration": "30d",
      "role_descriptors": { 
        "syndicate-role": {
          "cluster": [
            "monitor"
          ],
          "indices": [
            {
              "names": [
                "*"
              ],
              "privileges": [
                "read"
              ]
            }
          ]
        }
      }
    }
    {
      "id" : "<token_id>",
      "name" : "syndication-api-key",
      "expiration" : 0,
      "api_key" : "<api_key>",
      "encoded" : "<encoded_api_key>"
    }
    {
        token: "<snow-owl-api-key>"
    }
    PUT /codesystems/example_codesystem_id
    {
      "settings": {
        "distributable": true
      }
    }
    elasticsearch.yml
    ...
    http.port: 9200
    ...
    reindex.remote.whitelist: ["upstream-elasticsearch-url.com:9200", "other-upstream-elasticsearch-url.com:9200"]
    <?xml version="1.0" encoding="UTF-8"?>
    <feed xmlns="http://www.w3.org/2005/Atom">
      <id>urn:uuid:ddce3cd6-2efe-3142-9cce-62e73d3031ca</id>
      <title>Snow Owl® Terminology Server Syndication Feed</title>
      ...
      <entry>
        <id>valuesets/1234/V1.0</id>
        ...
        <title>Valueset example</title>
        <category term="BINARY" scheme="https://b2ihealthcare.com/snow-owl/syndication/binary/1.0.0" label="Binary index"/>
        ...
      </entry>
    </feed>
    {
        // Response
    }
    {
      "resource": "SNOMED-CT-US",
      "version": "SNOMED-CT/2021-01-31"
    }
    {
      "resource": "SNOMED-CT-US, SNOMED-CT"
      "version": ""
    }
    {
      "resource": "VS"
      "version": "SNOMED-CT/2020-07-31"
    }
    {
      "resource": "CM"
      "version": "LOINC/v2.72, ICD-10/v2019"
    }
    {
        "items": [
            {
                "id": "SNOMED-CT/2022-01-31",
                "version": "2022-01-31",
                "description": "2022-01-31",
                "effectiveTime": "2022-01-31",
                "resource": "codesystems/SNOMED-CT"
            },
            {
                "id": "SNOMED-CT/2022-07-31",
                "version": "2022-07-31",
                "description": "2022-07-31",
                "effectiveTime": "2022-07-31",
                "resource": "codesystems/SNOMED-CT"
            }
        ]
        "limit": 50,
        "total": 2
    }
    API Keys | Kibana Guide [7.17] | ElasticElasticchevron-right
    Elasticsearch security principles | Elasticsearch Guide [7.17] | ElasticElasticchevron-right

    User management

    The Snow Owl Terminology Server has two different ways to manage users. The primary authentication and authorization service is the LDAP Directory Server. The secondary option is a file-based database, used only for administrative purposes. Whenever user access has to be granted or revoked the following methods could be applied.

    hashtag
    LDAP based identity provider

    circle-info

    This is only applicable to the default deployment setup where a co-located OpenLDAP server is used alongside the Terminology Server.

    There are several ways to access and manage an OpenLDAP server, hereby we will only describe one of them, through the Apache Directory Studio.

    Apache Directory Studio is an open-source, free application. It is available to download for different platforms (Windows, macOS, and Linux).

    Before accessing the LDAP database there is one technical prerequisite to satisfy. The OpenLDAP server has to be accessible from the machine Apache Directory Studio is installed. The best and most secure way to achieve that is to set up an SSH tunnel. Follow to an article that describes how to configure an SSH tunnel using PuTTY and Windows.

    The OpenLDAP server uses port 389 for communication. This is the port that needs to be tunneled through the SSH connection. Here is what the final configuration looks like in PuTTY:

    Once the SSH tunnel works, it's time to set up our connection in Apache DS. Go to File -> New -> LDAP Connection and set the following:

    Hit the "Check Network Parameter" button to verify the network connection.

    Go to the next page of the wizard and provide your credentials. The default Bind DN and Bind password can be found in the Terminology Server release package under ./snow-owl/docker/.env.

    Hit the "Check Authentication" button to verify your credentials. Hit Finish to complete the setup procedure.

    All users and groups should be browseable now through the LDAP Browser view:

    hashtag
    Grant user access

    To grant access to a new user an LDAP entry has to be created. Go to the LDAP Browse view and right-click on the organization node, then New -> New Entry:

    It is the easiest to use an existing entry as a template:

    Leave everything as is on the Object Classes page, then hit Next. Fill in the new user's credentials:

    On the final page, double click on the userPassword row and provide the user's password:

    Hit Finish to add the user to the database.

    Now we need to assign a role for the user. Before going forward, get ahold of the user's DN using the LDAP Browser view:

    Select the desired role group in the Browser view and add a new attribute:

    Select the attribute type uniqueMember and hit Finish:

    Paste the user's DN as the value of the attribute and hit Enter to make your changes permanent:

    hashtag
    Revoke user access

    To revoke access the user has to be deleted from the list of users:

    And also has to be removed from the role group:

    hashtag
    Change credentials

    To change either the first or last name, or the password of a user, just edit any of the attributes in the user editor:

    hashtag
    File-based identity provider

    There is a configuration file ./snow-owl/docker/configs/snowowl/users that contains the list of users with their credentials encrypted. The passwords are encrypted using the hash algorithm (variant $2a$). This method of authentication should be used for testing or internal purposes only, users added here will have elevated privileges.

    circle-exclamation

    To apply any changes made to the users file the Terminology Server has to be restarted afterward.

    hashtag
    Grant user access

    To grant access the users file has to be amended with the new user and its credentials. There are several ways to encrypt a password using the bcrypt algorithm but here is one that is easy and available on most of the Linux variants. The package called htpasswd has to be installed:

    It will prompt for the password and will amend the file with the new user at the end.

    hashtag
    Revoke user access

    Simply remove the user's line from the file and restart the service.

    hashtag
    Change credentials

    Remove the user's line from the file and regenerate the credentials according to the section.

    this linkarrow-up-right
    bcrypt arrow-up-right
    Grant user access
    Configure SSH tunnel
    Set up LDAP connection
    Provide credentials for the LDAP connection
    Browser LDAP users / groups
    Create new LDAP entry
    Select existing user entry as template
    Configure user details
    Set user credentials
    Copy the user's DN
    Add new attribute
    Select attribute type uniqueMember
    Add new member to role group
    Delete user entry
    Delete role group attribute
    Change user credentials
    htpasswd -nBC 10 my-new-username | head -n1 | sed 's/$2y/$2a/g' >> ./snow-owl/docker/configs/snowowl/users

    Spin up the service

    Full list of steps to perform before spinning up the service:

    1. Extract the Terminology Server release archive to a folder. E.g. /opt/snow-owl

    2. (Optional) Obtain an SSL certificate

      1. Make sure a DNS A record is routed to the host's public IP address

      2. Go into the folder ./snow-owl/docker/cert

      3. Execute the ./init-certificate.sh script:

    3. (Optional) Configure access for managed Elasticsearch Cluster (elastic.co)

    4. (Optional) Extract dataset to ./snow-owl/resources where folder structure should look like ./snow-owl/resources/indexes/nodes/0 at the end.

    5. Verify file ownership to be UID=1000 and GID=0:

    6. Check any credentials or settings that need to be changed in ./snow-owl/docker/.env

    7. Authenticate with our private docker registry while in the folder ./snow-owl/docker:

    8. Issue a pull (in folder ./snow-owl/docker)

    9. Spin up the service (in the folder ./snow-owl/docker)

    10. Verify that the REST API of the Terminology Server is available at:

      1. With SSL: https://snow-owl.example.com/snowowl

      2. Without SSL: http://hostname:8080/snowowl

    11. Verify that the server and cluster status is GREEN by querying the following REST API endpoint:

      1. With SSL:

      2. Without SSL:

    12. Enjoy using the Snow Owl Terminology Server 🎉

    ./init-certificate.sh -d snow-owl.example.com
    chmod -R 1000:0 ./snow-owl/docker ./snow-owl/logs ./snow-owl/resources
    cat docker_login.txt | docker login -u <username> --password-stdin https://docker.b2i.sg
    docker-compose pull
    docker-compose up -d
    curl https://snow-owl.example.com/snowowl/info
    curl http://hostname:8080/snowowl/info

    Configure Elastic Cloud (optional)

    circle-info

    The release package contains everything that is required to use a co-located Elasticsearch instance by default. Follow these steps only when there is a need for a remote Elasticsearch cluster.

    To configure the Terminology Server to work with a managed Elasticsearch cluster two settings require attention.

    hashtag
    Configure Terminology Server

    First, the local Elasticsearch container and all its configurations should be removed from the docker-compose.yml file. Once that is done, we have to tell the Terminology Server where to find the cluster. This can be set in the file ./snow-owl/docker/configs/snowowl/snowowl.yml:

    hashtag
    Configure Elastic Cloud

    Snow Owl TS leverages Elasticssearch's synonym filters. To have this feature work properly with a managed Elasticsearch cluster our custom dictionary has to be uploaded and configured. The synonym file can be found in the release package under ./snow-owl/docker/configs/elasticsearch/synonym.txt. This file needs to be compressed as an zip archive by following this structure:

    For the managed Elasticsearch instance this zip file needs to be configured as a bundle extension. The steps required are covered in this guide in great detail:

    Once the bundle is configured and the cluster is up we can (re)start the docker stack. In case there are any troubles the Terminology Server will refuse to initialize and let you know what the problem is in its log files.

    repository:
      index:
        socketTimeout: 60000
        clusterUrl: https://my-es-cluster.elastic-cloud.com:9243
        clusterUsername: my-es-cluster-user
        clusterPassword: my-es-cluster-pwd
    .
    └── analysis
        └── synonym.txt
    Upload custom plugins and bundles | Elastic Docswww.elastic.cochevron-right
    Logo
    Logo
    Logo
    Welcome to Apache Directory Studio — Apache Directorydirectory.apache.orgchevron-right
    InstallDocker Documentationchevron-right
    Logo
    Logo