Snow Owl Docs
9.x
Search
K

Content syndication

With content syndication, data can be seamlessly moved between different Snow Owl Terminology Server deployments.
This functionality is useful when content created in a central deployment (upstream) needs to be distributed to one or more read-only downstream instances. The resource distribution is designed to be uni-directional and semi-automated where an actor has to configure any new downstream instances to be able to receive data from the central unit.

Configure upstream

To be able to access the upstream server and its content the following items are required:
  • the HTTP port of Elasticsearch has to be accessible for the downstream Snow Owl and Elasticsearch instances (configured via the http.port property, the default is 9200)
  • the REST API of Snow Owl has to be accessible for the downstream Snow Owl servers
  • an Elasticsearch API key with sufficient privileges for authentication and authorization
  • a Snow Owl API key with sufficient privileges for authentication and authorization
  • configure selected terminology resources as distributable

Access Elasticsearch

In case Snow Owl uses a self-hosted Elasticsearch instance the HTTP port can be opened by modifying the container settings in the docker-compose.yml file. Make sure to remove the localhost IP prefix from the port declaration:
docker-compose.yml
...
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:${ELASTICSEARCH_VERSION}
container_name: elasticsearch
...
ports:
- - "127.0.0.1:9200:9200"
+ - "9200:9200"
When opening up a self-hosted Elasticsearch make sure to use strengthened security with secure HTTP and username/password access.
A detailed guide on Elasticsearch security can be found here.
In the case of a hosted Elasticsearch instance there is nothing to do, it will already be accessible from outside.

Access Snow Owl

The default reverse proxy configuration (shipped in the released package) exposes the Snow Owl REST API via the URL: http(s)://upstream-snow-owl-url/snowowl
Other than that no additional configuration is needed.

Obtain an Elasticsearch API key

Creating a new API key for Elasticsearch is either possible through its Api Key API or - in the case of a hosted instance - from within Kibana.
The content syndication operation requires the following permissions:
  • cluster privilege: monitor
  • index privilege: read
Here is an example request body for the Api Key API:
POST /_security/api_key
{
"name": "syndication-api-key",
"expiration": "30d",
"role_descriptors": {
"syndicate-role": {
"cluster": [
"monitor"
],
"indices": [
{
"names": [
"*"
],
"privileges": [
"read"
]
}
]
}
}
}
This request will return with the following response:
{
"id" : "<token_id>",
"name" : "syndication-api-key",
"expiration" : 0,
"api_key" : "<api_key>",
"encoded" : "<encoded_api_key>"
}
Take note of the encoded API Key, which is the one that will be used later on.
To obtain an API key using Kibana, follow this guide with the same settings from above.

Obtain a Snow Owl API Key

To request an API key from the upstream Snow Owl Terminology Server the following REST API endpoint must be used:
post
https://upstream-snow-owl-url/snowowl
/token
To request an API key

Select distributable resources

All three major terminology resource types can be configured as distributable. Resources have a settings map that can be updated via their specific REST API endpoints:
  • PUT /codesystems/{codeSystemId}
  • PUT /valuesets/{valueSetId}
  • PUT /conceptmaps/{conceptMapId}
A setting called distributable has to be set with a value of either true or false. Here is an example update request to make the 'Example Code System' distributable:
PUT /codesystems/example_codesystem_id
{
"settings": {
"distributable": true
}
}

Configure downstream

Elasticsearch

There is one configuration property that must be set before provisioning a new downstream Snow Owl Terminology Server.
Any potential upstream Elasticsearch instance must be listed as an allowed source of information for the downstream Elasticsearch instances via a configuration parameter in the elasticsearch.yml file.
The property is called reindex.remote.whitelist :
elasticsearch.yml
...
http.port: 9200
...
reindex.remote.whitelist: ["upstream-elasticsearch-url.com:9200", "other-upstream-elasticsearch-url.com:9200"]
The whitelisted URL must contain the upstream HTTP port and must not contain the scheme.

Provision a new downstream server

Provisioning a new downstream server has the following prerequisites:
  • start with an empty dataset
  • collect all terminology resource identifiers that need to be syndicated
  • get all the necessary credentials to communicate with upstream
  • initiate the resource syndication and verify the result

Collect terminology resources for syndication

To populate a downstream server with terminology resources via an upstream source, one must collect the required resource identifiers or resource version identifiers beforehand.
Resource identifiers must be in their simple form, e.g.:
  • SNOMED-CT
  • ICD-10
  • LOINC
Resource version identifiers must be in the following form: <resource_id>/<version_id>, e.g.:
  • SNOMED-CT/2020-01-31
  • ICD-10/v2019
  • LOINC/v2.72
To determine which resources are available for syndication, the following upstream REST API endpoint can be used. It returns an atom feed that consists of resource versions from where one can collect the required identifiers.
get
https://upstream-snow-owl-url/snowowl
/syndication/feed.xml
Retrieve syndication resource feed
It is not required to list all resource version identifiers for an already selected resource. E.g.:
  • If SNOMED-CT is selected as a resource, it is not required to select all its versions among the version resource identifiers.
  • If a specific version is selected (SNOMED-CT/2020-01-31) and the resource is not listed among the selected resources, then only versions created until 2020-01-31 will be syndicated

Syndicate resources

To kick off a syndication process the following parameters are required:
  • the list of resource identifiers
  • the list of resource version identifiers
  • the upstream Snow Owl URL without its REST API root context:
    • e.g. https://upstream-snow-owl-url.com
  • the API key to authenticate with the upstream Snow Owl server
  • the upstream Elasticsearch URL, including the scheme and port:
    • e.g. https://upstream-elasticsearch-url.com:9200
  • the API key to authenticate with the upstream Elasticsearch
When there are no existing resources on the downstream server yet, at least one resource identifier or one resource version identifier must be selected.
Snow Owl will resolve all resource dependencies and will handle syndication requests rigorously. If e.g. a Value Set depends on a specific SNOMED CT version and that version is not among the selected resources - or does not exist on the downstream server yet - the syndication run will fail to note that there is a missing dependency. It is always required to list all dependencies that the selected resources have for a given syndication run.
The above parameters should be fed to the following downstream Snow Owl REST API endpoint:
post
https://downstream-snow-owl-url/snowowl
/syndication/syndicate
Syndicate resource(s)
The syndication process starts in the background as an asynchronous job. It can be tracked by calling the following endpoint using the job identifier returned in the Location:
get
https://downstream-snow-owl-url/snowowl
/syndication/{id}
Retrieve syndication job
The returned result object will contain all information related to the given syndication run:
  • status of the run (RUNNING, FINISHED, FAILED)
  • list of successfully syndicated resource versions
  • additional details about created or updated Elasticsearch indices

Examples of resource selection

Code Systems

There is a need to syndicate the SNOMED-CT US extension. It depends on the SNOMED CT International version 2021-01-31. Provide the following resource identifier and resource version identifier configuration:
{
"resource": "SNOMED-CT-US",
"version": "SNOMED-CT/2021-01-31"
}
This will syndicate all versions of SNOMED-CT-US and all international versions until 2021-01-31.
If the configuration is changed to:
{
"resource": "SNOMED-CT-US, SNOMED-CT"
"version": ""
}
This will syndicate all versions of SNOMED-CT-US and SNOMED-CT international, including all international versions even after 2021-01-31.

Value Sets

There is a Value Set with an identifier of VS and members from SNOMED-CT/2020-07-31:
{
"resource": "VS"
"version": "SNOMED-CT/2020-07-31"
}

Concept Maps

There is a Concept Map with an identifier of CM mapping concepts between LOINC/v2.72 and ICD-10/v2019:
{
"resource": "CM"
"version": "LOINC/v2.72, ICD-10/v2019"
}

Keeping a downstream server up-to-date

If a given downstream server already contains the desired resources and the goal is to keep the content up-to-date, it is not required to fill in the resource and resource version identifiers for the syndication request.
One can call the POST /syndication/syndicate endpoint with all the credentials and URLs but without specifying any resource or version identifier. The server will automatically determine - based on the set of existing downstream resources - if there are any new resource versions available for syndication.
To check whether there are any updates available, there is an endpoint that can be called:
get
https://downstream-snow-owl-url/snowowl
/syndication/list
Retrieve a list of resource versions which are available for syndication
If there are any updates this endpoint will return a list of versions, if there are none it will return an empty result.
Last modified 1mo ago