This method is only applicable to deployments where the Elasticsearch cluster is co-located with the Snow Owl Terminology Server.
A managed Elasticsearch service will automatically configure a snapshot policy upon creation. See details here.
The Terminology Server release package contains a built-in solution to perform rolling and permanent data backups. The docker stack has a specialized container (called snow-owl-backup
) that is responsible for creating scheduled backups of:
the Elasticsearch indices
the OpenLDAP database (if present)
For the Elasticsearch indices, the backup container uses the Snapshot API. Snapshots are labeled in a predefined format with timestamps. E.g. snowowl-daily-20220324030001
The OpenLDAP database is backed up by compressing the contents of the folder under ./snow-owl/ldap
. Filenames are generated using the name of the corresponding Elasticsearch snapshot. E.g. snowowl-daily-20220324030001.tar.gz
.
Backup Window: when a backup operation is running the Terminology Server blocks all write operations on the Elasticsearch indices. This is to prevent data loss and have consistent backups.
Backup Duration: the very first backup of an Elasticsearch cluster takes a bit more time (depends on the size and I/O performance but between 20 minutes - 40 minutes), subsequent backups should take significantly less: 1 - 5 minutes.
Daily backups are rolling backups, scheduled, and cleaned up based on the settings specified in the ./snow-owl/docker/.env
file. Here is a summary of the important settings that could be changed.
To store backups redundantly it is advised to mount a remote file share to a local path on the host. By default, this folder is configured to be at ./snow-owl/backup
. It contains:
the snapshot files of the Elasticsearch cluster
the backup files of the OpenLDAP database
extra configuration files
Make sure the remote file share has enough free space to store around the double of the ./snow-owl/resources/indexes
folder.
Backup jobs are scheduled by crond, so cron-expressions can be defined here to specify the time a daily backup should happen.
This is used to tell the backup container how many daily backups must be kept.
Let's say we have an external file share mounted to /mnt/external_folder
. There is a need to create daily backups after each working day, during the night at 2:00 am. Only the last two-weeks-worth of data should be kept (assuming 5 working days each week).
It is also possible to perform backups occasionally, e.g. before versioning an important SNOMED CT release or before a Terminology Server version upgrade. These backups are kept until manually removed.
To create such backups the following command needs to be executed using the backup container's terminal:
The script will create a snapshot backup of the Elasticsearch data with a label snowowl-my-backup-label-20220405030002
and an archive that contains the database of the OpenLDAP server with the name snowowl-my-backup-label-20220405030002.tar.gz
.