This method is only applicable to deployments where the Elasticsearch cluster is co-located with the Snow Owl Authoring Platform.
A managed Elasticsearch service will automatically configure a snapshot policy upon creation. See details here.
The Authoring Platform release package contains a built-in solution to perform rolling and permanent data backups. The docker stack has a specialized container (called snow-owl-backup
) that is responsible for creating scheduled backups of:
the Elasticsearch indices
the OpenLDAP database (if present)
Bugzilla's configuration files and SQL database
For the Elasticsearch indices, the backup container uses the Snapshot API. Snapshots are labeled in a predefined format with timestamps. E.g. snowowl-daily-20220324030001
The OpenLDAP database is backed up by compressing the contents of the folder under ./snow-owl/ldap
. Filenames are generated using the name of the corresponding Elasticsearch snapshot. E.g. snowowl-daily-20220324030001.tar.gz
.
Bugzilla is backed up by compressing the configuration files and the MySQL database dump. Filenames are generated using the name of the corresponding Elasticsearch snapshot. E.g. snowowl-daily-20220324030001.tar.gz
.
Backup Window: when a backup operation is running the Terminology Server blocks all write operations on the Elasticsearch indices. This is to prevent data loss and have consistent backups.
Backup Duration: the very first backup of an Elasticsearch cluster takes a bit more time (depends on the size and I/O performance but between 20 minutes - 40 minutes), subsequent backups should take significantly less: 1 - 5 minutes.
Daily backups are rolling backups, scheduled, and cleaned up based on the settings specified in the ./snow-owl/docker/.env
file. Here is a summary of the important settings that could be changed.
To store backups redundantly it is advised to mount a remote file share to a local path on the host. By default, this folder is configured to be at ./snow-owl/backup
. It contains:
the snapshot files of the Elasticsearch cluster
the backup files of the OpenLDAP database
the backup files of Bugzilla
extra configuration files
Make sure the remote file share has enough free space to store around the double of the ./snow-owl/resources/indexes
folder.
Backup jobs are scheduled by crond, so cron-expressions can be defined here to specify the time a daily backup should happen.
This is used to tell the backup container how many daily backups must be kept.
Let's say we have an external file share mounted to /mnt/external_folder
. There is a need to create daily backups after each working day, during the night at 2:00 am. Only the last two-weeks-worth of data should be kept (assuming 5 working days each week).
It is also possible to perform backups occasionally, e.g. before versioning an important SNOMED CT release or before a Terminology Server version upgrade. These backups are kept until manually removed.
To create such backups the following command needs to be executed using the backup container's terminal:
The script will create a snapshot backup of the Elasticsearch data with a label snowowl-my-backup-label-20220405030002,
an archive that contains the database of the OpenLDAP server and an archive that contains the configuration and database of Bugzilla with the name snowowl-my-backup-label-20220405030002.tar.gz
.