# Preload dataset (optional)

In certain cases, a pre-built dataset is also shipped together with the Terminology Server. This is to ease the initial setup procedure and get going fast.&#x20;

{% hint style="info" %}
This method is only applicable to deployments where the Elasticsearch cluster is co-located with the Terminology Server.

To load data into a managed Elasticsearch cluster, there are several options:

* use [cross-cluster replication](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/xpack-ccr.html)
* use [snapshot-restore](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/snapshot-restore.html)
* use Snow Owl to rebuild the data to the remote cluster
  {% endhint %}

These datasets are the compressed form of the Elasticsearch data folder which follows the same structure. Except for having a top folder called `indexes` . This is the same folder as in `./snow-owl/resources/indexes` . So to be able to load the dataset one should just extract the contents of the dataset archive to this path.

```sh
tar --extract \
    --gzip \
    --verbose \
    --same-owner \
    --preserve-permissions \
    --file=snow-owl-resources.tar.gz \
    --directory=/opt/snow-owl/resources/

chown -R 1000:0 /opt/snow-owl/resources
```

{% hint style="warning" %}
Make sure to validate the file ownership of the indexes folder after decompression. Elasticsearch requires UID=1000 and GID=0 to be set for its data folder.
{% endhint %}
