Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Snow Owl is provided in the following package formats:
Package
Description
zip/tar.gz
The zip
and tar.gz
packages are suitable for installation on any system and are the easiest choice for getting started with Snow Owl on most systems.
Install Snow Owl with tar.gz
or zip
rpm
The rpm
package is suitable for installation on Red Hat, Centos, SLES, OpenSuSE and other RPM-based systems. RPMs may be downloaded from the Downloads section.
Install Snow Owl with RPM
deb
The deb
package is suitable for Debian, Ubuntu, and other Debian-based systems. Debian packages may be downloaded from the Downloads section.
Install Snow Owl with Debian Package
docker
Images are available for running Snow Owl as Docker containers. They may be downloaded from the official Docker Hub Registry. Install Snow Owl with Docker
Coming Soon!
Snow Owl is provided as a .zip
and as a .tar.gz
package. These packages can be used to install Snow Owl on any system and are the easiest package format to use when trying out Snow Owl.
The latest stable version of Snow Owl can be found on the Snow Owl Releases page.
Snow Owl requires Java 8 or later. Use the official Oracle distribution or an open-source distribution such as OpenJDK.
zip
packageThe .zip
archive for Snow Owl can be downloaded and installed as follows:
.tar.gz
packageThe .tar.gz
archive for Snow Owl can be downloaded and installed as follows:
Snow Owl can be started from the command line as follows:
By default, Snow Owl runs in the foreground, prints its logs to the standard output (stdout), and can be stopped by pressing Ctrl-C.
All scripts packaged with Snow Owl assume that Bash is available at /bin/bash. As such, Bash should be available at this path either directly or via a symbolic link.
You can test that your instance is running by sending an HTTP request to Snow Owl's status endpoint:
which should give you a response like this:
You can send the Snow Owl process to the background using the combination of nohup
and the &
character:
Log messages can be found in the $SO_HOME/serviceability/logs/
directory.
To shut down Snow Owl, you can kill the process ID directly:
or using the provided shutdown script:
.zip
and .tar.gz
archives:The .zip
and .tar.gz
packages are entirely self-contained. All files and directories are, by default, contained within $SO_HOME
— the directory created when unpacking the archive.
This is very convenient because you don’t have to create any directories to start using Snow Owl, and uninstalling Snow Owl is as easy as removing the $SO_HOME
directory. However, it is advisable to change the default locations of the config directory, the data directory, and the logs directory so that you do not delete important data later on.
You now have a test Snow Owl environment set up. Before you start serious development or go into production with Snow Owl, you must do some additional setup:
Learn how to configure Snow Owl.
Configure important Snow Owl settings.
Configure important system settings.
Type
Description
Default Location
Setting
home
Snow Owl home directory or $SO_HOME
Directory created by unpacking the archive
bin
Binary scripts including startup/shutdown to start/stop the instance
$SO_HOME/bin
conf
Configuration files including snowowl.yml
$SO_HOME/configuration
data
The location of the data files and resources.
$SO_HOME/resources
path.data
logs
Log files location.
$SO_HOME/serviceability/logs
Coming Soon!
Snow Owl uses SLF4J and Logback for logging.
The logging configuration file (serviceability.xml
) can be used to configure Snow Owl logging. The logging configuration file location depends on your installation method, by default it is located in the ${SO_HOME}/configuration
folder.
Extensive information on how to customize logging and all the supported appenders can be found on the Logback documentation.
This section includes information on how to setup Snow Owl and get it running, including:
Downloading
Installing
Starting
Configuring
Snow Owl is built using Java, and requires at least Java 8 in order to run. Only Oracle’s Java and the OpenJDK are supported. The same JVM version should be used on all Elasticsearch nodes and clients.
We recommend installing Java version 1.8.0_171 or a later version in the Java 8 release series. We recommend using a supported LTS version of Java.
The version of Java that Snow Owl will use can be configured by setting the JAVA_HOME environment variable.
Snow Owl ships with good defaults and requires very little configuration.
Snow Owl has three configuration files:
snowowl.yml
for configuring Snow Owl
serviceability.xml
for configuring Snow Owl logging
elasticsearch.yml
for configuring the underlying Elasticsearch instance in case of embedded deployments
These files are located in the config directory, whose default location depends on whether or not the installation is from an archive distribution (tar.gz
or zip
) or a package distribution (Debian or RPM packages).
For the archive distributions, the config directory location defaults to $SO_PATH_HOME/configuration
. The location of the config directory can be changed via the SO_PATH_CONF
environment variable as follows:
Alternatively, you can export the SO_PATH_CONF
environment variable via the command line or via your shell profile.
For the package distributions, the config directory location defaults to /etc/snowowl
. The location of the config directory can also be changed via the SO_PATH_CONF
environment variable, but note that setting this in your shell is not sufficient. Instead, this variable is sourced from /etc/default/snowowl
(for the Debian package) and /etc/sysconfig/snowowl
(for the RPM package). You will need to edit the SO_PATH_CONF=/etc/snowowl
entry in one of these files accordingly to change the config directory location.
The configuration format is YAML. Here is an example of changing the path of the data directory:
Settings can also be flattened as follows:
Environment variables referenced with the ${...}
notation within the configuration file will be replaced with the value of the environment variable, for instance:
You should rarely need to change Java Virtual Machine (JVM) options. If you do, the most likely change is setting the heap size.
The preferred method of setting JVM options (including system properties and JVM flags) is via the the SO_JAVA_OPTS
environment variable. For instance:
When using the RPM or Debian packages, SO_JAVA_OPTS
can be specified in the system configuration file.
Some other Java programs support the JAVA_OPTS
environment variable. This is not a mechanism built into the JVM but instead a convention in the ecosystem. However, we do not support this environment variable, instead supporting setting JVM options via the environment variable SO_JAVA_OPTS
as above.
The RPM for Snow Owl can be downloaded from the Downloads section. It can be used to install Snow Owl on any RPM-based system such as OpenSuSE, SLES, Centos, Red Hat, and Oracle Enterprise.
RPM install is not supported on distributions with old versions of RPM, such as SLES 11 and CentOS 5. Please see Install Snow Owl with .zip or .tar.gz instead.
On systemd-based distributions, the installation scripts will attempt to set kernel parameters (e.g., vm.max_map_count
); you can skip this by masking the systemd-sysctl.service unit.
Use the chkconfig command to configure Snow Owl to start automatically when the system boots up:
Snow Owl can be started and stopped using the service command:
If Snow Owl fails to start for any reason, it will print the reason for failure to STDOUT. Log files can be found in /var/log/snowowl/.
To configure Snow Owl to start automatically when the system boots up, run the following commands:
Snow Owl can be started and stopped as follows:
These commands provide no feedback as to whether Snow Owl was started successfully or not. Instead, this information will be written in the log files located in /var/log/snowowl/.
You can test that your Snow Owl instance is running by sending an HTTP request to:
which should give you a response something like this:
Snow Owl defaults to using /etc/snowowl
for runtime configuration. The ownership of this directory and all files in this directory are set to root:snowowl
on package installation and the directory has the setgid flag set so that any files and subdirectories created under /etc/snowowl
are created with this ownership as well (e.g., if a keystore is created using the keystore tool). It is expected that this be maintained so that the Snow Owl process can read the files under this directory via the group permissions.
Snow Owl loads its configuration from the /etc/snowowl/snowowl.yml
file by default. The format of this config file is explained in Configuring Snow Owl.
The RPM places config files, logs, and the data directory in the appropriate locations for an RPM-based system:
You now have a test Snow Owl environment set up. Before you start serious development or go into production with Snow Owl, you must do some additional setup:
Learn how to configure Snow Owl.
Configure important Snow Owl settings.
Configure important system settings.
Type
Description
Default Location
Setting
home
Snow Owl home directory or $SO_HOME
/usr/share/snowowl
bin
Binary scripts including startup/shutdown to start/stop the instance
/usr/share/snowowl/bin
conf
Configuration files including snowowl.yml
/etc/snowowl
data
The location of the data files and resources.
/var/lib/snowowl
path.data
logs
Log files location.
/var/log/snowowl
Snow Owl uses a mmapfs
directory by default to store its data. The default operating system limits on mmap counts is likely to be too low, which may result in out of memory exceptions.
On Linux, you can increase the limits by running the following command as root:
To set this value permanently, update the vm.max_map_count
setting in /etc/sysctl.conf
. To verify after rebooting, run sysctl vm.max_map_count
.
The RPM and Debian packages will configure this setting automatically. No further configuration is required.
This is only relevant if you are running Snow Owl with an embedded Elasticsearch and not connecting it to an existing cluster.
Snow Owl (with embedded Elasticsearch) uses a lot of file descriptors or file handles. Running out of file descriptors can be disastrous and will most probably lead to data loss. Make sure to increase the limit on the number of open files descriptors for the user running Snow Owl to 65,536 or higher.
For the .zip
and .tar.gz
packages, set ulimit -n 65536
as root before starting Snow Owl, or set nofile
to 65536
in /etc/security/limits.conf
.
RPM and Debian packages already default the maximum number of file descriptors to 65536
and do not require further configuration.
While Snow Owl requires very little configuration, there are a number of settings which need to be considered before going into production.
The following settings must be considered before going to production:
By default, Snow Owl includes the OSS version of Elasticsearch and runs it in embedded mode to store terminology data and make it available for search. This is convenient for single node environments (eg. for evaluation, testing and development), but it might not be sufficient when you go into production.
To configure Snow Owl to connect to an Elasticsearch cluster, change the clusterUrl
property in the snowowl.yml
configuration file:
The value for this setting should be a valid HTTP URL point to the HTTP API of your Elasticsearch cluster, which by default runs on port 9200
.
If you are using the .zip
or .tar.gz
archives, the data and logs directories are sub-folders of $SO_HOME
. If these important folders are left in their default locations, there is a high risk of them being deleted while upgrading Snow Owl to a new version.
In production use, you will almost certainly want to change the locations of the data and log folders.
The RPM and Debian distributions already use custom paths for data and logs.
To allow clients to connect to Snow Owl, make sure you open access to the following ports:
8080/TCP:: Used by Snow Owl Server's REST API for HTTP access
8443/TCP:: Used by Snow Owl Server's REST API for HTTPS access
2036/TCP:: Used by the Net4J binary protocol connecting Snow Owl clients to the server
By default, Snow Owl tells the JVM to use a heap with a minimum and maximum size of 2 GB. When moving to production, it is important to configure heap size to ensure that Snow Owl has enough heap available.
To configure the heap size settings, change the -Xms
and -Xmx
settings in the SO_JAVA_OPTS
environment variable.
The value for these setting depends on the amount of RAM available on your server and whether you are running Elasticsearch on the some node as Snow Owl (either embedded or as a service) or running it in its own cluster. Good rules of thumb are:
Set the minimum heap size (Xms
) and maximum heap size (Xmx
) to be equal to each other.
Too much heap can subject to long garbage collection pauses.
Set Xmx
to no more than 50% of your physical RAM, to ensure that there is enough physical RAM left for kernel file system caches.
Snow Owl connecting to a remote Elasticsearch cluster requires less memory, but make sure you still allocate enough for your use cases (classification, batch processing, etc.).
Most operating systems try to use as much memory as possible for file system caches and eagerly swap out unused application memory. This can result in parts of the JVM heap or even its executable pages being swapped out to disk.
Swapping is very bad for performance, and should be avoided at all costs. It can cause garbage collections to last for minutes instead of milliseconds and can cause services to respond slowly or even time out.
There are two approaches to disabling swapping. The preferred option is to completely disable swap, but if this is not an option, you can minimize swappiness.
Usually Snow Owl is the only service running on a box, and its memory usage is controlled by the JVM options. There should be no need to have swap enabled.
On Linux systems, you can disable swap temporarily by running:
To disable it permanently, you will need to edit the /etc/fstab
file and comment out any lines that contain the word swap
.
Another option available on Linux systems is to ensure that the sysctl value vm.swappiness
is set to 1. This reduces the kernel’s tendency to swap and should not lead to swapping under normal circumstances, while still allowing the whole system to swap in emergency conditions.
By default, Snow Owl is starting and connecting to an embedded Elasticsearch
cluster available on http://localhost:9200
. This cluster has only a single node and its discovery method is set to single-node
, which means it is not able to connect to other Elasticsearch clusters and will be used exclusively by Snow Owl.
This single node Elasticsearch cluster can easily serve Snow Owl in testing, evaluation and small authoring environments, but it is recommended to customize how Snow Owl connects to an Elasticsearch cluster in larger environments (especially when planning to scale with user demand).
You have two options to configure Elasticsearch used by Snow Owl.
The first option is to configure the underlying Elasticsearch instance by editing the configuration file elasticsearch.yml
, which depending on your installation is available in the configuration directory (you can create the file, if it is not available, Snow Owl will pick it up during the next startup).
The embedded Elasticsearch version is 6.3.2
. If you are configuring it to connect to an existing Elasticsearch cluster, then make sure that the cluster version matches with this version.
The second option is to configure Snow Owl to use a remote Elasticsearch cluster without the embedded instance. In order to use this feature you need to set the repository.index.clusterUrl
configuration parameter to the remote address of your Elasticsearch cluster. When Snow Owl is configured to connect to a remote Elasticsearch cluster, it won't boot up the embedded instance, which reduces the memory requirements of Snow Owl slightly.
You can connect to self-hosted clusters or hosted solutions provided by AWS and Elastic.co for example.
Ideally, Snow Owl should run alone on a server and use all of the resources available to it. In order to do so, you need to configure your operating system to allow the user running Snow Owl to access more resources than allowed by default.
The following settings must be considered before going to production:
Where to configure systems settings depends on which package you have used to install Snow Owl, and which operating system you are using.
When using the .zip
or .tar.gz
packages, system settings can be configured:
temporarily with , or
permanently in .
When using the RPM or Debian packages, most system settings are set in the system configuration file. However, systems which use systemd require that system limits are specified in a systemd configuration file.
On Linux systems, ulimit
can be used to change resource limits on a temporary basis. Limits usually need to be set as root before switching to the user that will run Snow Owl. For example, to set the number of open file handles (ulimit -n
) to 65,536
, you can do the following:
The new limit is only applied during the current session.
You can consult all currently applied limits with ulimit -a
.
On Linux systems, persistent limits can be set for a particular user by editing the /etc/security/limits.conf
file. To set the maximum number of open files for the snowowl
user to 65,536
, add the following line to the limits.conf file:
This change will only take effect the next time the snowowl
user opens a new session.
When using the RPM or Debian packages, system settings and environment variables can be specified in the system configuration file, which is located in:
However, for systems which uses systemd, system limits need to be specified via systemd.
When using the RPM or Debian packages on systems that use systemd, system limits must be specified via systemd.
The systemd service file (/usr/lib/systemd/system/snowowl.service) contains the limits that are applied by default.
To override them, add a file called /etc/systemd/system/snowowl.service.d/override.conf (alternatively, you may run sudo systemctl edit snowowl
which opens the file automatically inside your default editor). Set any changes in this file, such as:
Once finished, run the following command to reload units:
Package | Location |
RPM | /etc/sysconfig/snowowl |
Debian | /etc/default/snowowl |
Snow Owl uses a number of thread pools for different types of operations. It is important that it is able to create new threads whenever needed. Make sure that the number of threads that the Snow Owl user can create is at least 4096
.
This can be done by setting ulimit -u 4096
as root before starting Snow Owl, or by setting nproc
to 4096
in /etc/security/limits.conf
.
The package distributions when run as services under systemd will configure the number of threads for the Snow Owl process automatically. No additional configuration is required.
The method for starting Snow Owl varies depending on how you installed it.
If you installed Snow Owl with a .tar.gz
or zip
package, you can start Snow Owl from the command line.
Snow Owl can be started from the command line as follows:
By default, Snow Owl runs in the foreground, prints some of its logs to the standard output (stdout
), and can be stopped by pressing Ctrl-C
.
All scripts packaged with Snow Owl assume that Bash is available at /bin/bash. As such, Bash should be available at this path either directly or via a symbolic link.
To run Snow Owl as a daemon, use the following command:
Log messages can be found in the $SO_HOME/serviceability/logs/
directory.
The startup scripts provided in the RPM and Debian packages take care of starting and stopping the Snow Owl process for you.
Snow Owl is not started automatically after installation. How to start and stop Snow Owl depends on whether your system uses SysV init
or systemd
(used by newer distributions). You can tell which is being used by running this command:
Use the chkconfig
command to configure Snow Owl to start automatically when the system boots up:
Snow Owl can be started and stopped using the service command:
If Snow Owl fails to start for any reason, it will print the reason for failure to STDOUT. Log files can be found in /var/log/snowowl/
.
To configure Snow Owl to start automatically when the system boots up, run the following commands:
Snow Owl can be started and stopped as follows:
These commands provide no feedback as to whether Snow Owl was started successfully or not. Instead, this information will be written in the log files located in /var/log/snowowl/
.
Snow Owl security features enables you to easily secure your terminology server. You can password-protect your data as well as implement more advanced security measures such as role-based access control and auditing.
You can choose the following security realms/identity providers to authenticate your users:
You can manage and authenticate users with the built-in file internal realm. All the data about the users for the file realm is stored in the users
file. The file is located in SO_PATH_CONF
and are read on startup.
You need to explicitly select the file realm in the snowowl.yml
configuration file in order to use it for authentication.
In the above configuration the file realm is using the users
file to read your users from. Each row in the file represents a username and password delimited by :
character. The passwords are BCrypt encrypted hashes. The default users
file comes with a default snowowl
user with the default snowowl
password.
An orderly shutdown of Snow Owl ensures that Snow Owl has a chance to cleanup and close outstanding resources. For example, an instance that is shutdown in an orderly fashion will initiate an orderly shutdown of the embedded Elasticsearch instance, gracefully close and disconnect connections and perform other related cleanup activities. You can help ensure an orderly shutdown by properly stopping Snow Owl.
If you’re running Snow Owl as a service, you can stop Snow Owl via the service management functionality provided by your installation.
If you’re running Snow Owl directly, you can stop Snow Owl by sending Ctrl-C
if you’re running Snow Owl in the console, or by invoking the provided shutdown
script as follows:
You can configure security to communicate with a Lightweight Directory Access Protocol (LDAP) server to authenticate users. To integrate with LDAP, you configure an ldap
realm in the snowowl.yml
configuration file.
At a minimum, you must set the realm type to ldap
, specify the url
of the LDAP server and set the rootDnPassword
in the snowowl.yml
configuration file. Your users should be available under the specified baseDn
entry, and also there should be an cn=admin
entry to allow access for Snow Owl to read user data. By default Snow Owl expects that the username of a user is present in the uid
property. You can change this in the userIdProperty
setting.
Coming soon!