Installation Guide

Installing Single-machine and Multi-machine systems

This guide describes how to install the TigerGraph platform either as a single node or as a multi-node cluster. Please use the Table of Contents to go to the appropriate section of this guide.

If you are installing the Developer Edition, you can also install a Docker image or a virtual machine (VirtualBox) image. Your welcome email message will direct you to the appropriate resources.

Preparation

This section is for New Installations. If you are updating from a previous version of the TigerGraph platform, first read the section below on Upgrading an Existing Installation .

Before you can install the TigerGraph system, you need the following:

  1. One or more servers that meets the minimum Hardware and Software Requirements with regard to operating system, memory and hard disk space, as well as enough memory and storage to store your graph data.

  2. sudo or root privilege.

  3. A license key provided by TigerGraph (not applicable to Developer Edition)

  4. A TigerGraph system package .

  5. If your package is a *tar.gz file, you may need to install some software prerequisites.

Use a BASH shell, otherwise there may be installation issues.

Obtaining a TigerGraph Package

If you do not yet have a TigerGraph system package, you can request one at www.tigergraph.com/download/ .

Software Prerequisites for *.tar.gz Packages

If your package is a *tar.gz file, you also need to insure your machine has the following software prerequisites.

  1. Pre-install these basic Linux utilities on your server, if necessary:

    • tar

    • curl

    • ip

    • more

    • crontab

    • ssh/sshd

    • netstat

    • semanage

  2. I f you are installing a cluster, you also need the following:

    • ntpd

    • iptables/firewalld

  3. If you will use the password login method (P method) instead of ssh key login method (K method) to install the TigerGraph platform, you will also need the following:

    • sshpass

Installation

For 3.0 Beta, installer may have issues with using "SSH with password" for EC2 instances. Please use ssh with key file for the time being.

If your ec2 machine was created to be accessed via ssh password, please run these commands and continue with the installation:

ssh-keygen
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

The name of your package may vary, depending on the product edition (e.g., developer or enterprise) and the version (e.g., 2.0.1). For the examples here, we will assume the name is tigergraph-x.y.z.tar.gz. Substitute the name of your actual package file.

  1. Extract the package:

Example: extract for <version> = x.y.z
Example: extract for <version> = x.y.z
tar -xzf tigergraph-x.y.z.tar.gz

2. A folder named tigergraph-<version>-offline (or tigergraph-<version>-developer) will be created. Change into this folder. To Install with default settings, run the install.sh script with commands:

Example: Default installation for <version> = 3.0.0
Example: Default installation for <version> = 3.0.0
# for single node installation, run :
cd tigergraph-*/
sudo ./install.sh
# for cluster/remote installation, run :
cd tigergraph-*/
./install.sh
# For non-interactive installations, please go to the non-interactive mode.
# to install developer edition
cd tigergraph-*/
sudo ./install.sh

The installer will ask you a few questions:

  • Do you agree to the License Terms and Conditions?

  • What is your license key? (not applicable to Developer Edition)

  • Do you want to use the default TigerGraph user name or select/create your own?

  • Do you want to use the default TigerGraph user password or create your own?

  • Do you want to use the default installation folder or select/create your own?

  • Do you want to use the default data location folder or select/create your own?

  • Do you want to use the default log location folder or select/create your own?

  • Do you want to use the default temp folder or select/create your own?

  • What is the default SSH port for your machine?

To see what are the default settings, and to see how customize the installation, read the Installation Options section below.

Since license keys are long – over 100 characters long. If you copy-and-paste the license key, be careful not to accidentally include an end-of-line character.

3. After installation is complete, you can login to the tigergraph user with this command : su tigergraph To confirm correct operation:

1. Try the command gadmin status

If the is system installed correctly and the license is activated, the command should report that zk , kafka , etcd, dict, ts3, ifm, ctrl, nginx, gsql, restpp, and gui are up and ready. Since there is no graph data loaded yet, gse and gpe will show "warm up".

2. Try the command gsql --version

4. Basic installation is now finished! Please see Post-Installation Notes below.

Installation Options

The following default settings will be applied if no parameters specified:

  • The installer will create a user called tigergraph , with password tigergraph .

  • The default root directory for the installation would be /home/tigergraph/tigergraph with the App/Data/Log/Temp files within it : App Path : /home/tigergraph/tigergraph/app Data Path : /home/tigergraph/tigergraph/data Log Path : /home/tigergraph/tigergraph/log Temp Path : /home/tigergraph/tigergraph/tmp

  • The root directory for the installation (referred to as <TigerGraph.Root.Dir>) is a folder called tigergraph located in the tigergraph user's home directory, i.e., /home/tigergraph/tigergraph .

The installation can be customized by running command line options with the install.sh script:

The installation can be run customized using three different methods : 1. Interactive mode 2. Command Line options 3. Non-interactive mode with the install_conf.json file

Command Line Options

# Installation options of enterprise edition
Usage:
./install.sh [-n] [-u <user>] [-p <password>] [-r <tigergraph_root_dir>] [-l <license_key>] [-F] [-N]
./install.sh -U
./install.sh -h
Options:
-h -- Show the help
-u -- TigerGraph user [default: tigergraph]
-p -- TigerGraph password [default: tigergraph]
-l -- TigerGraph license key
-n -- Non-interactive option: suppress prompts, and continue installation using default config
-U -- Upgrade tigergraph system from existing platform with no config change and
shoud run under tigergraph user
-F -- Set iptables (firewall) rules to open tcp ports among cluster nodes
-N -- Set NTP system time synchronization among cluster nodes
[NOTE ]: Using option '-n' will non-interactively install the platform on single node
or cluster with all configurations from config file "platform_config.json".
In this case, the config file should be modified before installation.
[WARNING ]: Installer fails if any option (except -F and -N) is provided with option '-n' at the same time.
# Installation options of developer edition
Usage:
./install.sh [-u <user>] [-p <password>] [-r <tigergraph_root_dir>]
./install.sh -h
Options:
-h -- show the help
-u -- TigerGraph user [default: tigergraph]
-p -- TigerGraph password [default: tigergraph]
-r -- TigerGraph.Root.Dir [default: <tigergraph_user_home>/tigergraph]
-n -- Non-interactive option: suppress prompts, and continue installation using default config

TigerGraph cluster configuration enables the graph database to be partitioned and distributed across multiple server nodes in a local network (not available in the Developer Edition). The cluster can either be a physical cluster or a network virtual cluster from a cloud service such as Amazon EC2 or Microsoft Azure.

  1. The installation of TigerGraph 3.x has been validated on Amazon EC2 and Microsoft Azure and on a physical on-premises cluster. For Amazon EC2, please make sure all tcp ports are open among all cluster nodes, otherwise service may not start.

  2. In TigerGraph 3.x, the installation machine can be within or outside the cluster. If outside the cluster, the installation machine should be a Linux machine.

  3. Currently, every machine in the cluster must have a sudo user with the same username and password or SSH key .

  4. To install a high-availability cluster (with at least 3 nodes), please set ReplicationFactor as you wish. (Default 1 means HA is off, you might set it to be the factor of the number of nodes, i.e. ReplicationFactor = 2 or 3 for 6-node cluster)

  5. For cluster installation, there is no requirement to run installation script with sudo privileges.

During cluster configuration, the user is required to provide the following information regarding the cluster:

  • The node id (e.g. m1) and its IP address (e.g. 172.30.3.2).

  • The login credentials for the nodes.

  • The ReplicationFactor, which has to do with HA setup

Interactive Mode Installation

In interactive mode, the installer will first ask the same basic questions it asks for single-node installation. It will then ask how many machines are in your cluster. Then it will prompt for the IP addresses of the machines, assigning each machine an alias m1, m2, m3, etc. Next it will ask for sudo user name and credentials information. Last, it will ask the user if they accept some changes to the system. (See non-interactive mode installation below for details about user credentials.) A screenshot of interactive installation is shown below.

Non-Interactive Mode Installation

For non-interactive mode installation, the user must review and modify all the settings in the file install_conf.json before running the installer. This file is in the folder with your install.sh file and other TigerGraph package files.

The following are some advanced configuration options:

  1. Node List Each machine in the cluster is defined as a key:value pair, where the key is a machine alias m1, m2, m3, etc. NOTE: If you chose names other than m1, m2, etc., be sure to list them in alphanumeric order in the config file. The first machine ("m1") has a special role in some cases. Use as many key:value pairs as you need, placing the public IP addresses next to each key. The installer will auto detect the local IP addresses and use them to configure the system. If the installer detects more than one local IP address, it will ask the user to select one for configuration. One example of NodeList:

    "NodeList": ["m1: 192.168.55.42", "m2: 192.168.55.46", "m3: 192.168.55.47" ]

    Note: The entry is a json array of strings, so each key:value pair should be quoted as a string, and be separated by a comma.

  2. Login Config Two login methods are supported:

    • SSH with password

    • SSH with key file

    For SSH with password, you must input the sudo/root user and its password. For SSH with key file, you must specify the AWS EC2 key file or other key file by its absolute path.

  3. Replication Factor If you would like enable the HA feature, please make sure you have at least 3 nodes in the cluster and set the replication factor >= 2. For example, if your cluster has 6 nodes, you could set the replication factor to be 2 or 3. If you set the replication factor to be 2, then 3 nodes will be used for one copy of the data and the other 3 nodes will be used as a replica copy of the data. Reminder: Set replication factor as the factor of the number of nodes to maximize the HA benefit. Otherwise, some nodes may not be utilized as part of the HA cluster.

Below is a sample install_conf.json file.

The node names (e.g., m1, m2, etc.) MUST be given in alphanumeric order, because the first machine has a special role in some situations. In our documentation we will refer to this machine as m1.

install_conf.json example
install_conf.json example
{
"tigergraph.user.name": "tigergraph",
"tigergraph.user.password": "tigergraph",
"tigergraph.root.dir": "/home/tigergraph/tigergraph",
"license.key": "Replace_With_Trial_Or_Official_License_Key_String",
"notes (this is a comment)": "The cluster.option block is for remote installation or cluster installation (installation that has more than one node). Skip it if installing locally on single node",
"cluster.option": {
"notes (this is a comment)": "Set enable.cluster to true for cluster installation",
"enable.cluster": "false",
"nodes.ip": {
"m1": "192.168.1.1",
"m2": "192.168.1.2",
"m3": "192.168.1.3",
"m4": "192.168.1.4"
},
"nodes.login": {
"supported.methods (this is a comment)": "'P' for SSH using password or 'K' for SSH using key file (e.g. ec2_key.pem)",
"notes (this is a comment)": "All nodes must use the same ssh port, the same sudo user, and the same password or the same key file",
"ssh.port": "22",
"chosen.method": "P (or K)",
"P": {
"sudo.user.name": "sudoUserName",
"sudo.user.password": "sudoUserPassword"
},
"K": {
"sudo.user.name": "sudoUserName",
"ssh.key.file": "/path/to/my_key.pem (if empty, the installer will use default ssh key file such as ~/.ssh/id_rsa)"
}
},
"HA.option": {
"HA.option notes (this is a comment)": "option to install a high-availability cluster (with at least 3 nodes), default value: false",
"enable.HA": "false (or true)"
}
}
}

Post-Installation Notes

Change Your Password

If you installed with the default password, we recommend that you change it now.

Additional Customization

To perform additional customization, run gadmin --configure ( must be on node m1 if it is cluster ), followed by gadmin config-apply . The ' gadmin config-apply ' command must be run on node m1 if it is cluster, since only node m1 contians pkg_poolresources. If you configured one or more items of gpe.servers, gse.servers, restpp.servers, kafka.servers, zk.servers, dictserver.servers, gpe.replicas, or gse.replicas, you must reinstall the package by running command gadmin pkg-install reset on node m1.

see the appropriate sections of the TigerGraph System Administrators Guide v2.1 .

Learning To Use TigerGraph

If you are a first-time user:

Upgrading an Existing Installation

Developer Edition upgrade is not supported

The Developer Edition is not designed for upgrade from one version to another It is not possible to upgrade a Developer Edition installation to Enterprise Edition.

If you have written User-Defined Functions for your queries, be sure to make a backup of these files : <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprUtil.hpp

Upgrading from v2.x to v3.x

  1. Make sure all data is consumed and no active jobs are running. The following instructions will guide you to force all KAFKA data to be consumed for each graph. For graphs requiring an authentication token, the endpoint must be called for each graph utilizing their respective tokens. For graphs not requiring tokens, you can call the endpoint once for all graphs.

    • /rebuildnow endpoint The /rebuildnow endpoint is to be called to force the engine to do a rebuild for the graphs. It is a non-blocking url call where user can do the query as well as loading during the rebuildnow call. The endpoint takes in three optional additional parameters (with example below) :

    1. threadnum: a parameter used to control the number of threads used to do the rebuild. If not specified then it uses the default threadnum in gium.

    2. vertextype: the vertex type name that used to do rebuild only for this type of vertices.

    3. segid: a list of parameters used to specify which segments get the rebuild. If not specified then it do the rebuild for all segments.

    4. path: path to write the summary file on each machine in the cluster. This can be used to indicate that the rebuild has finished and it also records the summary of the rebuild on each machine. The default path will be /tmp/rebuildnow.

    // The following url will trigger the engine to do the rebuild on all segments using default thread num and default output path "/tmp/rebuildnow"
    curl -X GET "localhost:9000/rebuildnow/graph_name"
    // The following url will trigger the engine to do the rebuild on all segments using default thread num and output into the folder given by path = "/data/rebuild"
    curl -X GET "localhost:9000/rebuildnow/graph_name?path=/data/rebuild"
    // The following url will trigger the engine to do the rebuild on all segments using default thread num and output into the folder given by path = "/data/rebuild" and only do the rebuild for "lineitem" vertex
    curl -X GET "localhost:9000/rebuildnow/graph_name?path=/data/rebuild&vertextype=lineitem"
    // The following url will trigger the engine to do the rebuild on all segments using 4 threads
    curl -X GET "localhost:9000/rebuildnow/graph_name?threadnum=4"
    // The following url will trigger the engine to do the rebuild on segment id 1 and 2 using 4 threads
    curl -X GET "localhost:9000/rebuildnow/graph_name?threadnum=4&segid=1&segid=2"

    The query will output two files into the given path parameter, i.e. create the init.summary.txt at the beginning of running to record all segment info and output finished.summary.txt at the end of the query running so that people know when the rebuild are all finished. Below is an example of running the CURL request and output.

    time curl -H "GSQL-TIMEOUT:8000000" -X GET "http://localhost:9000/rebuildnow?path=/tmp/rebuildlater"
    {"version":{"edition":"enterprise","api":"v2","schema":0},"error":false,"message":"RebuildNow finished, please check details in the folder: /tmp/rebuildnow","results":[],"code":"REST-0000"}
    real 0m20.267s
    user 0m0.004s
    tigergraph@ubuntu core/gpe [tg_2.5.1_dev-CORE-706] $ ls -latr /tmp/rebuildnow/
    total 92
    drwxrwxrwx 2 tigergraph tigergraph 4096 Nov 6 18:33 .
    -rw-rw-r-- 1 tigergraph tigergraph 35108 Nov 6 21:53 init.summary.txt
    -rw-rw-r-- 1 tigergraph tigergraph 35108 Nov 6 21:53 finish.summary.txt
    cat finished.summary.txt
    [SELECTED] Segment id: 106, vertextype: 0, vertexsubtypeid: 0, vertexcount: 187732, edgecount: 563196, deletevertexcount: 0, postqueue_pos: 16344, transaction id: 16344, rebuild ts: 1573106412990
    [SKIPPED] Segment id: 6, vertextype: 0, vertexsubtypeid: 0, vertexcount: 85732, edgecount: 3106, deletevertexcount: 0, postqueue_pos: 16344, transaction id: 16344, rebuild ts: 1573106412900

    NOTE : The /rebuildnow endpoint does not guarantee that all KAFKA messages are consumed by the engine (you have to wait until there is no PullDelta in the GPE log to guarantee that all KAFKA messages have been consumed by the engine). It only guarantees that all the in-memory graph updates are persisted to disk data.

  2. Make sure config is applied by using this command : gadmin config-apply.

  3. Stop your TigerGraph system with this command : gadmin stop all admin ts3 -y.

  4. Install version 3.0.0 with the same cluster config and HA options as your previous installation.

    • If you have enabled HA in your 2.5.x installation, you should specify the ReplicationFactor in 3.0.0 installer to be the same as previously configured. Otherwise, leave it as 1.

    • NOTE: If your old 2.5.x system is installed in the cluster [m1, m2, m3, m4], you could only install 3.0.0 in the same [m1, m2, m3, m4], but you only need to maintain the IP of m1 to be the same. The order of m2 to m4 does not matter.

    • Please specify a valid license key.

  5. After installing, log in as tigergraph user. Now gadmin version should point to 3.0.0. If not, please check your installation.

  6. For the following instructions, we assume to be under tigergraph user:

    1. Download the migration tool here and unzip it (link to tool to be added).

    2. Change directories to the migration_tool folder.

    3. Find the gsql.cfg file that belongs to 2.5.x. The default loaction is: ~/.gsql/gsql.cfg

    4. Run migration tool with ./migration_tool.sh ~/.gsql/gsql.cfg

    5. If any errors occur, please check the error message, as well as debug.log under the migration tool folder.

Notice: If you don’t activate a valid license when installing 3.0, you might fail in the end when running these two commands. gsql recompile loading job gsql install query -force all

Upgrading from 2.x to 2.x

All sections below are for versions prior to v3.0. If your specific versions are not listed below, please upgrade by :

  1. Download the latest version of TigerGraph to your system.

  2. Extract the tarball.

  3. Run the TigerGraph.bin file that was extracted from the tarball : bash tigergraph.bin

Updating from v2.1.7 to v2.2.x

These steps are assuming that v2.1.7 is installed. To upgrade to v2.2 from a version older than v2.1.7 , please upgrade to v2.1.7 first. If the tigergraph username and password have been changed, please have them ready as you will need them in order to update the system.

  1. Download tigergraph-2.2.x-offline.tar.gz with user “tigergraph” and extract the tarball file.

  2. Download the post_upgrade.sh script that is attached here.

  3. Run tigergraph.bin under the same folder to upgrade to 2.2.x

  4. Run the post-upgrade script that was downloaded in step 2 : post_upgrade.sh -u <sudoUser> [-P <sudoPass> | -K <sshKey> ] -p <tigergraphUserPass>

Updating from v2.0 to v2.1

v2.0 can be upgraded to v2.1 Enterprise Edition. The data store format and GSQL language scripts in v2.0 are forward compatible to v2.1.

Upgrading from v1.x to v2.x

The data store format between 1.x and 2.x for single servers is forward compatible but not backward compatible. For a single server platform, users can upgrade from 1.x to 2.x without reloading data or recreating the graph schema. Some details of the GSQL language have changed, so some loading jobs and queries will need to be revised and reinstalled.

For a cluster configuration, direct upgrade from 1.x to 2.x is not supported at this time. Users interested in migrating from 1.x to 2.x need to export their data and metadata, install v2.x, and then reload data and metadata, with some small modifications. Please contact support@tigergraph.com for assistance.

Please consult the Release Notes for all the versions between your current version and your target version (e.g., v2.1) to see a summary of specification changes. Contain support@tigergraph.com for assistance.

Workflow for Direct Upgrade

  1. Verify that your data store is compatible and is eligible for direct update / upgrade.

  2. Review the specification changes and how they may affect your applications (loading jobs and queries).

  3. Stop issuing new commands to your TigerGraph system and allow any operations to complete.

  4. (Recommended) Backup your data, as a precaution.

  5. Follow the procedure at the beginning of this document for installing a new system. The installer will automatically shut down your system and start it again.

Be sure to specify the same username as your current installation. Otherwise, if you use a different user name, it will be treated as a new installation, with an empty graph.

  1. Pay attention to output messages during the installation process which may alert you to additional tasks or checks you should perform.

  2. Run the command gsql to start the GSQL shell. The first time after an update, gsql performs two important operations:

    1. Copies your catalog from your old installation to the new installation .

    2. Compares the files in the backup /dev_<datetime>/gdk/gsql/src folder to the new /dev/gdk/gsql/src folder. Pay attention to any files residing in the old folder but not in the new folder. Review them and copy them to the new folder if appropriate. See the example below.

  3. Revise and reinstall loading jobs, user-defined functions, and queries as needed.