By Mohamad Wael

Posted :

How to configure replication in orientdb ?

How to configure replication in orientdb , featured image

Start by downloading orientdb on the servers , where replication is to be configured . It can be gotten from orientdb , or for example by using wget , as in :

$ wget https://s3.us-east-2.amazonaws.com/orientdb3/releases/3.1.10/orientdb-3.1.10.tar.gz

Next extract orientdb , by using tar , as in tar -xzf orientdb-version.tar.gz , and go into the bin directory of the extracted folder , which is more or less related to orientdb execution , by using the cd command , as in cd orientdb-3.1.10/bin , and execute ./dserver.sh .

The dserver.sh script , starts orientdb in distributed mode . The reason that this script is first ran , is for orientdb to generate a password , and a name for each node , where the script is to be ran . Choose meaningful name , as in location1 , location2 ...

...
+---------------------------------------------------------------+
|                WARNING: FIRST RUN CONFIGURATION               |
+---------------------------------------------------------------+
| This is the first time the server is running. Please type a   |
| password of your choice for the 'root' user or leave it blank |
| to auto-generate it.                                          |
|                                                               |
| To avoid this message set the environment variable or JVM     |
| setting ORIENTDB_ROOT_PASSWORD to the root password to use.   |
+---------------------------------------------------------------+

Root password [BLANK=auto generate it]: ***********
Please confirm the root password: ***********
...

+---------------------------------------------------------------+
|         WARNING: FIRST DISTRIBUTED RUN CONFIGURATION          |
+---------------------------------------------------------------+
| This is the first time that the server is running as          |
| distributed. Please type the name you want to assign to the   |
| current server node.                                          |
|                                                               |
| To avoid this message set the environment variable or JVM     |
| setting ORIENTDB_NODE_NAME to the server node name to use.    |
+---------------------------------------------------------------+

Node name [BLANK=auto generate it]: development

# You can after that hit ctrl-c to stop
# orientdb , or you can also execute the
# script  ./shutdown.sh in orientdb
# bin directory , to shut down
# orientdb .

Orientdb recommends 4 GB of ram , for the distributed mode , but if memory is an issue , it can be configured by editing the script orientdb-version/bin/dserver.sh , for example by using nano and ctrl-w , to search for the memory settings , which should be as follows :

# Excerpt from bin/dserver.sh


# ORIENTDB memory options, default to 4GB of heap.

if [ -z "$ORIENTDB_OPTS_MEMORY" ] ; then
    ORIENTDB_OPTS_MEMORY="-Xms4G -Xmx4G"


# Xms is the start memory , and Xmx is the
# max memory .
# Replace 4G for example by 512M , which means
# 512 megabytes .
# If using nano , hit ctrl-x , followed by the
# y character , followed by an enter to exit
# editing , and save the done work .

Now it is time to configure nodes' clustering . To do that the OHazelcastPlugin must be configured in config/orientdb-server-config.xml , to specify if it is enabled , and where the default configuration file for distributed database , for example default-distributed-db-config.json is , and where the file used to configure cluster membership and protocol , for example , hazelcast.xml is . The default configuration in orientdb-server-config.xml is sufficient .

<!-- orientdb-version/config/orientdb-server-config.xml -->

<handler class="com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin">
    <parameters>
        <parameter value="${distributed}" name="enabled"/>
        <parameter value="${ORIENTDB_HOME}/config/default-distributed-db-config.json" name="configuration.db.default"/>
        <parameter value="${ORIENTDB_HOME}/config/hazelcast.xml" name="configuration.hazelcast"/>
        <parameter value="development" name="nodeName"/>
    </parameters>
</handler>

To configure the replication protocol , edit orientdb-server-config.xml , for each server .

<!-- The config/orientdb-server-config.xml file .-->
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast
        xsi:schemaLocation="http://www.hazelcast.com/schema/config ...>

    <group>
        <name>orientdb</name>
        <password>orientdb</password>
    </group>

    <properties>
        <property name="hazelcast.phone.home.enabled">false</property>
        ...
    </properties>

    <network>
        <port auto-increment="false">2434</port>
        <join>

            <multicast enabled="false">
                <multicast-group>235.1.1.1</multicast-group>
                <multicast-port>2434</multicast-port>
            </multicast>

            <tcp-ip enabled="true">
                <member>ipaddress</member>
                <member>ipaddress:port</member>
                <member>host</member>
                <member>host:port</member>
                ...
            </tcp-ip>

        </join>
    </network>

    <executor-service>
        <pool-size>16</pool-size>
    </executor-service>

</hazelcast>

The group name and password , are the cluster's name and password . Make sure to change them , and to choose a secure password .

The network is the network protocol to be used . auto-increment is disabled , if enabled and the port specified is already bound , then the next port is tried .

The network protocol can be set to multicast , like if on the local network , or on the same PC , in such a case , nothing is to be configured , just make sure that multicast is enabled , as in , enabled="true" . Only one network setting can be enabled , as in tcp-ip or multicast .

If like on different or remote networks , or for any other reasons , the network protocol can be set to tcp-ip . For tcp-ip , you can specify the IP address or host name , with optionally a port number , as in 192.168.0.4:2424 . For each server , that is to be part of the group , just add its details using the member tag .

Having configured the network protocol , it is time to configure database replication , which can be done in default-distributed-db-config.json . This file is copied , and updated as distributed-config.json , to each database folder in orientdb-version/databases .

// The config/default-distributed-db-config.json file .
{
  "autoDeploy": true,
  "executionMode": "undefined",
  "readQuorum": 1,
  "writeQuorum": "majority",
  "readYourWrites": true,
  "newNodeStrategy": "static",
  "servers": {
    "production": "master"
    "development": "replica",
  },
  "clusters": {
    "internal": {
    },
    "*": {
      "servers": [""]
    }
  }
}
/*
autoDeploy : Means automatically deploy the database to nodes , which
                do not have it .
executionMode : Default is undefined , to let the client decide .
                If set to Asynchronous , then an operation is first
                   executed on the the local node , before being
                   replicated .
readQuorum : Number of responses that must be coherent , before
                replying to a read operation .
writeQuorum : Number of responses that must be coherent , before
                 replying to a write operation .
              If set to all , means all responses must be coherent ,
                 if set to majority , this means that n/2 + 1
                 responses must be coherent , and if set to 1 , it
                 is as if it is disabled  .
readYourWrites : The write quorum is only satisfied , if the local
                    node has responded .
newNodeStrategy : Can be set to static or dynamic , if static a new
                     node is registered as static , if dynamic a new
                     node is managed as dynamic . When a node is
                     unreachable , and the node strategy is dynamic it
                     does not count into the quorum .
servers : What is the role of each server , for example the server which
             was given the name of production is a master , and the one
             given the name of development is replica .
          "servers":{"*"":master} , can be used to state that all servers
             are masters .
          More than one master can be configured , a replica server is just a
             replica , it does not count into voting in writeQuorum .
clusters : The term cluster in orientdb , also means a way of grouping
              records of a certain type , or by a specific value .
           A class is orientdb way of modeling , so it is the model , it
              stores its data into records .
           Each class can have multiple clusters , as in to group records
              which it has and which are similar .
           A cluster can be a physical one , as storage on disk , or
              temporal as an in memory cluster .
           So this setting is used to configure clusters .
           As seen , in the provided excerpt the internal cluster is not 
              replicated . All other clusters are replicated .
           servers has as a value an array , which is the list of
              servers where cluster records are saved . The special
              value <NEW_NODE> means auto add new nodes . An
              example of using specific node values is :
              ["location1" , "location2"] .
           Cluster configuration inherits database configuration ,
              readQuorum , writeQuorum , readYourWrites , and can
              override them .
           Additionally owner can be used to specify the owner of the
              cluster  , as in "owner": "location1" , this is called
              static assignment , even if the node is down , a static
              owner is not changed . If not done statically , an owner is
              chosen at runtime dynamically .*/

In the previous example , two nodes were specified , one as being a master , and the second one as being a replica , other configuration options , are explained in the preceding code , as in having all nodes to be masters , which is the default .

That is it for configuring replication using orientdb , the servers can be started using ./dserver.sh . Start the replica , after the master .

To check if everything is working correctly , a database can be created as in create database remote:localhost/nameOfDb root thePassword , on the master server , using the console , which can be launched from the bin folder , and using the console in the replica server , and after issuing the command connect remote:localhost/ root thePassword , the command List databases can be run to verify that the created database has been replicated .

If multiple instances of orientdb are run on the same PC , different ports must be configured for each running instance , for example in orientdb-server-config.xml , by using nano config/orientdb-server-config.xml , and searching using ctrl-w for port , and choosing the preferred ports .

<!-- Excerpt from orientdb-server-config.xml file .-->
<?xml version="1.0" encoding="UTF-8"?>
<listener protocol="binary" socket="default" port-range="2424-2430" ip-address="0.0.0.0"/>
<listener protocol="http" socket="default" port-range="2480-2490" ip-address="0.0.0.0">

<!-- Excerpt from config/hazelcast.xml file .-->
<port auto-increment="false">2434</port>
<!-- instead of changing the port , auto-increment 
     can be set to true for hazelcast.xml .-->