X-Post: MongoDB Replication Set on Ubuntu 16.04 LTS

MongoDB_Gray_Logo_FullColor_RGB-01

A MongoDB Replication Set is a database cluster to create fallback security in case of a server / database failure. The setup created during the course of this tutorial involves a basic set, consisting of a primary and a secondary database server.

Futhermore, a delayed / shadow server is implemented as an insurance instance, an arbiter tackles the dilemma between consistency and availability introduced in the CAP theorem by computer scientist Eric Brewer.

1. Introduction and Game Plan

1.1 Introduction

In this article, three different kinds of members will be used:

    • Regular MongoServer: A server instance that holds all data stored within the database. The PRIMARY server is responsible for writing and reading, while the SECONDARY server(s) replicate the data from the primary.

 

    • Shadowed / Hidden MongoServer: This server also holds all data stored within a database, but executes write operations received by the primary instance with a preset delay. In case important data has been deleted by a rash / mistyped command, it can still be recovered from the hidden server, where the command is not yet executed.

 

  • Mongo Arbiter: In a replication set with an even number of secondary instances, that are equally eligible to vote for a new primary, an arbiter is required to create an imbalance in votes. The arbiter itself does not store any data and is not capable to become the primary of a set. Instead, it is used to counteract the CAP theorem.
    https://en.wikipedia.org/wiki/CAP_theorem
1.2 Game Plan

MongoDB Cluster
For clarification, the servers involved will be referred to as:

MongoDB Server A mongoA 10.10.10.10
MongoDB Server B mongoB 10.10.10.11
MongoDB Shadow mongoShadow 10.10.10.12
MongoDB Arbiter mongoArbiter 10.10.10.13

Furthermore the replication set will be named: mongoRepSet.

Before continuing, make sure MongoDB 3.4 is installed on each machine involved. MongoDB 3.4 Ubuntu Install Guide

After installing MongoDB, stop the service on all servers for now, as the startup configuration file will be edited momentarily.

2. Replication Servers

The following instructions are related to mongoA, mongoB and mongoShadow, which will store a replication of your data. If you have more storage servers involved, the same procedure applies.

2.1 MongoDB startup configuration

Open the mongo configuration file (located at /etc/mongod.conf) with your text editor of choice.

Change the following lines accordingly:

By leaving ‘bindIp’ blank, it will automatically assign its IP during startup.
Note: You can change the port to your delight.
If you choose to do so, change ports accordingly in all upcoming steps.

More configuration options

Instructions on how to setup a security key for authorization can be found in the Authorization section below.

Safe and close the configuration file.

2.2 Firewall

Since each member of a replication set needs to communicate with all other members, some firewall rules must be added.
In this example, ufw is used to adjust iptables:

mongoA:

mongoB:

mongoShadow:

mongoArbiter:

2.3 Replication Set Initiation

Start MongoDB on all servers

Make sure the startup succeeds and the new configuration has been applied

On the primary server (here: mongoA) connect to the mongo service

Create a config for the replication

Note that mongoShadow, which is supposed to be a hidden member, has been configured with four more variables (priority, hidden, votes and slaveDelay).

Also, mongoArbiter was configured with settings priority and arbiterOnly.

For further information about elections, votes, priorities and delayed members, please see chapter 4.

Initiate the replication set with the config

Check the status of your replication set for errors

! Connecting a replication set to all servers can take a few seconds. However, if the connection cannot be established, check firewall rules, port settings, IPs and make sure mongo is running on each server !

2.4 Authorization

To avoid unauthorized access to your databases, create users and assign roles. Enable authorization in the configuration file (/etc/mongod.conf) afterwards.

https://docs.mongodb.com/manual/tutorial/create-users/

To authorize access from one member of the set to another, create a keyFile, make it read-only and assign its ownership to mongodb.

! Copy the keyfile to each member. Compare checksum to make sure they hold the same information !

Adjust the security section in each config file accordingly, to enable keyFile authorization.

Restart all servers and login with your newly created user

2.5 Network Compression

If you are running each mongo instance on a different server in multiple data centers, all replication data is transferred between them. With limited bandwidth, enabling network compression can lower the traffic caused for this reason.
To enable network compression, add these lines to the configuration file.

Restart all servers afterwards, to apply the new configuration.

3. Creating mongodumps

Usually the mongodump command just needs a host and login credentials to create a database backup file. In a replication set, you want to pass the entire set, including the IP and port of each instance

When creating a mongodump of a replication set you can pass over extra arguments that influence the behavior of the dump. Attach or replace the following parameters in the basic command above.

Create a mongodump of a specific database only

Use the server with the lowest ping to create a backup. If this server is unavailable, continue with the second-best ping
https://docs.mongodb.com/manual/core/read-preference/

Create a backup not only from the data stored in mongodb, but also from the oplog. The oplog contains information about read / write operations applied to your database.
https://docs.mongodb.com/manual/core/replica-set-oplog/

Create a zip from the backup

Create an archive from the backup
! This replaces –out in the basic command above !

This is just a small example of arguments that can be passed to mongodump. See complete list

Create a cronjob if you want to setup timed backups.

4. Elections, Votes, Priorities

As explained in the introduction, there are two basic types of replication set members: 1 Primary and 1…n Secondary.

All members are pinging each other once every two seconds. If a ping is not responded after ten seconds, the other members declare it as ‘inaccessible’. While this has only low impact if a secondary is effected, an inaccessible primary server would lead to a no-read and no-write database soon, if not handled properly.

To crown a new primary member, chosen from all eligible secondary members, an ‘election’ is initiated. The secondary member with the highest priority is converted to the primary. Since shadow databases (hidden) and arbiter instances (empty) are not supposed to become a primary member ever, they have priority(0). Priorities can be adjusted in the replication set configuration before initiation or during runtime.

After each secondary member proposed for the primary role by offering their priority value, votes are casted. Now imagine the following scenario:
In one of the replication sets there are one primary and two secondary servers. All three of them have priority(1) and also vote(1). The primary server becomes inaccessible.  An election is initiated and the remaining two servers each vote for themselves, since they know they have the highest priority in the set. This leads to a deadlock, because no clear winner can be found. The election will restart eventually, but the outcome will remain the same.
To avoid these situations, arbiter instances are used. While arbiters are not eligible to become a primary themselves (remember: priority(0)), they still have vote(1) to give. Due to the odd number of votes, a clear winner will be found and the chosen secondary will become a primary.

Note: The default priority value for each member is 1. It can be set between 0 and 1000.

After restoring a former primary member into working state, it will stay in a secondary mode until a new election is initiated and its vote wins.
To force a new election after restoring the primary member, run

This forces the current primary member to step down and will start a new election.
https://docs.mongodb.com/manual/core/replica-set-elections/ https://docs.mongodb.com/manual/tutorial/adjust-replica-set-member-priority/
https://docs.mongodb.com/manual/core/replica-set-priority-0-member/
https://docs.mongodb.com/v3.4/reference/method/rs.stepDown/

5. Robo 3T vs. Studio 3T

If you like working with a graphical user interface to run queries, there is good news to both users of Robo 3T and Studio 3T.
Replication sets are supported by default!

Robo 3T:
Formerly known as ‚Robomongo‘, Robo 3T is a free-for-all MongoDB GUI with embedded shell. download

Create a new connection.
Select ‚Replica Set‘ from the drop down menu at the top.
Add all members of your replication set.

2017-10-28 00_08_20-Connection SettingsSave and connect.


Studio 3T:
Studio 3T is a fully featured IDE for MongoDB professionals featuring a Visual Query Builder, SQL DB im- and exporting, IntelliShell and much more. download

You can use a restricted version for private usage for free, but have to pay for a license, if you want to use all its features in a business environment.

Select ‚Replica Set or Sharded Cluster ‚ from the drop down menu.
Add your first member of your replication set to the list.
Use the ‚Discover‘ Button to the right to add all remaining members of the replication set to the list automatically.

2017-10-28 00_21_09-New ConnectionSave and connect.

Note: Alternatively you can feed Studio 3T with a mongodb link in the form:
mongodb://10.10.10.10:27017,10.10.10.11:27017,10.10.10.12:27017/dbname?replicaSet=mongoRepSet


 

Please Note!

After some testing it looks like Robo 3T tries to reach each member of a replica set individually via port 27017 even if you were using  a SSH tunnel during the connection creation. The connection will eventually time out, if you do not allow access on port 27017 on each server from another client than the replication set members.

Studio 3T seems to be smarter and uses the replication set route to connect to its members. This means you can access mongo instances of the replica set, even if they are exclusively reachable on port 27017 by another set member.

studio 3t route2

  •  
  •  
  •  
  •  
  •  
  •  
  •  

About the author

Florian Kriegel

Florian Kriegel studiert Mobile Computing an der Hochschule Hof. Bereits während der Schulzeit hat er für lokale Unternehmen mobile Lösungen entwickelt und betreut. Seit April 2016 ist Florian als Software Developer Teil von groupXS Solutions GmbH und unterstützt das Team sowohl beim Programmieren als auch im Marketing. https://florian-kriegel.de/ https://florian-kriegel.de/blog/

Kommentar verfassen