Introduction:

MongoDB is a very popular open-source scale-out NoSQL database. It powers modern big data analytics applications that require low latency read and writes, high availability, and advanced data management.

Usecases:

Key use cases for MongoDB includes real-time analytics, product catalogs, content management, and mobile applications.

The storage solution with MongoDB offers the following key benefits:

  1. Scalability provided by the storage array, which can modularly scale storage with MongoDB 
  2. Non-Disruptive Operations with high availability to deliver maximum uptime and stable performance, specifically during drive failure.
  3. Predictable high performance with low latency, providing excellent response time for the most demanding analytics applications built on MongoDB.

Summary:

This reference document showcases a validated end-to-end solution design for efficiently deploying a virtualized scale-out MongoDB NoSQL database on the block storage using VMware vSphere Hypervisor Environment.

Note: Storage array hosts MongoDB virtual machines (VMs) and databases.

mongodb-vm

MongoDB Architecture:

Sharding divides the dataset and distributes the data over multiple servers, or shards. Each shard is an independent database. Collectively, the shards make up a single logical database.

Shard stores data. In a production shard cluster, each shard own the responsibility of maintaining a replica set. This strategy facilitates high availability and data consistency.

mongodb-architecture

Storage Architecture:

storage-architecture

Below are the 4 Key Steps to Remember before Benchmarking MongoDB with YCSB Workloads on All-Flash Block Storage.

  1. Server Configurations
  • 2 Fibre Channel 16Gbps ports.
  • ESXi 6.7 Installed on hosts.
  • Fibre Channel switch connectivity.
  1. Storage controller Configuration 
  • 2-node HA pair.
  • 4 Fibre Channel 16Gbps ports.
  • SSD Drives.
  1. Configurations on ESXi Host

1.Created three volumes of each 500G on storage controller.

2.Mapped the three volumes to 3-node ESXi hosts from storage controller.

3.Created 3 VMFS6 Datastores on hosts (i.e one storage volume created as one VMFS6 Datastore)

esxi-host

  1. Created 8 Virtual Machines with 2vdisks (i.e 8 VMs resides in 3 Datastores).
  2. Size of vdisk1-16G Thin Prov (where RHEL OS is installed), vdisk2-500G Thin Prov (used as DB – Data disk)

vdisk

  1. Login to all 8 VMs, created partition on 500G disk (added as vdisk2), mount it on the hosts.
[root@localhost opt]# fdisk /dev/sdb

[root@localhost opt]# mkfs -t ext4 /dev/sdb

mke2fs 1.42.9 (28-Dec-2013)

[root@localhost ~]# mkdir -p /data/db1; mount /dev/sdb /data/db1

[root@localhost ~]# cat /etc/fstab | grep sdb

/dev/sdb1 /data/db1 ext4 defaults 0 0

  1. Configuring MongoDB on VMs:

Prerequisites:

  • 3 RHEL 7 VMs as Config Replica Sets
      • 10.20.178.30      configsvr1
      • 10.20.178.92       configsvr2
      • 10.20.178.96      configsvr3
  • 4 RHEL 7 VMs as Shard Replica Sets
  •               10.20.178.93      shardsvr1
  •               10.20.178.95      shardsvr2
  •               10.20.178.94      shardsvr3
  •               10.20.178.46      shardsvr4
  • 1 RHEL 7 VMs as mongos/Query Router
  •               10.20.178.91      mongos

Step1: Edit the hosts file on each VMs,

[root@configsvr1 ~]# cat /etc/hosts

10.20.178.30 configsvr1

10.20.178.92 configsvr2

10.20.178.96 configsvr3

10.20.178.93 shardsvr1

10.20.178.95 shardsvr2

10.20.178.94 shardsvr3

10.20.178.46 shardsvr4

10.20.178.91 mongos

Step2: Install MongoDB on all instances/VMs

[root@configsvr1 ~]# mongod –version

db version v4.0.9

git version: fc525e2d9b0e4bceff5c2201457e564362909765

OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013

allocator: tcmalloc

modules: none

build environment:

distmod: rhel70

distarch: x86_64

target_arch: x86_64

Step3: Create Config Server Replica Set

Change the DB storage path to your own directory in /etc/mongod.conf.

We will use ‘/data/db1’ on all the VM’s,

storage:
  dbPath: /data/db1

Step4: Create the Shard Replica Sets

Change the default storage to your specific directory in /etc/mongod.conf.

storage:
  dbPath: /data/db1

Step 5 – Configure mongos/Query Router

mongos –configdb “replconfig01/configsvr1:27017,configsvr2:27017,configsvr3:27017”

Step 6 – Add shards to mongos/Query Router

mongo –host mongos –port 27017

sh.addShard( “shardreplica01/shardsvr1:27017”)
sh.addShard( “shardreplica01/shardsvr2:27017”)

sh.addShard( “shardreplica01/shardsvr3:27017”)

sh.addShard( “shardreplica01/shardsvr4:27017”)

mongos> sh.status()

— Sharding Status —

 sharding version: {

“_id” : 1,

“minCompatibleVersion” : 5,

“currentVersion” : 6,

“clusterId” : ObjectId(“5cb8d38dd5a157f0f404f31f”)

}

shards:

{  “_id” : “shardreplica01”,  “host” : “shardreplica01/shardsvr1:27017,shardsvr2:27017,shardsvr3:27017,shardsvr4:27017”,  “state” : 1 }

active mongoses:

“4.0.9” : 1

autosplit:

Currently enabled: yes

balancer:

Currently enabled:  yes

Currently running:  no

Failed balancer rounds in last 5 attempts:  0

Migration Results for the last 24 hours:

No recent migrations

databases:

{  “_id” : “config”,  “primary” : “config”,  “partitioned” : true }

Performance Validation with YCSB:

YCSB is a popular Java open-source specification and program suite developed at Yahoo! to compare the relative performance of various NoSQL databases. Its workloads are used in various comparative studies of NoSQL databases

Workloads Used: A, B, C.

  • Workload A: Update heavy workload: 50/50% Mix of Reads/Writes
  • Workload B: Read mostly workload: 95/5% Mix of Reads/Writes
  • Workload C: Read-only: 100% reads.

The following command was used to run workload A,B & C where threads were 8, 16, 32, and 64:

/root/YCSB/bin/ycsb run mongodb -s -P /root/YCSB/workloads/workloada -p mongodb.url=”mongodb://mongos:27017/ycsb1″ -p operationcount=120000000 -p maxexecutiontime=1800 -threads 8

/root/YCSB/bin/ycsb run mongodb -s -P /root/YCSB/workloads/workloadb -p mongodb.url=”mongodb://mongos:27017/ycsb1″ -p operationcount=120000000 -p maxexecutiontime=1800 -threads 8

/root/YCSB/bin/ycsb run mongodb -s -P /root/YCSB/workloads/workloadc -p mongodb.url=”mongodb://mongos:27017/ycsb1″ -p operationcount=120000000 -p maxexecutiontime=1800 -threads 8

/root/YCSB/bin/ycsb run mongodb -s -P /root/YCSB/workloads/workloadd -p mongodb.url=”mongodb://mongos:27017/ycsb1″ -p operationcount=120000000 -p maxexecutiontime=1800 -threads 8

MongoDB

References:

https://www.howtoforge.com/tutorial/deploying-mongodb-sharded-cluster-on-centos-7/

https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload

https://www.netapp.com/us/media/tr-4600.pdf