Configuring MongoDB with Hitachi VSS block

By Prashant Singh posted 10-31-2023 13:50

Like

MongoDB is a popular document-oriented NoSQL database used for high-volume data storage that offers high performance, scalability, and flexibility designed to overcome the relational database approach and the limitations of other NoSQL solutions. However, to achieve optimal performance and availability, having a well-configured storage system at the back end is important. Hitachi Virtual Storage Software block (VSS block) is a software-defined storage solution from Hitachi Vantara that offers advanced data management features, including snapshots and data migration.

In this blog, we’ll explore the configuration of MongoDB with Hitachi VSS block storage and the method to run storage performance tests through a very popular benchmarking tool Yahoo! Cloud Serving Benchmark (YCSB). Note that in this scenario, we used standalone database servers instead of clustering.

Note: When it comes to deploying a cluster, specifically when deploying MongoDB in a production environment, using a Replica Set is possible, as explained here.

In addition, we’ll review the client and storage end throughputs, Latency, and CPU utilization received as a part of the benchmarking tests.

In comparison to an SQL Database, the following diagram depicts how a NoSQL database (in this scenario, MongoDB) is correlated:

Figure 1: SQL versus NoSQL database co-relation

Now, let’s go through the six steps required to get the MongoDB database installed, configured, and ready for use on a Hitachi VSS block storage system. Then we’ll run storage performance tests using the YCSB tool and review the results.

Note: Six client nodes (RHEL 8.1/MongoDB v4.4.18) were used for testing with Hitachi VSS block version 1.10.01. These steps are also valid for the latest available version: VSS block 1.12.

Client node specs – 192 GB RAM/64 Cores CPU

Storage node specs – 116 GB RAM/14 Cores CPU

Step 1: Planning your MongoDB deployment

Before configuring MongoDB with Hitachi VSS block, it's important to plan your MongoDB deployment. Planning helps you determine the appropriate storage capacity and performance requirements by considering the following:

· Determine the number of users.

· Determine your data model and schema design.

· Estimate your data size to determine the size of the database.

· Consider your expected workload.

· Determine your hardware requirements.

· Decide on your deployment architecture.

· Select your hosting platform.

· Determine your backup and recovery strategy.

· Consider security.

For additional considerations, see the MongoDB webpage.

Step 2: Installing MongoDB

The first step in configuring MongoDB with Hitachi VSS block storage is to install MongoDB on your system. Download the latest version of MongoDB from the official website and follow the installation instructions.

MongoDB is available in many versions; however, these tests were run on v4.4.18. Therefore, the following supporting software is related to MongoDB v4.4.18.

To install MongoDB 4.4, run the following commands on every client node (in this scenario, RHEL 8):

cd /opt

wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-4.0.5.tgz

tar -zxvf mongodb-linux-x86_64-4.0.5.tgz

ln -s mongodb-linux-x86_64-4.0.5 mongodb

useradd mongod

mkdir -p /var/lib/mongo

chown -R mongod:mongod /opt/mongodb*

chown -R mongod: /var/lib/mongo

Step 3: Configuring MongoDB

After MongoDB is installed, you must configure it for optimal performance, which includes using the correct values in the important configuration files related to MongoDB at the /etc location. For detailed instructions on configuring MongoDB on a Red Hat 8 server, see the official documentation. For higher storage throughput, we used standalone MongoDB database servers instead of using highly available features such as clustering or sharding.

After the MongoDB packages are installed, create and modify the following files and directories:

/etc/mongod.conf — the default configuration file of MongoDB (the default localhost IP (127.0.0.1) is bind IP and the default port is 27017).

/var/lib/mongo — the default data directory of MongoDB (to be created).

/var/log/mongodb/mongod.log — the default log file of MongoDB.

1. To update the MongoDB configuration file by including the filesystem created on top of Volumes provisioned through Hitachi VSS block, modify the following entries in the /etc/mongod.conf file:

vi /etc/mongod.conf

storage:

dbPath: "/var/lib/mongo"

journal:

enabled: true

net:

port: 27017 bindIp: "127.0.0.1"

2. To disable SELINUX and the firewall, start and enable the MongoDB service and update the environment variables by using the following commands:

$ sudo setenforce 0

$ iptables -F

$ sudo firewall-cmd –reload

$ sudo systemctl start mongod

$ sudo systemctl enable mongod

$ sudo systemctl status mongod

vi /root/.bash_profile

Before the “export PATH” line, add the following lines to the file:

##mongodb

PATH=$PATH:/opt/mongodb/bin

3. To confirm that MongoDB is correctly installed and functional, start the Mongo Shell using the following command:

#mongo

After a successful installation, the Mongo shell version is displayed, and a shell is opened to accept standard Mongodb commands as follows:

Step 4: Configuring Hitachi VSS block

Hitachi VSS block offers advanced data management features that help you ensure high availability and data protection. After configuring MongoDB, you must configure Hitachi VSS block. This includes creating a Protection Domain, Fault Domain, Storage pool, and Volumes. Then, you must map the volumes accordingly on the compute nodes. In this scenario, volumes were created from the storage pool (created on top of the physical drives allocated to the storage nodes). On each client node, Linux LVM was used to create LVs on top of the volumes provisioned through Hitachi VSS block.

pvcreate <Name_of_all_the_mpath_disks> (For eg., pvcreate /dev/mapper/mpathlv….)

vgcreate <Name_of_the_Volume_Group> <Name_of_all_Physical_Volumes_with_space_in_between>

lvcreate -i <total number of volumes/pv’s> -I 4M -L <Size_based_on_provisioned_storage>G -n <Name_of_the_Logical_Volume> <Name_of_the_Volume_Group>

mkfs.xfs /dev/<Name_of_the_Volume_Group> /<Name_of_the_Logical_Volume>

We used the / var/lib/mongo directory to mount the XFS filesystem created on top of the Hitachi VSS block volumes. The MongoDB database is created and stored in this location.

mkdir /var/lib/mongo

mount /dev/<Name_of_the_Volume_Group>/<Name_of_the_Logical_Volume> /var/lib/mongo

Step 5: Integrating MongoDB with Hitachi VSS block

After configuration, you must integrate MongoDB and Hitachi VSS block. This step involves setting up the MongoDB data directory in the /etc/mongod.conf file to use the Hitachi VSS block storage from the backend by using the filesystem created on top of volumes provisioned through Hitachi VSS block for the MongoDB data.

Storage:

dbPath: "/var/lib/mongo"

Step 6: Testing and Optimizing

After configuring MongoDB with Hitachi VSS block storage, it’s important to test the deployment and optimize it for performance. In this scenario, we used the YCSB benchmarking tool to ingest load to MongoDB and check the performance.

We implemented automation to run all workloads and simultaneously collect NMON and SAR reports from the client as well as the storage nodes. This was later used for data analysis and performance reporting.

Installing YCSB

Access the location where you want to install YCSB.

curl -O --location

https://github.com/brianfrankcooper/YCSB/releases/download/0.17.0/ycsb-0.17.0.tar.gz

tar xfvz ycsb-0.17.0.tar.gz

Running YCSB

Access the location where you installed YCSB.

This includes the following two steps:

1. Loading the database.

2. Running operations on the database.

The two stages are included in the entire set of runs when running YCSB tests.

The following are the sample commands, one each for load and run operations:

load: Run the load phase

./bin/ycsb.sh load mongodb -threads 64 -P workloads/workloada -p recordcount=250000000 -s | tee /results/moat/MongoDBTesting/HLG_throughput/ALoad_64T_Test_250M_HLG${node}.txt" 2>/dev/null &

run: Run the transaction phase

./bin/ycsb.sh run mongodb -s -threads 64 -P workloads/workloada -p operationcount=150000000 -p recordcount=150000000 | tee /results/moat/MongoDBTesting/HLG_throughput/ARun_64T_Test_150M_HLG${node}.txt" 2>/dev/null &

Note: 64 threads and a 150M record count were used to obtain the best throughput on the client and storage ends.

The following screenshots show the details of throughput values on the client and storage end:

MongoDB Client Nodes throughput:

The data suggests that VSS block (1.10 on VMware ESXi) performance enhances multiple folds for read-intensive or read-only operations and is comparatively low for write-intensive workloads. The same is reflected in the latency values for read and write-intensive workloads. This clearly shows the capability of VSS block to support huge read operations (100M record count) with ease.

This data is for VSS block (1.12 on BareMetal). The record and operation count are kept low (50M- Load, 30M- Run) while running workloads. An important point to note is that with a lower record count, the latency values decreased significantly. The disk cleanup time on the client side (needed before a fresh retest) is reduced significantly in a BareMetal setup when compared to a VMware setup.

Hitachi VSS block Storage Nodes throughput:

The storage throughput is in sync with the client throughput data. Because read % increased in an operation, the storage generates more IOPS, resulting in more ops/sec on the client side. During the F run workload, which is a mix of 50-25-25 of READ-WRITE-UPDATE, the client and the storage receives maximum CPU utilization (nearing threshold of 70%).

Summary

In summary, configuring MongoDB with Hitachi VSS block can optimize database performance using the popular benchmarking tool YCSB. The tests can be optimized using various threads, record count, and operation count to obtain the best overall throughput. It was noted that increasing the thread count results in an increase in latency on the client side and vice versa. Additionally, with a lower record count, the possibility of high cache hits can lead to higher ops/sec with lower latency on the client side, but lower IOPS on the storage side. So, the sizing of the Record count and thread count holds the key.

Reference Links

https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-red-hat/

https://github.com/brianfrankcooper/YCSB/wiki/Core-Workloads#running-the-workloads

https://knowledge.hitachivantara.com/Documents/Storage/Virtual_Storage_Software_Block

https://linuxconfig.org/how-to-install-mongodb-on-redhat-8

https://www.linuxtechi.com/how-to-install-mongodb-rhel-centos/

https://www.mongodb.com/blog/post/performance-testing-mongodb-30-part-1-throughput-improvements-measured-ycsb

https://www.msystechnologies.com/blog/4-key-steps-to-remember-before-benchmarking-mongodb-with-ycsb-workloads-on-all-flash-block-storage/

#mongodb #sds #virtual storage software block #VSSblock

0 comments

8 views