Setting up dbs with indexes in parallel

Denis_Lantsman · September 15, 2020, 8:00pm

We use mongodb for our production database. We are hoping to improve test reliability & isolation by having each test worker on a machine talk to its own db in mongodb.

So, we are trying to spin up 30 identical dbs on our test server. Each db should start out with empty collections, but the collections need to have indexes & validation rules defined on them.

We tried to do this in two ways:

A node script, which uses the mongodb nodejs driver to connect to each db and call the createIndex and various other commands.
Preparing a single correctly configured db, dumping it via mongodump, then replicating it to the other 29 dbs via mongorestore.

Both of these approaches fail to parallelize. While doing the setup on a single db takes ~2s, doing it across 30 dbs takes over 40s. In the case of option 1, we were seeing very inconsistent index creation times, ranging from 40s to over 3 minutes. Since the dbs are independent of each other, we expect this to take a similar amount of time as a single db, so 2-5s.

It appears that the mongo instance running on the test server is not able to run index creation across several dbs at the same time. Is this a known limitation? Are we mis-configuring mongo in some way that prevents it from running these operations in parallel?

Thanks for your input!

FYI, the dump/restore command we are using is:

# ... set up db test-1 (takes ~2s)
mongodump --archive="test-1" --db=test-1 > test_output/mongodump-log 2>&1

echo "Setting up dbs via mongorestore"
# this takes 40s. Why?
seq 2 $NUM_SERVERS | xargs -L 1 -I % -P $NUM_SERVERS sh -c '\
  mongorestore --archive=test-1 test-% --nsFrom="test-1.*" --nsTo="test-%.*" > test_output/mongorestore-test-% 2>&1; \
'
echo "Done setting up dbs"

Denis_Lantsman · September 15, 2020, 10:52pm

As a test, I ran this procedure with an empty db mongodump (no docs or indexes) and it succeeded nearly instantly, even with 30 mongorestores. So it does seem to be something related to index creation (even for an empty collection).

steevej · September 16, 2020, 12:03am

You do not seem to start mongorestore in background so your script is starting 39 mongorestore processes, one after the other, with no parallelism at the source. So even if mongod could do it in parallel your script is not.

Denis_Lantsman · September 16, 2020, 2:17am

The -P option on xargs is meant to run the tasks in parallel. I tested this script with just a

seq 2 $NUM_SERVERS | xargs -L 1 -I % -P $NUM_SERVERS sh -c '\
  echo "starting %"; \
  sleep 10; \
  echo "done %"; \
'

and it did print out all the “starting” messages first, then the “done” messages 10s later.

kevinadi · September 16, 2020, 7:13am

Hi @Denis_Lantsman welcome to the community.

I would like to clarify some things:

So, we are trying to spin up 30 identical dbs on our test server.

How many mongod processes are we talking about? Are all 30 dbs live in a single mongod instance? If you have 30 mongod instances, are they on a single machine, or on separate machines?

If all 30 databases live in a single mongod process or if you have 30 mongod processes in a single machine, I don’t think you can expect it to run with the same timings as preparing a single database. This is because to create an index, it would need to do a collection scan which involves reading the whole collection into the cache. Multiply this by 30, and you’re hammering the cache and the disk with IO requests. If your disk cannot handle the requests, mongod will be forced to sit idle while waiting for the disk to complete its work. You can check if disk is the bottleneck by checking the iostat numbers, and see if the disk is fully utilized during this process.

Actually you can do a small experiment by trying this process with less parallelization numbers. Say start with 2 processes simultaneously, and observe the reported timings. Then gradually increase the number of parallelization until you don’t see the benefit of adding more processes. I would be interested in seeing at what point the hardware start to get overworked.

If you need further help, please post more details:

How much data is in the database, how many collections, and how many indexes
What is your MongoDB version
What is the spec of your OS, hardware, and how you run the mongod process
Please provide as much examples as possible so your experiment can be replicated and your timings verified by another person

Best regards,
Kevin

steevej · September 16, 2020, 12:53pm

Oops. Sorry but I completely missed that.

Denis_Lantsman · September 16, 2020, 2:40pm

Hey Kevin, thanks for the reply. We are looking into your suggestions and running some experiments.

A few clarifications:

We’re running 30 dbs on a single mongod process, which is running on the same machine.
The db setup we’re trying to achieve is to have indexes over empty collections. So I wouldn’t expect a collection scan to be an expensive operation. We looked at iostat and we don’t seem to be running against disk bottlenecks.
We have about 30 (empty) collections, with about 60 indexes total.
We’re using mongo 3.6.3 (same version for the shell and mongod)
We’re running this on an EC2 instance - m5.8xlarge so vCPU=32, Memory=128GiB. We’re using ubuntu, so mongod is started with systemd. We already looked into ulimit documentation and systemd is configured to start the service with the recommended settings.
We ran the parallelization experiment, and we’re getting about the same amount of time to start 30 dbs with parallelization 1 as with parallelization 5 or 30. (Which is leading us to suspect that there’s some locking in the mongod process that’s preventing it from creating these dbs in parallel).

Denis_Lantsman · September 16, 2020, 8:06pm

Another update, we’re sometimes seeing the db setup via mongorestore take as long as 7 minutes!

Denis_Lantsman · September 16, 2020, 11:17pm

I also tried running this with mongod 4.4 and had similar results.

Denis_Lantsman · September 16, 2020, 11:18pm

One more thing that might be useful - I recorded a mongoresotre log from one of the particularly long mongorestore operations (this one took ~6min).

gist.github.com

https://gist.github.com/dlants/65a1374b7e79969df9a9e5fa0543e534

mongorestore-log

2020-09-16T20:45:36.498+0000	checking options
2020-09-16T20:45:36.498+0000		dumping with object check disabled
2020-09-16T20:45:36.498+0000	will listen for SIGTERM, SIGINT, and SIGKILL
2020-09-16T20:45:36.500+0000	connected to node type: standalone
2020-09-16T20:45:36.500+0000	standalone server: setting write concern w to 1
2020-09-16T20:45:36.500+0000	using write concern: w='1', j=false, fsync=false, wtimeout=0
2020-09-16T20:45:36.538+0000	archive prelude source-db.collection1
2020-09-16T20:45:36.538+0000	archive prelude source-db.collection2
2020-09-16T20:45:36.538+0000	archive prelude source-db.collection3
2020-09-16T20:45:36.538+0000	archive prelude source-db.collection4

This file has been truncated. show original

It seems like there are some places where the script is sitting idle for as much as 15s at a time (see last two lines below):

2020-09-16T20:47:06.479+0000	creating collection target-db.collection3 using options from metadata
2020-09-16T20:47:06.479+0000	using collection options: bson.D{bson.DocElem{Name:"idIndex", Value:mongorestore.IndexDocument{Options:bson.M{"name":"_id_", "ns":"target-db.collection3"}, Key:bson.D{bson.DocElem{Name:"_id", Value:1}}, PartialFilterExpression:bson.D(nil)}}}
2020-09-16T20:47:14.678+0000	restoring target-db.collection3 from archive 'mongodump-source-db'
2020-09-16T20:47:14.678+0000	finished restoring target-db.collection5 (0 documents)
2020-09-16T20:47:14.678+0000	finished restoring target-db.collection34 (0 documents)
2020-09-16T20:47:14.680+0000	using 1 insertion workers
2020-09-16T20:47:14.680+0000	demux checksum for namespace source-db.collection3 is correct (0), 0 bytes
2020-09-16T20:47:14.680+0000	restoring indexes for collection target-db.collection3 from metadata
2020-09-16T20:47:14.680+0000	demux namespaceHeader: {source-db collection4 false 0}
2020-09-16T20:47:14.680+0000	demux Open
2020-09-16T20:47:14.680+0000	reading metadata for target-db.collection4 from archive 'mongodump-source-db'
2020-09-16T20:47:14.680+0000	creating collection target-db.collection4 using options from metadata
2020-09-16T20:47:14.680+0000	using collection options: bson.D{bson.DocElem{Name:"idIndex", Value:mongorestore.IndexDocument{Options:bson.M{"name":"_id_", "ns":"target-db.collection4"}, Key:bson.D{bson.DocElem{Name:"_id", Value:1}}, PartialFilterExpression:bson.D(nil)}}}
2020-09-16T20:47:14.680+0000	demux namespaceHeader: {source-db collection4 true 0}
2020-09-16T20:47:39.042+0000	restoring target-db.collection4 from archive 'mongodump-source-db'

The demux makes me think it’s waiting on some sort of mutex lock on collection4…

steevej · September 16, 2020, 11:46pm

While I do not know why it would take so long, I would like to propose another route for setting up your test databases.

: steevej @ asus-laptop ; cat create-test-collections.js
for( let instance = 0 ; instance < 30 ; instance++ )
{
	db = db.getSiblingDB( "test-" + instance ) ;
	db.dropDatabase() ;
	db.getCollection( "collection-1" ).createIndex( { a : 1 , b : 1 } , { background : true } ) ;
	db.getCollection( "collection-2" ).createIndex( { c : 1 , d : 1 } , { background : true } ) ;
	db.getCollection( "collection-3" ).createIndex( { e : 1 , f : 1 } , { background : true } ) ;
}
: steevej @ asus-laptop ; date ; mongo create-test-collections.js  ; date
Wed Sep 16 19:08:08 EDT 2020
MongoDB shell version v4.0.5
connecting to: mongodb://127.0.0.1:27017/?gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("b186b791-9815-45d8-a61d-f45988260b73") }
MongoDB server version: 4.0.5
Wed Sep 16 19:08:11 EDT 2020
: steevej @ asus-laptop ;

I admit, that everything is local but I get all the databases, all collections and some indexes within 4 seconds. My system is

7886MiB System memory
Intel(R) Core™ i5-2410M CPU @ 2.30GHz
Linux asus-laptop 4.19.36-1-lts #1 SMP

Since the indexes are built in the background they might not be finished building at the end.

I also tried with more indexes on an Atlas Free Tier and I could create 30 dbs with 3 collections per dbs and 2 (2-fields) indexes per collections. It took around 15 seconds.

Denis_Lantsman · September 17, 2020, 6:48pm

We tried this approach (creating indexes via the js driver). It seemed to have the same bottleneck - and I don’t doubt that you would find the times increasing if you created 30 collections with 60 indexes per db.

This approach also experiences the same variability in performance, with some runs taking significantly more than 30s.

Denis_Lantsman · September 21, 2020, 6:43pm

For wayward travelers who might run into the same issue:

We ended up giving up on using a single mongod process, since all of our efforts to parallelize it didn’t work. Instead, we went with this approach:

Spin up a mongod instance pointing at an empty dbpath. Set up the indexes on that single db. Then kill that process.
Duplicate the resulting db file 29 times, giving us 30 identical db archives.
Spin up an individual mongod process pointing at each db archive. This gives us 30 processes each running a single db.

Overall it takes about 7s for our 30 mongod processes, with indexes, to be ready for requests. A great improvement over the occasional 2min wait!

echo "Creating first db"
echo "creating fixture db mongod process"
mkdir ./mongo-dbs/db-fixture
mongod --port $FIRST_MONGO_PORT --dbpath ./mongo-dbs/db-fixture --logpath ./test_output/mongo-fixture &
until mongo mongodb://localhost:$FIRST_MONGO_PORT/db-test --eval "{}" > /dev/null 2>/dev/null; do
	echo "fixture db not ready. waiting"
    sleep .1
done
echo "Setting up indexes"
env MONGODB_URL="mongodb://localhost:$FIRST_MONGO_PORT/db-test" scripts/setup-database-indexes --background false > test_output/setup-database-indexes
mongod --dbpath ./mongo-dbs/db-fixture --shutdown
echo "done creating first db"
echo "Launching $NUM_SERVERS mongo dbs"
seq 1 $NUM_SERVERS | xargs -L 1 -I % -P $NUM_SERVERS bash -c '\
	cp -r ./mongo-dbs/db-fixture/. ./mongo-dbs/db-%; \
    MONGO_PORT=`expr $FIRST_MONGO_PORT + % - 1`; \
    (mongod --port $MONGO_PORT --dbpath ./mongo-dbs/db-% --logpath ./test_output/mongo-% &) ; \
    (until mongo mongodb://localhost:$MONGO_PORT/db-test --eval "{}" > /dev/null 2>/dev/null; do echo "mongo % not ready..."; sleep .1; done) ; \
	echo "mongo % ready"; \
'
echo "done launching mongod processes"

system · September 26, 2020, 6:43pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.