Mongoc_client_get_database(): precondition failed: client

Hi,
Noticed a an issue I haven’t seen before and not sure how to explain.
Using:
MongoDB 6.0.5
mongo-cxx-driver 3.7.0 (with mongo-c-driver 1.23.0)

In part of the code I am initiating a collection pointer using the mongodriver syntax:
collection collectionPointer = conn[DB_NAME][collectionName];

The conn is arriving from a connection pool (using acquire() ).
And when reaching this line above the application gets abort() called on it.
From looking at the logs, saw this:
/path/to/source/mongo-c-driver-1.23.0/src/libmongoc/src/mongoc/mongoc-client.c:1339 mongoc_client_get_database(): precondition failed: client
Which seems like its failing in the BSON_ASSERT in that function:
BSON_ASSERT (client);

And from the core file noticed this trace:

(gdb) bt
#0  0x00007f048b149e87 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f048b14b8cb in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f048a84bc12 in mongoc_client_get_database (client=<optimized out>, name=<optimized out>)
    at /path/to/source/mongo-c-driver-1.23.0/src/libmongoc/src/mongoc/mongoc-client.c:1340
#3  0x00007f04900d2fd6 in mongocxx::v_noabi::database::database(mongocxx::v_noabi::client const&, bsoncxx::v_noabi::string::view_or_value) ()
   from /usr/lib/libmongocxx.so._noabi
#4  0x00007f04900be6a8 in mongocxx::v_noabi::client::database(bsoncxx::v_noabi::string::view_or_value) const & () from /usr/lib/libmongocxx.so._noabi
#5  0x000055f10d4cbea0 in mongocxx::v_noabi::client::operator[](bsoncxx::v_noabi::string::view_or_value) const & (this=0x7eff7d2cc494, name=...)
    at /usr/include/mongocxx/v_noabi/mongocxx/client.hpp:406
#6  0x000055f10d4c2f79 in zmongoPutObject (conn=..., <removed args from here but the are all fine>)
    at zios/zmongo/zmongo_utils.cpp:204
#7  0x000055f10d4f6589 in zmongo_put_object (client_pool=0x7f0340189630, <removed args from here but the are all fine>) at zios/zmongo/zmongo_c.cpp:35
<below trace is irrelevant>

I changed the trace a bit to not have internal info, but the arguments came in to my function just fine.
Also, in frame 5 I can see that the DB_NAME is also ok.

When printing conn in frame 6 I see this:

(gdb) p conn
$1 = (mongocxx::v_noabi::client &) @0x7eff7d2cc494: {_impl = std::unique_ptr<mongocxx::v_noabi::client::impl> = {get() = 0x7eff66612054}}

Not sure if its ok or not.

Above that all the mongocxx/mongoc code will not show me the argument values.

At start I thought it was a one time thing but then saw it again a few days later occurring at the same line, which crashes the application.
This is a code path which runs many times (millions per day), so its not something that I can re-create since its not repeating.

Do you have any insight to this? any help will be appreciated.

Thanks!

Hi @Oded_Raiches

mongocxx::client is designed to perform a null check before calling mongoc_client_get_database. This check is supposed to throw an “invalid client object” exception if the internal pointer is null. However, this null check is not thread-safe. If the client object is being modified concurrently by another thread, this may cause a race condition that defeats the null check and allows a null pointer to be passed to mongoc_client_get_database. This would explain the unpredictable application crashes as well as the apparent inconsistencies in the resulting stack trace.

The mongocxx::pool::entry object returned by .acquire() is likely being destroyed (which sets the client object’s internal pointer to null) while the entry’s corresponding client object is still being used by another thread. The pool entry object owns the provided client object: the lifetime of the pool entry object returned by .acquire() must be greater than the scope of the provided client object’s use.

Have you ensured that the pool entry object providing the client object remains valid for the duration of the client object’s use?

Hi @Rishabh_Bisht
The flow is rather simple:

zmongo_put_object (client_pool=0x7f0340189630, <removed args from here but the are all fine>)
{
	....
	auto entry = client_pool.acquire();
	if (!entry) {
		return some_error;
	}
	... 
	return zmongoPutObject(*entry, <removed args from here but the are all fine>);
}

zmongoPutObject (conn=..., <removed args from here but the are all fine>)
{
	...
	collection collectionPointer = conn[DB_NAME][collectionName];   <-- failure here
	...
}

zmongo_put_object function and its incoming pool is used by parallel threads here, but it suppose to be thread safe, right?
The conn acquired here is never used by parallel threads.
AFAIU, the conn acquired by the pool will be released once exiting zmongo_put_object.
Do you see any issue with this? also the scope of the entry and its lifetime is surly larger than the user of the conn.

A mongocxx::pool can be used across multiple threads and used to create clients. However, each mongocxx::client can only be used in a single thread.
Unfortunately, given the description so far, there isn’t enough information to determine the root cause.

Please take a look at Connection pools for reference - this might be a helpful resource.

Thanks for the reply @Rishabh_Bisht
Let me try to explain again, the pool in my application is used by multiple threads, calling the acquire function, but the client connections acquired are used only withing the thread and released at the end of the calling function within the thread, just like this:

auto threadfunc = [](mongocxx::client& client, std::string dbname) {
  auto col = client[dbname]["col"].insert_one({});
};

std::thread t1 ([&]() {
  auto c = pool.acquire();
  threadfunc(*c, "db1");
  threadfunc(*c, "db2");
});

std::thread t2 ([&]() {
  auto c = pool.acquire();
  threadfunc(*c, "db2");
  threadfunc(*c, "db1");
});

t1.join();
t2.join();

Using it this way for over a year now without any issue, but I spotted the above issue 3 times already this past 2 weeks, which makes me feel it is some issue new to mongo 6.0.5 version.