Using pymonogo to execute transactions on single-node replica sets

Hello,

We are currently using pymongo 3.12.0, but we want to upgrade to 4.1.1.In _socket_from_server in the newer pymongo version, the read_preference is forced to be PRIMARY_PREFERRED when connecting to a replica set of SINGLE topology. The comment says that this is according to “the spec”, that PRIMARY_PREFERRED should be used when connecting directly to a replSet member.Transactions require the read_preference to be set to PRIMARY, but this _socket_from_server method completely overrides any preference we pass.What this means:
For our single node sites, transactions will fail because of the read preference not being PRIMARY .
For our multi-node sites, everything works swimmingly.We would like to be able to use transactions on our single-node replica sets, but it seems like this is not possible by design.

For context, we use multi-node replica sets on our production and staging environments, but we use single-node replica sets in order to have numerous cheap development/testing environments.
For these environments, we do not care about data integrity or persistence, but we do need them to be able to mimic the feature set available to our production environments.

It would be nice to get some insight on how to resolve this.

Thanks for reporting this issue. I cannot reproduce this error using PyMongo 4.1.1 (or any other version):

>>> client = MongoClient()
>>> client.topology_description
<TopologyDescription id: 62b206b9e17622ce1b043822, topology_type: ReplicaSetWithPrimary, servers: [<ServerDescription ('localhost', 27017) server_type: RSPrimary, rtt: 0.002338583999999977>]>
>>> with client.start_session() as s, s.start_transaction():client.t.t.find_one({}, session=s)
... 
{'_id': ObjectId('62b2053d003273b84afb7006')}

Same with a client connected directly to the primary:

>>> client = MongoClient(directConnection=True)
>>> client.topology_description
<TopologyDescription id: 62b20544003273b84afb7007, topology_type: Single, servers: [<ServerDescription ('localhost', 27017) server_type: RSPrimary, rtt: 0.000564334999992866>]>
>>> with client.start_session() as s, s.start_transaction():client.t.t.find_one({}, session=s)
... 
{'_id': ObjectId('62b2053d003273b84afb7006')}

Could you provide the code that reproduces the error including the full trackback?

3 Likes

Here is some code that begins the transaction and a matching traceback.

Code that executes transaction:

async with await AsyncDatabase.instance()._client.start_session() as session:
    # PRIMARY_PREFERRED doesn't seem to be supported for transactions, so use PRIMARY instead
    # The type hint from pymongo doesn't have the enum values inherit from ReadPreference, so we must cast it here.
    return await session.with_transaction(
        execute_transaction, read_preference=cast(ReadPreference, ReadPreference.PRIMARY)
    )

Traceback:

backend/[REDACTED]/services/database_test.py:440: in test_all_or_nothing
    await run_in_transaction("test_all_or_nothing", tx)
backend/[REDACTED]/services/database.py:543: in run_in_transaction
    return await session.with_transaction(
        execute_transaction, read_preference=cast(ReadPreference, ReadPreference.PRIMARY)
    )
..[REDACTED]
backend/[REDACTED]/persistence/base.py:532: in _count
    return await self._get_mongo_collection().count_documents(
../.pyenv/versions/3.10.4/lib/python3.10/concurrent/futures/thread.py:58: in run
    result = self.fn(*self.args, **self.kwargs)
../.virtualenvs/[REDACTED]/lib/python3.10/site-packages/pymongo/collection.py:1811: in count_documents
    return self._retryable_non_cursor_read(_cmd, session)
../.virtualenvs/[REDACTED]/lib/python3.10/site-packages/pymongo/collection.py:1817: in _retryable_non_cursor_read
    return client._retryable_read(func, self._read_preference_for(s), s)
../.virtualenvs/[REDACTED]/lib/python3.10/site-packages/pymongo/mongo_client.py:1371: in _retryable_read
    return func(session, server, sock_info, read_pref)
../.virtualenvs/[REDACTED]/lib/python3.10/site-packages/pymongo/collection.py:1806: in _cmd
    result = self._aggregate_one_result(sock_info, read_preference, cmd, collation, session)
../.virtualenvs/[REDACTED]/lib/python3.10/site-packages/pymongo/collection.py:1663: in _aggregate_one_result
    result = self._command(
../.virtualenvs/[REDACTED]/lib/python3.10/site-packages/pymongo/collection.py:272: in _command
    return sock_info.command(
../.virtualenvs/[REDACTED]/lib/python3.10/site-packages/pymongo/pool.py:736: in command
    session._apply_to(spec, retryable_write, read_preference, self)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pymongo.client_session.ClientSession object at 0x10f8fe200>
command = [REDACTED]
is_retryable = False, read_preference = PrimaryPreferred(tag_sets=None, max_staleness=-1, hedge=None)
sock_info = SocketInfo(<socket.socket fd=23, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 59648), raddr=('127.0.0.1', 27017)>) at 4522065392

    def _apply_to(self, command, is_retryable, read_preference, sock_info):
        self._check_ended()
        self._materialize()
        if self.options.snapshot:
            self._update_read_concern(command, sock_info)
    
        self._server_session.last_use = time.monotonic()
        command["lsid"] = self._server_session.session_id
    
        if is_retryable:
            command["txnNumber"] = self._server_session.transaction_id
            return
    
        if self.in_transaction:
            if read_preference != ReadPreference.PRIMARY:
>               raise InvalidOperation(
                    "read preference in a transaction must be primary, not: "
                    "%r" % (read_preference,)
                )
E               pymongo.errors.InvalidOperation: read preference in a transaction must be primary, not: PrimaryPreferred(tag_sets=None, max_staleness=-1, hedge=None)

As you can see, The PRIMARY read preference is being given to with_transaction, but at some point (actually in _socket_from_server) it’s being turning into PRIMARY_PREFERRED, which causes the transaction to not execute.

Here is the topology description:

(Pdb) AsyncDatabase.instance()._client.topology_description
<TopologyDescription id: 62b44a11581b0cbf8945e9ae, topology_type: Single, servers: [<ServerDescription ('localhost', 27017) server_type: RSPrimary, rtt: 0.002651116764172912>]>
2 Likes

Thank you for the additional info! I’ve reproduced the bug and opened a ticket for it here: https://jira.mongodb.org/browse/PYTHON-3333

3 Likes

Note that this bug only occurs when using directConnection=True which is not required for your use case. Instead your apps can connect without directConnection=True (or with directConnection=False) even with a single member replica set. For example:

>>> client = MongoClient(directConnection=False)
>>> client.topology_description
<TopologyDescription id: 62bcab02b4fdcaaf57288dfa, topology_type: ReplicaSetWithPrimary, servers: [<ServerDescription ('localhost', 27017) server_type: RSPrimary, rtt: 0.0007047529999795188>]>
>>> with client.start_session() as s, s.start_transaction():client.t.t.count_documents({}, session=s)
... 
0
2 Likes