GraphQl query with custom resolver for handling nested objects fails at random

Hi
I am fairly new to MongoDB and I have a strange issue when trying to update an account object nested under user object, using a custom resolver. Executing query remotely (e.g. from Postman) fails 1 in 3 times, while executing it through Realm UI is always successful and is 10 times faster.


Setup:

The ‘Users’ collection holds user objects that look like this:

{
  _id:...
  externalId:...
  account: {
    ...
    ...
  }
}

I wanted to be able to update one or more fields in the account nested object, without the need to fetch the object first, building a new account object and then using a standard UpdateOneUser resolver to finally perform the operation.

I build a custom resolver:

exports = async function ({ externalId, accountPartial }) {
  const cluster = context.services.get("mongodb-atlas");
  const users = cluster.db("main-db").collection("Users");
  const keys = Object.keys(accountPartial)
  const user = await Promise.all(keys.map(key => {
    const obj = {}
    obj[`account.${key}`] = accountPartial[key]
    return users.findOneAndUpdate(
    { externalId: externalId},
    { $set: obj },
    { returnNewDocument: true }
  )
  }));
  return user[0];
}

and I am calling it via the following mutation:

mutation updateUserAccountPartial($externalId: String!, $accountPartial: InputAccountPartial!) {
  user: custom__userAccountPartialUpdate(
    input: {externalId: $externalId, accountPartial: $accountPartial}
  ) {
    _id
  }
}

Results:

Failed request

There is no problem whatsoever when executing the resolver directly or via the GraphQL UI in the dashboard. When I try to call it via Postman however, on average 1 out of 3 times I get an error:

{
	"message": "failed to update documents: retry count exhausted",
	"locations": [
		{
			"line": 1,
			"column": 116
		}
	],
	"path": [
		"user"
	]
}

and in the GraphQL log I have:
Status: OK
Runtime: ~4000ms

Rule Performance Metrics:

{
  "main-db.Users": {
    "roles": {
      "server": {
        "matching_documents": 202,
        "evaluated_fields": 0,
        "discarded_fields": 0
      }
    },
    "no_matching_role": 0
  }
}

I have only 2 documents in the Users collection, so the matching_documents at 202 is a strange number.

Successful request

For a successful operation, the GraphQL logs are:
Status: OK
Runtime: ~2000ms

Rule Performance Metrics:

{
  "main-db.Users": {
    "roles": {
      "server": {
        "matching_documents": 123,
        "evaluated_fields": 0,
        "discarded_fields": 0
      }
    },
    "no_matching_role": 0
  }
}

Problem summary

When I run the mutation from GraphQL UI or I run the custom resolver function directly, the runtime is approx: 300ms, and it always succeeds.

Could anyone help me figure out:

  1. Why are the execution times so long?
  2. Why are they failing at random.
  3. Why is the matching_documents field so big
  4. How to fix it

I will be very grateful for all help.

I would guess that updating nested fields via custom resolvers should be a pretty standard practice. Else, one would have to always make two requests instead of one. Did anyone encounter a similar scenario?

I too have these questions as we are experiencing the same error. However we do NOT use GraphQL. For whatever reason we are getting the issue when attempting to make an update request to a document by _id using Realm Web (it fails with retry count exhausted and "matching_documents": "101". This suggests the issue is not specific to GraphQL.

1 Like

Thank you for your input, @Jason_Louro! I agree, it does not look like a GraphQL specific issue.

Hi @_alex and @Jason_Louro - have you received any more info on this, or made any more discoveries yourselves?

I ask as I believe I’m seeing the same or very similar issue. I have a Atlas Realm Function that is performing an updateMany() on our collections (that is ultimately called by our web app via a GraphQL mutation Custom Resolver), and this updateMany() is set’ing nested document data, positionally in Arrays (hence not using the OOTB updateMany GraphyQL APIs) - and this executes sub-second (whether updating 10, or 1k, or 2-3k documents - its execution time is quite consistent & fast) when run from the Atlas Realm Function execution UI or the GraphiQL UI.

However, when run via Postman or our web app (making the GraphQL mutation call) - it often hits the 90s GraphQL Function timeout and fails with a 503 error; and this does seem to be dependent on the number of documents being updated by the updateMany() within my Function. Any additional info would be greatly appreciated (if you have any) - thanks much in advance!