Aggregation $function works in postman- gives ReferenceError in jest

I have an aggregation function that works fine when using postman. When I test using supertest I get:

MongoServerError: PlanExecutor error during aggregation :: caused by :: ReferenceError: cov_2nu0vpmwxb is not defined :
    @:3:13
    
        at MessageStream.messageHandler (/Users/Ryan.Moore/fls/fls/apps/chops-api/src/node_modules/mongoose/node_modules/mongodb/src/cmap/connection.ts:753:20)
        at MessageStream.emit (node:events:520:28)
        at processIncomingData (/Users/Ryan.Moore/fls/fls/apps/chops-api/src/node_modules/mongoose/node_modules/mongodb/src/cmap/message_stream.ts:168:12)
        at MessageStream._write (/Users/Ryan.Moore/fls/fls/apps/chops-api/src/node_modules/mongoose/node_modules/mongodb/src/cmap/message_stream.ts:65:5)
        at writeOrBuffer (node:internal/streams/writable:390:12)
        at _write (node:internal/streams/writable:331:10)
        at MessageStream.Writable.write (node:internal/streams/writable:335:10)
        at Socket.ondata (node:internal/streams/readable:777:22)
        at Socket.emit (node:events:520:28)
        at addChunk (node:internal/streams/readable:324:12) {
      ok: 0,
      code: 139,
      codeName: 'JSInterpreterFailure'
    }

The aggregation function is:

/** extractResponseFields - extracts the tag from the response field into a new field
 * response will look like:
 *   "response": "Let me get someone to talk to you about your account.<metric name=\"Missing_accounts\"
 *               stage=\"Handover_to_agent_IB\" /> <metric name=\"Speak_to_agent_DB\" stage=\"entrypointIB\" />
 *               <pause wait=2000 /><metric name=\"Speak to agent_DB\" stage=\"context_set_IB\" />
 *               <pause wait=2000 />I'm handing you over to one of my team who'll be happy to help.
 *               They'll be with you as soon as they can.<metric name=\"DB_Speak_to_agent\"
 *               stage=\"in_hours_handover_IB\" /> <pause wait=2000 /><handover skill=\"DS_Banking_IB\" />"
 *
 * So we need to extract and store each of the values of the tag in a new field e.g.
 *   metrics: [ "Missing_accounts", "Speak_to_agent_DB", "Speak_to_agent_DB", "DB_Speak_to_agent" ]
 *   stages: [ "Handover_to_agent_IB", "entrypointIB", "context_set_IB", "in_hours_handover_IB" ]
 *   handoverSkills: [ "DS_General_IB" ]
 *
 * 1. match all records that have '<tag>' in the response field - pass in like stage="
 * 2. create an additional field to hold the tag data
 * 3. merge the new field data into the original collection
 * @param {*} tag - tag to find in response field.
 * @param {*} fieldName - name of field to create.
 * @returns {Object} A distinct list of the values for this field sorted alphabetically.
 * @returns {Integer} The count of items in the list.
 */
const extractResponseFields = async (tag, fieldName) => {
  console.log('extractResponseFields:', tag, fieldName, tag.length)
  const hs = await Conversations.aggregate([
    { // stage 1 - match all records than have '<tag>' in the response field
      $match: {
        response: {
          $regex: tag
        }
      }
    },
    { // stage 2 - create an additional field to hold the tag data
      $set: {
        [fieldName]:
          {
            $function:
            {
              body: function (response, tag) {
                // split the function by the tag - creates an array
                const arr = response.split(tag)
                const tags = []
                for (let i = 1; i < arr.length; i++) { // ignore first element '0' as it is the text preceding tag
                  // take the text up until the next "
                  tags.push(arr[i].split('"')[0])
                }
                return tags
              },
              args: ['$response', tag], // $response comes from collection, tag is the variable passed
              lang: 'js'
            }
          }
      }
    }
    ,
    { // stage 3 - merge the new field data into the original collection
      $merge: { into: { db: 'chops', coll: 'conversations' }, on: '_id', whenMatched: 'merge', whenNotMatched: 'insert' }
    }
  ])
  // return [hs, hs.length]
  //console.log(hs, hs.length)
  // return a distinct collection of the newly created field
  const results = await Conversations.distinct(fieldName).sort()
  return [results, results.length]
}

test is…

  it('GET ' + endpoint + ' - Should return 90 distinct values from stages.', async () => {

    const response = await request(app).get(endpoint)
      .query({
        'metric': 'stage=\"',
        'fieldname': 'stages'
      }).send()
    expect(response.statusCode).toBe(200)
    expect(response.body.page * 1).toBe(1)
    expect(response.body.resultsPerPage * 1).toBe(90)
    expect(response.body.totalCount * 1).toBe(90)
    // sample data
    expect(response.body.extractResponseFields[0]).toBe('1_Ask_another_way')
    expect(response.body.extractResponseFields[7]).toBe('Borrowing')
  })

Every time I run I get the same object reference of ‘cov_2nu0vpmwxb’ in the error
How do I find out what ‘cov_2nu0vpmwxb’ is?

Every time I run I get the same object reference of ‘cov_2nu0vpmwxb’ in the error
How do I find out what ‘cov_2nu0vpmwxb’ is?

@Ryan_Moore assuming your code works correctly when executed via one path (Postman) and not another (Jest), the issue likely lies in the failing caller.

Start with basic troubleshooting such as:

  1. Set a breakpoint on extractResponseFields and step through the function
  2. What is the structure of the “passing” (Postman) pipeline being passed to the Conversations collection?
  3. What is the structure of the “failing” (Jest) pipeline being passed to the Conversations collection?
  4. Compare the two to see how they differ

Since the error is a codeName: 'JSInterpreterFailure' it stands to reason the failing is in the $function body or args.

If a simple diff of the two pipelines doesn’t surface the issue, try running the $function body as Javascript via the mongo or mongosh shell. This may help uncover where the error lies using the values being passed in via Jest.

@alexbevi thank you for your prompt reply
I don’t see any difference in the requests…
postman:

{
   "level":30,
   "time":1649062552717,
   "pid":42502,
   "hostname":"<REDACTED>",
   "req":{
      "id":1,
      "method":"GET",
      "url":"/conversations/extractResponseFields?metric=stage=%22&fieldname=stages",
      "query":{
         "metric":"stage=\"",
         "fieldname":"stages"
      },
      "params":{
         
      },
      "headers":{
         "user-agent":"PostmanRuntime/7.29.0",
         "accept":"*/*",
         "postman-token":"044f4763-d29f-4f5c-a006-1a083886f6f9",
         "host":"localhost:4000",
         "connection":"keep-alive"
      },
      "remoteAddress":"::1",
      "remotePort":64699
   },
   "res":{
      "statusCode":200,
      "headers":{
         "access-control-allow-origin":"*",
         "content-security-policy":"default-src 'self';base-uri 'self';block-all-mixed-content;font-src 'self' https: data:;frame-ancestors 'self';img-src 'self' data:;object-src 'none';script-src 'self';script-src-attr 'none';style-src 'self' https: 'unsafe-inline';upgrade-insecure-requests",
         "x-dns-prefetch-control":"off",
         "expect-ct":"max-age=0",
         "x-frame-options":"SAMEORIGIN",
         "strict-transport-security":"max-age=15552000; includeSubDomains",
         "x-download-options":"noopen",
         "x-content-type-options":"nosniff",
         "x-permitted-cross-domain-policies":"none",
         "referrer-policy":"no-referrer",
         "x-xss-protection":"0",
         "cross-origin-resource-policy":"cross-origin",
         "content-type":"application/json; charset=utf-8",
         "content-length":"2060",
         "etag":"W/\"80c-992goIikOSKmfyNi89tGmdI3DEI\""
      }
   },
   "responseTime":60,
   "msg":"request completed"
}

Jest

 {
   "level":30,
   "time":1649061217898,
   "pid":39606,
   "hostname":"<REDACTED>",
   "req":{
      "id":8,
      "method":"GET",
      "url":"/conversations/extractResponseFields?metric=stage=%22&fieldname=stages",
      "query":{
         "metric":"stage=\"",
         "fieldname":"stages"
      },
      "params":{
         
      },
      "headers":{
         "host":"127.0.0.1:63943",
         "accept-encoding":"gzip, deflate",
         "connection":"close"
      },
      "remoteAddress":"::ffff:127.0.0.1",
      "remotePort":63944
   },
   "res":{
      "statusCode":500,
      "headers":{
         "access-control-allow-origin":"*",
         "content-security-policy":"default-src 'self';base-uri 'self';block-all-mixed-content;font-src 'self' https: data:;frame-ancestors 'self';img-src 'self' data:;object-src 'none';script-src 'self';script-src-attr 'none';style-src 'self' https: 'unsafe-inline';upgrade-insecure-requests",
         "x-dns-prefetch-control":"off",
         "expect-ct":"max-age=0",
         "x-frame-options":"SAMEORIGIN",
         "strict-transport-security":"max-age=15552000; includeSubDomains",
         "x-download-options":"noopen",
         "x-content-type-options":"nosniff",
         "x-permitted-cross-domain-policies":"none",
         "referrer-policy":"no-referrer",
         "x-xss-protection":"0",
         "cross-origin-resource-policy":"cross-origin",
         "content-type":"text/html; charset=utf-8",
         "content-length":"109",
         "etag":"W/\"6d-cqSWWe9Li6uA4lFKYwZhI8qvRAw\""
      }
   },
   "err":{
      "type":"Error",
      "message":"failed with status code 500",
      "stack":"Error: failed with status code 500\n    at ServerResponse.onResFinished (/Users/Ryan.Moore/fls/fls/apps/chops-api/src/node_modules/pino-http/logger.js:77:38)\n    at ServerResponse.emit (node:events:532:35)\n    at onFinish (node:_http_outgoing:830:10)\n    at callback (node:internal/streams/writable:553:21)\n    at afterWrite (node:internal/streams/writable:498:5)\n    at afterWriteTick (node:internal/streams/writable:485:10)\n    at processTicksAndRejections (node:internal/process/task_queues:82:21)"
   },
   "responseTime":25,
   "msg":"request errored"
}

Running the function body as javascript …

var tag = 'stage="'

cursor = db.conversations.find({response: {$regex: 'stage="'}},{_id: 0, response: 1})
while(cursor.hasNext()){
  currentRecord = cursor.next()
  // printjson(currentRecord.response)
  // split the function by the tag - creates an array
  var arr = currentRecord.response.split(tag)
  var tags = []
  for (let i = 1; i < arr.length; i++) { // ignore first element '0' as it is the text preceding tag
    // take the text up until the next "
    tags.push(arr[i].split('"')[0])
  }
  print(tags)
}

Gives me what I would expect

complaint_handover,complaints_Web,Handover_in_hrs_Web
IB_Authenticated_click_here_to_get_started
Question_too_long_2,IB_handover_skill_indentification
DD_Indemnity_handover_Web,Web_handover_skill_known
NOT_Start
Freetype_message_true,Payments

@Ryan_Moore my apologies if my comment wasn’t clear. I wanted to see the differences in the pipeline that is being passed to the MongoDB Server; not the output of the function call.

For example:

const extractResponseFields = async (tag, fieldName) => {
  console.log('extractResponseFields:', tag, fieldName, tag.length)
  const pipeline = [
    { // stage 1 - match all records than have '<tag>' in the response field
      $match: {
        response: {
          $regex: tag
        }
      }
    },
    { // stage 2 - create an additional field to hold the tag data
      $set: {
        [fieldName]:
          {
            $function:
            {
              body: function (response, tag) {
                // split the function by the tag - creates an array
                const arr = response.split(tag)
                const tags = []
                for (let i = 1; i < arr.length; i++) { // ignore first element '0' as it is the text preceding tag
                  // take the text up until the next "
                  tags.push(arr[i].split('"')[0])
                }
                return tags
              },
              args: ['$response', tag], // $response comes from collection, tag is the variable passed
              lang: 'js'
            }
          }
      }
    }
    ,
    { // stage 3 - merge the new field data into the original collection
      $merge: { into: { db: 'chops', coll: 'conversations' }, on: '_id', whenMatched: 'merge', whenNotMatched: 'insert' }
    }
  ]
  // XXX added the following to log the pipeline structure
  console.log(JSON.stringify(pipeline))
  // XXX execute the pipeline
  const hs = await Conversations.aggregate(pipeline)
  // return [hs, hs.length]
  //console.log(hs, hs.length)
  // return a distinct collection of the newly created field
  const results = await Conversations.distinct(fieldName).sort()
  return [results, results.length]
}

@alexbevi
so from test run

[{"$match":{"response":{"$regex":"stage=\""}}},{"$set":{"stages":{"$function":{"args":["$response","stage=\""],"lang":"js"}}}},{"$merge":{"into":{"db":"chops","coll":"conversations"},"on":"_id","whenMatched":"merge","whenNotMatched":"insert"}}]

postman

[{"$match":{"response":{"$regex":"stage=\""}}},{"$set":{"stages":{"$function":{"args":["$response","stage=\""],"lang":"js"}}}},{"$merge":{"into":{"db":"chops","coll":"conversations"},"on":"_id","whenMatched":"merge","whenNotMatched":"insert"}}]

so neither seem to show the 2nd stage?? but the postman run results in the documents being modified to have the ‘stages’ field added

@Ryan_Moore you may need to set a breakpoint at console.log(JSON.stringify(pipeline)) to inspect the value of the pipeline variable. I would assume what is being written to the console would be correct, but the issue may be due to body containing a function definition, which JSON.stringify isn’t handling.

Note that you can likely replace the $function logic with pure aggregation logic. I’m guessing at what the “response” field contains, but here’s a rough example:

db.foo.drop(); db.foo.insert({ response: 'stage="complaint_handover",stage="complaints_Web",stage="Handover_in_hrs_Web"' })
db.foo.aggregate([
{ $project: {
  stages: {
    $map: {
      input: { $slice: [ { $split: [ "$response", "stage=" ] }, 1, { $size: { $split: [ "$response", "stage=" ] } } ] },
      as: "tag",
      in: { $arrayElemAt: [ { $split: [ "$$tag", '"' ] }, 1 ] }
    }
  }
}}
]);

@alexbevi
So the aggregation logic you provided does the same as the function and the pipeline stage now gives…

[{"$match":{"response":{"$regex":"stage=\""}}},{"$set":{"stages":{"$map":{"input":{"$slice":[{"$split":["$response","stage="]},1,{"$size":{"$split":["$response","stage="]}}]},"as":"tag","in":{"$arrayElemAt":[{"$split":["$$tag","\""]},1]}}}}},{"$merge":{"into":{"db":"chops","coll":"conversations"},"on":"_id","whenMatched":"merge","whenNotMatched":"insert"}}]

Thanks for your help

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.