Prioritizing the documents from two collections

I want to use Java MongoDB aggregation to combine two collections and prioritize the records based on following rules in two collections.

Please suggest some code snippet on how to achieve this.I’ve used MongoDB play ground toachive the result but looking at some code snippet to improve better.

Prioritization Rules
1.Urgent
2.Aged Days
3.udlStatus
4.contentReceivedTimestamp
5.percentExecuted

LetterTemplateCollection

{
      "lob": "AUTO",
      "templateId": "RT-000004",
      "remediationID": "CR-100002",
      "templateContent": "CUSTOMER_NAME_1",
      "templatePrinter": "ASVS",
      "templateStatus": "Approved",
      "urgent": "Yes",
      "agedDays": 102
    }

ContentCollection
{
      "contentReceivedTimestamp": "2024-01-02T04:48:02.582225",
      "templateId": "RT-000004",
      "lob": "AUTO",
      "remediationID": "CR-100002",
      "remediationTransactionID": "128",
      "remediationType": "Content",
      "contentStatus": "Loaded",
      "udlStatus": "Rejected-Ready to Reprint",
      "percentageExecuted": 95,
      "accountInfo": {
        "accountID": "9230028467",
        "accountType": "AUTOLN"
      },
      "remediationContentInfo": {
        "phaseNumber": "1",
        "remediationActionType": "LTR",
        "partySequenceNumber": "1",
        "checkPayeeName": "John Wright",
        "financialImpactAmount": "100",
        "interestAmount": "300",
        "loufAmount": "0",
        "minPayDiffAmount": "0",
        "rollupAccounts": "4.50"
      },
      "customerInfo": {
        "partyID": "779622269",
        "partyTypeID": "SSN",
        "impactType": "OTHER",
        "ssn": "779622269",
        "ecn": "",
        "epic": "CUST",
        "partySequence": "1",
        "customerDetails": {
          "firstName": "Joesph",
          "middleName": "Ross",
          "lastName": "Biden",
          "suffix": "Test2",
          "dateOfBirth": "06-20-1981",
          "email": "test@gmail.com",
          "phone": "612-382-9723"
        },
        "addressDetails": {
          "addressType": "MAIL",
          "addressLine1": "1234",
          "addressLine2": "Mint Street",
          "addressLine3": "Test2",
          "city": "Bridgville",
          "state": "PA",
          "zip5": "15017",
          "zip4": "1534",
          "country": "US"
        }
      }
    }

Hi @Anil_kumar_Narreddi and welcome to the MongoDB Community forum. :wave:

To help you better, we would need certain information with respect to the above request.

  1. What do you mean when you refer to improve better? What are the metric that you are trying to improve?
  2. What is the aggregation query that you are trying to use ?
  3. What is the collection size for both the above mentioned collections?
  4. Do you have any index defined in the above collections?
  5. Can you also share the explain output for the aggregation query you are using?

Answering the above questions would help us identify the gaps and help the community to triage better.

Regards
Aasawari

2 Likes

Hi,

Please provide a dataset that represents your issue without any un-necessary fields and the expected output.
Without more details, I would say that you are looking for a sort stage at the end of your aggregation pipeline.

Cheers,
Maxime.

Please find my comments inline and the aggregation query below

  1. What do you mean when you refer to improve better? What are the metric that you are trying to improve?
    –To prioritize the records in the collection based on the rules
  2. What is the aggregation query that you are trying to use ?
    Pasted below
  3. What is the collection size for both the above mentioned collections?
    It can be large example to start with 100k records in the collection
  4. Do you have any index defined in the above collections?
    Only one index for now recordId
  5. Can you also share the explain output for the aggregation query you are using?
    [
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2024-01-02T04:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “128”,
    “udlStatus”: “Rejected-Ready to Reprint”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2019-02-22T06:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “229”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2020-06-14T06:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “823”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2021-07-10T06:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “129”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2022-05-10T06:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “79”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2022-09-10T06:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “83”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2023-01-15T04:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “135”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2023-07-19T02:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “34”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2023-12-10T06:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “83”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2024-01-18T06:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “72”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    },
    {
    “_id”: ObjectId(“5a934e000102030405000000”),
    “agedDays”: 102,
    “computed”: “computed value”,
    “remedTransactionInfo”: [
    {
    “contentReceivedTimestamp”: “2024-02-10T06:48:02.582225”,
    “executionPercentage”: 95,
    “remediationTransactionId”: “159”,
    “udlStatus”: “Ready to Print”
    }
    ],
    “remediationID”: “CR-100002”,
    “templateId”: “RT-000004”,
    “urgent”: “Yes”
    }
    ]

db.ltrTemplateInfo.aggregate([
{
“$match”: {
$expr: {
$and: [
{
$eq: [
“$remediationID”,
“CR-100002”
]
},
{
$eq: [
“$templateId”,
“RT-000004”
]
},
{
$eq: [
“$lob”,
“AUTO”
]
}
]
}
}
},
{
“$lookup”: {
“from”: “remediationContent”,
“localField”: “templateId”,
“foreignField”: “templateId”,
“as”: “mergedRemedTemplateData”
}
},
{
“$unwind”: “$mergedRemedTemplateData”
},
{
“$addFields”: {
“udlStatusValue”: {
“$indexOfArray”: [
[
“Rejected-Ready to Reprint”,
“Ready to Print”
],
“$mergedRemedTemplateData.udlStatus”
]
}
}
},
{
$sort: {
agedDays: 1,
urgent: 1,
“udlStatusValue”: 1,
“mergedRemedTemplateData.contentReceivedTimestamp”: 1
}
},
{
“$project”: {
urgent: 1,
agedDays: 1,
templateId: 1,
computed: “computed value”,
remediationID: “$mergedRemedTemplateData.remediationID”,
remedTransactionInfo: [
{
“remediationTransactionId”: “$mergedRemedTemplateData.remediationTransactionID”,
“udlStatus”: “$mergedRemedTemplateData.udlStatus”,
“executionPercentage”: “$mergedRemedTemplateData.percentageExecuted”,
“contentReceivedTimestamp”: “$mergedRemedTemplateData.contentReceivedTimestamp”
}
]
}
}
])

Yes based on the results prioritized based on rules,need to sort the records once aggregation is finished on the two collections

Hi,

Can you edit your previous post and use markdown notation to format the code correctly? Also make sure to remove the “ and use " instead as it makes it impossible to import in MongoDB if I want to hack and test myself.

Can you explain what is wrong with your pipeline and why it’s not doing what you want?

Thanks,
Maxime.

Do you see any issue in the aggregation pipeline built with sorting the records based on those 4 conditions?

Also In the pipeline, I’m doing match based on conditions, but if I want to run this pipeline across all remediations in the collection(eg:100k records) ,will there be any performance impact in returning the aggregated results?

Can this be tweaked any thing differently by giving some weightage number(Random Integer) to those 4 rules and sort based on the weightage numbers in the aggregated collection?

db.ltrTemplateInfo.aggregate([ { “$match”: { $expr: { $and: [ { $eq: [ “$remediationID”, “CR-100002” ] }, { $eq: [ “$templateId”, “RT-000004” ] }, { $eq: [ “$lob”, “AUTO” ] } ] } } }, { “$lookup”: { “from”: “remediationContent”, “localField”: “templateId”, “foreignField”: “templateId”, “as”: “mergedRemedTemplateData” } }, { “$unwind”: “$mergedRemedTemplateData” }, { “$addFields”: { “udlStatusValue”: { “$indexOfArray”: [ [ “Rejected-Ready to Reprint”, “Ready to Print” ], “$mergedRemedTemplateData.udlStatus” ] } } }, { $sort: { agedDays: 1, urgent: 1, “udlStatusValue”: 1, “mergedRemedTemplateData.contentReceivedTimestamp”: 1 } }, { “$project”: { urgent: 1, agedDays: 1, templateId: 1, computed: “computed value”, remediationID: “$mergedRemedTemplateData.remediationID”, remedTransactionInfo: [ { “remediationTransactionId”: “$mergedRemedTemplateData.remediationTransactionID”, “udlStatus”: “$mergedRemedTemplateData.udlStatus”, “executionPercentage”: “$mergedRemedTemplateData.percentageExecuted”, “contentReceivedTimestamp”: “$mergedRemedTemplateData.contentReceivedTimestamp” } ] } } ])

Posted some additional criteria and aggregation pipeline below.Please check and post any inputs.

As already mentioned and very important if you want us to help you.

Updated the previous post,please check if it’s fine now?

It is not.

Please read

example_of_a_well_formed_document =
{ 
  
   "first_name" : "steeve" ,
   "last_name" : "juneau" ,
   "user_id" : "steevej" ,
}
2 Likes