Prioritizing the documents from two collections

Anil_kumar_Narreddi · February 18, 2024, 3:15am

I want to use Java MongoDB aggregation to combine two collections and prioritize the records based on following rules in two collections.

Please suggest some code snippet on how to achieve this.I’ve used MongoDB play ground toachive the result but looking at some code snippet to improve better.

Prioritization Rules
1.Urgent
2.Aged Days
3.udlStatus
4.contentReceivedTimestamp
5.percentExecuted

LetterTemplateCollection

{
      "lob": "AUTO",
      "templateId": "RT-000004",
      "remediationID": "CR-100002",
      "templateContent": "CUSTOMER_NAME_1",
      "templatePrinter": "ASVS",
      "templateStatus": "Approved",
      "urgent": "Yes",
      "agedDays": 102
    }

ContentCollection
{
      "contentReceivedTimestamp": "2024-01-02T04:48:02.582225",
      "templateId": "RT-000004",
      "lob": "AUTO",
      "remediationID": "CR-100002",
      "remediationTransactionID": "128",
      "remediationType": "Content",
      "contentStatus": "Loaded",
      "udlStatus": "Rejected-Ready to Reprint",
      "percentageExecuted": 95,
      "accountInfo": {
        "accountID": "9230028467",
        "accountType": "AUTOLN"
      },
      "remediationContentInfo": {
        "phaseNumber": "1",
        "remediationActionType": "LTR",
        "partySequenceNumber": "1",
        "checkPayeeName": "John Wright",
        "financialImpactAmount": "100",
        "interestAmount": "300",
        "loufAmount": "0",
        "minPayDiffAmount": "0",
        "rollupAccounts": "4.50"
      },
      "customerInfo": {
        "partyID": "779622269",
        "partyTypeID": "SSN",
        "impactType": "OTHER",
        "ssn": "779622269",
        "ecn": "",
        "epic": "CUST",
        "partySequence": "1",
        "customerDetails": {
          "firstName": "Joesph",
          "middleName": "Ross",
          "lastName": "Biden",
          "suffix": "Test2",
          "dateOfBirth": "06-20-1981",
          "email": "test@gmail.com",
          "phone": "612-382-9723"
        },
        "addressDetails": {
          "addressType": "MAIL",
          "addressLine1": "1234",
          "addressLine2": "Mint Street",
          "addressLine3": "Test2",
          "city": "Bridgville",
          "state": "PA",
          "zip5": "15017",
          "zip4": "1534",
          "country": "US"
        }
      }
    }

Aasawari · February 20, 2024, 9:54am

Hi @Anil_kumar_Narreddi and welcome to the MongoDB Community forum.

To help you better, we would need certain information with respect to the above request.

What do you mean when you refer to improve better? What are the metric that you are trying to improve?
What is the aggregation query that you are trying to use ?
What is the collection size for both the above mentioned collections?
Do you have any index defined in the above collections?
Can you also share the explain output for the aggregation query you are using?

Answering the above questions would help us identify the gaps and help the community to triage better.

Regards
Aasawari

MaBeuLux88_xxx · February 20, 2024, 4:30pm

Hi,

Please provide a dataset that represents your issue without any un-necessary fields and the expected output.
Without more details, I would say that you are looking for a sort stage at the end of your aggregation pipeline.

Cheers,
Maxime.

Anil_kumar_Narreddi · February 20, 2024, 5:52pm

Please find my comments inline and the aggregation query below

What do you mean when you refer to improve better? What are the metric that you are trying to improve?
–To prioritize the records in the collection based on the rules
What is the aggregation query that you are trying to use ?
Pasted below
What is the collection size for both the above mentioned collections?
It can be large example to start with 100k records in the collection
Do you have any index defined in the above collections?
Only one index for now recordId
Can you also share the explain output for the aggregation query you are using?
[
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2024-01-02T04:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “128”,
“udlStatus”: “Rejected-Ready to Reprint”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2019-02-22T06:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “229”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2020-06-14T06:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “823”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2021-07-10T06:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “129”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2022-05-10T06:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “79”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2022-09-10T06:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “83”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2023-01-15T04:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “135”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2023-07-19T02:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “34”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2023-12-10T06:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “83”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2024-01-18T06:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “72”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
},
{
“_id”: ObjectId(“5a934e000102030405000000”),
“agedDays”: 102,
“computed”: “computed value”,
“remedTransactionInfo”: [
{
“contentReceivedTimestamp”: “2024-02-10T06:48:02.582225”,
“executionPercentage”: 95,
“remediationTransactionId”: “159”,
“udlStatus”: “Ready to Print”
}
],
“remediationID”: “CR-100002”,
“templateId”: “RT-000004”,
“urgent”: “Yes”
}
]

db.ltrTemplateInfo.aggregate([
{
“$match”: {
$expr: {
$and: [
{
$eq: [
“$remediationID”,
“CR-100002”
]
},
{
$eq: [
“$templateId”,
“RT-000004”
]
},
{
$eq: [
“$lob”,
“AUTO”
]
}
]
}
}
},
{
“$lookup”: {
“from”: “remediationContent”,
“localField”: “templateId”,
“foreignField”: “templateId”,
“as”: “mergedRemedTemplateData”
}
},
{
“$unwind”: “$mergedRemedTemplateData”
},
{
“$addFields”: {
“udlStatusValue”: {
“$indexOfArray”: [
[
“Rejected-Ready to Reprint”,
“Ready to Print”
],
“$mergedRemedTemplateData.udlStatus”
]
}
}
},
{
$sort: {
agedDays: 1,
urgent: 1,
“udlStatusValue”: 1,
“mergedRemedTemplateData.contentReceivedTimestamp”: 1
}
},
{
“$project”: {
urgent: 1,
agedDays: 1,
templateId: 1,
computed: “computed value”,
remediationID: “$mergedRemedTemplateData.remediationID”,
remedTransactionInfo: [
{
“remediationTransactionId”: “$mergedRemedTemplateData.remediationTransactionID”,
“udlStatus”: “$mergedRemedTemplateData.udlStatus”,
“executionPercentage”: “$mergedRemedTemplateData.percentageExecuted”,
“contentReceivedTimestamp”: “$mergedRemedTemplateData.contentReceivedTimestamp”
}
]
}
}
])

Anil_kumar_Narreddi · February 20, 2024, 5:56pm

Yes based on the results prioritized based on rules,need to sort the records once aggregation is finished on the two collections

MaBeuLux88_xxx · February 20, 2024, 6:26pm

Hi,

Can you edit your previous post and use markdown notation to format the code correctly? Also make sure to remove the “ and use " instead as it makes it impossible to import in MongoDB if I want to hack and test myself.

Can you explain what is wrong with your pipeline and why it’s not doing what you want?

Thanks,
Maxime.

Anil_kumar_Narreddi · February 21, 2024, 6:15pm

Do you see any issue in the aggregation pipeline built with sorting the records based on those 4 conditions?

Also In the pipeline, I’m doing match based on conditions, but if I want to run this pipeline across all remediations in the collection(eg:100k records) ,will there be any performance impact in returning the aggregated results?

Can this be tweaked any thing differently by giving some weightage number(Random Integer) to those 4 rules and sort based on the weightage numbers in the aggregated collection?

db.ltrTemplateInfo.aggregate([ { “$match”: { $expr: { $and: [ { $eq: [ “$remediationID”, “CR-100002” ] }, { $eq: [ “$templateId”, “RT-000004” ] }, { $eq: [ “$lob”, “AUTO” ] } ] } } }, { “$lookup”: { “from”: “remediationContent”, “localField”: “templateId”, “foreignField”: “templateId”, “as”: “mergedRemedTemplateData” } }, { “$unwind”: “$mergedRemedTemplateData” }, { “$addFields”: { “udlStatusValue”: { “$indexOfArray”: [ [ “Rejected-Ready to Reprint”, “Ready to Print” ], “$mergedRemedTemplateData.udlStatus” ] } } }, { $sort: { agedDays: 1, urgent: 1, “udlStatusValue”: 1, “mergedRemedTemplateData.contentReceivedTimestamp”: 1 } }, { “$project”: { urgent: 1, agedDays: 1, templateId: 1, computed: “computed value”, remediationID: “$mergedRemedTemplateData.remediationID”, remedTransactionInfo: [ { “remediationTransactionId”: “$mergedRemedTemplateData.remediationTransactionID”, “udlStatus”: “$mergedRemedTemplateData.udlStatus”, “executionPercentage”: “$mergedRemedTemplateData.percentageExecuted”, “contentReceivedTimestamp”: “$mergedRemedTemplateData.contentReceivedTimestamp” } ] } } ])

Anil_kumar_Narreddi · February 21, 2024, 6:17pm

Posted some additional criteria and aggregation pipeline below.Please check and post any inputs.

steevej · February 21, 2024, 6:23pm

As already mentioned and very important if you want us to help you.

Anil_kumar_Narreddi · February 21, 2024, 7:57pm

Updated the previous post,please check if it’s fine now?

steevej · February 21, 2024, 9:54pm

It is not.

Please read

example_of_a_well_formed_document =
{ 
  
   "first_name" : "steeve" ,
   "last_name" : "juneau" ,
   "user_id" : "steevej" ,
}