Multiple Field Search to get single matched document

Hi Folks,
Am looking for option to extract a particular document as per user desire , they wish to filter a document with particular condition like skill and country and last_name.

  1. I tried out many aggregation process but nothing is achieved as per my requirement
  2. important thing is user details will share by array of values [“java”,“php”,“Newyork”,“jack”]
  3. how to get these matched document from mongodb
  4. i try this below code
 [
      {
        $search: {
          index: "boolean",
          compound: {
            must: [
              {
                text: {
                  query: ["java","php","Newyork","jack"],
                  path: [
                    "first_name",
                    "last_name",
                    "state",
                    "city",
                    "skillset.skill",
                    "prefered_location",
                    "email",
                    "employment_details.job_role",
                    "employment_details.job_skills",
                  ],
                },
              },
            ],
            should: [
              {
                text: {
                  query: ["java","php","Newyork","jack"],
                  path: [
                    "first_name",
                    "last_name",
                    "state",
                    "city",
                    "skillset.skill",
                    "prefered_location",
                    "email",
                    "employment_details.job_role",
                    "employment_details.job_skills",
                  ],
                },
              },
            ],
          },
        },
      }, { $limit: 15 }
    ]

kindly suggest your thoughts , thanks in Advance

Using a query array of values, each clause generated becomes a “should” (optional) clause. You have that wrapped in a single must clause, so any of those must match, but not necessarily all of them. If you require all the query values to match, separate them individually into other items in your must array.

1 Like
[
  {
    $search: {
      index: "ats_boolean",
      compound: {
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["first_name"],
          },
        },
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["last_name"],
          },
        },
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["skillset.skill"],
          },
        },
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["city"],
          },
        },
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["state"],
          },
        },
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["prefered_location"],
          },
        },
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["email"],
          },
        },
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["employment_details.job_role"],
          },
        },
        must: {
          text: {
            query: ["java", "php", "chennai"],
            path: ["employment_details.job_skills"],
          },
        },
      },
    },
  },
  {
    $limit: 15,
  },
]

this is also gave a any one or two matches

At first glance, the syntax is not quite right. There should be only one must under compound, which is an array of text clauses, rather than a bunch of musts.

Also, there are two diagnostic tools worth exploring here: scoreDetails to see how the scores are computed and the .explain() on the aggregation call which shows how the query was interpreted. The .explain() is going to be helpful in your case. The must clauses are likely overly restrictive, as at least one of each of those query terms is going to have to match the specified field in order for a document to match. Perhaps make this a should array rather than must and let relevancy pull the best matching documents to the top.

Thanks for your suggestion but I have to find exact matching document not a relevant match document that’s why am looking for suggestion furthermore people are also suggesting SQL DB will achieve this Mongodb have some limitations, is that true ?

You have an array of queries, and you’re trying those against various fields - an exact match is a bit undefinable. If you can have a single query string for each specific field, you’ll be able to get as exact as you like. Where does the array of query strings come from? Is each element specific to a particular field? Also, what do you mean by “exact”? Is “chennai” exactly as it appears in the city field? Or is this case insensitive?

it means perfect match, my requirement I have to fetch a document as per my array of values.

I hear you about wanting a perfect match, but you have an array of values - it’s not specified which field those values should match, so you’re trying several. Do the elements in the array have a particular field they should match? If not, then a “perfect match” does not seem well defined. Can you provide some sample documents that should and should not match? In general, full text search is not about a perfect or exact match but about relevancy, with the best matching documents at the top of the list.

You mentioned considering a SQL DB - how would that query be specified, out of curiosity?

It’s an interesting search challenge you have proposed, so I’m here to help as best I can. Thanks for your patience :slight_smile:

Here is the document for your reference,

{
  "_id": {
    "$oid": "64e5ae1d2682288e398a5c25"
  },
  "CandidateId": "SS46620",
  "first_name": "varathan",
  "last_name": "raja",
  "email": "varathan9005@gmail.com",
  "mobile_number": "9840176815",
  "gender": "Male",
  "state": "Chennai",
  "city": "Chennai",
  "pincode": "627501",
  "current_location": "Chennai",
  "willing_to_relocate": true,
  "prefered_location": "Chennai",
  "expected_ctc": "To be modified",
  "notice_period": "15 Days or less",
  "status": "Active",
  "prefered_mode_of_hire": "To be modified",
  "resume_url": "https://ss-ats-assets.s3.ap-south-1.amazonaws.com/candidate_resume/varathan-Resume_1692773917416.html",
  "skillset": [
    {
      "skill": "JAVA,.NET,PHP,CORELDRAW,SQL,VB",
      "years": 2,
      "months": 6
    }
  ],
  "employment_details": [
    {
      "company_name": "apx solution",
      "start_date": "2018-10-01T18:30:00.000Z",
      "end_date": "",
      "job_role": "chennai",
      "work_model": "To be modified",
      "ctc": "7.0 Lacs",
      "employment_type": "Permanent",
      "industry_type": "Software Product",
      "c2h_payroll": "apx solution",
      "job_skills": null,
      "is_current": true
    },
    {
      "company_name": "THIRIPURA chits p LTD",
      "start_date": "2010-10-01T18:30:00.000Z",
      "end_date": "2018-02-01T18:30:00.000Z",
      "job_role": "Regional Manager",
      "work_model": "To be modified",
      "ctc": "To be modified",
      "employment_type": "Permanent",
      "industry_type": "Software Product",
      "c2h_payroll": "THIRIPURA chits p LTD",
      "job_skills": null,
      "is_current": false
    },
    {
      "company_name": "csc computer center",
      "start_date": "2008-01-01T18:30:00.000Z",
      "end_date": "2018-10-01T18:30:00.000Z",
      "job_role": "kalakad",
      "work_model": "To be modified",
      "ctc": "To be modified",
      "employment_type": "Permanent",
      "industry_type": "Software Product",
      "c2h_payroll": "csc computer center",
      "job_skills": null,
      "is_current": false
    }
  ],
  "created_by": {
    "$oid": "64ccd040cface0ef8be4db92"
  },
  "is_deleted": false,
  "createdAt": {
    "$date": "2023-08-23T06:58:37.685Z"
  },
  "updatedAt": {
    "$date": "2023-08-23T06:58:37.685Z"
  },
  "__v": 0
}

in this document have java, php, chennai, (chennai is a city), first moment i was used wildcard search , wildcard will full fill my requirement but i got unmatched documents, that’s why am reached here, thanks for your support Erik_Hatcher.

Thanks for the clear example. I understand the data. What I’m not quite clear on and I think will help a lot is … where do the query terms come from? Is the user adding three separate terms, and expects that they could/should/must be a match anywhere in the document (or a select subset of the fields as seems to be the case)? Or is the user typing in “java php chennai”? Would “java” match a location that pointed to “Java, VA”, for example? Java, Virginia - Wikipedia - or are the terms “java” and “php” only to be matching in the skill field?

We can do much better than wildcard with a bit of fine tuning of what you’re doing here. :wink: Again, thanks for your patience as it’s an interesting challenge to 1) try to understand what you’re ultimately needing, and 2) how best to configure and query it to get there.

What is your index configuration? Fully dynamic mappings, or overriding some/all fields settings?

After a few minutes of re-reading this entire thread, I am reminded that you did specify in the first message that the user is supplying an array of query terms. Ok, here’s a proposed solution:

  compound: {
    must: [
      text: { path: [ ... ], query: "java" },
      text: { path: [ ... ], query: "php" },
      text: { path: [ ... ], query: "chennai" }
    ]
  }

This is saying that all individual terms must match in one or more of the fields in the path array provided. Use the array that you initially have been using, with the change being to make a clause for each query term.

How does this work for your needs?

2 Likes

I appreciate your approach on this , i have worked on your query in mongodb compass it produced one of my old result, it gives any two of values will be present on document not all of them,

[
  {
    $search: {
      index: "ats_boolean",
      compound: {
        must: [
          {
            text: {
              path: [
                "first_name",
                "last_name",
                "state",
                "city",
                "skillset.skill",
                "prefered_location",
                "email",
                "employment_details.job_role",
                "employment_details.job_skills",
              ],
              query: "java",
            },
            text: {
              path: [
                "first_name",
                "last_name",
                "state",
                "city",
                "skillset.skill",
                "prefered_location",
                "email",
                "employment_details.job_role",
                "employment_details.job_skills",
              ],
              query: "php",
            },
            text: {
              path: [
                "first_name",
                "last_name",
                "state",
                "city",
                "skillset.skill",
                "prefered_location",
                "email",
                "employment_details.job_role",
                "employment_details.job_skills",
              ],
              query: "chennai",
            },
          },
        ],
      },
    },
  },
]

here is the document was i got as a solution

{
  "_id": {
    "$oid": "6463784e5c2e1cf13bb8f78d"
  },
  "CandidateId": "SS39246",
  "first_name": "Sriba",
  "last_name": "",
  "email": "psriba@gmail.com",
  "mobile_number": "9003781504",
  "gender": "Female",
  "state": "Tamil Nadu",
  "city": "Chennai",
  "pincode": "600034",
  "current_location": "Chennai",
  "willing_to_relocate": false,
  "prefered_location": "",
  "expected_ctc": "15 LPA",
  "notice_period": "Immediate",
  "status": "In progress",
  "prefered_mode_of_hire": "C2H (contract to Hire) - Client side",
  "resume_url": "https://ss-ats-assets.s3.ap-south-1.amazonaws.com/candidate_resume/Sriba_Java%20Developer_Chennai.doc_1684240460376.msword",
  "skillset": [
    {
      "skill": "Chennai",
      "exp": 49
    }
  ],
  "employment_details": [
    {
      "company_name": "Mindtree",
      "start_date": "2019-02-03T18:30:00.000Z",
      "end_date": "2022-04-28T18:30:00.000Z",
      "job_role": "Java Developer",
      "work_model": "Remote",
      "ctc": "9 LPA",
      "employment_type": "Permanent",
      "industry_type": "IT",
      "c2h_payroll": "-",
      "job_skills": "Java",
      "is_current": "yes"
    }
  ],
  "created_by": {
    "$oid": "641462a847038cf77ecc7e81"
  },
  "is_deleted": false,
  "createdAt": {
    "$date": "2023-05-16T12:34:22.861Z"
  },
  "updatedAt": {
    "$date": "2023-05-16T12:34:22.861Z"
  },
  "__v": 0
}

why am into this, am working on Boolean search functionality in this case i have completed OR , NOT, except AND gate functionality, my client they will enter text to search their requirement like `Php and java and Berlin not WashingtonDC not Jasper` for this my code will extract AND key Words OR keywords NOT keywords then produce their results, am Succeed on OR , NOT gate but failed in AND condition. so now am working on hardcode functionality to bare this case,
anyway thanks for your effort on this, i never imagine someone will take care of my task to help me out but you did it, thanks for that.