C# Atlas Search: General Questions

I’ve been playing around with atlas search and wrote a routine that tests a number of different methodologies for it’s use case. This has spawned a few questions that I either cannot find answers for or which have some documentation that I may not understand fully

  1. I do not really want a regex, but I do want a partial match. For example: “456” should find “4567”. This seems to work with Autocomplete only(or creative regex). However Autocomplete does not support wildcard match or multiple path match. Is it possible to do a partial match on 2-3 fields? Which are the only fields my users would ever search on?
  2. I think I need to use compound for this one, but I am struggling with syntax, which may just be my fault due to how I wrote the code. Regardless, I am using FilterDefintions" or LINQ style statements in many spots, example: x=> x.CompanyID == “XXXX” or builder.Eq(“ID”, new ObjectId(“XXX”)). I want to combine this with atlas search but it seems atlas search wants things in a completely different format. I guess this is ok, but obviously changes my entire code so I am being a baby about it :slight_smile: It also makes my much more complex FilterDefinitions challenging to “rewrite” for the Atlas Text Search

What I am ultimately trying to solve with #2 is avoiding the multi-stage pipeline of .Search + .Match, and of course Match cannot go first in the pipeline.

My use case is that I 100% know I need to filter based on moderately complex set of natural filters. Let’s say something like “belongs to this group, is not flagged for quarantine, i have access to it, etc” and then I want to take that subset and search based on user input which is where the power of the Atlas Search comes in. However I want to artificially limit the possible subset they are searching on based on a dynamic set of other field restrictions.

FYI .Search + .Match works perfectly but it is slow, as your documentation indicates it would be. What sort of options do I have here other than finding a way to completely redo all my FilterDefinitions as a collection of .Must(something’s)? I also noticed not all of my filters would be supported via that style, though I think it would cover most.

I hope these questions make sense and I would be happy to provide examples upon request, but it is somewhat theoretical at the moment.

Replying to my own post!

I re-read the documentation and discovered that I can set returnStoredSource=true in combination with Stored Source Fields = ALL to match the .Search().Match().Sort() extremely fast and produce the same results. So that works at the cost of a larger index. I think that is the only downside. So this will work as a solution for me until our database reaches whatever size that I need to revisit this.

However I do still have the following questions:

  1. Is there any way to do a partial match(autocomplete) with multiple fields?
  2. Are there any plans to allow for a stage in the pipeline before search? Then we could use a .Match() to pre-filter or limit the number of records we search on without the complexity of Compound, which isn’t really what I want.

It seems compound is basically adding “things” to my search beyond whatever text I am trying to search for. While I think this works and I probably could have converted my normal filters to a Compound.Must syntax, it is not really what I am looking for. What I am really looking for is to “pre-filter” my collection prior to doing the search.

Hi @Mark_Mann,

I know it may not be ideal since you have mentioned the complexity of compound in your second question but have you considered doing it within a compound operator? More information / examples on the Search Across Multiple Fields using autocomplete documentation.

Again, I do mention this in case it has not been investigated but would filter work for you? Theres an example at the start of the same documentation which demonstrates the $match stage being replaced with the use of filter.

Look forward to hearing from you.

Regards,
Jason

1.) Thank you. I will research this further

2.) I build my filter statements dynamically, so I cannot say with certainty that all of them would work with the compound $filter, but I believe most will. The primary issue I have is that most of my filters are in the format of LINQ or FilterDefinition, so I would have to write some type of translation script to get them in the format the $filter for Search wants. I know this is not the end of the world, but was hoping search would take the other format for $filter.

Hey @Mark_Mann,

Thanks for getting back to me regarding my previous responses. In response to another part of your 2nd question - to my knowledge and at the moment $match isn’t possible prior to the $search stage.

Just to clarify here, do you have a link to what “other format for $filter” (mentioned above) appears like? I just want to confirm for better understanding :slight_smile:

Look forward to hearing from you.

Regards,
Jason

Thank you for responding Jason. I made another specific thread for this piece of the question as well so this single thread does not get cluttered with my thoughts.

I did read that Search must be the first stage in any pipeline. I also read the documentation of how any Filter/Sort should be included in the search stage or it will have significant performance implications. I was also able to confirm this via empirical testing.

I am specifically looking to pass one of the following to the C# driver for search, and it does not seem possible. Thus I would have to create what I need in the BsonDocument format in order for it to be recognized and executed.

FilterDefinition
or
c# linq: Expression<Func<T, bool>>

Apologies as I am not too familiar with the C# driver for search but does either filter() method mentioned in the above link work for you? This is not the same as an additional $match or $sort stage in the pipeline but is the filter clause within the compound operator which works within the single $search stage.

Thank you for responding Jason. I made another specific thread for this piece of the question as well so this single thread does not get cluttered with my thoughts.

Thanks for raising this thread. I believe a colleague of mine will respond to you here :slight_smile:

Regards,
Jason

Jason,

I am starting to notice the performance difference between Search + Match and using “Filter”(or MUST/MUST NOT, knowing the affect the score).

I was able to get some basic concepts to work, but I still cannot figure out how to make our situation work. What I am really looking for, all code and semantics aside, is a “pre-filter” prior to sorting. I could then make it my own responsibility to properly index whatever I am pre-filter on, then only use the text search for those records.

However I understand I cannot do that now(yet?) so I have tried to accomplish my task with the tools available to me.

I am struggling with an “OR” clause against two different fields when attempt to MATCH within a Search pipeline stage.

Example below.

Ignore the fact that it is compound with only one “Should”, I removed a bunch of them for readability and they do not impact this conversation. They only serve to use the mongo search to match other data against other fields. This works just fine and this particular discussion is not about the accuracy or style of the Mongo Atlas Search.

As I’ve mentioned before, my “dream world” would be allow for me to pass my existing FilterDefinition to the Atlas Search routine somehow, which I have already made because prior to implementing atlas search I would take my FilterDefintion and add a “$regex” match to it. However since I cannot do that, I am trying to convert my FilterDefintion to what Atlas Search wants.

Let’s say my filter is something fairly basic like:
CreatedByUserID == myUserID OR CreatedByCompanyID == myCompanyID. I cannot figure out how to do this.

I tried something like the below, but obviously it only works if BOTH are true. I also experimented with “QueryString”, which was less effective and only allowed for OR within a single Field. How would I go about something very basic like this? I don’t see any way to put and OR clause between two filters or put two “paths” in a single filter(it claims it must be a string not an array).

My filter has a few other components to it, but the concept is similar. I am trying to take a giant collection and limit the users search to things in their “bucket”. The “bucket” scope can vary depending on the intentions and thus I build my “FilterDefintion” accordingly

"filter" :[
              {
                "in":{
                  "path":"CreatedByUserID",
                  "value":[ObjectID('A'),
                          ObjectID('B')]
                }
              },
              {
                "in":{
                  "path":"CreatedByCompanyID",
                  "value":[ObjectID('C'),
                          ObjectID('D')]
                }
              }
            ]

Basic Pipeline:

{
        "compound": {
            "minimumShouldMatch": 1,
            "should": [
                {
                    "regex": {
                        "allowAnalyzedField": true,
                        "path": "Fields.Value",
                        "query": [
                            ".*0001.*"
                        ]
                    }
                }
            ]
        },
        "highlight": {
            "path": [
                "Fields.Value"
            ]
        },
        "index": "TextSearch",
        "returnStoredSource": true,
        "sort": {
            "Created": -1
        }
    }
1 Like

Hey Mark,

If you could provide me with the following to try simplify my understanding of the scenario - hopefully I can try work something out that suits your use case.

  1. 4-5 sample documents
  2. The two conditions / search terms you’re using for the OR portion based off these sample documents
  3. The expected output based off these sample documents
  4. The index definition currently in use.

I have some idea of what you’re after and the issue of using the filter with both of the conditions but I think with some sample documents I can try recreate something at least for troubleshooting.

Please redact any personal or sensitive information before posting here. Feel free to DM me the sample documents if you believe they are too large to post here.

Wondering if even a nested compound might work here… Is this something you tried with regards to the filter?

Look forward to hearing from you.

Regards,
Jason

Mark - I created a very basic example using the following sample documents to maybe try understand what you’re after regarding filter and use of an OR boolean.

Let’s say we have these 5 test documents:

{
	"name": "jason",
	"company_id": 1 /// <--- Doc to be returned
},
{
	"name": "test", /// <--- Doc to be returned
	"company_id": 2
},
{
	"name": "jason",
	"company_id": 2
},
{
	"name": "jason",
	"company_id": 3
},
{
	"name": "test", /// <--- Doc to be returned
	"company_id": 1
}

I want to filter for documents with - {"name":"test"} OR {"company_id":1}. I do so using the following:

db.search.aggregate({
	$search: {
		compound: {
			should: [{
				compound: {
					filter: {
						text: {
							path: "name",
							query: "test"
						}
					}
				}
			},
			{
				compound: {
					filter: {
						equals: {
							path: "company_id",
							value: 1
						}
					}
				}
			}]
		}
	}
})

This returns the 3 documents:

[
  {
    _id: ObjectId("64ed469c820dce241360b7ac"),
    name: 'jason',
    company_id: 1
  },
  {
    _id: ObjectId("64ed469c820dce241360b7ad"),
    name: 'test',
    company_id: 2
  },
  {
    _id: ObjectId("64ed469c820dce241360b7b0"),
    name: 'test',
    company_id: 1
  }
]

Note: I’m using the default index definition in the above tests

Wondering if this is something you were after with specific regards to filter and OR?

If not, i’ll await for the specifics regarding my previous reply.

Look forward to hearing from you.

Regards,
Jason

Jason,

I think you are on to something here with nested compounds(did not know I could do that).

What I have now is:
Compound
-Filter
–Compound
—Must(things that are mandatory here, like status = good status
—Should(min should match of 1, essentially making this an OR)
----My checks here: made by companyid made by userid, granted read access to user, public flag == true, etc
-Should(this is my generic “regex” text search, min match of 1)
–search properties A/B/C for the text string

So far so good. It was a but confusing at first since when I added the above Filter it did not work until I specifically created a field mapping for the involved fields(example below). I am not sure if this is specifically because I was comparing object IDs or I would need to do this always…My hunch is that it was because they were ObjectID’s and that is how I was using the “in” operator.

Anyway, it seems like progress has been made. Now I need to go through my entire logic for where I build a FilterDefintion and write a matching function to build an above compound filter BSON. Not terrible, but less than ideal.

Feature Request at least for the C# driver: A way to pass "FilterDefinition into the Search Operator to filter in addition to the desired text search or just a better way to pre-filter prior to search in general


{
“mappings”: {
“dynamic”: true,
“fields”: {
“CreatedByCompany”: {
“fields”: {
“CompanyID”: {
“type”: “objectId”
},

1 Like

Glad to hear progress has been made :slight_smile:

Not 100% sure since I am unaware of the index definition before you made the changes but its possible that it is due to CompanyID existing within the CreatedByCompany field. However, I am not certain since I am not sure of what the index definition was beforehand. Perhaps the Static and Dynamic mappings documentation might be of use in this case.

You could raise this as a feedback request here.

Regards,
Jason

Jason,

It was due to me doing an equals/in with ObjectID without specifying the field mapping as an ObjectID.

I actually think I have this working, but I did have to completely rewrite my logic that builds a dynamic LINQ statement to instead build a dynamic BsonDocument matching the specific format required for Atlas Search. This is unfortunate as it is somewhat hardcoded and I just lost a huge benefit of the C# driver, which is using linq to create a “more natural” filter…at least more natural in the sense of matching C# code.

I’ll take you up on that feedback request.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.