Atlas Search Index: Filter out docs with a field that has an empty string

Can I set a filter on a $search stage to filter out any results for a field that is an empty string? Maybe there’s a way to do an equals for this?

Hi @Matt_Jones1,

welcome to MongoDB and the Forums!

I would recommend you to go through compound operator in which you can specify must and must not match of the output.

Please let me know if that helps.

Thanks,
Darshan

Darshan,

Thank you for the reply. If I use compound with a must operator, then which operator under must will allow me to specify a match only if a field is an empty string?

Thanks,
Matt

Hello,
Is there any answer to this question?

Thanks,
Prasad

Hi @Prasad_Kini ,

Hope you are doing great.

Can you please share the sample of valid documents, As this query needs tailored query, it will be helpful if I have the sample document and expected output.

Thanks,
Darshan

Hi Darshan,
Please find below some sample documents. My basic requirement is to be able to retrieve all documents with blank/null fields (e.g. catgory, cost_center, users). I will know the field names when building the query.

{
	"projects": [
		{
			"name": "Payment System",
			"description": "Issue payments to clients",
			"category": "Budgeted",
			"estimate": 12,
			"cost_center": 874,
			"budget": 600000,
			"users": [
				"Sales",
				"HR",
				"Technology"
			]
		},
		{
			"name": "Admin System",
			"description": "Admin Portal Enhancements",
			"category": "",
			"estimate": 4,
			"cost_center": null,
			"budget": 100000,
			"users": [
				"Technology"
			]
		},
		{
			"name": "Lights On",
			"description": "Business As Usual",
			"category": "",
			"estimate": 52,
			"cost_center": null,
			"budget": 1000000,
			"users": []
		}
	]
}

Thanks,
Prasad

Hi @DarshanJayarama,
Would you be able to help with this?

Thanks,
Prasad

Hi @Prasad_Kini,

My apologies for the delay. I have created below search index in my sample_mflix movies collection:

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "plot": {
        "type": "string"
      },
      "runtime": {
        "type": "number"
      }
    }
  }
}

And with below search query, I got the empty record for the given field:

MongoDB Enterprise mflix-shard-0:PRIMARY> db.movies.aggregate([ {
    $search: {
        compound: {
            must: [{
                text: {
                    query: "titanic",
                    path: "plot"
                }
            }],
            mustNot: [{
                range: {
                    "gt": 1,
                    path: "runtime"
                }
            }]
        }
    }
}, {
    $project: {
        countries: 1,
        _id: 1,
        runtime: 1,
        plot: 1
    }
}])
{ "_id" : ObjectId("573a1394f29313caabce0bcb"), "plot" : "An account of the ill-fated maiden voyage of RMS Titanic in 1912.", "runtime" : "", "countries" : [ "USA" ] }
MongoDB Enterprise mflix-shard-0:PRIMARY>

Above query returned empty string for the runtime field.

I hope this answer your question.

Thanks,
Darshan

Thanks @DarshanJayarama. Could you please shed some light on the inner workings of gt? How does it know to pick up documents with the empty string when the condition is doing a comparison with a numeric value?

Also, is there a similar hack to check for empty arrays as well?

Regards,
Prasad

Hi Prasad,

Thanks for your response.

We are filtering out those records which are not greater than 1 for the runtime field. as runtime containing numeric values, if any field containing null or missing fields will not be adhere to the condition as null is less than 0:

MongoDB Enterprise mflix-shard-0:PRIMARY> null < 0
false
MongoDB Enterprise mflix-shard-0:PRIMARY> null < 1
true
MongoDB Enterprise mflix-shard-0:PRIMARY> "" < 0
false
MongoDB Enterprise mflix-shard-0:PRIMARY> "" < 1
true
MongoDB Enterprise mflix-shard-0:PRIMARY>

In this condition, filtering out those records which gt 1 in number field get us the missing or null field.

I hope this clarify your doubts.

Thanks,
Darshan

1 Like

Hi @DarshanJayarama,

Wouldn’t the query also pickup documents that have runtime<=1? i.e. values like 1, 0.9, 0.72 etc?

Thanks,
Prasad