Search Index creation fails when synonyms list are provided

Hi,

I was recently working on mongo atlas free tier (M0 cluster) to create a search index for a POC. I needed to include synonyms in the search index to make it easier to search for similar words. But when I try to create the atlas index I encounter the error below.

  1. The error doesn’t provide any extra information on where exactly my document is wrong.

  2. I followed the documentation and my synonyms collection follows the format (I even queried to check if there were any invalid docs).

  3. I also reduced the collection limit to 10000 as mentioned but it didn’t work either.

  4. Here is the JSON for the index configuration:

{
	"mappings": {
		"dynamic": false,
		"fields": {
			"title": {
				"fields": {
					"english": {
						"type": "string"
					}
				},
				"type": "document"
			}
		}
	},
	"synonyms": [{
		"analyzer": "lucene.standard",
		"name": "synonym_mapping",
		"source": {
			"collection": "animeSynCollection"
		}
	}],
	"storedSource": {
		"include": [
			"title.english",
			"bannerImage"
		]
	}
}

Hi @Ashik_N_A - Welcome to the community :wave:

Could you clarify on the query used to verify if there were any invalid docs?

Additionally, could you provide the source synonym collection (redacting any personal or sensitive information) a long with a few sample documents that are being searched on so that I can try reproduce this behaviour?

Regards,
Jason

Hey, thanks for the response. I figured out where the problem was. The query I made only checked the format of synonyms (array), and the string (empty strings) but I forgot to consider symbols.

The document that caused the index build to fail looked like this:

{'_id': ObjectId('6496c27616a431fd95eba537'), 'mappingType': 'equivalent', 'synonyms': ['∞']}

But still, it would be nice if Atlas provided a detailed error, suggesting which document was causing the build to fail (at least the document _id).

Regards,
Ashik

2 Likes

Glad to hear you found the document causing the error.

Thanks for the feedback above. I agree that it would be helpful if some further information (e.g. _id value could be returned in the error) to help locate the document(s) causing errors. I’ll raise this with the team internally as feedback.

Regards,
Jason

2 Likes

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.