Something wrong in mongo atlas search string facet

Atlas is currently in use.
It is mongodb ver 8.0.

Regarding the atlas search string facet, I understand that the search index is now indexed as token instead of string facet, but there is something wrong with the query result.

please look follow example.

documents

{
  "extraction": {
    "keywords": ["a", "b", "c"]
  }
},
{
  "extraction": {
    "keywords": ["b", "c", "d", "e"]
  }
},

search index

wrong case

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "extraction": {
        "fields": {
          "keywords": [
            {
              "type": "token"
            }
          ]
        },
        "type": "document"
      }
    }
  }
}

In the above case, the aggregate results are weird.

right case

  "mappings": {
    "dynamic": false,
    "fields": {
      "extraction": {
        "fields": {
          "keywords": [
            {
              "type": "token"
            },
            {
              "type": "stringFacet"
            }
          ]
        },
        "type": "document"
      }
    }
  }
}      

set this case to ensure that the aggregate results are normal.

In conclusion, string facet must be used to properly operate the facet for the string array field below the document.

Hi @Damon_Kim , can you share the results that you are getting when using stringFacet vs token?

Hi, @amyjian.
It is difficult to directly deliver the results because the questionnaire expresses my example abstractly.
And I’m using it well by indexing the string facet.
However, characteristically, when facet in an index defined only as token, the aggregate figures were significantly less or missing.
For example,

{
  "$searchMeta": {
    "facet": {
      "operator": ... ,
      "facets": {
        "keywords": {"path": "keywords", "type": "string", "numBuckets": 100}
      }
    }
  }
}

Result

only token indexing
{
  "b": 1,
  "d": 1,
}

changed string facet indexing
{
  "a": 1,
  "b": 2,
  "c": 2,
  "d": 1,
  "e": 1,
}

Sure Here’s a simple and short reply.

Having issues with string facet in MongoDB Atlas Search results aren’t grouping as expected. Anyone else faced this.

Let me know if you want it more technical or detailed!

Hi, I noticed in this query the provided path is keywords. Can you confirm that this is the query you ran? I would expect this to fail, as the path should be extractions.keywords. Or, do you happen to have a top-level keywords field as well?

Facing a similar issue with mongoDB v8. When adding a filter on a field indexed as token I don’t get any results in the response.

Having issues with MongoDB Atlas Search string facet not returning expected results.

Hi @William_Gram , can you provide an example of the documents, index definition, and query you are running?

Hey, having similar issues here (mongo 7.0.22). I have a collection with fields created_at (datetime) and keywords (array of str) and atlas search index defined as such:

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "created_at": {
        "type": "date"
      },
      "keywords": [
        {
          "type": "token",
          "normalizer": "lowercase"
        },
        {
          "type": "stringFacet"
        }
      ]
    }
  }
}

And I attempt to run counting of keywords in different time windows similar to this:

[
            {
                "$searchMeta": {
                    "index": "index",
                    "facet": {
                        "operator": {
                            "range": {
                                "path": "created_at",
                                "gte": start_datetime,
                                "lt": end_datetime
                            }
                        },
                        "facets": {
                            "topTerms": {
                                "type": "string",
                                "path": "keywords",
                                "numBuckets": 100
                            }
                        }
                    }
                }
            }
        ]

Something really weird happens, where (if I understand things right) I believe only the token indexing should allow me to count frequencies of the keywords in the arrays of selected documents, however A LOT of the documents are not contributing into the buckets. If the index does not have stringFacet mapping the query as shown above returns proper count (total or lowerbound) for the douments but buckets are either empty or show very low counts. When both mappings are present (token and stringfacet) I get (seemingly) proper counts for the keyword frequencies.

1 Like

We’re encountering the same issue on our side. Switching from stringFacet to token when using an operator (such as autocomplete) does not return the expected results. I’m happy to provide reproduction steps if that would help with debugging. @amyjian

Hello! I’d love to see your repro steps if you’d be able to provide them (for anyone in this thread), thanks!

After switching from stringFacet to token type, I started experiencing the same issue. The buckets always return empty, even though lowerBound shows that there are matching documents.

Here is my search index:

{
  mappings: {
    dynamic: false,
    fields: {
      field1: {
        type: "string"
      },
      field2: {
        type: "string"
      },
      field3: {
        type: "string"
      },
      facetField: [
        {
          type: "string"
        },
        {
          type: "token"
        }
      ],
    }
  },
  analyzer: "lucene.standard",
  searchAnalyzer: "lucene.standard"
}
[
  {
    $searchMeta: {
      index: "mySearcIndex",
      facet: {
        operator: {
          compound: {
            filter: [
              {
                text: {
                  query: "{query1}",
                  path: "{field1}"
                }
              },
              {
                text: {
                  query: "{query1}",
                  path: "{field2}"
                }
              },
              {
                text: {
                  query: ["{query3}"],
                  path: "{field3}"
                }
              }
            ]
          }
        },
        facets: {
          facetField: {
            type: "string",
            path: "facetField",
            numBuckets: 250
          }
        }
      }
    }
  }
]

The result of the above query is as follows:

{
  "count": {
    "lowerBound": "4000"
  },
  "facet": {
    "facetField": {
      "buckets": []
    }
  }
}

The strange thing is that the same query works as expected with the $search stage using the query below.

[
  {
    $search: {
      index: "mySearchIndex",
      facet: {
        operator: {
          compound: {
            filter: [
              {
                text: {
                  query: "{query1}",
                  path: "{field1}",
                },
              },
              {
                text: {
                  query: "{query2}",
                  path: "{field2}",
                }
              },
              {
                text: {
                  query: ["{query3}"],
                  path: "{field3}"
                }
              }
            ],
          },
        },
        facets: {
          facetField: {
            type: "string",
            path: "facetField",
            numBuckets: 250,
          },
        },
      },
    },
  },
  {
    $facet: {
      "facetField": [
       { $unwind: "$facetField" },
       { $sortByCount: "$facetField" }
     ]
    }
  }
]

Actually, this is my new search index, and the old one is exactly the same as the one I shared above. The only difference is that I changed the stringFacet type to token. Any ideas?