I have modified the document schema for using data -
{
"_id" : ObjectId("63cf39df1d7798a846b2eb0e"),
"articleid" : "9d23e3ab-9b7d-11ed-a650-b0227af59807",
"headline" : "Microsoft to Put $10b More in ChatGPT Maker OpenAI",
"subtitle" : "OpenAI needs Microsoft’s funding and cloud-computing power to run increasingly complex models",
"fulltext" : "\nMS Adds $1()B to Investment in ChatGPT Maker",
"pubdate" : "2023-01-24",
"article_type" : "print",
"date" : ISODate("2023-01-24T00:00:00.000+0000")
}
I am facing 2 Major performance issues.
-
I have 10 Million records in the collection, in the field - headline, subtitle, fulltext I am trying to find the words with Operator like - must, should, and simple search.
-
Now Whenever I run a must query it is always faster and return the result within 10 seconds no matter how big the query is, but in the case of a should or a simple search of a single word it takes more than 50 seconds why?
Now, I was thinking due to large data like 10 Million I need to filter data based on date first and then search full text search, in this way I can get result much faster.
Below are my sample queries -
Should -
db.getCollection("article_fulltext").aggregate([
{
"$search":{
"index":"fulltext",
"compound":{
"filter":[
{
"range":{
"path":"date",
"gte":"ISODate(""2023-01-01T00:00:00.000Z"")",
"lte":"ISODate(""2023-01-31T00:00:00.000Z"")"
}
}
],
"should":[
{
"text":{
"query":"CHATGPT",
"path":[
"headline",
"fulltext",
"subtitle"
]
}
},
{
"text":{
"query":"OPENAI",
"path":[
"headline",
"fulltext",
"subtitle"
]
}
}
],"minimumShouldMatch": 1
}
}
}
])
simple search -
db.getCollection("article_fulltext").aggregate([{
$search:{
index:"fulltext",
text:{
query:"Microsoft",
path:["headline", "fulltext", "subtitle"]
}
}
}])
Atlas search Index -
{
"mappings": {
"dynamic": false,
"fields": {
"articleid": {
"type": "string"
},
"fulltext": {
"type": "string"
},
"headline": {
"type": "string"
},
"subtitle": {
"type": "string"
}
}
}
}
I am facing issues with the above queries, as I am getting really good performance with must search!