Can I create an index on a part of one field?

Hi,

My collection has a field named “date” of type “Date” (example: 2021-04-28T09:07:11.092+00:00), and a field named “lineNumber” (integer).

I try to create the best index for my query.

The latter (in Java code) selects documents matching a specific year and sorts them on descending date and ascending lineNumber:

AggregateIterable<Document> logs = logsCollection.aggregate(Arrays.asList(
	project(fields(computed("year", eq("$year", "$date")), 
	include(ROW_ID, ROW_CYCLE, ROW_TYPE, ROW_DATE, ROW_OUTPUT))), 
	match(eq("year", "2021")), 
	sort(orderBy(descending(ROW_DATE), ascending(ROW_LINE_NUMBER)))));

I created a compound index “date_-1_lineNumber_1” on descending date and ascending lineNumber but it is never used (I can see that in MongoDBCompass).

How can I do ? Must I use a partial index ?
Thanks!

A few thoughts.

  1. I think the index cannot be used because you match on a computed field.
  2. The project may prevents the index to be used on the sort.
  3. I would match, sort and then project
  4. I understand that you project to compute the year needed for the match
  5. To be able to match with computing the year you may try to match $gte:(January 1st of the year to match) and $lt:(January 1st of the following year).

Hi Steeve,

Thanks a lot for your answer.
I changed my query as you suggested (points 3 and 5):

AggregateIterable<Document> logs = logsCollection.aggregate(Arrays.asList(
				project(fields(include(MongoDBConstants.ROW_ID, MongoDBConstants.ROW_DATE, MongoDBConstants.ROW_LINE_NUMBER))), 
				match(and(Arrays.asList(
							gte(MongoDBConstants.ROW_DATE, firstDayOfYear), 
							lte(MongoDBConstants.ROW_DATE, lastDayOfYear)))), 
				sort(orderBy(descending(MongoDBConstants.ROW_DATE), ascending(MongoDBConstants.ROW_LINE_NUMBER)))));

Now in MongoDBCompass I can see that the index on date field is used, but not the compound one on date+lineNumber. Don"t undersand why ? …

Try

Your $project being the first stage might prevent the query planner to select the best index.

And in general, you want $match first and then $sort. As it is, you are potentially doing the projection work for all document before narrowing the documents of interest. May be the query planner can reorder, but when I know that a given order is better, I do not take the chance.

Hi Steeve,
Oops yes I forgot to move the project stage at end.
But it does not change anything…
I tried a simple query, with just a sort order:

AggregateIterable logs = logsCollection.aggregate(Arrays.asList(
sort(orderBy(descending(MongoDBConstants.ROW_DATE), ascending(MongoDBConstants.ROW_LINE_NUMBER)))));

My index on descending date and ascending lineNumber is not used in this query also…

I observed the same behaviour while setting up something on the sample_training database of a M0 cluster. I do not know what to say.

Hopefully, someone with a deeper understanding can help us both.

Hi @Helene_ORTIZ

@steevej is absolutely correct that the projection stage prevents MongoDB from using the index. Since an index points to the physical location of a document, I believe once you have a stage that loses this physical connection (e.g. projection, grouping, etc.), indexes generally cannot be used anymore.

So I tried to create some example documents:

{"_id": 1,date: new Date("2021-01-01"), "lineNumber": 1, "id": "x", "cycle": "y", "type": "z"}
{"_id": 2,date: new Date("2021-01-02"), "lineNumber": 2, "id": "x", "cycle": "y", "type": "z"}
{"_id": 3,date: new Date("2021-01-03"), "lineNumber": 3, "id": "x", "cycle": "y", "type": "z"}
{"_id": 4,date: new Date("2022-01-04"), "lineNumber": 4, "id": "x", "cycle": "y", "type": "z"}
{"_id": 5,date: new Date("2022-01-05"), "lineNumber": 5, "id": "x", "cycle": "y", "type": "z"}

Then I tried implementing your aggregation (I think this should be similar to your code):

var agg = Arrays.asList(
    Aggregates.match(
        Filters.and(
            Filters.gte("date", LocalDate.of(2021, 1, 1)),
            Filters.lt("date", LocalDate.of(2022, 1, 1)))),
    Aggregates.sort(
        Sorts.orderBy(
            Sorts.descending("date"), Sorts.ascending("lineNumber"))),
    Aggregates.project(
        Projections.fields(
            Projections.include("id", "cycle", "lineNumber", "date")))
);

coll.aggregate(agg).forEach(doc -> System.out.println(doc.toJson()));

and I found that the index was used in this case.

To ensure that the index is really used, I used the explain output code taken from the java driver documentation:

var explainResult = coll.aggregate(agg).explain();
System.out.println(explainResult.toJson(JsonWriterSettings.builder().indent(true).build()));

The output of explain is discussed in the explain results documentation page, but in a nutshell, if the index is not used, then you should see a COLLSCAN somewhere. If the index is used, you should see IXSCAN instead, without any COLLSCAN anywhere. In my test using the code above, I only see IXSCAN as expected.

If you’re still having issues, could you post the output of your explain output, and some example documents?

Best regards
Kevin

1 Like

Hi Kevin,
The way you suggested to do the request does not change anything, the default index on _id is used instead of my index on descending date and ascending lineNumber.
“var” does not exist in Java and neither the method explain() .

Just a hunch.

Make sure you do not have a typo between the field names used in the index, the real field names in your documents vs the value of your MongoDBConstants.

When things do not work as advertised, more often than not, it is a very simple mistake that we overlook.

Hi @Helene_ORTIZ

The var keyword is available in Java 10 since 2018 (see Simplify Local Variable Type Definition Using the Java 10 var Keyword), but if your Java version does not support this, please modify the code to account for the correct type.

The explain() method is also available in the Java driver I used (version 4.3), due to its compatibility with the latest MongoDB version (5.0) (see Java driver compatibility list). I believe this method is available since Java driver version 4.2.

If you’re using older Java driver that do not have the explain() method, you can still observe the explain plan output in the mongo shell using db.collection.explain().aggregate(…). Basically the Java driver’s explain() is calling exactly the same thing in the server, so you’ll see the same information there.

If you’re still having issues with this, please post the explain(‘executionStats’) output from the mongo shell. It can also be done in the Java driver using the explainVerbosity enum. Please also post your versions of Java, Java driver, and MongoDB server.

Best regards
Kevin

Hi again !
Until now, my project used Java 1.8.
I updated to Java 10 and also MongoDB java driver from 3.12.10 to 4.3.4 (I had to include mongodb-driver-sync, mongodb-bson and mongodb-driver-core).

Following this update some requests does not work any more…

For example this one:

AggregateIterable<Document> parameters = valuesCollection.aggregate(Arrays.asList(
				match(eq(proposno, 14)), 
				group(and(eq(ctype, "$" + ctype),
						eq(cname, "$" + cname),
						eq(pname, "$" + pname),
						eq(alias, "$" + alias),
						eq(unit, "$" + unit)),
						sum(nvalues, "$" + nvalues)
						),
				sort(orderBy(ascending(_id + "." + ctype),
						ascending(_id + "." + cname),
						ascending(_id + "." + alias),
						ascending(_id + "." + pname)))));

no longer returns a set of documents but only one:
Document{{_id=true, nvalues=41834}}

MongoDBCompass now generates the Java code below:

Arrays.asList(new Document("$match", 
    new Document("propid", 54893L)), 
    new Document("$group", 
    new Document("_id", 
    new Document("ctype", "$ctype")
                .append("cname", "$cname")
                .append("pname", "$pname")
                .append("alias", "$alias")
                .append("unit", "$unit")
                .append("nbValuesTotal", 
    new Document("$sum", "$nvalues")))))

Is it the right way to do it now ?
Thanks!

I managed to fix the problem thanks to this post:

I replaced the and operator by 'fields` operator.

1 Like

Hi again Kevin,

Now that I use Java 10 and the last MongoDB java drivers the index is finally used!
Here is an extract of the explain() output:

"queryHash": "D7F7D6F6",
          "planCacheKey": "A44264DE",
          "winningPlan": {
            "stage": "FETCH",
            "inputStage": {
              "stage": "IXSCAN",
              "keyPattern": {
                "date": -1,
                "lineNumber": 1
              },

Thanks a lot !!!

2 Likes