Can't create a unique index that ignores nulls in MongoDB

MongoDB’s semantics of null is not the same as the semantics of null in SQL databases. As a consequent, unique indexes in MongoDB will raise uniqueness violation error when more than one document has null for the field that has unique index. Furthermore, there appears to be no way to create a unique index where nulls are ignored.

MongoDB seems to have adopted the null value semantics of Java where null is a literal value assigned to variables of reference type to indicate that it has no referent. Thus, in Java null == null is true.

However in the database world, null could have many meanings including indicating that the value of an attribute is unknown. Since two unknowns cannot be compared, null == null should be null, i.e., it is unknown if two unknowns are the same. With this semantics, two null values in a unique index are never equal in that null==null is not true. Thus unique indexes will allow multiple null values.

This is an issue in migrating SQL database to MongoDB.

Hi @Rafiul_Ahad ,

You should have a look at sparse indexes and, for even more flexibility, partial indexes.

Regards,
Jeff

1 Like

I did. And nothing worked so far.

From indexing - Unique index in mongoDB 3.2 ignoring null values - Stack Overflow, the following works for unique index for field of type String with nulls
IndexOptions indexOptions = new IndexOptions().unique(true).partialFilterExpression(Filters.eq(“fieldName”,Filters.eq("$type",“string”)));
but the note says:
"Just a note from my testing for anyone else looking for the same solution: although this worked as a mechanism for enforcing uniqueness while allowing nulls, Mongo didn’t use the index for lookup queries using the indexed field and using hint() to force it resulted in very slow performance

Pete S "

Apr 20, 2018 at 10:25

Welcome to the MongoDB Community @Rafiul_Ahad!

Null values are a placeholder for unknown values in both cases, but semantics for comparison and indexing may differ depending on the databases you are comparing. For example, SQL Server used to (and may still) disallow multiple nulls in a unique index by default – you have to create a filtered unique index (analogous to a partial index in MongoDB).

I think the most straightforward way to ignore null values for the purposes of a unique index would be to avoid serialising null property values. For example, if you are using the MongoDB Java driver’s POJO support this happens by default (see Serialization Customization).

For more information on null semantics in MongoDB, please refer to Query for Null or Missing Fields.

It seems like you may have found an acceptable solution for your use case.

If that isn’t the case or you are looking for further advice on this topic, please provide some more information on your environment:

  • version of Java driver (and ODM like Spring data, if applicable)
  • version of MongoDB server
  • how you created the unique index
  • the error message or unexpected behaviour you are observing when trying to insert a document with null values

Per the answer you referenced from Stack Overflow, you can also use a partialFilterExpression matching on type to index a subset of documents based on field type (eg those where the fieldName value is a string ). However, as mentioned in the Partial Index Query Coverage documentation a partial index filter will not be chosen by the query planner unless a query includes the same filters:

MongoDB will not use the partial index for a query or sort operation if using the index results in an incomplete result set.

To use the partial index, a query must contain the filter expression (or a modified filter expression that specifies a subset of the filter expression) as part of its query condition.

Regards.
Stennie

1 Like

Hi Stennie,
Thank your for your response.
My current workaround is just that, a workaround to be able to create unique indexes on fields that have nulls in them. Since it comes with a performance penalty, it is not acceptable.
Let me get on the soapbox for a moment. Indexes are physical database constructs. They should never be seen by the application. Asking an application to include the filter used in the index in their query is not acceptable.
Without divulging much about what I am doing, let’s say that an application is programmatically generating data for insertion into MongoDB or MySQL database. It is also translating queries from a high level language to database query language. For MongoDB, this application will have to deal with this MongoDB idiosyncrasy adding to the complexity of the code.
To answer your questions, I am using MongoDB server version 6.0.0 and the application is using MongoDB Java Sync driver version 4.7.1. Initially the unique indexes were created using the following option:
IndexOptions indexOptions = new IndexOptions().unique(true);
But that caused uniqueness violation exception so I included the partialFilterExpression that I mentioned in this thread.
Thanks.
-Rafiul