Hi @Gorkem_Erdogan
A timeseries collection is quite different from a normal MongoDB collection. This is because although superficially it behaves like a normal collection, MongoDB treats time series collections as writable non-materialized views on internal collections that automatically organize time series data into an optimized storage format on insert (see Time Series Collections: Behavior).
For this reason, indexing a time series collection involves creating an index in the underlying internal collection, instead of creating it on the visible collection. There are index types that are unsupported at this time: TTL, partial, and unique (see Time Series Collection Limitations: Secondary Indexes).
For example, let’s create a new timeseries collection:
> db.createCollection("test", { timeseries: { timeField: "timestamp" } } )
then create a document to insert:
> doc = {_id:0, timestamp: new Date()}
and let’s insert three of those into the collection:
> db.test.insertOne(doc)
> db.test.insertOne(doc)
> db.test.insertOne(doc)
if you then see the content of the collection, all three documents with identical content will be present:
> db.test.find()
[
{ timestamp: ISODate("2021-12-03T08:43:50.503Z"), _id: 0 },
{ timestamp: ISODate("2021-12-03T08:43:50.503Z"), _id: 0 },
{ timestamp: ISODate("2021-12-03T08:43:50.503Z"), _id: 0 }
]
however, if you check the collection list, there is a mystery collection there:
> show collections
test [time-series]
system.buckets.test
system.views
if you delve into the mystery collection, you’ll see how the test
collection is actually stored:
> db.system.buckets.test.find()
[
{
_id: ObjectId("61a9d8947dfd3e5b32de6144"),
control: {
version: 1,
min: { _id: 0, timestamp: ISODate("2021-12-03T08:43:00.000Z") },
max: { _id: 0, timestamp: ISODate("2021-12-03T08:43:50.503Z") }
},
data: {
_id: { '0': 0, '1': 0, '2': 0 },
timestamp: {
'0': ISODate("2021-12-03T08:43:50.503Z"),
'1': ISODate("2021-12-03T08:43:50.503Z"),
'2': ISODate("2021-12-03T08:43:50.503Z")
}
}
}
]
so the test
collection is just a view to the actual system.buckets.test
. Inside the actual underlying collection, the three documents are stored in a single “bucket”. This is why as it currently stands, you cannot create a unique index on timeseries data.
In conclusion, timeseries collection is a special collection type that is basically a view into a special underlying collection, thus it behaves differently from a normal MongoDB collection. This is done to allow MongoDB-managed storage of timeseries documents that is otherwise quite expensive to do if it’s done using a regular MongoDB document. However, having this capability also comes with some caveats, namely the unique index limitation that you came across.
Having said that, if you feel that having a secondary unique index is a must, you can create the collection in the normal manner, but lose the compactness of the timeseries collection storage. I suggest to benchmark your workload, and check if you can manage with a normal collection to store your data if the features you lose by using timeseries are important to your use case.
Hopefully this is useful.
Best regards,
Kevin