Data Federation: Partition Attribute Types not working as expected

I have an issue with defining S3 partition attributes for Atlas data federation.
Everything works fine when using “string” as the partition attribute type.
However, when using “objectid” or “isodate”, I cannot make the query work at all.
Is there a known issue with these attribute types or am I doing something wrong?

Example:

S3 Path:
s3://mybucket/tenant=mytenant/structureNode=5a3bbd16b6e14f53b0e92da2/day=2023-11-10/archived=2023-12-28T21:44:23.908Z/1.json.gz

Query
db.cold.find({ tenant: "mytenant", structureNode: ObjectId("5a3bbd16b6e14f53b0e92da2") })

OR

db.cold.find({ tenant: "mytenant", structureNode: ObjectId("5a3bbd16b6e14f53b0e92da2"), day: { "$gt": ISODate("2023-11-09T23:55:22.941Z") } })

Data Federation Config for the queried collection:

{
  "name": "cold",
  "dataSources": [
      {
          "path": "/tenant={tenant string}/structureNode={structureNode objectid}/day={day isodate}/archived={archived isodate}/*",
          "storeName": "xxxxx"
      }
    ]
  }

When changing structureNode and day to string type in data federation config above, then the queries work. I read the following docs on the topic and cannot find anything wrong with our config:

Edit: I just checked and the issue with the objectid property is that there is a document attribute with the same name as the partition key. It looks like the document content takes precedence over the partition attribute. After changing the path config to /tenant={tenant string}/structureNode={structureNodeId objectid}/day={day isodate}/archived={archived isodate}/* and replacing it in the query, at least the first query works fine. I also made sure that the other attribute, day, is not part of the documents’ content. :slight_smile:

Edit: I guess there is somehow an issue with the ISODate conversion. I tried to use the following syntax to make the conversion more explicit: day={day isodate('2023-11-10'):\\d{4}-\\d{2}-\\d{2}}. Unfortunately that leads to the error: MongoServerError: an internal error occurred. :frowning:

When I query without the day field, I see that the returned documents have the day field set as expected, i.e. day: ISODate('2023-11-10T00:00:00.000Z'). I just cannot query for it. However, using the day field to narrow down the amount of data scanned from S3 is crucial to my architecture.

I finnaly solved the issue: The date field of the s3 path partition was also shadowed by a date field which was defined in the documents. After changing the attribute name in the documents to dayAsStr, the query works.

What does not work however, is the data type definition isodate('2023-11-10'). Everytime I set it, the data federation complains with an “internal error”. I guess this is a minor issue, since the queries work now.

However, if someone could explain to me why the isodate format does not work, I would be very thankful.

After all, we could speed up our queries a little bit according to [1] which reads: " If you wish to specify a format, which improves performance, you must use special values to indicate the exact position of the attributes in the date such as day (02 ), month (01 ), year (2006 ), etc."

[1] https://www.mongodb.com/docs/atlas/data-federation/supported-unsupported/supported-partition-attributes/#supported-partition-attribute-types (key: isodate)

Edit: I have a suggestion to improve the docs mentioned above to make it clearer that partition names could clash with document attribute keys which might cause confusion and hard-to-debug problems.

Hi Martin,

Apologies for the delay with no reply; I’m glad you got it working!

I’ll follow up with your docs team on your suggestion and see if I can get an answer on the isodate question.

Thanks!
Irwin