Attribute Pattern - Should I limit the number of attributes?

When using the attribute pattern should I be mindful of the number of attributes I create per document to ensure good performance? And what impact does a larger number of attributes have on write and query performance?

My use case is that I would like my users to be able to filter data based on averages over the past number of years but balance this with maintaining good performance. Will there be a significant impact on performance between having 3 attributes and 12 attributes in an indexed field given the number of index keys increases?

Existing document (simplified version) - Yearly data

{
    "name": "Product Name",
    "price": {
        "Avg365": 10,
        "Avg730": 5,
        "Avg1095": 15
    }
}

Propose new document (simplified version) - Every 90-days

{
    "name": "Product Name",
    "price": {
        "Avg90": 4,
        "Avg180": 8,
        "Avg270": 9,
        "Avg365": 10,
        "Avg455": 10,
        "Avg545": 6,
        "Avg635": 8,
        "Avg730": 5,
        "Avg820": 10,
        "Avg910": 12,
        "Avg1000": 18,
        "Avg1095": 15
    }
}

In these cases I have an index on “price” with queries such as:
{ price: {$elemMatch: { "k": "Avg365", "v": { $lte: 8, $gte: 1 } } } }

Like any design pattern you should be mindful in apply it. Otherwise you might end up implementing an anti-pattern.

The following thread is still in my bookmarks because I have not had the time to take a serious look at it.

Hi @Callum_Boyd,

The proposed new document is the anti-pattern that leads to the Attribute Pattern.

Better document would be:

{
    "name": "Product Name",
    "average_prices": [
        {"days": 90, "price": 4},
        {"days": 180, "price": 8},
        {"days": 270, "price": 9},
        {"days": 365, "price": 10},
        {"days": 455, "price": 10},
        {"days": 545, "price": 6},
        {"days": 635, "price": 8},
        {"days": 730, "price": 5},
        {"days": 820, "price": 10},
        {"days": 910, "price": 12},
        {"days": 1000, "price": 18},
        {"days": 1095, "price": 15}
    ]
}

Also your query doesn’t work with that document (no “k” and no “v” keys). And $elemMatch only works on an array.

Index on my doc would be {"average_prices.days": 1, "average_prices.price": 1}

This index would be used by this query:

db.coll.find({
  average_prices: {
    $elemMatch: {
      days: 270, 
      price: {$gt: 8, $lt: 10}
    }
  }
})

This is what the winning plan from an explain looks like:

winningPlan: {
      stage: 'FETCH',
      filter: {
        average_prices: {
          '$elemMatch': {
            '$and': [
              { days: { '$eq': 270 } },
              { price: { '$lt': 10 } },
              { price: { '$gt': 8 } }
            ]
          }
        }
      },
      inputStage: {
        stage: 'IXSCAN',
        keyPattern: { 'average_prices.days': 1, 'average_prices.price': 1 },
        indexName: 'average_prices.days_1_average_prices.price_1',
        isMultiKey: true,
        multiKeyPaths: {
          'average_prices.days': [ 'average_prices' ],
          'average_prices.price': [ 'average_prices' ]
        },
        isUnique: false,
        isSparse: false,
        isPartial: false,
        indexVersion: 2,
        direction: 'forward',
        indexBounds: {
          'average_prices.days': [ '[270, 270]' ],
          'average_prices.price': [ '(8, 10)' ]
        }
      }
    }

IXSCAN => FETCH is the best we can do here.

About the performances, the index will add one extra entry for each new element in the array (== each new average price). So if you have 1M docs and each have 10 prices => 10M entries in the index.
It’s not really a problem as long as you have enough RAM to support it but it’s going to be a problem if you decide to have 1000 prices in each doc.

This also means that each insert in this collection will now generate 1 write operation in the collection and 10 write ops in the index. Nothing alarming here.

Cheers,
Maxime.

2 Likes

Thanks @MaBeuLux88 !

Apologies for the error in my post, I tried to simplify too far. My document looks more like:

{
    "name": "Product Name",
    "price": [
        {"k": "Avg365", "v": 10},
        {"k": "Avg730", "v": 5},
        {"k": "Avg1095", "v": 15}
    ]
}

Also thank you for your comments on the performance side, its reassuring to know the impact won’t be huge provided the price options do not grow excessively

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.