Definition
New in version 8.3.
$similarityDotProductReturns the dot product of two numeric vectors represented as arrays or
binDatavalues. The dot product equals the sum of the products of corresponding elements.$similarityDotProducthas two syntax forms.Concise syntax returns a raw dot product score:
{ $similarityDotProduct: [ <vector1>, <vector2> ] } Full syntax accepts an optional normalization parameter:
{ $similarityDotProduct: { vectors: [ <vector1>, <vector2> ], score: <boolean> } } When using the full syntax,
$similarityDotProductaccepts the following fields:FieldTypeNecessityDescriptionvectorsArray
Required
Array of exactly two expressions. Each expression must resolve to an array of numeric values or a
binDatavalue. Both vectors must have equal length.scoreBoolean
Optional
When
true, returns a normalized score using the formula(1 + dotProduct) / 2. Defaults tofalse.For more information on expressions, see Expressions.
Behavior
null and Missing Values
If either argument resolves to null or refers to a missing
field, $similarityDotProduct returns null.
Return Value
$similarityDotProduct returns a double. When
score is false (the default), the result is the raw dot
product. The value depends on the magnitude of the input vectors.
Vectors with larger magnitudes produce larger dot product values.
When score is true, the result is normalized using the
formula (1 + dotProduct) / 2. This normalization assumes
unit-length (normalized) input vectors. For unit-length vectors,
the raw dot product is in the range [-1, 1] and the normalized
score is in the range [0, 1].
Errors
$similarityDotProduct returns an error in the
following cases:
Either argument does not resolve to an array or
binDatavalue.Input arrays or
binDatavalues have different lengths.Either array contains non-numeric elements.
Example
The following example uses a vectors collection:
db.vectors.insertMany( [ { _id: 1, a: [1, 2, 3], b: [1, 2, 3] }, { _id: 2, a: [1, 2, 3], b: [3, 2, 1] }, { _id: 3, a: [1, 2, 3], b: [4, 5, 6] } ] )
The following aggregation pipeline computes the dot product between
the a and b fields for each document and returns both the
raw score and the normalized score:
db.vectors.aggregate( [ { $project: { raw: { $similarityDotProduct: [ "$a", "$b" ] }, normalized: { $similarityDotProduct: { vectors: [ "$a", "$b" ], score: true } } } } ] )
The operation returns the following results:
{ _id: 1, raw: 14, normalized: 7.5 } { _id: 2, raw: 10, normalized: 5.5 } { _id: 3, raw: 32, normalized: 16.5 }