Definition
New in version 8.3.
$similarityEuclideanReturns the Euclidean distance between two numeric vectors represented as arrays or
binDatavalues. Euclidean distance measures the straight-line distance between two points in vector space.$similarityEuclideanhas two syntax forms.Concise syntax returns a raw Euclidean distance:
{ $similarityEuclidean: [ <vector1>, <vector2> ] } Full syntax accepts an optional normalization parameter:
{ $similarityEuclidean: { vectors: [ <vector1>, <vector2> ], score: <boolean> } } When using the full syntax,
$similarityEuclideanaccepts the following fields:FieldTypeNecessityDescriptionvectorsArray
Required
Array of exactly two expressions. Each expression must resolve to an array of numeric values or a
binDatavalue. Both vectors must have equal length.scoreBoolean
Optional
When
true, returns a normalized score in the range(0, 1]using the formula1 / (1 + distance). Identical vectors produce a score of1. Defaults tofalse.For more information on expressions, see Expressions.
Behavior
null and Missing Values
If either argument resolves to null or refers to a missing
field, $similarityEuclidean returns null.
Return Value
$similarityEuclidean returns a double. When
score is false (the default), the result is the raw
Euclidean distance, which is always greater than or equal to
0. A distance of 0 means the vectors are identical.
Larger values indicate greater dissimilarity.
When score is true, the result is normalized to the range
(0, 1] using the formula 1 / (1 + distance):
1indicates the vectors are identical (distance is0).Values approaching
0indicate greater dissimilarity.
Errors
$similarityEuclidean returns an error in the
following cases:
Either argument does not resolve to an array or
binDatavalue.Input arrays or
binDatavalues have different lengths.Either array contains non-numeric elements.
Example
The following example uses a vectors collection:
db.vectors.insertMany( [ { _id: 1, a: [1, 2, 3], b: [1, 2, 3] }, { _id: 2, a: [1, 2, 3], b: [3, 2, 1] }, { _id: 3, a: [1, 2, 3], b: [4, 5, 6] } ] )
The following aggregation pipeline computes the Euclidean distance
between the a and b fields for each document and returns
both the raw distance and the normalized score:
db.vectors.aggregate( [ { $project: { raw: { $similarityEuclidean: [ "$a", "$b" ] }, normalized: { $similarityEuclidean: { vectors: [ "$a", "$b" ], score: true } } } } ] )
The operation returns the following results:
{ _id: 1, raw: 0, normalized: 1 } { _id: 2, raw: 2.8284271247461903, normalized: 0.2612038749637415 } { _id: 3, raw: 5.196152422706632, normalized: 0.16139702886038895 }