/ /

$similarityEuclidean (expression operator)

Definition

New in version 8.3.

$similarityEuclidean

Returns the Euclidean distance between two numeric vectors represented as arrays or binData values. Euclidean distance measures the straight-line distance between two points in vector space.

$similarityEuclidean has two syntax forms.

Concise syntax returns a raw Euclidean distance:

{ $similarityEuclidean: [ <vector1>, <vector2> ] }

Full syntax accepts an optional normalization parameter:

{
   $similarityEuclidean: {
      vectors: [ <vector1>, <vector2> ],
      score: <boolean>
   }
}

When using the full syntax, $similarityEuclidean accepts the following fields:

Field	Type	Necessity	Description
`vectors`	Array	Required	Array of exactly two expressions. Each expression must resolve to an array of numeric values or a `binData` value. Both vectors must have equal length.
`score`	Boolean	Optional	When `true`, returns a normalized score in the range `(0, 1]` using the formula `1 / (1 + distance)`. Identical vectors produce a score of `1`. Defaults to `false`.

For more information on expressions, see Expressions.

Behavior

`null` and Missing Values

If either argument resolves to null or refers to a missing field, $similarityEuclidean returns null.

Return Value

$similarityEuclidean returns a double. When score is false (the default), the result is the raw Euclidean distance, which is always greater than or equal to 0. A distance of 0 means the vectors are identical. Larger values indicate greater dissimilarity.

When score is true, the result is normalized to the range (0, 1] using the formula 1 / (1 + distance):

1 indicates the vectors are identical (distance is 0).
Values approaching 0 indicate greater dissimilarity.

Errors

$similarityEuclidean returns an error in the following cases:

Either argument does not resolve to an array or binData value.
Input arrays or binData values have different lengths.
Either array contains non-numeric elements.

Example

The following example uses a vectors collection:

db.vectors.insertMany( [
   { _id: 1, a: [1, 2, 3], b: [1, 2, 3] },
   { _id: 2, a: [1, 2, 3], b: [3, 2, 1] },
   { _id: 3, a: [1, 2, 3], b: [4, 5, 6] }
] )

The following aggregation pipeline computes the Euclidean distance between the a and b fields for each document and returns both the raw distance and the normalized score:

db.vectors.aggregate( [
   {
      $project: {
         raw: { $similarityEuclidean: [ "$a", "$b" ] },
         normalized: {
            $similarityEuclidean: {
               vectors: [ "$a", "$b" ],
               score: true
            }
         }
      }
   }
] )

The operation returns the following results:

{ _id: 1, raw: 0, normalized: 1 }
{ _id: 2, raw: 2.8284271247461903,
  normalized: 0.2612038749637415 }
{ _id: 3, raw: 5.196152422706632,
  normalized: 0.16139702886038895 }

Learn More

Back

$similarityDotProduct

$size

Definition

Behavior

null and Missing Values

Return Value

Errors

Example

Learn More

`null` and Missing Values