Calculating similarity the same way as a vector search

I’d like to manually calculate the vectorSearchScore using embeddings. I have a cosine similarity function, that I think is correct, but it’s giving me very different results to the search scores returned by the vector search on mongodb. Is MongoDb using a proprietary algorithm or doing some kind of weighting/scaling, or is my function wrong?

The embeddings I’m using are raw data returned by OpenAi using the text-embedding-3-small model.

  public static cosineSimilarity(A: number[], B: number[]): number {
    let dotProduct = 0.0;
    let normA = 0.0;
    let normB = 0.0;

    for (let i = 0; i < A.length; i++) {
      dotProduct += A[i] * B[i];
      normA += A[i] ** 2;
      normB += B[i] ** 2;
    }

    normA = Math.sqrt(normA);
    normB = Math.sqrt(normB);

    if (normA === 0 || normB === 0) {
      return 0;
    } else {
      return dotProduct / (normA * normB);
    }
  }

Hi Will,

Thanks for submitting the question. We do normalize the cosine/dot product scores following the formula described here.

Hope this helps!

1 Like

Perfect. Thanks for that.