BlogAnnounced at MongoDB.local NYC 2024: A recap of all announcements and updatesLearn more >>
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Beyond Basics: Enhancing Kotlin Ktor API With Vector Search

Ricardo Mello8 min read • Published Mar 21, 2024 • Updated Mar 25, 2024
AISearchKotlinAtlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
In this article, we will delve into advanced MongoDB techniques in conjunction with the Kotlin Ktor API, building upon the foundation established in our previous article, Mastering Kotlin: Creating an API With Ktor and MongoDB Atlas. Our focus will be on integrating robust features such as Hugging Face, Vector Search, and MongoDB Atlas triggers/functions to augment the functionality and performance of our API.
We will start by providing an overview of these advanced MongoDB techniques and their critical role in contemporary API development. Subsequently, we will delve into practical implementations, showcasing how you can seamlessly integrate Hugging Face for natural language processing, leverage Vector Search for rapid data retrieval, and automate database processes using triggers and functions.

Prerequisites

Demonstration

We'll begin by importing a dataset of fitness exercises into MongoDB Atlas as documents. Then, we'll create a trigger that activates upon insertion. For each document in the dataset, a function will be invoked to request Hugging Face's API. This function will send the exercise description for conversion into an embedded array, which will be saved into the exercises collection as descEmbedding:
Atlas Application architecture
In the second part, we will modify the Kotlin Ktor application to incorporate HTTP client calls, enabling interaction with the Hugging Face API. Additionally, we will create a /exercises/processRequest endpoint. This endpoint will accept a text input, which will be processed using the Hugging Face API to generate an embedded array. Subsequently, we will compare this array with the descEmbedding generated in the first part. Utilizing vector search, we will return the three most proximate results (in this case, the fitness exercises that are most relevant to my search):
Kotlin Application Architecture

MongoDB Setup and Hugging Face Integration

1. Creating exercises collection

The first step in achieving our goal is to create an empty collection called "exercises" that will later store our dataset. Begin by logging in to your MongoDB Atlas account. From the Atlas dashboard, navigate to your cluster and select the database where you want to create the collection. Click on the "Collections" tab to manage your collections within that database and create an empty exercises collection:
Creating exercises collection

2. Creating a trigger and function

Next, we need to create a trigger that will activate whenever a new document is inserted into the exercises collection. Navigate to the Triggers tab and create a trigger named "Trigger_Exercises" as shown in the images below:
Creating exercises Trigger
Remember to choose the "exercises" collection, select "Insert Document" for the operation type, and enable "Full Document.”
Creating exercises Trigger
Finally, paste the following function code into the "Function" field and click "Save":
Creating exercises Function
This function serves as a bridge between MongoDB and the Hugging Face API, enhancing documents stored in a MongoDB collection with embeddings generated by the API. The function is triggered by a change event in the MongoDB collection, specifically when a new document is inserted or an existing document is updated.
Now, let's explore the functionality of this function:
  1. Event handling: The function extracts the full document from the MongoDB change event to be processed.
  2. Hugging Face API interaction: It interacts with the Hugging Face API to obtain an embedding for the document's description. This involves sending an HTTP POST request to the API's feature extraction endpoint, with the document's description as input.
  3. MongoDB update: Upon receiving a successful response from the Hugging Face API, the function updates the document in the MongoDB collection with the extracted embedding. This enriches the document with additional information useful for various natural language processing tasks.

3. Renaming the function

To align our environment with the demonstration image, let's change the name of our function to Function_Exercises. To do this, access the "Functions" menu and edit the function:
Selecting App Service Trigger
Creating new exercises Trigger
Then, enter the new name and click “Save”:
Renaming Function

4. Getting the Hugging Face access token

The function we previously created requires a token to access Hugging Face. We need to obtain and configure it in Atlas. To do this, log in to your Hugging Face account, and access the settings to create your key:
Getting Hugging Face Token
After copying your key, let's return to MongoDB Atlas and configure our key for access. Click on the "Values" button in the side menu and select “Create New Value”:
Creating new Application Values
Now, we need to create a secret and a value that will be associated with this secret.
First, create the secret by entering the key from Hugging Face:
Creating Application Secret
Then, create a value named HF_value (which will be used in our function) and associate it with the secret, as shown in the image:
Creating Application Value
If everything has gone perfectly, our values will look like this:
Application Values List
We have finished configuring our environment. To recap:
Creating the empty collection:
  • We created an empty collection named "exercises" in MongoDB Atlas. This collection will receive input data, triggering a process to convert the exercises description into embedded values.
Setting up triggers and functions:
  • A trigger named "Trigger_Exercises" was created to activate upon document insertion.
  • The trigger calls a function named "Function_Exercises" for each inserted document.
  • The function processes the description using the Hugging Face API to generate embedded values, which are then added to the "exercises" collection.
Final configuration:
  • To complete the setup, we associated a secret and a value with the Hugging Face key in MongoDB Atlas.

5. Importing a dataset

In this step, we will import a dataset of 50 documents containing information about exercises:
Exercises Document Sample
To achieve this goal, I will use MongoDB Tools to import the exercises.json file via the command line. After installing MongoDB Tools, simply paste the "exercises.json" file into the "bin" folder and execute the command, as shown in the image below:
Mongo Tools Import
Notice: Remember to change your user, password, and cluster.
If everything goes well, we will see that we have imported 50 exercises.
Dataset imported with sucessfully
Now, let's check the logs of our function to ensure everything went smoothly. To do this, navigate to the "App Services" tab and click on "Logs":
Checking App Services Logs
And now, let's view our collection:
Exercises collection with embedded data
As we can see, we have transformed the descriptions of the 50 exercises into vector values and assigned them to the "descEmbedding" field.
Let's proceed with the changes in our Kotlin application. If you haven't already, you can download the application. Our objective is to create an endpoint /processRequest to send an input to HuggingFace, such as:
"I need an exercise for my shoulders and to lose my belly fat."
Postman final demonstration
We will convert this information into embedded data and utilize Vector Search to return the three exercises that most closely match this input. To begin, let's include two dependencies in the build.gradle.kts file that will allow us to make HTTP calls to Hugging Face:
build.gradle.kts
In the ports package, we will create a repository that will retrieve exercises from the database:
domain/ports/ExercisesRepository
We will create a response to display some information to the user:
application/response/ExercisesResponse
Now, create the Exercises class:
domain/entity/Exercises
Next, we will implement our interface that will communicate with the database by executing an aggregate query using the vector search that we will create later.
infrastructure/ExercisesRepositoryImpl
Now, let's create our endpoint to access Hugging Face and then call the method created earlier:
application/routes/ExercisesRoutes
Next, let's create the request that we will send to Hugging Face. In this class, in addition to the input, we have a converter to convert the return from String to Double:
application/request/SentenceRequest
Let's include the route created earlier and a huggingFaceApiUrl method in our Application class. Here's the complete code:
Application.kt
Finally, let's include the Hugging Face endpoint in the application.conf file.
application.conf
Now, we need to go back to Atlas and create our vector search index. Follow the images below:
Creating new Atlas Search Index
Select Atlas Vector Search:
Creating new Atlas Vector Search Index
Creating new Atlas Vector Search Index
If everything is okay, you will see a success message like the one below, indicating that the index was successfully created in MongoDB Atlas:
Creating new Atlas Vector Search Index
This code snippet defines a vector index on the descEmbedding field in our exercises collection. The type field specifies that this is a vector index. The path field indicates the path to the field containing the vector data. In this case, we are using the descEmbedding field. The numDimensions field specifies the number of dimensions
of the vectors, which is 384 in this case. Lastly, the similarity field specifies the similarity metric to be used for comparing vectors, which is the Euclidean distance.
After implementing the latest updates and configurations, it's time to test the application. Let's start by running the application. Open Application.kt and click on the run button:
Running the Application
Once the application is up and running, you can proceed with testing using the following curl command:
Requesting processRequest in

Conclusion

This article showcased how to enrich MongoDB documents with embeddings from the Hugging Face API, leveraging its powerful natural language processing capabilities. The provided function demonstrates handling change events in a MongoDB collection and interacting with an external API. This integration offers developers opportunities to enhance their applications with NLP features, highlighting the potential of combining technologies for more intelligent applications.
The example source code is available on GitHub.
If you have any questions or want to discuss further implementations, feel free to reach out to the MongoDB Developer Community forum for support and guidance.

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Visually Showing Atlas Search Highlights with JavaScript and HTML


Feb 03, 2023 | 7 min read
Quickstart

Quick Start: Getting Started With MongoDB Atlas and Python


Apr 10, 2024 | 4 min read
Tutorial

Introducing Atlas Stream Processing Support Within the MongoDB for VS Code Extension


Mar 05, 2024 | 4 min read
Tutorial

Using OpenAI Latest Embeddings In A RAG System With MongoDB


Feb 01, 2024 | 15 min read
Table of Contents