Instruction-Following Rerankers: An Unsung Context Engineering Tool

As we've moved from simple large language model (LLM) interactions to building long-running AI agents, prompt engineering has naturally evolved into context engineering—optimizing not just system instructions but any information that lands in the LLM’s context window. This includes external knowledge, reasoning tokens, tool definitions and outcomes, short-term conversational history, long-term memory, and so on. Identifying and prioritizing the most important information at every LLM call is crucial, since all this content competes for the same limited real estate.

This is where rerankers come in. Purpose-built to optimize relevance ordering, rerankers have been used in search and retrieval systems for years. They’re now finding application in (RAG) and agentic systems, where prioritizing the right information is key to generating accurate, relevant, and coherent responses.

In this blog, you’ll learn about a relatively new class of rerankers, called instruction-following rerankers, and how to use them in your LLM applications as a context engineering tool.

What are (instruction-following) rerankers?

Rerankers are often used as a refinement step in a two-stage retrieval system. In the first stage, a retrieval algorithm (vector search, BM25, etc.) is used to produce a broad set of initial candidates. The reranker then scores each query-document pair and reorders the results by relevance, surfacing the most pertinent information at the top.

Diagram of Two-stage retrieval — Figure 1. Two-stage retrieval

Instruction-following rerankers take this a step further by enabling users to dynamically steer the reranking process through explicit instructions alongside the query. These instructions can be used to specify the desired characteristics of the documents to be retrieved, which can vary across queries, use cases, and applications. The reranker then uses these instructions to adjust relevance scores and the document ordering accordingly.

How to use instruction-following rerankers

In the rest of this blog post, we’ll explore a few different scenarios you might come across when building AI applications and see how instruction-following reranking can improve outcomes. We’ll also show you how to implement them in your applications using Voyage AI’s rerank-2.5. The Jupyter Notebook containing all the code examples below is available on GitHub in our GenAI Showcase repository.

Scenario 1: Incorporating implicit business logic

Different domains and industries operate under implicit rules that shape what information matters most in any given context. For example, peer-reviewed clinical trials carry more weight than general medical websites in healthcare. Legal research must prioritize primary sources like statutes and case law over secondary commentary, with recency and jurisdictional relevance serving as critical signals.

These domain-specific hierarchies, quality indicators, and compliance requirements are rarely explicit in user queries. Instruction-following rerankers can incorporate these implicit operational rules and automatically surface information that aligns with professional standards and regulatory requirements, without forcing users to articulate them in their conversations.

Let’s say you’re building a RAG-based virtual assistant for a healthcare provider. You’ve created the knowledge base for this chatbot using data from several sources on the internet, including medical journals, medical websites, and online community forums. An example document from your knowledge base looks as follows, with the text field containing the content of a webpage or chunk, and the metadata object containing the name of the source and date of publication:

Python

Code Snippet

In a typical RAG workflow, you would extract information from the knowledge base using a technique like vector search. However, vector search treats all sources of information equally, whereas you want to prioritize certain sources over others. You can do this using an instruction-following reranker after the search step:

Python

Code Snippet

import voyageai
import os

# Set Voyage AI API key
os.environ["VOYAGE_API_KEY"] = "your-voyageai-api-key"
# Initialize Voyage AI client
vo = voyageai.Client()

query = "How to treat migraines?"

# Documents retrieved from search
documents = [
    "Source: peer_reviewed_journal, A randomized controlled trial of 600 patients found topiramate reduced migraine frequency by 50% compared to placebo, with common side effects including cognitive slowing.",
    "Source: general_website, Migraine prevention includes lifestyle changes and medications. Common preventive drugs include beta-blockers and antidepressants. Talk to your doctor about which option is right for you.",
    "Source: forum, I started taking magnesium and B2 supplements for my migraines and they've totally disappeared! Haven't needed my prescription in months.",
    "Source: peer_reviewed_journal, Systematic review of 42 studies shows CGRP monoclonal antibodies reduce monthly migraine days by 4-6 days on average, with better tolerability profiles than traditional preventives.",
    "Source: general_website, 10 natural remedies for migraines that actually work! From essential oils to ice packs, these home treatments can help you avoid medication side effects.",
    "Source: healthcare_provider, Preventive migraine medications work by reducing the frequency and severity of attacks. Options include daily pills, monthly injections, or quarterly infusions depending on your needs.",
    "Source: peer_reviewed_journal, Meta-analysis of acupuncture trials demonstrates modest benefit for migraine prevention, with effect size comparable to some pharmacological interventions and minimal adverse events."
]

instructions = "Prioritize peer-reviewed journals, followed by advice from healthcare providers, then general websites and finally forums."

# Use the rerank method to rerank retrieved results
reranking = vo.rerank(query=f"{instructions}\nQuery: {query}", documents=documents, model="rerank-2.5", top_k=3)

# Print the reranking results
for r in reranking.results:
    print(f"Document: {r.document}")
    print(f"Score: {r.relevance_score}")
    print()

In the above example, the reranker is provided instructions outlining the priority order of different medical information sources. The reranker uses these to rank research-backed sources higher than less credible ones, thereby ensuring users receive evidence-based guidance, which is important in healthcare contexts.

A few things to note about the reranking step:

Instructions are prepended to the query using f"{instructions}\nQuery: {query}", similar to how you structure prompts for LLMs.
Structured data is formatted into a single string representation. Any metadata that must influence ranking should be embedded into the string.
The instruction-augmented query and formatted documents are passed to Voyage AI’s rerank method as the query and documents arguments, respectively.

The reranking results, with and without instruction-following, are as follows:

Query: How to treat migraines?

Instruction: Prioritize peer-reviewed journals, followed by advice from healthcare providers, then general websites, and finally forums.

With instruction-following	Without instruction-following
Source: peer_reviewed_journal, Systematic review of 42 studies shows CGRP monoclonal antibodies reduce monthly migraine days by 4-6 days on average, with better tolerability profiles than traditional preventives.	Source: forum, I started taking magnesium and B2 supplements for my migraines, and they've totally disappeared! Haven't needed my prescription in months.
Source: peer_reviewed_journal, A randomized controlled trial of 600 patients found topiramate reduced migraine frequency by 50% compared to placebo, with common side effects including cognitive slowing.	Source: general_website 10 natural remedies for migraines that actually work! From essential oils to ice packs, these home treatments can help you avoid medication side effects.
Source: peer_reviewed_journal, Meta-analysis of acupuncture trials demonstrates modest benefit for migraine prevention, with effect size comparable to some pharmacological interventions and minimal adverse events.	Source: healthcare_provider Preventive migraine medications work by reducing the frequency and severity of attacks. Options include daily pills, monthly injections, or quarterly infusions, depending on your needs.

As seen above, without instruction-following, the reranker prioritizes anecdotal experiences and general advice over clinical evidence.

Scenario 2: Handling different types of queries

Oftentimes, the AI applications you’re designing have a single overarching purpose—for example, documentation assistance, customer support, or coding assistance—but the types of queries users ask can span a broad range of topics and complexity.

Let’s say you’re building a RAG-based documentation assistant. Novice users of the chatbot might ask about basic concepts and terminology, while experienced users need code examples and advanced configuration options. Some users might also be troubleshooting specific issues and need workarounds quickly.

A simple vector search might not be sufficient to handle these different user needs. An effective approach is to categorize incoming queries (e.g., using an LLM), apply metadata filters (e.g., tags and content type) to your search based on that categorization, and then follow up with an instruction-following reranker that prioritizes different types of information based on the query type:

Python

Code Snippet

query = "My query is not using the index I created"

# Documents retrieved from search
documents = [
    # Beginner/Conceptual content about indexes
    "What are MongoDB indexes? Indexes are special data structures that store a small portion of your data in an easy-to-traverse form. Like a book's index helps you find topics without reading every page, MongoDB indexes help find documents without scanning the entire collection.",
    
    "How indexes improve query performance: Without an index, MongoDB must scan every document in a collection (a collection scan) to find matches. With an index, MongoDB can limit the number of documents it must inspect. For a collection with millions of documents, this means the difference between a query taking seconds versus milliseconds.",
    
    "Understanding index types: MongoDB supports several index types. Single field indexes are the simplest, indexing one field. Compound indexes index multiple fields together and are useful when queries filter on multiple fields. Text indexes enable text search capabilities, while geospatial indexes support location-based queries.",
    
    "The tradeoffs of indexing: While indexes speed up read operations, they slow down writes because MongoDB must update indexes whenever documents change. Indexes also consume disk space and memory. It's important to create indexes strategically for your most common queries rather than indexing every field.",
    
    # Implementation content about indexes
    "Creating a single field index: ```javascript\ndb.users.createIndex({ email: 1 })\n// 1 for ascending, -1 for descending\ndb.products.createIndex({ price: -1 })``` The sort order matters for sorting queries but not for equality matches.",
    
    "Building compound indexes: ```javascript\ndb.orders.createIndex({ user_id: 1, created_at: -1 })\n// This supports queries on user_id alone, or user_id + created_at\n// But NOT queries on created_at alone``` Field order matters - most selective field should come first.",
    
    "Creating unique indexes: ```javascript\ndb.users.createIndex({ email: 1 }, { unique: true })\n// Prevents duplicate email addresses\ndb.sessions.createIndex({ token: 1 }, { unique: true, sparse: true })``` Sparse indexes only include documents with the indexed field.",
    
    "Adding indexes with options: ```javascript\ndb.logs.createIndex(\n  { created_at: 1 },\n  { expireAfterSeconds: 86400 } // TTL index, auto-deletes after 24 hours\n)\ndb.large_collection.createIndex({ status: 1 }, { background: true })``` Background builds don't block database operations.",
    
    "Creating text indexes for search: ```javascript\ndb.articles.createIndex({ title: 'text', content: 'text' })\n// Query with:\ndb.articles.find({ $text: { $search: 'mongodb performance' } }).sort({ score: { $meta: 'textScore' } })``` Only one text index per collection allowed.",
    
    # Troubleshooting content about indexes
    "Query not using an index? Use explain() to diagnose: ```javascript\ndb.users.find({ email: 'user@example.com' }).explain('executionStats')``` Look for 'IXSCAN' (good) vs 'COLLSCAN' (bad). If you see COLLSCAN, create an index on the queried field.",
    
    "Index build failing or taking too long? Quick fixes: Use {background: true} option to avoid blocking writes. For large collections, build indexes during low-traffic periods. On replica sets, use rolling index builds: build on secondaries first, then step down primary and build there.",
    
    "Too many indexes slowing down writes? Solution: Audit your indexes with db.collection.getIndexes() and remove unused ones. Use MongoDB's index usage stats: ```javascript\ndb.collection.aggregate([{ $indexStats: {} }])``` Drop indexes with zero or minimal 'ops' count.",
    
    "Index taking too much memory? Check index size with db.collection.stats() and look at 'indexSizes'. Solutions: 1) Drop unused indexes 2) Use partial indexes to index only relevant documents: ```javascript\ndb.orders.createIndex({ status: 1 }, { partialFilterExpression: { status: { $ne: 'archived' } } })```",
    
    "Compound index not being used? Verify your query matches the index prefix. An index on {a: 1, b: 1, c: 1} supports queries on 'a', 'a+b', and 'a+b+c', but NOT 'b' or 'c' alone. Reorder index fields or create additional indexes for different query patterns."
]

# Generate instructions based on query type
if query_type == "concepts":
    instructions = "Prioritize explanatory content."
elif query_type == "implementation":
    instructions = "Prioritize code examples."
elif query_type == "troubleshooting":
    instructions = "Prioritize workarounds and step-by-step debugging instructions."

# Use the rerank method to rerank retrieved results
reranking = vo.rerank(query=f"{instructions}\nQuery: {query}", documents=documents, model="rerank-2.5", top_k=3)

# Print the reranking results
for r in reranking.results:
    print(f"Document: {r.document}")
    print(f"Score: {r.relevance_score}")
    print()

In the above code, note how the instructions for the reranker are set based on the query_type.

The reranking results, with and without instruction-following, are as follows:

Query: My query is not using the index I created

Instruction: Prioritize workarounds and step-by-step debugging instructions.

With instruction-following	Without instruction-following
Query not using an index? Use explain() to diagnose: ```javascript db.users.find({ email: 'user@example.com' }).explain('executionStats')``` Look for 'IXSCAN' (good) vs 'COLLSCAN' (bad). If you see COLLSCAN, create an index on the queried field.	Query not using an index? Use explain() to diagnose: ```javascript db.users.find({ email: 'user@example.com' }).explain('executionStats')``` Look for 'IXSCAN' (good) vs 'COLLSCAN' (bad). If you see COLLSCAN, create an index on the queried field.
Compound index not being used? Verify your query matches the index prefix. An index on {a: 1, b: 1, c: 1} supports queries on 'a', 'a+b', and 'a+b+c', but NOT 'b' or 'c' alone. Reorder index fields or create additional indexes for different query patterns.	Compound index not being used? Verify your query matches the index prefix. An index on {a: 1, b: 1, c: 1} supports queries on 'a', 'a+b', and 'a+b+c', but NOT 'b' or 'c' alone. Reorder index fields or create additional indexes for different query patterns.
Index build failing or taking too long? Quick fixes: Use {background: true} option to avoid blocking writes. For large collections, build indexes during low-traffic periods. On replica sets, use rolling index builds: build on secondaries first, then step down primary and build there.	Too many indexes slowing down writes? Solution: Audit your indexes with db.collection.getIndexes() and remove unused ones. Use MongoDB's index usage stats: ```javascript db.collection.aggregate([{ $indexStats: {} }])``` Drop indexes with zero or minimal 'ops' count.

As seen above, rerank-2.5 is a strong reranker by default, but instructions help it align better with the user’s intent. Without instructions, the reranker retrieves a document about removing unused MongoDB indexes. With the troubleshooting instruction, it is replaced with a document about quick fixes for failing index builds, which is more directly relevant to troubleshooting why queries might not be using an existing index.

Scenario 3: Managing long-term memories and state

Long-term memory in the context of AI systems is facts, insights, and instructions learned by the system across multiple conversations. It is a critical component for AI systems to maintain consistency and coherence over time. As these systems accumulate more memories, they need a way to dynamically retrieve memories that are most relevant in the context of the current conversation.

Once again, semantic search is a common technique for memory retrieval. However, it treats all memories as equally valid and fails to account for changing preferences, recency, or the significance of memories. An instruction-following reranker on top of the semantic search enables you to prioritize memories based on factors beyond just semantic similarity.

Let’s take the example of a travel assistant agent that helps create complete travel itineraries, including making hotel bookings, restaurant reservations, etc. Let’s assume that the agent persists users’ travel, dietary, and budget preferences to a long-term memory store in MongoDB as follows:

Python

Code Snippet

Here’s how you can use instruction-following reranking to prioritize which memories are used by the agent to inform future itineraries:

Python

Code Snippet

query = "Vacation ideas in March for me and my husband"

# Documents retrieved from semantic search-- formatted
documents = [
    # Recent luxury bookings (2024)
    "Date: 2025-09-15, Booked Ritz-Carlton Maui, oceanfront suite, $850/night. Anniversary trip with husband David.",
    "Date: 2025-09-20, Complained about noise from pool area at Ritz-Carlton Maui. Requested quiet rooms for future stays.",
    "Date: 2025-07-10, Booked Four Seasons Bora Bora, overwater bungalow, $1200/night. Solo work retreat.",
    # Critical constraint
    "Date: 2025-06-01, Customer has severe shellfish allergy. Avoid destinations where seafood is primary cuisine.",
    # Recent preferences
    "Date: 2025-09-16, Prefers hotels with spa facilities. Mentioned spa treatments are essential for relaxation.",
    "Date: 2025-08-30, Achieved Marriott Bonvoy Gold status. Prefers Marriott properties for loyalty benefits.",
    "Date: 2025-08-05, Inquired about beach destinations in Mexico and Caribbean for March 2025. Budget: up to $1000/night.",
    # General preference
    "Date: 2024-01-15, Prefers warm beach destinations year-round. Enjoys water activities and relaxation.",
    # Older budget bookings (2023)
    "Date: 2023-11-05, Booked Holiday Inn Orlando, $120/night. Family trip with two children.",
    "Date: 2023-03-10, Booked Hampton Inn Chicago, $95/night. Budget business trip.",
    "Date: 2022-12-01, Booked Motel 6 Las Vegas, $65/night. Road trip with friends.",
    "Date: 2022-06-15, Mentioned preferring budget accommodations to save money for activities."
]

instructions = "Prioritize recent booking patterns (last 12 months), safety concerns, and dietary restrictions."

reranking = vo.rerank(query=f"{instructions}\nQuery: {query}", documents=documents, model="rerank-2.5", top_k=3)

for r in reranking.results:
    print(f"Document: {r.document}")
    print(f"Relevance Score: {r.relevance_score}")
    print()

The reranking results, with and without instruction-following, are as follows:

Query: Vacation ideas in March for my husband and me

Instruction: Prioritize recent booking patterns (last 12 months), safety concerns, and dietary restrictions.

With instruction-following	Without instruction-following
Date: 2025-08-05, Inquired about beach destinations in Mexico and the Caribbean for March 2025. Budget: up to $1000/night.	Date: 2025-08-05, Inquired about beach destinations in Mexico and the Caribbean for March 2025. Budget: up to $1000/night.
Date: 2025-06-01, Customer has a severe shellfish allergy. Avoid destinations where seafood is primary cuisine.	Date: 2024-01-15, Prefers warm beach destinations year-round. Enjoys water activities and relaxation.
Date: 2025-09-15, Booked Ritz-Carlton Maui, oceanfront suite, $850/night. Anniversary trip with husband David.	Date: 2025-09-15, Booked Ritz-Carlton Maui, oceanfront suite, $850/night. Anniversary trip with husband David.

As seen above, without instructions, the reranker does not prioritize the note about the user’s shellfish allergy, which would be key information for the agent to consider when planning future itineraries. Given the right instructions, the reranker can prioritize both recent preferences and important permanent truths about the user.

Conclusion

In this blog post, we explored how instruction-following rerankers can serve as a powerful context engineering tool in RAG and agentic applications. By enabling you to specify ranking priorities through natural language instructions, these models help ensure that your AI systems are contextually intelligent and surface the right information at the right time.

Next Steps

Register for a Voyage AI account to claim your free 200 million tokens and try out rerank-2.5. To learn more about the model, visit Voyage AI’s API documentation and follow them on X (Twitter) and LinkedIn for more updates.

Using Instruction-Following Rerankers As A Context Engineering Tool

What are (instruction-following) rerankers?

How to use instruction-following rerankers

Scenario 1: Incorporating implicit business logic

Scenario 2: Handling different types of queries

Scenario 3: Managing long-term memories and state

Conclusion

Next Steps