Contextual AI Components
State-of-the-art component APIs to enhance every step of your RAG pipeline

CORE RAG BUILDING BLOCKS
Optimize your RAG pipeline end-to-end with APIs for document understanding, reranking, generation, and evaluation

STATE-OF-THE-ART PERFORMANCE
Ground agents in your enterprise knowledge with RAG components that outperform competing tools on leading benchmarks

SIMPLE, FLEXIBLE INTEGRATION
Incorporate components into your existing RAG pipeline without needing to overhaul your entire architecture
Better RAG performance in minutes
Powerful, modular RAG components for when accuracy and simplicity matter

Our multi-stage document understanding pipeline for converting unstructured content into AI-ready formats

The first instruction-following reranker, providing greater control over how retrieved knowledge is prioritized

The most grounded large language model in the world, engineered specifically to minimize hallucinations

Our evaluation-optimized model for preference, direct scoring, and natural language unit test evaluation
Achieve even greater performance
with the Contextual AI Platform
Each component API provides state-of-the-art performance. Our end-to-end platform delivers even greater performance and simplicity by jointly optimizing and orchestrating the components as a single, unified system.
Flexible, modular components

PARSE
Extract complex multimodal content from any document
- Convert unstructured documents into structured output optimized for your RAG pipeline
- Easily process text, charts, tables, code, and other complex modalities
- Infer document hierarchy and add positional metadata to each chunk, enabling agents to connect information across hundreds of pages

RERANK
Resolve knowledge conflicts with the only instruction-following reranker
- Steer how your reranker prioritizes information with natural language instructions
- Serve the most relevant content to your model’s context—with or without instructions
- Drop into your existing RAG pipeline with just a few lines of code

GENERATE
Maximize accuracy with the world’s most grounded language model
- Generate answers strongly grounded in your retrieved documents
- Minimize hallucinations with response tags to delineate between facts and model commentary
- Provide in-line attributions to source documents for end-users to verify responses

LMUnit
Evaluate LLM responses with fine-grained natural language unit tests
- Assess responses for your defined criteria, like conciseness, technical precision, and more
- Test for accuracy with out-of-the-box evaluation tests for equivalence and groundedness
- Reduce your dependence on end-user testing and get to production faster
Simple APIs to improve any RAG pipeline
# Convert unstructured content into structured output
import os
from contextual import ContextualAI
client = ContextualAI(api_key=os.environ.get("CONTEXTUAL_API_KEY"))
with open("your filepath", "rb") as fp:
response = client.parse.create(
raw_file=fp,
parse_mode="standard",
figure_caption_mode="concise",
enable_document_hierarchy=True,
page_range="0-5",
)
job_id = response.job_id
results = client.parse.job_results(job_id, output_types=['markdown-per-page'])
# Rank retrieved documents using instructions
import os
from contextual import ContextualAI
client = ContextualAI(api_key=os.environ.get("CONTEXTUAL_API_KEY"))
rerank_response = client.rerank.create(
query = "your query",
instruction = "your instructions",
documents = ["your documents"],
metadata = ["your metadata"],
model = "ctxl-rerank-en-v1-instruct"
)
print(rerank_response.to_dict())
# Generate highly grounded responses
import os
from contextual import ContextualAI
client = ContextualAI(api_key=os.environ.get("CONTEXTUAL_API_KEY"))
generate_response = client.generate.create(
model="v1",
messages=[{
"content": "content",
"role": "user",
}],
knowledge=["your knowledge"],
avoid_commentary=False
)
print(generate_response.response)
# Evaluate responses with user-defined unit tests
import os
from contextual import ContextualAI
client = ContextualAI(api_key=os.environ.get("CONTEXTUAL_API_KEY"))
response = client.lmunit.create(
query="your query",
response="your response",
unit_test="your unit test"
)
print(response)