Documentation Index
Fetch the complete documentation index at: https://langchain-5e9cc07a-preview-mdrxyo-1777658790-7be347c.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Compatibility: Only available on Node.js.You can still create API routes that use MongoDB with Next.js by setting the runtime variable to nodejs like so:export const runtime = "nodejs";For more information, see Edge runtimes in the Next.js documentation.
This guide provides a quick overview for getting started with MongoDB Atlas vector stores. For detailed documentation of all MongoDBAtlasVectorSearch features and configurations head to the API reference.
Overview
Integration details
Setup
To use MongoDB Atlas vector stores, you’ll need to configure a MongoDB Atlas cluster and install the @langchain/mongodb integration package.
Initial Cluster Configuration
To create a MongoDB Atlas cluster, navigate to the MongoDB Atlas website and create an account if you don’t already have one.
Create and name a cluster when prompted, then find it under Database. Select Browse Collections and create either a blank collection or one from the provided sample data.
Note: The cluster must be MongoDB 7.0 or higher for manual embedding mode. Automated embedding mode requires MongoDB 8.2 or higher.
Creating a Vector Search Index
After configuring your cluster, create a vector search index on your collection. You can do this either on Atlas, Compass, or MongoDB Shell. The index definition depends on which embedding mode you use.
Manual embedding (MongoDB 7.0+): you embed documents client-side and store the vectors in a field. Use the following definition, adjusting numDimensions to match your embeddings model.
{
"name": "index_name",
"type": "vectorSeach",
"definition": {
"fields": [
{
"numDimensions": 1536,
"path": "embedding",
"similarity": "euclidean",
"type": "vector"
}
]
}
}
Automated embedding (MongoDB 8.2+): MongoDB generates embeddings server-side using Voyage AI models. Use the autoEmbed field type and specify the model:
{
"name": "index_name",
"type": "vectorSeach",
"definition": {
"fields": [
{
"type": "autoEmbed",
"modality": "text",
"path": "textContent",
"model": "voyage-4"
}
]
}
}
By default, the vector store reads from a text field named text and (in manual mode) writes vectors to a field named embedding. Set textKey and embeddingKey to match your index.
const vectorStore = new MongoDBAtlasVectorSearch(
embeddings, // omit in auto embedding mode
{
collection,
indexName: "index_name",
textKey: "textContent", // document field where raw text is stored
embeddingKey: "embedding", // matches "path" (omit in auto embedding mode)
}
);
Proceed to build the index.
Embeddings
In manual embedding mode, you provide an embeddings model and embed documents client-side. This guide uses OpenAI embeddings as an example. You can also use other supported embeddings models.
In automated embedding mode, MongoDB Atlas handles embedding generation server-side. No client-side embeddings package is required.
Installation
Manual embedding
Automated embedding
Install the core package plus an embeddings provider:npm install @langchain/mongodb mongodb @langchain/core @langchain/openai
Only the core package and MongoDB driver are required:npm install @langchain/mongodb mongodb @langchain/core
Credentials
Once you’ve done the above, set the MONGODB_ATLAS_URI environment variable from the Connect button in Mongo’s dashboard. You’ll also need your DB name and collection name:
process.env.MONGODB_ATLAS_URI = "your-atlas-URL";
process.env.MONGODB_ATLAS_COLLECTION_NAME = "your-atlas-collection-name";
process.env.MONGODB_ATLAS_DB_NAME = "your-atlas-db-name";
If you are using manual embedding mode with OpenAI, set your OpenAI key as well:
process.env.OPENAI_API_KEY = "YOUR_API_KEY";
In automated embedding mode, no additional API key is required — MongoDB Atlas handles embedding generation using the model configured in your index.
If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:
// process.env.LANGSMITH_TRACING="true"
// process.env.LANGSMITH_API_KEY="your-api-key"
Instantiation
Once you’ve set up your cluster and index, initialize your vector store. The constructor accepts two forms depending on whether you use manual or automated embedding.
Manual embedding
Automated embedding
Pass an embeddings instance as the first argument:import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGODB_ATLAS_URI!);
const collection = client
.db(process.env.MONGODB_ATLAS_DB_NAME)
.collection(process.env.MONGODB_ATLAS_COLLECTION_NAME);
const embeddings = new OpenAIEmbeddings({
model: "text-embedding-3-small",
});
const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
collection,
indexName: "vector_index", // Defaults to "default"
textKey: "text", // Defaults to "text"
embeddingKey: "embedding", // Defaults to "embedding"
});
Pass only the configuration object (no embeddings argument is needed).import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGODB_ATLAS_URI!);
const collection = client
.db(process.env.MONGODB_ATLAS_DB_NAME)
.collection(process.env.MONGODB_ATLAS_COLLECTION_NAME);
const vectorStore = new MongoDBAtlasVectorSearch({
collection,
indexName: "default", // Must match the index name in your Atlas cluster
});
Manage vector store
Add items to vector store
You can now add documents to your vector store:
import type { Document } from "@langchain/core/documents";
const document1: Document = {
pageContent: "The powerhouse of the cell is the mitochondria",
metadata: { source: "https://example.com" }
};
const document2: Document = {
pageContent: "Buildings are made out of brick",
metadata: { source: "https://example.com" }
};
const document3: Document = {
pageContent: "Mitochondria are made out of lipids",
metadata: { source: "https://example.com" }
};
const document4: Document = {
pageContent: "The 2024 Olympics are in Paris",
metadata: { source: "https://example.com" }
}
const documents = [document1, document2, document3, document4];
await vectorStore.addDocuments(documents, { ids: ["1", "2", "3", "4"] });
After adding documents, there is a delay before they become queryable. In automated embedding mode this delay is longer because MongoDB must generate embeddings server-side after insertion. Wait until the Atlas search index reports that documents are indexed before querying.
Adding a document with the same id as an existing document will update the existing one.
Delete items from vector store
await vectorStore.delete({ ids: ["4"] });
Query vector store
Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.
Query directly
Performing a simple similarity search can be done as follows:
const similaritySearchResults = await vectorStore.similaritySearch("biology", 2);
for (const doc of similaritySearchResults) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]
Filtering
MongoDB Atlas supports pre-filtering of results on other fields. They require you to define which metadata fields you plan to filter on by updating the index you created initially. Here’s an example:
{
"fields": [
{
"numDimensions": 1024,
"path": "embedding",
"similarity": "euclidean",
"type": "vector"
},
{
"path": "source",
"type": "filter"
}
]
}
Above, the first item in fields is the vector index, and the second item is the metadata property you want to filter on. The name of the property is the value of the path key. Therefore the above index allows us to search on a metadata field named source.
Then, in your code you can use MQL Query Operators for filtering.
The below example illustrates this:
const filter = {
preFilter: {
source: {
$eq: "https://example.com",
},
},
}
const filteredResults = await vectorStore.similaritySearch("biology", 2, filter);
for (const doc of filteredResults) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]
Returning scores
If you want to execute a similarity search and receive the corresponding scores you can run:
const similaritySearchWithScoreResults = await vectorStore.similaritySearchWithScore("biology", 2, filter)
for (const [doc, score] of similaritySearchWithScoreResults) {
console.log(`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(doc.metadata)}]`);
}
* [SIM=0.374] The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* [SIM=0.370] Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]
Query by turning into retriever
You can also transform the vector store into a retriever for easier usage in your chains.
const retriever = vectorStore.asRetriever({
// Optional filter
filter: filter,
k: 2,
});
await retriever.invoke("biology");
[
Document {
pageContent: 'The powerhouse of the cell is the mitochondria',
metadata: { _id: '1', source: 'https://example.com' },
id: undefined
},
Document {
pageContent: 'Mitochondria are made out of lipids',
metadata: { _id: '3', source: 'https://example.com' },
id: undefined
}
]
Usage for retrieval-augmented generation
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:
Closing connections
Make sure you close the client instance when you are finished to avoid excessive resource consumption:
API reference
For detailed documentation of all MongoDBAtlasVectorSearch features and configurations head to the API reference.