hey everyone I have few questions - in BM25s retriever ...

At a glance

hey everyone I have few questions

in BM25s retriever the the nodes are loaded in memory, for large documentations will this not increase the memory overheld and delay realtime response

11 comments

Please stand by @payload as I search the knowledge base. I normally take about 20 seconds or so. :wave:

----------------------
Yes, you're correct. In the BM25Retriever, the nodes are loaded into memory. This is done to facilitate quick retrieval of similar nodes based on the BM25 algorithm. However, for large documentations, this could indeed increase memory overhead and potentially delay real-time responses due to the time it takes to load the data into memory.

To mitigate this, LlamaIndex provides options for persisting the data on disk or using a document store that can be remote like MongoDB, Redis, etc. This is demonstrated in the examples provided in the LlamaIndex documentation. For instance, you can use the SimpleDocumentStore to hold your nodes, which can be remote, thus reducing the memory overhead. Here's an example:

Plain Text

from llama_index.core.storage.docstore import SimpleDocumentStore

docstore = SimpleDocumentStore()
docstore.add_documents(nodes)

Then, you can create the BM25Retriever using the docstore:

Plain Text

from llama_index.retrievers.bm25 import BM25Retriever
import Stemmer

bm25_retriever = BM25Retriever.from_defaults(
    docstore=docstore,
    similarity_top_k=2,
    stemmer=Stemmer.Stemmer("english"),
    language="english",
)

You can also persist the BM25Retriever directly to disk:

Plain Text

bm25_retriever.persist("./bm25_retriever")
loaded_bm25_retriever = BM25Retriever.from_persist_dir("./bm25_retriever")

These methods can help manage memory usage and response times when working with large documentations.

For more details, you can refer to the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ppayload

@kapa.ai i have my data currently in qdrant store how can i connect both

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
You can connect your data in a Qdrant store to LlamaIndex by using the QdrantVectorStore and VectorStoreIndex classes provided by LlamaIndex. Here's an example of how you can do this:

Plain Text

import qdrant_client
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.qdrant import QdrantVectorStore

# Create a Qdrant client
client = qdrant_client.QdrantClient("<http://localhost:6333">)

# Create a QdrantVectorStore with your collection
vector_store = QdrantVectorStore(
    collection_name="your_collection_name",
    client=client,
)

# Create a VectorStoreIndex from your vector store
index = VectorStoreIndex.from_vector_store(vector_store)

In this example, replace "<http://localhost:6333"> with the address of your Qdrant instance and "your_collection_name" with the name of your collection in Qdrant.

Please note that you need to have the qdrant_client and llama-index-vector-stores-qdrant packages installed. You can install them using pip:

Plain Text

%pip install llama-index-vector-stores-qdrant
!pip install qdrant_client

For more details, you can refer to the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ppayload

@kapa.ai how can i connect qdrant store with bm25 retriever and also keep memory overhed in check as the number of nodes might be huge

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
To connect a Qdrant store with a BM25 retriever while keeping memory overhead in check, you can use a SimpleDocumentStore to hold your nodes and a QdrantVectorStore for your vector store. Here's an example of how you can do this:

First, initialize your SimpleDocumentStore and QdrantVectorStore:

Plain Text

from llama_index.core.storage.docstore import SimpleDocumentStore
from llama_index.vector_stores.qdrant import QdrantVectorStore
from qdrant_client import QdrantClient

# Initialize a document store to store nodes
docstore = SimpleDocumentStore()

# Initialize a Qdrant client
client = QdrantClient(path="./qdrant_data")

# Initialize a QdrantVectorStore with your collection
vector_store = QdrantVectorStore("composable", client=client)

Then, create your BM25Retriever using the docstore:

Plain Text

from llama_index.retrievers.bm25 import BM25Retriever

bm25_retriever = BM25Retriever.from_defaults(
    docstore=docstore, similarity_top_k=2
)

Finally, create a VectorStoreIndex from your vector_store:

Plain Text

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_vector_store(vector_store)

This setup allows you to use a BM25 retriever with a Qdrant store while keeping memory overhead in check, as the nodes are stored in a document store that can be remote (like MongoDB, Redis, etc.), reducing the memory overhead.

For more details, you can refer to the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ppayload

@WhiteFang_Jr @Logan M ur inputs kapa seems dead end 😅

LLogan M

@payload bm25s supports mmap, if you pass it in when loading a saved index. Just need to save and then load it with the kwarg
https://github.com/run-llama/llama_index/blob/16dcd7f0846d291e7951b23d3bb8b280e754a257/llama-index-integrations/retrievers/llama-index-retrievers-bm25/llama_index/retrievers/bm25/base.py#L133

mmap=True
https://github.com/xhluca/bm25s?tab=readme-ov-file#memory-efficient-retrieval

ppayload

thanks will try it

Add a reply

Find answers from the community

hey everyone I have few questions - in BM25s retriever the the nodes are loaded in memor