I am having an index having 170000

At a glance

The community member has an index of 170,000 embeddings and is experiencing slow query times of 15-20 seconds. They are looking for the most appropriate way to handle this large index and get results within 2-3 seconds. A community member suggests using a vector database like Qdrant. The community member then tries to use FAISS to create and load an index, but encounters an AssertionError when querying the index. Another community member is unsure of the issue and suggests providing the full traceback. The community member states that they have resolved the issue by using the FAISS implementation from the FAISS GitHub repository, and will share the complete traceback later.

HHK

I am having an index having 170000 embeddings and if I use query engine then it takes 15-20 seconds to give me an answer. What is the most appropriate way to handle these many embeddings index. I want to have results within 2-3 seconds.

4 comments

LLogan M

Use a vector db like qdrant or some other integration

HHK

Thanks @Logan M for the reply. I am trying to use FAISS index now and I have created an index like this;

d = 1024
faiss_index = faiss.IndexFlatL2(d)

nodes = []
vector_store = FaissVectorStore(faiss_index=faiss_index)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

for index, row in data.iterrows():
    text_embedding = json.loads(row["list_embedding"])
    text = row["text"]
    node = TextNode(text=text, metadata={"id":row["id"]}, embedding=text_embedding)
    nodes.append(node)

vector_index = VectorStoreIndex(nodes=nodes, storage_context=storage_context)

# save index to disk
vector_index.storage_context.persist(persist_dir="./vector_store__faiss_index_50000")

And loaded the index like this;

# load index from disk
vector_store = FaissVectorStore.from_persist_dir("./vector_store__faiss_index_50000")
storage_context = StorageContext.from_defaults(
    vector_store=vector_store, persist_dir="./vector_store__faiss_index_50000"
)
newwindex = load_index_from_storage(storage_context=storage_context)

But the problem is when I try to query with this index;

query_engine = newwindex.as_query_engine()
response = query_engine.query("Can you show me all the products related to bikes?")

It is giving me **AssertionError: **

LLogan M

hmmm, not sure whats going on there. Is there a full traceback? (FAISS is honestly pretty annoying to use lol)

HHK

Thanks Logan, I have used FAISS implementation from their github and it working good now. will share the compelte traceback in a few hours.

Add a reply

Find answers from the community

I am having an index having 170000