Find answers from the community

s
F
Y
a
P
Updated last month

hey i need to index large amount of

hey, i need to index large amount of documents (1 gb)

but when i persist the index, the weaviate fails

Plain Text
        index = GPTVectorStoreIndex.from_documents(
            documents, service_context=service_context, storage_context=storage_context
        )

        index.storage_context.persist(persist_dir=persist_dir)
L
R
2 comments
  1. No need to call persist if using weaviate, it's already storing the entire index in weaviate. To re-load the index later after you've call from_documents, you'll just need to setup the vector_store object again and do VectorStoreIndex.from_vector_store(vector_store)
  1. Interesting, not sure on this timeout 🤔 I'd have to check how many embeddings it's trying to send on each batch. One quick workaround is to create an empty index (VectorStoreIndex([], service_context=service_context, storage_context=storage_context) ), and uploading one document at a time with index.insert(document) in a for loop
Add a reply
Sign up and join the conversation on Discord