Persisted Vector Index

wwebwrx

I have a locally persisted Vectorstore (created with VoyageAI embeddings). But every time I query it, my server makes external API calls to the VoyageAI embeddings API. It's very fast, but can someone explain why it needs to do this? I thought once I have the vectorstore kept locally there would be no need for ongoing calls to an external embeddings service.

19 comments

vverdverm

It depends on your code. The examples on LI docs typically only show the simplest path, which will always run the embed logic. There is another example that shows how to separate and do them conditionally.

Personally, I break them down into separate scripts or services

LLogan M

You need to embed the query text, so that is the embedding call being made

LLogan M

Then, that query embedding is being used against the existing saved vectors, to find relevant text

vverdverm

yeah, there is that aspect as well. I guess it depends on what @webwrx means by "calls"

vverdverm

is it one or N (for each doc)

LLogan M

Embedding/creating your index runs embeddings for each text chunk.

But once those are saved, you only need to embed each query as it comes in

vverdverm

right, but many of the examples don't show how to load a persisted index, the code they show will always run the document embedding process

vverdverm

we really need to see the code in question to understand what is happening

LLogan M

yea fair. Loading is fairly easy

If you are using the default vector db

Plain Text

index.storage_context.persist(persist_dir="./storage")

from llama_index import StorageContext, load_index_from_storage
index = load_index_from_storage(StorageContext.from_defaults(persist_dir="./storage"))

If you are using an external db, like qdrant, pinecone, weaviate, etc.

Plain Text

vector_store = ...
index = VectorStoreIndex.from_vector_store(vector_store)

Its a little more simple in the second case, since all the nodes are stored in the vector db to simplify storage

vverdverm

There is an example that uses an if statement to show a local index and the two modes in one script

vverdverm

somewhere on the docs

LLogan M

yea that gets used in a few notebooks

LLogan M

copy-pasta style

vverdverm

Need to GAR examples together maybe :]
https://blog.luk.sh/rag-vs-gar
let the user check some boxes to craft custom examples

vverdverm

or perhaps from natural language of the example you desire... that feels more appropriate

wwebwrx

Ahh yes this makes sense. It is only making one very quick connection with each query from what I can see. Makes sense it is embedding the query itself. Just didn't understand what's going on under the hood. Thanks for your help!

wwebwrx

Just one I believe, as Logan M suggested, I now understand it's the query text itself. 👍

wwebwrx

Here's my code that creates it for reference:

Plain Text

def create_vectorstore(app):
  ### DATA and VECTORSTORE locations
  data_dir = 'data'
  vectorStore_dir = 'index'

  try:
    ### TRY LOADING PERSISTED INDEX ###
    storage_context = StorageContext.from_defaults(persist_dir=vectorStore_dir)
    vectorStore = load_index_from_storage(storage_context)
    logging.info("Loaded Vector Store OK.")
    return vectorStore

  except Exception as e:
    logging.info(f"Error loading Vector Store: {e}")

    try:
      ### READ and INDEX DOCS ###
      logging.info("Creating Embeddings and Vector Store...")
      documents = SimpleDirectoryReader(data_dir).load_data()
      vectorStore = VectorStoreIndex.from_documents(documents)
      logging.info("Vector Store created OK.")

      ### PERSIST INDEX TO STORAGE ###
      vectorStore.storage_context.persist(persist_dir=vectorStore_dir)
      logging.info("Vector Store persisted OK.")

      return vectorStore

    except Exception as e:
      logging.error(f"Error creating or storing Vector Store: {e}")

SShawn1998

I am using Pinecone as my vectorstore - snippet below: would appreciate any guidance on how i can add new documents to the index and update existing documents to the index?

def get_response(user_query, user):
api_key = os.environ.get('PINECONE_API_KEY')
pc = Pinecone(api_key=api_key)

# Create a new index if it does not exist
'''
pc.create_index(
name="quickstart",
dimension=1536,
metric="euclidean",
spec=ServerlessSpec(cloud="aws", region="us-east-1"),
)
'''
pinecone_index = pc.Index("quickstart")

# Load documents from the specified directory
documents = SimpleDirectoryReader("./knowledge").load_data()

# Create an index from the documents
vector_store = PineconeVectorStore(pinecone_index=pinecone_index)

# reload an existing one
index = VectorStoreIndex.from_vector_store(vector_store=vector_store)

# Create a query engine from the index
model = "gpt-4o"
Settings.llm = llamaOpenAI(temperature=0, model=model)
query_engine = index.as_query_engine()
...
@Logan M could you pls guide - would really appreciate it.

Add a reply

Find answers from the community

Persisted Vector Index