Find answers from the community

Updated 2 months ago

Load

This has gota be simple and im just missing it.
Plain Text
index = VectorStoreIndex(storage_context=storage_context, service_context=service_context)
ValueError: One of nodes or index_struct must be provided.

All the samples show building the vectorStoreIndex.from_documents, but what's the right way to have take an existing already built index and simply create the object so that you can do other things like, index.as_query_engine()
L
N
20 comments
Plain Text
index.storage_context.persist(persist_dir="./storage")

from llama_index import StorageContext, load_index_from_storage 

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
You can also optionally add the service context as a kwarg when loading if you had that customized
Yeah i did, okie, lemme give that a try
Hmmm, not sure i can use this?
No index in storage context, check if you specified the right persist_dir.
What's the play here? My index isn't on disk, im using a weviate vector store, e.g.
vector_store = WeaviateVectorStore(weaviate_client = client, index_name="Book", text_key="content")
oh you are using a vector db !
setup the vector store
then

Plain Text
index = VectorStoreIndex.from_vector_store(vector_store, service_context=service_context)
Awesome, that got me further.

@Logan M "The new context does not provide any additional information ...." in response to a query i sent. Is there some unseen context window it's maintain that I need to kill for fresh queries?
Nah that's just the LLM being stubborn, that's a classic response from gpt-3.5 if I had to guess

Internally llama index has prompt templates that take the question and the context retrieved from the index
If all the context doesn't fit into a single LLM call, it gets refine across multiple llm calls
If the next context is not helpful, it's supposed to repeat the existing answer. But here the LLM decided not to follow instructions
I'm surprised it hit the refine prompt though, unless you've changed the chunk size or top k, or are using a smaller input LLM
I did change chunk size...
actually i initialized the llm, and let it know the max tokens for 3.5 chatgpt is 4000 tokens
Is there a good way to see debugging output on what it's doing? I'd thinking of modifying how it querying and then prompting the data
You can set a debug logger:
Plain Text
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().handlers = []
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))


Or you can try using the debug callback, but it takes some getting used to to parse it
https://github.com/jerryjliu/llama_index/blob/main/docs/examples/callbacks/LlamaDebugHandler.ipynb

There's also the token countining handler if you just want to see inputs/outputs and token counts
https://github.com/jerryjliu/llama_index/blob/main/docs/examples/callbacks/TokenCountingHandler.ipynb
This is all really helpful thank you. Do you have a good understanding of how the GPTSimpleKeywordTableIndex, https://gpt-index.readthedocs.io/en/v0.6.14/examples/composable_indices/ComposableIndices-Weaviate.html, plays a role as the root index in the composability graph? I'm not all together clear how it's creating such a thting, or leveraging such a thing.
It's basically using keywords from the summaries of each sub-index to pick which sub-index to send the query to
Ah ok, that makes a lot more sense.
Add a reply
Sign up and join the conversation on Discord