Load

This has gota be simple and im just missing it.

Plain Text

index = VectorStoreIndex(storage_context=storage_context, service_context=service_context)
ValueError: One of nodes or index_struct must be provided.

All the samples show building the vectorStoreIndex.from_documents, but what's the right way to have take an existing already built index and simply create the object so that you can do other things like, index.as_query_engine()

20 comments

LLogan M

Plain Text

index.storage_context.persist(persist_dir="./storage")

from llama_index import StorageContext, load_index_from_storage 

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)

LLogan M

You can also optionally add the service context as a kwarg when loading if you had that customized

NNick Darnell

Yeah i did, okie, lemme give that a try

NNick Darnell

Hmmm, not sure i can use this?
No index in storage context, check if you specified the right persist_dir.
What's the play here? My index isn't on disk, im using a weviate vector store, e.g.
vector_store = WeaviateVectorStore(weaviate_client = client, index_name="Book", text_key="content")

LLogan M

oh you are using a vector db !

LLogan M

setup the vector store

LLogan M

then

Plain Text

index = VectorStoreIndex.from_vector_store(vector_store, service_context=service_context)

NNick Darnell

Awesome, that got me further.

@Logan M "The new context does not provide any additional information ...." in response to a query i sent. Is there some unseen context window it's maintain that I need to kill for fresh queries?

LLogan M

Nah that's just the LLM being stubborn, that's a classic response from gpt-3.5 if I had to guess

Internally llama index has prompt templates that take the question and the context retrieved from the index

LLogan M

If all the context doesn't fit into a single LLM call, it gets refine across multiple llm calls

LLogan M

If the next context is not helpful, it's supposed to repeat the existing answer. But here the LLM decided not to follow instructions

LLogan M

I'm surprised it hit the refine prompt though, unless you've changed the chunk size or top k, or are using a smaller input LLM

NNick Darnell

I did change chunk size...

NNick Darnell

well

NNick Darnell

actually i initialized the llm, and let it know the max tokens for 3.5 chatgpt is 4000 tokens

NNick Darnell

Is there a good way to see debugging output on what it's doing? I'd thinking of modifying how it querying and then prompting the data

LLogan M

You can set a debug logger:

Plain Text

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().handlers = []
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

Or you can try using the debug callback, but it takes some getting used to to parse it
https://github.com/jerryjliu/llama_index/blob/main/docs/examples/callbacks/LlamaDebugHandler.ipynb

There's also the token countining handler if you just want to see inputs/outputs and token counts
https://github.com/jerryjliu/llama_index/blob/main/docs/examples/callbacks/TokenCountingHandler.ipynb

NNick Darnell

This is all really helpful thank you. Do you have a good understanding of how the GPTSimpleKeywordTableIndex, https://gpt-index.readthedocs.io/en/v0.6.14/examples/composable_indices/ComposableIndices-Weaviate.html, plays a role as the root index in the composability graph? I'm not all together clear how it's creating such a thting, or leveraging such a thing.

LLogan M

It's basically using keywords from the summaries of each sub-index to pick which sub-index to send the query to

NNick Darnell

Ah ok, that makes a lot more sense.

Add a reply

Find answers from the community

Load