Are you using the query_configs
option in your query call?
I was trying to create query_configs but not sure what goes in it when both levels are simple vector stores. Do I only need one object in the json for simple_dict type and put the similarity_top_k setting in the kwargs section? Adding what I described did not change the search behavior and did not increase the number of node sources in the response.
Just took a peek at the code for this lol
So, you can have one query config for both, or you can give each index struct a doc_id and specify different settings for unique ids using index_struct_id in the config
did you try something like this to set the option for all of them?
query_configs = [{
"index_struct_type": "simple_dict",
"query_mode": "default",
"query_kwargs": { "similarity_top_k": 3 }
}]
Thanks. I think that's what I tried,but I'll try it again. If I were to create two definitions, what does the struct_id align to? If I have 23 subindices would each have its own definition with a struct_id or is there a way for them all to share the same definition?
Is there a place to designate struct id on an index?
I thiiiink each one would need it's own definition. But a better setup would be to give the top-level the id
something like this:
sub_vector_1 = GPTSimpleVectorIndex(documents)
....
top_level_index = GPTSimpleVectorIndex([sub_vector_1, ...])
top_level_index.set_doc_id("top_level")
graph = ComposableGraph.build_from_index(top_level_index)
query_configs = [{
"index_struct_type": "simple_dict",
"query_mode": "default",
"query_kwargs": { "similarity_top_k": 3 }
},
"index_struct_type": "simple_dict",
"index_struct_id": "top_level",
"query_mode": "default",
"query_kwargs": { "similarity_top_k": 1 }
}
]
set_doc_id() is what set's the id
@Logan M Your first config example worked like a charm. I must have done something slightly off when I tried my brute force attempts earlier. You also helped me confirm my shaky understanding of the documentation and expanded on it. The best part is that using this multi-index approach is giving me more accurate responses. THANKS!
You are welcome, happy to help!! πͺ
@TylerDurden are you really experiencing better answers with multi simple vector indices? How many docs are there in your subindices? I ask cause I notices a worsening of responsesπ
also curious of how you set text for each subindex
@TylerDurden able to share any snippets on how you created the subindices? appreciate any tips! what led you to try adding subindices?
Here's a sample where I create an index for each file in a directory, and build a top index for the collection of indices.
indices = []
# Iterate over files in the directory
i = 0
for filename in os.listdir(path):
# Get the full path of the file
file_path = os.path.join(path, filename)
# Check if the file is a regular file (not a directory)
if os.path.isfile(file_path):
# Build Index for each file found in a Directory
document = SimpleDirectoryReader('',[file_path]).load_data()
indices.append(GPTSimpleVectorIndex(document))
summary = indices[i].query("Summarize this document.")
indices[i].set_text(str(summary))
indices[i].set_doc_id(str(filename))
i = i+1
top_index = GPTSimpleVectorIndex(indices)
from llama_index.composability import ComposableGraph
graph = ComposableGraph.build_from_index(top_index)
response = graph.query("<enter prompt here to query the graph>")
@AndreaSel93 @xmalina I currently have 23 documents that each have their own indices. I use LLM to create the summary text of each index. My sample code doesn't have the exact prompt because mine is tailored for a very specific business use case. When I had all the files in a single index, the responses to typed questions just didn't feel right. The answer glommed onto a document node that was just "ok" and if I increased the similarity top k, the answer just got wordier and not really more accurate. Now, it seems the summarizing text of each index that is in the top index is steering to a better document.
You could play with the summary prompt in my sample code to better suit your collection. It's possible that I could have just injected summaries into each of my documents before indexing and it may have worked the same way with a single index.
I still need to see if it still holds up with more and more documents. If not, I will probably need to inject the "summaries" into the files instead.
Just a note, in your example, the summary prompt is only summarizing a single node.
If you need to improve the summary, use response_mode="tree_summarize" and use a temp list index (or crank up the top_k for the vector index with the same response mode).
Creating summaries with the vector index allows you to steer the topic due to the similarity matching. Meanwhile, a list index will look at every node
Care to share the summary fix? Would like to see that ^^
@dagthomas something like this
...
document = SimpleDirectoryReader('',[file_path]).load_data()
indices.append(GPTSimpleVectorIndex(document))
temp_index = GPTListIndex(document)
summary = temp_index.query("Summarize this document.", response_mode='tree_summarize')
indices[i].set_text(str(summary))
indices[i].set_doc_id(str(filename))
...
thanks all ! looking forward to diggin into that. really helpful. thanks again.
I ultimately ended up using a simpler prompt for my summaries and surprisingly I seem to get better answers no with top_k =1.
summary = tempIndex.query("Summarize this document:")
However, when I attempted to change the llm predictor to "gpt-3.5-turbo" I now more than often get answers that start with, "The context information does not provide any indication..."
I have the temperature set to 0.7 and am using the "gpt-3.5-turbo" model for building the indices, loading the graph from disk, and running any queries.
Thanks, I wasn't sure if I was missing something related to designating the LLM. I just noticed I didn't designate LLM in my ComposableGraph save or load to disk, and wasn't sure if that's a thing, or if it matters.
You'll want to pass in llm_predictor when you do load_from_disk or during the query()
after loading too
I tried passing the llm_predictor when I do the load_from_disk and it didn't improve the experience. I had already been passing the LLM in the query command. (The saved composableGraph did not get saved passing the LLM, but was built using indices that had the LLM designated. ) I'm hoping the upcoming PR improves the results with the GPT 3.5 model.
Me too! So far, my experience is it's hard to get it to follow instructions every time π
very stubborn
I'm very happy with davinci-003 results in my proof of concept, but obviously want to start reducing the costs but maintain quality.
OK, so now that nodes are a thing, nothing works. I have muddled may way through patching most of the code, but it looks like you can't build a SimpleVector top index?
top_index = GPTSimpleVectorIndex.from_documents(indices,service_context=service_context)
returns the error:
File "/.../llama_index/indices/base.py", line 96, in from_documents
docstore.set_document_hash(doc.get_doc_id(), doc.get_doc_hash())
AttributeError: 'GPTSimpleVectorIndex' object has no attribute 'get_doc_id'
And from what I can tell, there is no longer a way to set a doc_id for an index?
See the latest guide for composability:
https://gpt-index.readthedocs.io/en/latest/how_to/index_structs/composability.htmlThere's a small typo in the docs though, your example would look something like this:
graph = ComposableGraph.from_indices(
GPTSimpleVectorIndex,
indices,
index_summaries=[index1_summary, index2_summary, index3_summary, ...],
)
It seems like id's are set like this now? π€
i.e. GPTSimpleVectorIndex.index_struct.index_id="my_id"
I realize there are a ton of breaking changes. However, hopefully going forward llama_index can support some super cool features
Hmm... I'll try that. Looks like I'm cutting out the top_index middleman... I'm a bit nervous about the resulting responses. My POC with the old code was impressing stakeholders. I'll keep my fingers crossed. π½οΈ π ?! π§
Yea if you run into anything else let me know! It should be working the same π
Does this stuff work with GPT 3.5? After trying the new graph build, I'm still getting "The context information does not provide a clear answer to this question." for some of my questions when I used to get a great response with davinci-003. I had the same issue trying to get gpt3.5 working with my old code.
Even davinci-003 is not working well with the graph I built using the new method. I'm getting much shorter replies, and some are inaccurate. I feel like my index of indexes may have been doing something this new graph builder isn't doing.
If I set top_k=2, I get more accurate results with davinci-003, but they still seem brief compared to my old code. I can try top_k=3 to see what happens, but I was getting great responses with top_k=1 with the old code.
Are the summaries getting templated?
Openai has also updated their models recently (or so I've heard), could have some effect π€
ChatGPT is notorious for those responses though
Not sure what you mean here?
When the graph builder is fed the summaries, does it do some llm transformation or prefixing to the summary text? Or does it just inject it verbatim?
My summaries seem a bit stilted when I look at the graph that was saved to disk. "This document..."
I have some old graph saves that don't start that way, but I may have played with the prompt or is it possible davinci-003 is replying differently to the same query I use to get the summary?
I think this is the answer here
If you want more control over the prompts sent to chatgpt or davinci (maybe you have more specific instructions) you can also customize the prompt templates
After deeper investigation, I don't think the query to generate the summary is giving any different response than before. And they often start with "This document..."
Is it possible "summaries" in the new version behave differently?
I see my saved graph has a "summary" field. I don't have an graph saved in the previous version to see if it has "summary".
I think it should be the same. They get treated like a normal document by the top level index as far as I know π€
So before, I built a vector index of 23 vector indices....
Then built a graph using that "top_index"
Now, I feed the 23 indices and 23 summaries into the graph builder.
graph = ComposableGraph.from_indices(GPTSimpleVectorIndex, indices, summaries)
Maybe the changed behavior has nothing to do with this graph.
I may need to do some π to π comparison of the saved graphs and indices built between the two llama versions.
Yea if you notice any degradation in performance between llama index versions, do let us know!
Otherwise, I blame open ai for any changes in response quality lol
You will notice the saved jsons are much smaller now. The older library duplicated a lot of data
First thing I'll do is confirm the "old" version works the same as it did last week.
...get my baseline well documented.
Here are samples of the same "chat" using a ComposableGraph with llamaindex v 4.24 vs. 5.2. The 5.2 responses are shorter. Both are using the default LLM.
Example:
You: Can I put a dead rat in the bagster?
Bagster Bot:
No, you cannot put a dead rat in the Bagster. The Bagster does not accept anything toxic or hazardous, such as food waste.
VS.
You: Can I put a dead rat in the bagster?
Bagster Bot:
No, you cannot put a dead rat in the Bagster.
That's the biggest difference I can think of (sorry btw, I just remembered this fix after seeing the differences here π
)
So if the LLM is seeing multiple nodes with the same text in 4.24, that is probably encouraging it to write longer answers? π€
Would top_k = 1 also be impacted? Was it getting 2 nodes instead of 1? Did default node sizes change since v4.24 ?
I'm new to python, etc... Is there a way to output some debug logging that shows what's happening during the query execution against the ComposableGraph?
Hard to say, I don't have a full understanding of the bug tbh
There's a way! Give me one sec, I'll pull up the example
from llama_index.logger import LlamaLogger
llama_logger = LlamaLogger()
service_context = ServiceContext.from_defaults(..., llama_logger=llama_logger)
....
response = index.query("my query")
print(llama_logger.get_logs()) # prints all logs, which basically includes all LLM inputs and responses
llama_logger.reset() # this clears the log stack
You can also turn on debug logs, but you wont get the same control over the logs as you do with llama logger
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
Thanks, I probably won't dive deeper into this for a few days. And here I thought I was ready to move on to learning more about langchain to build a bot. π’
nah you are ready! Do both at the same time lol
One pro tip for when you get to langchain -> you can use llama index as a tool with langchain agents. Furthermore, you can have multiple indexes for different topics or use cases. It'll make sense in due time, trust π