LlamaIndex

Log inLog into community

Find answers from the community

Updated 2 years ago

I m trying to set up a simple vector

I m trying to set up a simple vector

At a glance

The community member is trying to set up a simple vector index with numerous sub-indices, and is unsure how to increase the similarity_top_k beyond the default of 1. The comments suggest using the query_configs option, and provide examples of how to set the similarity_top_k for the top-level and secondary indices. The community member tries this approach and reports that it worked well, providing more accurate responses.

The community members discuss various approaches to creating sub-indices and a top-level index, including using GPTSimpleVectorIndex and ComposableGraph. They share code samples and troubleshoot issues related to the new composability features, such as setting document IDs and handling breaking changes.

The community members also discuss challenges with using the GPT-3.5 model, including getting responses that indicate a lack of clear context information. They mention an upcoming PR that may improve the performance of GPT-3.5, and explore ways to customize prompts and templates to improve the quality of the responses.

Useful resources

·

I'm trying to set up a simple vector index that has numerous sub indices that are also simple vector. My graph query "works", but I'm unsure how to increase the similarity_top_k to be greater than the default of 1 on either the top level or secondary indices. Is this something that can be accomplished?

2

L

T

A

68 comments

Are you using the query_configs option in your query call?

I was trying to create query_configs but not sure what goes in it when both levels are simple vector stores. Do I only need one object in the json for simple_dict type and put the similarity_top_k setting in the kwargs section? Adding what I described did not change the search behavior and did not increase the number of node sources in the response.

Just took a peek at the code for this lol

So, you can have one query config for both, or you can give each index struct a doc_id and specify different settings for unique ids using index_struct_id in the config

did you try something like this to set the option for all of them?

query_configs = [{
"index_struct_type": "simple_dict",
"query_mode": "default",
"query_kwargs": { "similarity_top_k": 3 }
}]

Thanks. I think that's what I tried,but I'll try it again. If I were to create two definitions, what does the struct_id align to? If I have 23 subindices would each have its own definition with a struct_id or is there a way for them all to share the same definition?

Is there a place to designate struct id on an index?

I thiiiink each one would need it's own definition. But a better setup would be to give the top-level the id

something like this:

Plain Text

sub_vector_1 = GPTSimpleVectorIndex(documents)
....

top_level_index = GPTSimpleVectorIndex([sub_vector_1, ...])
top_level_index.set_doc_id("top_level")

graph = ComposableGraph.build_from_index(top_level_index)

query_configs = [{
        "index_struct_type": "simple_dict",
        "query_mode": "default",
        "query_kwargs": { "similarity_top_k": 3 }
    },
        "index_struct_type": "simple_dict",
        "index_struct_id": "top_level",
        "query_mode": "default",
        "query_kwargs": { "similarity_top_k": 1 }
    }
]

set_doc_id() is what set's the id

(I'm pulling most of this from the docs page, if you haven't seen it yet: https://gpt-index.readthedocs.io/en/latest/how_to/composability.html)

@Logan M Your first config example worked like a charm. I must have done something slightly off when I tried my brute force attempts earlier. You also helped me confirm my shaky understanding of the documentation and expanded on it. The best part is that using this multi-index approach is giving me more accurate responses. THANKS!

You are welcome, happy to help!! 💪

@TylerDurden are you really experiencing better answers with multi simple vector indices? How many docs are there in your subindices? I ask cause I notices a worsening of responses😅 also curious of how you set text for each subindex

@TylerDurden able to share any snippets on how you created the subindices? appreciate any tips! what led you to try adding subindices?

Here's a sample where I create an index for each file in a directory, and build a top index for the collection of indices.

indices = []
# Iterate over files in the directory
i = 0
for filename in os.listdir(path):
    # Get the full path of the file
    file_path = os.path.join(path, filename)
    
    # Check if the file is a regular file (not a directory)
    if os.path.isfile(file_path):
        # Build Index for each file found in a Directory
        document = SimpleDirectoryReader('',[file_path]).load_data()
        indices.append(GPTSimpleVectorIndex(document))
        summary = indices[i].query("Summarize this document.")
        indices[i].set_text(str(summary))
        indices[i].set_doc_id(str(filename))
        i = i+1
        
top_index = GPTSimpleVectorIndex(indices)

from llama_index.composability import ComposableGraph

graph = ComposableGraph.build_from_index(top_index)

response = graph.query("<enter prompt here to query the graph>")

@AndreaSel93 @xmalina I currently have 23 documents that each have their own indices. I use LLM to create the summary text of each index. My sample code doesn't have the exact prompt because mine is tailored for a very specific business use case. When I had all the files in a single index, the responses to typed questions just didn't feel right. The answer glommed onto a document node that was just "ok" and if I increased the similarity top k, the answer just got wordier and not really more accurate. Now, it seems the summarizing text of each index that is in the top index is steering to a better document.

You could play with the summary prompt in my sample code to better suit your collection. It's possible that I could have just injected summaries into each of my documents before indexing and it may have worked the same way with a single index.

I still need to see if it still holds up with more and more documents. If not, I will probably need to inject the "summaries" into the files instead.

Just a note, in your example, the summary prompt is only summarizing a single node.

If you need to improve the summary, use response_mode="tree_summarize" and use a temp list index (or crank up the top_k for the vector index with the same response mode).

Creating summaries with the vector index allows you to steer the topic due to the similarity matching. Meanwhile, a list index will look at every node

Good catch.

Care to share the summary fix? Would like to see that ^^

@dagthomas something like this

Plain Text

...
document = SimpleDirectoryReader('',[file_path]).load_data()
indices.append(GPTSimpleVectorIndex(document))
temp_index = GPTListIndex(document)
summary = temp_index.query("Summarize this document.", response_mode='tree_summarize')
indices[i].set_text(str(summary))
indices[i].set_doc_id(str(filename))
...

Ahh, oki cool! Thanks!

thanks all ! looking forward to diggin into that. really helpful. thanks again.

I ultimately ended up using a simpler prompt for my summaries and surprisingly I seem to get better answers no with top_k =1.

summary = tempIndex.query("Summarize this document:")

However, when I attempted to change the llm predictor to "gpt-3.5-turbo" I now more than often get answers that start with, "The context information does not provide any indication..."

I have the temperature set to 0.7 and am using the "gpt-3.5-turbo" model for building the indices, loading the graph from disk, and running any queries.

That is a common problem with gpt-3.5. I see an upcoming PR that may improve it's performance soon though: https://github.com/jerryjliu/llama_index/pull/867

Thanks, I wasn't sure if I was missing something related to designating the LLM. I just noticed I didn't designate LLM in my ComposableGraph save or load to disk, and wasn't sure if that's a thing, or if it matters.

You'll want to pass in llm_predictor when you do load_from_disk or during the query() after loading too

I tried passing the llm_predictor when I do the load_from_disk and it didn't improve the experience. I had already been passing the LLM in the query command. (The saved composableGraph did not get saved passing the LLM, but was built using indices that had the LLM designated. ) I'm hoping the upcoming PR improves the results with the GPT 3.5 model.

Me too! So far, my experience is it's hard to get it to follow instructions every time 😅 very stubborn

I'm very happy with davinci-003 results in my proof of concept, but obviously want to start reducing the costs but maintain quality.

OK, so now that nodes are a thing, nothing works. I have muddled may way through patching most of the code, but it looks like you can't build a SimpleVector top index?

top_index = GPTSimpleVectorIndex.from_documents(indices,service_context=service_context)

returns the error:

File "/.../llama_index/indices/base.py", line 96, in from_documents
    docstore.set_document_hash(doc.get_doc_id(), doc.get_doc_hash())
AttributeError: 'GPTSimpleVectorIndex' object has no attribute 'get_doc_id'

And from what I can tell, there is no longer a way to set a doc_id for an index?

See the latest guide for composability: https://gpt-index.readthedocs.io/en/latest/how_to/index_structs/composability.html

There's a small typo in the docs though, your example would look something like this:

Plain Text

graph = ComposableGraph.from_indices(
    GPTSimpleVectorIndex,
    indices,
    index_summaries=[index1_summary, index2_summary, index3_summary, ...],
)

It seems like id's are set like this now? 🤔 i.e. GPTSimpleVectorIndex.index_struct.index_id="my_id"

I realize there are a ton of breaking changes. However, hopefully going forward llama_index can support some super cool features

Hmm... I'll try that. Looks like I'm cutting out the top_index middleman... I'm a bit nervous about the resulting responses. My POC with the old code was impressing stakeholders. I'll keep my fingers crossed. 🍽️ 🐁 ?! 🧀

Yea if you run into anything else let me know! It should be working the same 🙏

Does this stuff work with GPT 3.5? After trying the new graph build, I'm still getting "The context information does not provide a clear answer to this question." for some of my questions when I used to get a great response with davinci-003. I had the same issue trying to get gpt3.5 working with my old code.

Even davinci-003 is not working well with the graph I built using the new method. I'm getting much shorter replies, and some are inaccurate. I feel like my index of indexes may have been doing something this new graph builder isn't doing.

If I set top_k=2, I get more accurate results with davinci-003, but they still seem brief compared to my old code. I can try top_k=3 to see what happens, but I was getting great responses with top_k=1 with the old code.

Are the summaries getting templated?

Openai has also updated their models recently (or so I've heard), could have some effect 🤔

ChatGPT is notorious for those responses though

Not sure what you mean here?

When the graph builder is fed the summaries, does it do some llm transformation or prefixing to the summary text? Or does it just inject it verbatim?

My summaries seem a bit stilted when I look at the graph that was saved to disk. "This document..."

I have some old graph saves that don't start that way, but I may have played with the prompt or is it possible davinci-003 is replying differently to the same query I use to get the summary?

I think this is the answer here

If you want more control over the prompts sent to chatgpt or davinci (maybe you have more specific instructions) you can also customize the prompt templates

After deeper investigation, I don't think the query to generate the summary is giving any different response than before. And they often start with "This document..."

Is it possible "summaries" in the new version behave differently?

I see my saved graph has a "summary" field. I don't have an graph saved in the previous version to see if it has "summary".

I think it should be the same. They get treated like a normal document by the top level index as far as I know 🤔

So before, I built a vector index of 23 vector indices....

Then built a graph using that "top_index"

Now, I feed the 23 indices and 23 summaries into the graph builder.

graph = ComposableGraph.from_indices(GPTSimpleVectorIndex, indices, summaries)

Maybe the changed behavior has nothing to do with this graph.

I may need to do some 🍎 to 🍎 comparison of the saved graphs and indices built between the two llama versions.

Yea if you notice any degradation in performance between llama index versions, do let us know!

Otherwise, I blame open ai for any changes in response quality lol

You will notice the saved jsons are much smaller now. The older library duplicated a lot of data

First thing I'll do is confirm the "old" version works the same as it did last week.

...get my baseline well documented.

Here are samples of the same "chat" using a ComposableGraph with llamaindex v 4.24 vs. 5.2. The 5.2 responses are shorter. Both are using the default LLM.

Example:


You: Can I put a dead rat in the bagster?

Bagster Bot: 
No, you cannot put a dead rat in the Bagster. The Bagster does not accept anything toxic or hazardous, such as food waste.

VS.


You: Can I put a dead rat in the bagster?

Bagster Bot: 
No, you cannot put a dead rat in the Bagster.

Hmmm interesting 🤔

I know there was kind of a big bug fixed in 0.4.38 that was causing duplicate nodes to be sent to the LLM when using the vector index, or something along those lines
https://discord.com/channels/1059199217496772688/1059201661417037995/1089063414426517635

That's the biggest difference I can think of (sorry btw, I just remembered this fix after seeing the differences here 😅 )

So if the LLM is seeing multiple nodes with the same text in 4.24, that is probably encouraging it to write longer answers? 🤔

Would top_k = 1 also be impacted? Was it getting 2 nodes instead of 1? Did default node sizes change since v4.24 ?

I'm new to python, etc... Is there a way to output some debug logging that shows what's happening during the query execution against the ComposableGraph?

Hard to say, I don't have a full understanding of the bug tbh

There's a way! Give me one sec, I'll pull up the example

Plain Text

from llama_index.logger import LlamaLogger

llama_logger = LlamaLogger()
service_context = ServiceContext.from_defaults(..., llama_logger=llama_logger)
....
response = index.query("my query")
print(llama_logger.get_logs())  # prints all logs, which basically includes all LLM inputs and responses
llama_logger.reset()  # this clears the log stack

You can also turn on debug logs, but you wont get the same control over the logs as you do with llama logger

Plain Text

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

Thanks, I probably won't dive deeper into this for a few days. And here I thought I was ready to move on to learning more about langchain to build a bot. 😢

nah you are ready! Do both at the same time lol

One pro tip for when you get to langchain -> you can use llama index as a tool with langchain agents. Furthermore, you can have multiple indexes for different topics or use cases. It'll make sense in due time, trust 🙏

Add a reply

Sign up and join the conversation on Discord