Graphs

jjakpower

Folks, I'm getting confused when reading the Composability docs. I have a number of Chroma Collections which I'm loading using a GPTChromaIndex for each index. I'd like to be able to query across all the documents in all indexes, or sometimes maybe the user wants to constrain the search. There will be many documents, perhaps thousands. Where I'm getting confused is -> Do I need to compose a graph? If I do compose a graph, should there be an index per document as in the sample? Chroma is using OpenAI embeddings, it's very good at finding related content... how can I best rely on embedding similarity to find content?

14 comments

LLogan M

I would only create a graph if you've exhausted your options with a single index and it's still not performing as well as you want.

From what I've seen, embeddings should work fine in most cases, unless you have an easy way to group your documents into specific topics.

With just a single vector index, you can try modifying the chunk size when building the index (default is 3900 tokens).

You can also adjust the top k in your query index.query(..., similarity_top_k=3, response_mode="compact") with a higher top k, a smaller chunk size will help speed up responses (along with setting that response size)

However, decreasing chunk size too much can make answers harder to find

jjakpower

Okay cool! Will the index query return more than one chunk if it makes sense to?

jjakpower

Also where does one feed in the chunk size when building the index?

jjakpower

I have

Plain Text

 service_context = ServiceContext.from_defaults(
    chunk_size_limit=512, embed_model=embeddings)

jjakpower

Plain Text

github_client = GithubClient(os.getenv("GITHUB_TOKEN"))

        loader = GithubRepositoryReader(
            github_client,            
            **kwargs
        )

jjakpower

or..

jjakpower

Plain Text

index = GPTChromaIndex.from_documents(
        docs_content, service_context=service_context, chroma_collection=chroma_collection)

LLogan M

Putting it in the service context is the right place 💪🫡

LLogan M

Currently, it only returns a hard-coded number of nodes. The default is 1

jjakpower

copy that

jjakpower

Any thoughts on how to return node contents up to token length (parameter) and have the llm answer the question across that text as the context?

jjakpower

Plain Text

while total_tokens < token_max: keep appending nodes from index which has been sorted by most relevant based on vector similarity

LLogan M

Hmm setting response_mode="compact" in the query should do that. It fills each request with the maximum number of tokens (from the pool of text available after fetching the top k)

jjakpower

okay sweet, will give it a shot

Add a reply

Find answers from the community

Graphs