Find answers from the community

Home
Members
yourbuddyconner
y
yourbuddyconner
Offline, last seen 4 months ago
Joined September 25, 2024
What is the trade-off space between chunk size and LLM tokens?

I have been playing around with optimizing this, and there seems to be a floor of query performance along chunk size depending on document size. Increasing chunk size increases LLM tokens sent for query response however.

I am thinking of parameterizing chunk size to be functional with document size and optimize search queries based on that but would appreciate general thoughts to vet the concept.
14 comments
y
j
@jerryjliu0 have you thought at all before about caching queries?

Have a cool PoC for semantic query caching via pinecone (could use the vector index instead) rn and I feel like there might be a place in gpt_index to maybe slot this in as opposed to shipping an external library.
12 comments
y
j
2 comments
y
Is the recommended approach for using Pinecone (or another vector store) to load the documents into the store with gpt_index as the interface? (i.e. fresh index, create documents, insert into GPTPineconeIndex)

How would one interface with gpt_index in the case of a pre-existing vector index in pinecone in this case?

Trying to decide if the flexibility of being able to interface directly is better than using the gpt_index abstraction or not for non-full-document Q/A (ex. storing previous queries for a cache of Q/A so LLM calls can be limited)
9 comments
y
0
Any art on how to properly summarize a SimpleVectorIndex? I am seeing that it picks out subsets of my document and mode="tree_summarize" doesnt seem to be a thing on this index type.

Getting KeyError: <QueryMode.SUMMARIZE: 'summarize'> too...

Goal is to summarize the index such that it can be placed in a TreeIndex for hierarchical organization. And then facilitate vector querying at query-time for efficient retrieval.
22 comments
j
y
B