Hi all, I utilize vector_store = PGVectorStore.from_params(). When working with the Retrieval-Augmented Generation (RAG) model for Q&A, what's the optimal way: sending the full document(s) instead of using a similarity search using similarity_top_k? How should this be implemented, and what's the most effective approach to take? Thank you
In case I will need to query questions over more documents(different index per document) a TreeIndex can be better ?.., building indices on top of other document indices. Will be a good approach to compose a graph made up of indices and use query the graph ? What will do a better job: agent or sub question engine or router engine ?
if I have index already created as VectorStoreIndex.from_vector_store how I can use SummaryIndex, I need to recreate a new index ? how to save this new index in database ? Can VectorStoreIndex be used both with similarity top k and as SummaryIndex ?
I think you need to decide whether you want all nodes or only the relevant ones π
You can use a router engine to switch between a summy index or a vector index as needed. Typically you only want to use the summary index for queries that require reading the entire index
You can store the summary index in mongodb, redis, s3, google cloud bucket, etc.
To reduce the time, there is any way to distribute the load, having parallel processing.. sending each chunk as a separate call to LLM and combining the answers ?
Bedrock is in limited preview and not yet GA, when will be.. there will be more interest in using it. Will be good to have native support implemented from my point of view to not have limitations.