The community member is seeking advice on how to handle multiple sources of documentation and knowledge. They understand that using LangChain, they can only query one vector database at a time, so they are considering consolidating all the data into a single Pinecone index using the GPTPineconeIndex from the LlamaIndex hub. They want to know if this approach is correct and if the metadata for the source can be set automatically or manually.
In the comments, another community member suggests that the community member does not need to limit themselves to a single index. They can use LlamaIndex to create a composable index that wraps multiple indexes, or they can use custom tools for each index in LangChain. The community member acknowledges this but is concerned about the cost of making multiple API calls. They are working on a "simple" app that answers specific documentation questions, and they believe that updating a single Pinecone index with background processes and querying that single source of knowledge is a better approach for their use case.
There is no explicitly marked answer in the comments.
Hi guys, can somebody share some knowledge with me ? How do you handle multiple source of documentations/knowledge ? As far as I understand using langchain you can query exactly one vector database, and not multiple at the same time... thus as far as I understand I need to stick everything into single index (using Pinecone). Is my assumption correct that I can use GPTPineconeIndex fro various loaders from llamaindex hub and stuff the data to single index ? Does it automatically include metadata for source ( or can I set that ? ). I can have "collectors" set in background that would update the database at regular intervals...
Thanks @Logan M I know about the tools, and was thinking doing it that way with agents and so on.. but that is too many calls to API == more cost. I'm working on "simple" app that answer specific documentation questions, just the documentation is sprawling everywhere and not concentrated.... Using lama build in, in memory vector DB is not suitable since it would need to go and collect all the data, store them in memory for every sessions user would start, and depending on amount and location that would introduce lag... so in my case I think its better to update one Pinecone index with backgroud processes and have the UI to query single source of knowledge.