Sounds like a general approach could be:
- generate a summary of each document and store that in an index
- create a chroma document for each post in some pipeline, and create an index just for posts.
- use these two indexes and generate a parent index of all of them
Use the QueryToolMetadata class with an agent to figure out which tool is used corresponding to the right index. Some prompt engineering here could help refine the results returned by each one.