Find answers from the community

Updated 4 months ago

Hello folks I am neew here first of all

At a glance
Hello folks, I am neew here, first of all Llama index is awesome, and I have been actually facing a problem, and I believe that the answer lies in llama index, (went through the documentation), but I still need some help.

So suppose I have been given a csv table of subject metadata, of different subject concepts. Example

Biology
chem
physics

Now in biology also there are even sub categories like

  1. physiology
  2. botany
similarly goes for chemistry and physics.

Now while I go to query something, before doing the embedding based similarity search, I wanted to now whether I can do this keyword based search first (keyword index or tree idk, a bit confused). Take these scenerio, I asked the question at first, and assuming my LLM (which is first forced to give a json metadata of the query like this)

Plain Text
text: ...
subject: ...
subject_department: [....] # the query can belong to more than one department 


now I will send it to my index so that my search space now becomes

Plain Text
biology
  - botany
  - genomics


Now I will do vector search all the document embeddings belonging to node (botany) and (genomics). And return the answer in some format (with metadata), so how can I do that? using llama-index any kind of pseduo code or something you guys can help me out?
L
A
4 comments
you could make an index for each path of documents (i.e. biology->botany is one index) and then use a sub question query engine on top of them

If you end up having a ton of sub-indexes, you can use a retriever to fetch the top k most similar indexes before querying

That flow isn't totally explained well in the docs, but they key elements are there

Sub question query engine
https://gpt-index.readthedocs.io/en/stable/examples/query_engine/sub_question_query_engine.html

ObjectIndex
https://gpt-index.readthedocs.io/en/stable/core_modules/agent_modules/agents/usage_pattern.html#function-retrieval-agents
uhmm, Sorry but did't got that totally, instead of sub query engine, can't I just go to subject node first then inside (main) subject node (like biology) will go to department (botany) which becomes my leaf node and there I use vector search to retreiev results?

Not only that there could be another possibilities,

suppose my subject detected is chemistry and then there are two leaf node where my answer can be
so my search space now becomes

  1. physical chem
  2. neuclear chem
and now I use vector search to grab answers from physical and neuclear both (top-k seperately) and provide results?
Yea that works too πŸ€·β€β™‚οΈ anything possible tbh, just gotta code some of it haha
yeah seems like it, Thanks Logan
Add a reply
Sign up and join the conversation on Discord