Find answers from the community

Updated 3 months ago

question for this esteemed community

question for this esteemed community: when we split the data from csv into chunks , we embed chunks and do a vector search, but can 't see the full original text prior to splitting (unless you link it to a database that stores the full original texts).
what if we embed the metadata too? presumably, the metadata for the chunks coming from the same text that was split should be nearly identical, and the relevance score for those metadata vectors will be almost 1. if you need to find all chunks from the same original text and thus show you the full text, and not the excerpts, you just run another vector search against this one chunk. what do you think?
L
M
7 comments
I'm not sure if I completely follow.
The metadata attached to your nodes/documents is already used when calculating embeddings πŸ‘€
i thought only the chunk gets embedded, no?
hmm. my bad then, i did not know. that's not how i upload and use metadata on supabase now.
what's your take on how to pull the full text where the top chunk is from? via relational database?
I would add some value to the metadata so that you can locate the original full document
got it, thanks!
Add a reply
Sign up and join the conversation on Discord