Find answers from the community

Updated 2 months ago

when you use Huggingface embeddings and

when you use Huggingface embeddings and download them, how does that work with attention mechanisms? Is there any over-the-wire transaction when I vectorize a document if I use hf?
L
n
13 comments
no, the model is running locally, on your machine
does it ever make sense to use a vector db for a single document?
In the context of information extraction
what would be the pros and cons of, for example, creating a llama index Chroma db set up to query against a 70 page .pdf vs. just sending that .pdf to Claude API and leveraging its large context window?
speed, runtime, citeability, cost
if none of those are important enough for the use case, by all means send the pdf to claude
consider that it would be a new .pdf each time
would that factor into it?
Youd have to measure the tradeoffs yourself I think πŸ˜…
for sure, I appreciate the feedback I'm just wondering if speed would actually increase if its a new pdf each time, same with runtime, citeability and cost
I cant say for sure without trying it myself
Add a reply
Sign up and join the conversation on Discord