Find answers from the community

Updated 5 months ago

when you use Huggingface embeddings and

At a glance

The community members are discussing the use of Hugging Face embeddings and how they work with attention mechanisms. One community member notes that the model runs locally on the user's machine, not over the wire. Another community member asks if it makes sense to use a vector database for a single document, in the context of information extraction.

The discussion then turns to the pros and cons of using a Llama index Chroma database to query a 70-page PDF, versus sending the PDF to the Claude API and leveraging its large context window. The community members mention factors like speed, runtime, citeability, and cost, and suggest that if those factors are not important for the use case, it may be better to just send the PDF to Claude.

The community members also discuss the possibility of a new PDF being used each time, and whether that would affect the tradeoffs. Ultimately, they conclude that the specific tradeoffs would need to be measured and evaluated for the particular use case.

when you use Huggingface embeddings and download them, how does that work with attention mechanisms? Is there any over-the-wire transaction when I vectorize a document if I use hf?
L
n
13 comments
no, the model is running locally, on your machine
does it ever make sense to use a vector db for a single document?
In the context of information extraction
what would be the pros and cons of, for example, creating a llama index Chroma db set up to query against a 70 page .pdf vs. just sending that .pdf to Claude API and leveraging its large context window?
speed, runtime, citeability, cost
if none of those are important enough for the use case, by all means send the pdf to claude
consider that it would be a new .pdf each time
would that factor into it?
Youd have to measure the tradeoffs yourself I think πŸ˜…
for sure, I appreciate the feedback I'm just wondering if speed would actually increase if its a new pdf each time, same with runtime, citeability and cost
I cant say for sure without trying it myself
Add a reply
Sign up and join the conversation on Discord