Find answers from the community

Home
Members
Ritually
R
Ritually
Offline, last seen 3 months ago
Joined September 25, 2024
I'm creating a FastAPI to use VectorStoreIndex with HuggingFace LLM and embed model. When I am running the FastAPI app on my local Mac M1 and sending a request to it, I am getting this error:
Plain Text
RuntimeError: MPS backend out of memory
. Has anyone dealt with this before?
2 comments
b
I just watched Jerry's talk at the AI User Conference. He recommended building our own versions of agents, layer by layer. Are there any guides to building an agent from scratch without using LlamaIndex, or is the only way to read through the LlamaIndex source code?
1 comment
L
R
Ritually
·

Agent

I'm brainstorming what kind of agent I want to build to learn how to do it. What's the difference between LlamaIndex' OpenAIAgent and ReActAgent?
1 comment
L
I am trying to use a mix of Huggingface APIs and LlamaIndex functions to build a Streamlit hosted RAG. I am having trouble understanding why there are extra processing steps in the get_query_embedding step (specifically, https://github.com/run-llama/llama_index/blob/main/llama_index/embeddings/huggingface.py#L139). What's the difference between using get_query_embedding and just getting the embeddings from a Huggingface model?
4 comments
L
R
Hi, just starting to use LlamaIndex - is there a version of the CitationQueryEngine that can cite the URL the text came from, not just the source text itself?
2 comments
R
T