Ritually

Log inLog into community

Find answers from the community

Home

Members

Ritually

Offline, last seen 6 months ago

Joined September 25, 2024

RRitually

I'm creating a FastAPI to use

I'm creating a FastAPI to use VectorStoreIndex with HuggingFace LLM and embed model. When I am running the FastAPI app on my local Mac M1 and sending a request to it, I am getting this error:

Plain Text

RuntimeError: MPS backend out of memory

. Has anyone dealt with this before?

2 comments

RRitually

I just watched Jerry's talk at the AI

I just watched Jerry's talk at the AI User Conference. He recommended building our own versions of agents, layer by layer. Are there any guides to building an agent from scratch without using LlamaIndex, or is the only way to read through the LlamaIndex source code?

1 comment

RRitually

Agent

I'm brainstorming what kind of agent I want to build to learn how to do it. What's the difference between LlamaIndex' OpenAIAgent and ReActAgent?

1 comment

RRitually

Embeddings

I am trying to use a mix of Huggingface APIs and LlamaIndex functions to build a Streamlit hosted RAG. I am having trouble understanding why there are extra processing steps in the get_query_embedding step (specifically, https://github.com/run-llama/llama_index/blob/main/llama_index/embeddings/huggingface.py#L139). What's the difference between using get_query_embedding and just getting the embeddings from a Huggingface model?

4 comments

RRitually

Hi, just starting to use LlamaIndex - is

Hi, just starting to use LlamaIndex - is there a version of the CitationQueryEngine that can cite the URL the text came from, not just the source text itself?

2 comments