Find answers from the community

Updated 11 months ago

I am trying to build "another" RAG

I am trying to build "another" RAG system, using tools, with data stored in neo4j, chroma and summary index. the issue m facing is, the response time is around 25 min. the query is a bit complex too, but is it because of my local system? I am using OpenAI gpt-4-1106-preview model. I am running this locally on my macbook M1 chip with 8GB Ram. can anyone kindly guide me what i am doing wrong here?
1
L
V
m
5 comments
25 minutes is wild. If you are using api-based LLMs like openai, systems specs don't really matter unless your index is huge.

Usually the biggest slowdown is the number if LLM calls you are making
Hi Liqvid. GPT-4 is known to be slow (it's so popular that it's almost always jammed up). Would your use case work with GPT-3.5? (Or better, use a locally-served LLM so that you won't have to pay a dime?)
M1 8gb not much you can run locally maybe dolphin-phi

Better off using any of the hosted api mixtral I would think….
thank you guys for the replies.

@Logan M , its actually opposite, my index is quite small, it consists of 3 tables from xls file, each containing approx 10 rows n 12 columns.

@Vicent W. , i can try using the lower models of openai, i used this specific model due to its larger context window.

@mr.dronie , i specifically used openai due to its function calling ability, which is very much required for my use case.

i did try using Gemini pro as well, but the results m getting are like 2 ends of a rainbow!
I feel like something must be fishy with your setup for it to take 25 minutes
Add a reply
Sign up and join the conversation on Discord