Anyone has some good reading material on using llamaindex llamacpp ie local llm on low spec devises for rag, ?? Iam looking for things to do to optimize my request as much as possible before sending them to the llm since in my case right now the llm is the bottleneck mostly looking i to rag features
Yeah iam experimenting with simillarity and trying to resuce the top k to use the most refrenced file and then do a retrive from rhe vector store on only that file ( havent got the last part right yet) i will post my findings here though i thinl there are many like me out there