Could anyone familiar with getting Llamaindex working with Llamacpp on Macos/Apple Silicon please message me to help me with something? It has to do with getting the GPU to work.
(Edit) Unable to use GPU on M1 Max MacBook Pro. Recompiled llama-cpp-python with metal support as per docs but no joy. Any help is appreciated.
Testing llama.cpp separately uses gpu
Hi, just a quick question, does indexing use only CPU? Is there a way to accelerate using GPU? I'm on Apple Silicon and I'm only seeing CPU usage while indexing multiple pdfs so just wondering if I'm doing anything wrong.
Traceback (most recent call last): File "llamaIndex/starter5.py", line 3, in <module> from llama_index import download_loader ImportError: cannot import name 'download_loader' from 'llama_index' (unknown location)
Hey everyone! So I've been trying out different text analyzing chats on both llama.cpp and text-generation-webui. In both cases, the 7b model gives me the correct response in the correct context with the correct information. Using the exact same prompt with the 13b model causes it to just ask me my own question back to me or a sentence like "working it on it... it'll be done soon" and that will be the end of the generation. What am I doing wrong? Why is it hallucinating so much? The bigger model should be better for understanding context right? Any help is appreciated. (Apologize for the double post)
Hi everyone, I was wondering where I would add the parameter which would allow me to fetch more than 2 nodes or sources when querying my index. Thank you!
Hi everyone. I was wondering how I might go about querying a PDF and then having the page number that llamaindex found the information on be displayed as well. I have the initial query part done by following the original llamaindex example. I am quite stumped on how to get a page number. I do not want to use the output to again search the pdf using python. Please let me know where I should be looking and how to approach this. Thank you.