Find answers from the community

Home
Members
๐ŸŒฟ ^( O v O )^๐ŸŒฟ
๏ฟฝ
๐ŸŒฟ ^( O v O )^๐ŸŒฟ
Offline, last seen 3 months ago
Joined September 25, 2024
After using pip to install llama_index, I get the following when trying to import download_loader:

In [1]: from llama_index import download_loader --------------------------------------------------------------------------- ImportError Traceback (most recent call last) Cell In[1], line 1 ----> 1 from llama_index import download_loader ImportError: cannot import name 'download_loader' from 'llama_index' (unknown location)
3 comments
๏ฟฝ
L
W
For some reason, making LLM queries with llama-index VectorStoreIndex seems to be giving worse results than just plain LLM.

For instance, when I do resp = llm.complete("What do you know about the Chord P2P protocol? Talk specifically about the algorithms known as P-Grid and M-Chord") I get a detailed answer that is basically correct, well-written, and describes the algorithms in question in appropriate detail.

However, when I ask exactly the same question, using exactly the same LLM (mistral 7B) using streaming_response = query_engine.query("What do you know about the Chord P2P ...) I get a useless answer that mostly just told me the title and authors of the Chord paper that I ingested into my VectorStoreIndex, and then says that "details of M-Chord and P-grid cannot be provided without further context or prior knowledge"

I don't understand (a) why it "loses" the prior knowledge that the LLM clearly already had about these two algorithms when I try to ask the question with query_engine and (b) why it is not pulling any information from the PDF in the index that extensively talks about both M-Chord and P-Grid
3 comments
๏ฟฝ
T
If I'm using Settings.llm=Ollama(model="mistral") for my LLM, is there a specific embedding model I need to use when I'm trying to make a VectorStoreIndex from the documents? I was using HuggingFace Settings.embed_model = HuggingFaceEmbedding( model_name="sentence-transformers/all-MiniLM-L6-v2") ... does that make sense?
6 comments
๏ฟฝ
T
When I run the following code, I am getting OutOfMemory error. I can run Ollama without issue in the terminal, but this script is causing OOM ... what do I need to change here? Can someone explain if I'm doing the ServiceContext bit correctly, because I'm not really sure I understand what it's supposed to be doing and was honestly just doing some copypasta there:

from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index.core import Settings, ServiceContext from llama_index.llms.ollama import Ollama llm = Ollama(model="mistral", request_timeout=90.0) Settings.llm=llm service_context = ServiceContext.from_defaults(llm=llm, embed_model="local") Settings.embed_model = HuggingFaceEmbedding( model_name="sentence-transformers/all-MiniLM-L6-v2" ) resp = llm.complete("Suppose that you could prove from first principles that no group of odd order could compute the majority function. Why would this be a major result?") index = VectorStoreIndex.from_documents(docs, show_progress=True, service_context=service_context)

Why is this causing CUDA to be out of memory, when this model runs fine at terminal?
2 comments
๏ฟฝ
L
OK, another question for the void: After I create a VectorStoreIndex, and then try to run query_engine = index.as_query_engine() ... I am getting an error about OpenAI keys (even though I'm using huggingface embeddings and Ollama:

from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index.core import Settings, ServiceContext from llama_index.llms.ollama import Ollama llm = Ollama(model="mistral", request_timeout=30.0) Settings.embed_model = HuggingFaceEmbedding( model_name="BAAI/bge-small-en-v1.5" ) index = VectorStoreIndex.from_documents(docs)

How do I tell it to use ollama for llm?
3 comments
๏ฟฝ
W
I'm trying to run the following code, and it's only using 1 of 12 CPU cores while reading the PDF files from this directory. Is there a way to have SimpleDirectoryReader use multiprocessing or something to read in and parse multiple files at once?

from llama_index.core import SimpleDirectoryReader reader = SimpleDirectoryReader( input_dir="/home/ovo/code/datasets/ebooks/compsci/" ) docs = reader.load_data() print(f"Loaded {len(docs)} docs")
2 comments
๏ฟฝ
W
I am working on a project where I will ultimately be processing thousands of PDF files (full length ebooks) with the llama-index PDF reader, and using the contents of these PDF files to augment the capabilities of a Mistral 7B chatbot.

I just ran an initial test where I processed about 50 files (~300MB worth of data) and when I did index.storage_context.persist(persist_dir=".") the resulting JSON took up approximately 1.2GB - a ~400% increase in storage space compared to the original dataset... is this pretty typical or is there perhaps something I'm doing wrong that is making it more bloated than in needs to be?
2 comments
๏ฟฝ