🌿 ^( O v O )^🌿

After using pip to install llama_index,

After using pip to install llama_index, I get the following when trying to import download_loader:

In [1]: from llama_index import download_loader
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[1], line 1
----> 1 from llama_index import download_loader

ImportError: cannot import name 'download_loader' from 'llama_index' (unknown location)

3 comments

�

�🌿 ^( O v O )^🌿

For some reason, making LLM queries with

For some reason, making LLM queries with llama-index VectorStoreIndex seems to be giving worse results than just plain LLM.

For instance, when I do

resp = llm.complete("What do you know about the Chord P2P protocol? Talk specifically about the algorithms known as P-Grid and M-Chord")

I get a detailed answer that is basically correct, well-written, and describes the algorithms in question in appropriate detail.

However, when I ask exactly the same question, using exactly the same LLM (mistral 7B) using streaming_response = query_engine.query("What do you know about the Chord P2P ...) I get a useless answer that mostly just told me the title and authors of the Chord paper that I ingested into my VectorStoreIndex, and then says that "details of M-Chord and P-grid cannot be provided without further context or prior knowledge"

I don't understand (a) why it "loses" the prior knowledge that the LLM clearly already had about these two algorithms when I try to ask the question with query_engine and (b) why it is not pulling any information from the PDF in the index that extensively talks about both M-Chord and P-Grid

3 comments

�

�🌿 ^( O v O )^🌿

If I'm using `Settings.llm=Ollama(model

If I'm using Settings.llm=Ollama(model="mistral") for my LLM, is there a specific embedding model I need to use when I'm trying to make a VectorStoreIndex from the documents? I was using HuggingFace Settings.embed_model = HuggingFaceEmbedding( model_name="sentence-transformers/all-MiniLM-L6-v2") ... does that make sense?

6 comments

�

�🌿 ^( O v O )^🌿

When I run the following code, I am

When I run the following code, I am getting OutOfMemory error. I can run Ollama without issue in the terminal, but this script is causing OOM ... what do I need to change here? Can someone explain if I'm doing the ServiceContext bit correctly, because I'm not really sure I understand what it's supposed to be doing and was honestly just doing some copypasta there:

from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings, ServiceContext

from llama_index.llms.ollama import Ollama
llm = Ollama(model="mistral", request_timeout=90.0)
Settings.llm=llm
service_context = ServiceContext.from_defaults(llm=llm, embed_model="local") 

Settings.embed_model = HuggingFaceEmbedding(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

resp = llm.complete("Suppose that you could prove from first principles that no group of odd order could compute the majority function. Why would this be a major result?")

index = VectorStoreIndex.from_documents(docs, show_progress=True, service_context=service_context)

Why is this causing CUDA to be out of memory, when this model runs fine at terminal?

2 comments

�

�🌿 ^( O v O )^🌿

OK, another question for the void: After

OK, another question for the void: After I create a VectorStoreIndex, and then try to run query_engine = index.as_query_engine() ... I am getting an error about OpenAI keys (even though I'm using huggingface embeddings and Ollama:

from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings, ServiceContext

from llama_index.llms.ollama import Ollama
llm = Ollama(model="mistral", request_timeout=30.0)

Settings.embed_model = HuggingFaceEmbedding(
    model_name="BAAI/bge-small-en-v1.5"
)

index = VectorStoreIndex.from_documents(docs)

How do I tell it to use ollama for llm?

3 comments

�

�🌿 ^( O v O )^🌿

I'm trying to run the following code,

I'm trying to run the following code, and it's only using 1 of 12 CPU cores while reading the PDF files from this directory. Is there a way to have SimpleDirectoryReader use multiprocessing or something to read in and parse multiple files at once?

from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader(
    input_dir="/home/ovo/code/datasets/ebooks/compsci/"
)

docs = reader.load_data()
print(f"Loaded {len(docs)} docs")

2 comments

�

�🌿 ^( O v O )^🌿

Large (400+% larger) storage size in relation to size of original dataset

I am working on a project where I will ultimately be processing thousands of PDF files (full length ebooks) with the llama-index PDF reader, and using the contents of these PDF files to augment the capabilities of a Mistral 7B chatbot.

I just ran an initial test where I processed about 50 files (~300MB worth of data) and when I did index.storage_context.persist(persist_dir=".") the resulting JSON took up approximately 1.2GB - a ~400% increase in storage space compared to the original dataset... is this pretty typical or is there perhaps something I'm doing wrong that is making it more bloated than in needs to be?

2 comments

�

Find answers from the community

After using pip to install llama_index,

For some reason, making LLM queries with

If I'm using `Settings.llm=Ollama(model

When I run the following code, I am

OK, another question for the void: After

I'm trying to run the following code,

Large (400+% larger) storage size in relation to size of original dataset