In [1]: from llama_index import download_loader
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[1], line 1
----> 1 from llama_index import download_loader
ImportError: cannot import name 'download_loader' from 'llama_index' (unknown location)
resp = llm.complete("What do you know about the Chord P2P protocol? Talk specifically about the algorithms known as P-Grid and M-Chord")
I get a detailed answer that is basically correct, well-written, and describes the algorithms in question in appropriate detail. streaming_response = query_engine.query("What do you know about the Chord P2P ...)
I get a useless answer that mostly just told me the title and authors of the Chord paper that I ingested into my VectorStoreIndex, and then says that "details of M-Chord and P-grid cannot be provided without further context or prior knowledge"
query_engine
and (b) why it is not pulling any information from the PDF in the index that extensively talks about both M-Chord and P-GridSettings.llm=Ollama(model="mistral")
for my LLM, is there a specific embedding model I need to use when I'm trying to make a VectorStoreIndex from the documents? I was using HuggingFace Settings.embed_model = HuggingFaceEmbedding( model_name="sentence-transformers/all-MiniLM-L6-v2")
... does that make sense?from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings, ServiceContext
from llama_index.llms.ollama import Ollama
llm = Ollama(model="mistral", request_timeout=90.0)
Settings.llm=llm
service_context = ServiceContext.from_defaults(llm=llm, embed_model="local")
Settings.embed_model = HuggingFaceEmbedding(
model_name="sentence-transformers/all-MiniLM-L6-v2"
)
resp = llm.complete("Suppose that you could prove from first principles that no group of odd order could compute the majority function. Why would this be a major result?")
index = VectorStoreIndex.from_documents(docs, show_progress=True, service_context=service_context)
query_engine = index.as_query_engine()
... I am getting an error about OpenAI keys (even though I'm using huggingface embeddings and Ollama: from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings, ServiceContext
from llama_index.llms.ollama import Ollama
llm = Ollama(model="mistral", request_timeout=30.0)
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
index = VectorStoreIndex.from_documents(docs)
from llama_index.core import SimpleDirectoryReader
reader = SimpleDirectoryReader(
input_dir="/home/ovo/code/datasets/ebooks/compsci/"
)
docs = reader.load_data()
print(f"Loaded {len(docs)} docs")
index.storage_context.persist(persist_dir=".")
the resulting JSON took up approximately 1.2GB - a ~400% increase in storage space compared to the original dataset... is this pretty typical or is there perhaps something I'm doing wrong that is making it more bloated than in needs to be?