Find answers from the community

Updated 2 months ago

Hi, @Logan M

Hi,
Can we use VectorStore in the llama_index Ollama (LLaMA 3 model)? using Qdrant
My task is to chat with my own document, which is uploaded on the Quadrant server, chat using Ollama (LLaMA 3 model).
W
V
L
26 comments
Yes you can:
https://docs.llamaindex.ai/en/stable/examples/llm/ollama/?h=ollama

Once you have the llm, create the VectorStoreIndex object like you do, pass the llm or add it to Settings.

Plain Text
from llama_index.core import VectorStoreIndex, Settings
llm = ollama instance
Settings.llm = llm
# create instance for qdrant
vector_store = Qdrant vector store instance
# pass it to index
index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
Thanks @WhiteFang_Jr ,
but how can we insert the document file data to our Vector DB.
yes, once you have the Document object you can simply insert them into index.
Plain Text
for doc in docs:
  index.insert(doc)

https://docs.llamaindex.ai/en/stable/module_guides/indexing/document_management/?h=insertion#insertion
Hi @WhiteFang_Jr
When I execute this code, I encounter an issue with the following error:
ValueError:
**
Could not load OpenAI embedding model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

Consider using embed_model='local'.
Visit our documentation for more embedding options: https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings.html#modules
**
You'll have to define your embedding model. Either pass it down or set it with Settings at the top ( i prefer this )
Plain Text
from llama_index.core import Settings
# After defining your embedding model
Settings.embed_model = embed_model # Your embed model instance here
Thanks @WhiteFang_Jr embedding successfully created!
@WhiteFang_Jr
can you please help me for one more thing .
actually now i want to search a query regarding of the my uploaded document(embedding). When it find exact match then give a result else say nothing. So how can i do this i'll tryed multiple ways but didn't anyone works.
Did you try modifying the prompt?
Or adding a SimilarityPostprocessor in your query_engine ?
This will filter out the below set threshold values nodes.
These options will help you to limit LLM to respond
I used this, but it did not provide the correct answer.
and one more thing why it will take a lots of time for performing all the operations?
query_engine.query
its take a lots of time for generating a query response.
If you are using a local model, it will be quite slow
So the llama3 is local model ?
If the llama3 is the local model, so instead of this which one i use ?
How are you running llama3 model? If you have kept it running on your machine then yes it is locally running. If you are using service like replicate or grok then its not
yes so it's local
Using ollama run llama3
Yeah that's why it is taking time
Can you please suggest me a correct way to reduce the time taken to query response.
You can try running this on colab using Ollama. It will respond faster as it will have around 16GB GPU over there
Do you have any idea how I can do this?
Hy, @WhiteFang_Jr
Add a reply
Sign up and join the conversation on Discord