Find answers from the community

Updated 4 months ago

Hi can anyone point me to anexample for

At a glance
Hi can anyone point me to anexample for a local llm chat bot with following steps-
  1. Retrieve documents from qdrant store (done)
  2. Rerank retrieved results with cross encoder ( I saw an example of hugging face but not sure how to apply this to retrieved results)
  3. Create an llm with chat history and context. It will have a custom prompt to use chat history and context for answering
4 Put all of above in a continuous chat bot experience with ollama
L
g
2 comments
Plain Text
pip install llama-index
pip install llama-index-llms-ollama
pip install llama-index-embeddings-??
pip install llama-index-vector-stores-qdrant

# could use colbert to rerank
pip install llama-index-postprocessors-colbert-rerank

# or sentence-transformers
pip install llama-index-postprocessor-sbert-rerank


Plain Text
llm = ...
embed_model = ...

from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.qdrant import QdrantVectorStore

storage_context = StorageContext.from_defaults(vector_store=QdrantVectorStore(...))
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, embed_model=embed_model)

from llama_index.core.chat_engine import CondensePlusContextChatEngine
from llama_index.postprocessor.sbert_rerank import SentenceTransformerRerank

chat_engine = CondensePlusContextChatEngine.from_defaults(
  index.as_retriever(similarity_top_k=8),
  node_postprocessors=[SentenceTransformerRerank(model="BAAI/bge-reranker-base", top_n=2)],
  llm=llm,
)

chat_engine.chat("Hello")
Add a reply
Sign up and join the conversation on Discord