Find answers from the community

Updated 3 months ago

I just cant reliably get the right

I just cant reliably get the right context for the llm, this is my current code:
Plain Text
embed_model = HuggingFaceBgeEmbeddings(model_name="sentence-transformers/gtr-t5-xxl")
service_context = ServiceContext.from_defaults(llm=llm,embed_model=embed_model)
set_global_service_context(service_context)

UnstructuredReader = download_loader('UnstructuredReader')
dir_reader = SimpleDirectoryReader('./data', file_extractor={
  ".html": UnstructuredReader(),
})
documents = dir_reader.load_data()
index = VectorStoreIndex.from_documents(documents, show_progress=True);

sometimes the context is good, sometimes its completely offtopic
my entire data is exported HTML from a company confluence
I went trough https://huggingface.co/spaces/mteb/leaderboard and tried most of the top multilanguage models, some are better, some are worse, but they all fail to find relevant context in 70% of the cases
what else can I try to improve accuracy?

edit:
just found a benchmark specifically for german, will test the top model there
https://github.com/ClimSocAna/tecb-de
but question still stands, any ways to improve?
W
C
9 comments
You could check @ravitheja cool blog on how you can improve RAG app accuracy,
https://blog.llamaindex.ai/evaluating-the-ideal-chunk-size-for-a-rag-system-using-llamaindex-6207e5d3fec5


Also I'm guessing your docs are in GERMAN, For that case OpenAI embeddings or opensource embedding for german would be a good way to start
sadly cant use any non local service
but I will read the blog, thanks!
I know this feeling πŸ˜…
FaithfulnessEvaluator
RelevancyEvaluator
that exists? how does that work
very interesting, thank you
I tried increasing the chunk size a little, also I added this to the prompt:
"Given the context information and not prior knowledge, " "answer the query. Keep in mind the current date is 27.10.2023 and parts of the context might include historical information\n"
because the model was pulling up outdated info about previous events
definitly saw an improvement but I will do the whole benchmark thing
Add a reply
Sign up and join the conversation on Discord