Find answers from the community

Updated 3 months ago

I have two html files

I have two html files
one is
"Christmas Party 2023"
and one is
"Christmas Party 2019"
my prompt templates include this string:
"Given the context information and not prior knowledge, " "answer the query. The current date is 30.10.2023, please select only relevant data when creating the answer.\n" "Query: {query_str}\n" "Answer: "
and it correctly identifies that one of them is outdated and selects only the date provided in the 2023 version.
however, it mixes the activities from 2019 with 2023
the html files look like this:
Plain Text
<h1 id="title-text" class="with-breadcrumbs">
                                                <a href="/display/IN/Weihnachtsfeier+2023">Weihnachtsfeier 2023</a>
                                    </h1>
Erstellt von Peter Lustig, zuletzt geƤndert von Peterson Findus am Okt 30, 2023
<p><strong>Die Weihnachtsfeier findet am Freitag, den 15.12.2023 statt.
<description of activites here>

the context the llm receives looks like this:
Plain Text
Weihnachtsfeier 2023 Erstellt von Peter Lustig, zuletzt geƤndert von Peterson Findus am Okt 30, 2023

Die Weihnachtsfeier findet am Freitag, den 15.12.2023 statt.
<description of activites here>

the only problem I would see (from the logs) is that the llm receives both contexts right after another, without any seperator.

how could I tackle this challenge?
T
C
L
6 comments
What's your similarity top k? Maybe you could also include the dates as metadata so they're always persistent in the nodes?
didnt touch topk
the metadata is added when adding the documents right? I should probably stop using SimpleDirectoryReader then
@CHY4E From which log, you see that the llm receives both contexts?
I have tried this, there now is metadata with "heading", "author" and "lastEditedAt", the llm still fails to understand that one of the contexts is outdated
Add a reply
Sign up and join the conversation on Discord