Find answers from the community

Updated 3 months ago

Anyone have a solution in their RAG app

Anyone have a solution in their RAG app for smart date parsing? E. G. To filter docs in a date range based on the query. I’m aware of things like spacy which can identity date entities, but then there is a missing intermediate step to go from something like “this year” to an actual date/time for a query filter.
L
b
8 comments
You could use an LLM to do some preprocessing on the query. Using like a pydantic program or something.

Plain Text
from typing import List
from pydanitc import BaseModel, Field

def Entity(BaseModel):
  """Represents an entity name (date, time, place, etc.) and it's associated value."""
  name: str
  value: str

def Entities(BaseModel):
  """A list of entities."""
  entities: List[Entity]


from llama_index.program import OpenAIPydanticProgram

prompt = "Given a query, extract any useful entities. If none are found, return an empty list.\nQuery: {query_str}"
program = program = OpenAIPydanticProgram.from_defaults(
    output_cls=Entities,
    prompt_template_str=prompt,
    llm=OpenAI(model="gpt-3.5-turbo-1106"),
)

entities = program(query_str="My query")
print(entities.entities)
You could of course tweak that as much as you need. Kind of annoying to have to call the LLM for this, but I can't think of a better way to catch those edge cases you mentioned
I’m thinking about fine tuning Bert for this
I have so so many things that this would help with
The prompt would be like Todays Date is YYYY-MM-DD. The user has entered the query below: xxxx
Extract the relevant date range:
Would have to be a generative model, not just bert. Probably T5 is a good choice 🤔
Yeah. I feel like this should be a pretty popular thing if done well? Rather than going full bore text to sql or w/e, no matter what your RAG stack date filters are important for narrowing the search space
Our auto retriever would technically handle that. Introduces an LLM step to write the filters, top k, and query
Add a reply
Sign up and join the conversation on Discord