i have been going directly from index.as_query_engine() to query(). With query(), it takes time and costs. Does it make sense to use the Retriever classes to check the index on how well the chuncks returned can answer the query or is that prone to too much tweaking to not be worth it?
i was going to do that. And then started thinking I don't understand the retriever <-> llm relationship well enough. So i query the retriever. Most likely it's going to find at least one chunk. What threshold do i choose to say the chunk is "similar enough"? I just don't have a feel how many times the retriever returns 0 (I think never) chunks or is there some threshold that says "Good luck with that - you're going to get a crappy answer."
if i could do this, given the time it takes to get a response from the (openai) llm just to get a crappy answer, I wanted to eliminate that experience + the cost (although the cost is less than a penny from what I can tell...)