Hey there, I can see when I do index.query that there is an 'initial reponse' and a 'refined response' - the initial response is actually what I need, how can I parse that out instead of the refined? And what are the levers that defines what 'refined' is and how can I optimize that? Thanks!
For Notion, the problem was the Reader would pull in the "body" of a page (what the user has written on the page, relevant info etc) but none of the metadata (title of the page, when it was created etc) wasn't included.
I looped through the Documents once they were created and hit a Notion API which got the metadata (but couldn't retrieve the body for some reason) and appended it to the Document as a dict.
That way when a query comes in the metadata can be used to create a vector similarity which allowed for retrieval to be more accurate (and more context for the language model to work off of when generating a response). As detailed in the docs, a language model could even create the metadata itself to allow for better querying.
How has everyone dealt with speed issues? Our dataset isn't that large but querying an index seems to take a while (possibly because of the number of network requests in our API).
The code in my notebook was working before but now that I've updated my llama version I'm getting these errors:
Plain Text
ImportError: cannot import name 'RESPONSE_TEXT_TYPE' from partially initialized module 'llama_index.indices.response.builder' (most likely due to a circular import)