if you have research papers -> what's the best way to extract and chunk the data? I want the LLM to reference the journal article when providing answers
I usually add metadata and increase similarity top k but keep most parameters like chunk size as default. For completions I use CitationQueryEngine which provides in-text citations that map back to the retrieved source nodes