Hi everyone! I've been recently diving into AI/RAG, particularly using LlamaIndex (Love the project, amazing work!) and open source models. I have 2 questions that might come out of my ignorance, but would be great if anyone could answer:
- Why do we use top k chunks in similarity instead of setting a similarity threshold?
- Is there a "recommended" maximum size/amount of documents for ingestion? As in, after X amount the model might not perform as good as expected?
Thanks!