The post shows some code for setting up a vector store and indexing documents. Community members in the comments discuss the performance of this setup, noting that the chunk size and overlap seem suboptimal. They suggest using a larger chunk size (e.g., 512) and increasing the top k retrieved nodes. The community members also note that the data being indexed (4.78 MB across 4 documents) may be too large, and that the presence of tables in the data could be causing issues with storage and retrieval. They provide suggestions for improving performance, such as using a reranker with a top-n of 3 or 4.
4.78 MB is a looot of text π I would suggest a) using a chunk size of 512 b) increasing the top k -- probably 12? c) using a reranker with top-n = 3 or 4