The embedding mode will only check the closest matching text chunks. By default this is 1, but you can set it in the query with something like similarity_top_k=2
My concern here is that i will provide the LLM a context that is not larger than 4096 tokens and making a single call. There are probably over 100k token on my data and it will only extract topics of the "sample" given in the context