Refine and Compact are actually very similar (the Compact object class extends refine)
The only difference is that refine will query the LLM once per node
Compact will stuff as much as possible into each LLM. But it may still make more than one LLM call and refine if all the text does not fit into the first LLM call
Please add similarity_cutoff, similarity_top_k params with response mode param. In updated version, you need to use query_engine and query with query_engine.