Sadly, the only "free" ones are available from huggingface (I.e. the last two links I sent)
But this assumes you have the hardware needed to run the models, which in most cases is pretty expensive (I.e. you'd need a 3090 minimum to run a decent LLM at a good speed)
one question regarding the response, why does it sometime cut off like it doesn't finish the full sentence? Is there any way around that such that it finish the sentence or continues from that point onward?