you can use local embeddings just fine.
ServiceContext.from_defaults(embed_model="local:BAAI/bge-base-en-v1.5")
Many options for embeddings actually
https://docs.llamaindex.ai/en/stable/core_modules/model_modules/embeddings/modules.htmlYou can limit keyword results with a node postprocessor (like a reranker, I explained earlier today I think)
Generally, the ideal solution is something that combines all of these (i.e. in a router query engine, or a sub-question query engine)