Does anyone have experience using Llama Index with Dolly v2? I think the best way is to use HuggingFacePipeline.from_model_id() alongisde HuggingFaceEmbeddings() and pass that as the ServiceContext to a GPTVectorStoreIndex.from_documents().as_query_engine(), but I'm getting a few lines of sensible responding followed by a bunch of repetition and nonsense. Not sure if I just need to tweak parameters and response length, or if I'm producing Frankenstein's Monster here.
It could definitely be some parameters that need tweaking (temperature, top p/k, etc.). But tbh, I haven't had an great experience with dolly and llama index
Actually, we just released a notebook today for using local models. I used camel as the LLM, it seems to do pretty well with llama index (for the most part anyways)