StableLM

mmegler

I finally have a script up and running using a LLM other than OpenAI. I am providing the script 1 PDF that is roughly 300 pages. The answer I'm getting has 2 problems:

It's hallucinating it's opening sentence
The list it provides is accurate, but repeated over and over. So What should be a list of 9 items is 35 of the same thing repeating.

I have a feeling perhaps I have something out of order or I am missing a step. I had a hard time cobbling everything together where I could get an output.

5 comments

LLogan M

Ngl stableLM is not great. I think you will struggle to get a 3b model to work well tbh

I'm guessing the repetitive response can be fixed by tweaking the the generate kwargs

Maybe slightly increasing the repetition penalty, or messing around with the top p will help?

The GenerationConfig here lists all available parameters to tweak (in addition to temperate an do_sample that you already have)

Since load the model takes a long time, you can change these after the model has loaded by modifying the llm._generate_kwargs dict

LLogan M

https://huggingface.co/docs/transformers/v4.30.0/main_classes/text_generation

mmegler

@Logan M Is there a model you'd recommend? Really my goal is to get away from a paid API like OpenAI. A stretch goal is it runs on 16gb ram, but I can always run it on Google Colab for better performance. The real thing I'm trying to get to is using a LLM that I don't have to pay for and is fairly accurate reading documentation. (Might be too big of an ask!)

Thanks for the feedback, I'll take a look at the docs. 🙂

LLogan M

Yea open-source llms still have a lot of catching up to do it seems

With only 16gb, if you cant get stablelm to generate bettrr, you could look into using llama2 or similar with llama.cpp. if you install llama.cpp with gpu support, it's decently fast and easy to use.

Langchain also has a llama.cpp integration you can use with llama index

You cam use any LLM from langchain, as long as you wrap it with our wrapper

Plain Text

from llama_index.llms import LangChainLLM
from llama_index import ServiceContext, set_global_service_context 

llm = LangChainLLM(<langchain llm>)
ctx = ServiceContext.from_defaults(llm=llm)
set_global_service_context(ctx)

mmegler

Thanks much, I'll give it a try now. Thank you!

Add a reply

Find answers from the community

StableLM