Find answers from the community

Updated 3 months ago

```program = GuidancePydanticProgram( output_cls=Character, prompt_template_str=(

Plain Text
program = GuidancePydanticProgram(
    output_cls=Character,
    prompt_template_str=(
        "Give me a character description"
        " the movie {{game_name}} as inspiration"
    ),
    guidance_llm=outlines.models.transformers("mistralai/Mistral-7B-v0.1", device="cuda"),
    verbose=True,
)
and getting out of memory error running on 32 gb VRAM 3090? ann thoughts on how to optimize it?
k
V
3 comments
Please stand by @andysingal as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
Hmm, I don't know enough to give you a confident answer yet. The information extracts provided do not contain specific details on how to optimize the GuidancePydanticProgram to avoid out of memory errors on a GPU with 32GB VRAM.

However, it's worth noting that large language models like the one you're using (mistralai/Mistral-7B-v0.1) can consume a significant amount of memory, especially when processing large inputs or outputs.

You might want to consider the following general strategies for optimizing memory usage:

  1. Reduce the batch size: If you're processing multiple inputs at once, try reducing the number of inputs you process at the same time.
  1. Truncate the input or output: If your inputs or outputs are very long, you could try reducing their length. However, be aware that this could affect the quality of the model's output.
  1. Use a smaller model: If memory usage continues to be an issue, you might need to switch to a smaller model. This could also affect output quality, but it might be necessary if memory constraints are a significant issue.
For more specific guidance, you might find it helpful to refer to the LlamaIndex documentation () or the LlamaIndex GitHub repository ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
32 GB going out for a 16 GB model doesn't seem right. Can you paste a screenshot of your VRAM + RAM usage graph when the error happens, like Nicholas did in their message?
https://discord.com/channels/1059199217496772688/1059200010622873741/1190678757707632731
Add a reply
Sign up and join the conversation on Discord