Find answers from the community

Updated 3 months ago

Nvidia module issues with nemotron models

At a glance

The community member is using the NVIDIA module from the llama-index-llms-nvidia package, but is encountering issues with the "nemotron" models, which work fine with the native OpenAI library. They are getting a 404 Not Found error when trying to use the "nvidia/nemotron-4-51b-instruct" model.

The comments suggest that the community member should try upgrading to the latest version of the llama-index-llms-nvidia package, which is 0.2.6, as it may fix the issue. Another community member mentions that the model name may be a typo and should be "nvidia/llama-3.1-nemotron-51b-instruct" instead.

The community member also tried using the "nvidia/nemotron-4-340b-instruct" model, but encountered a 400 Bad Request error. They are unable to upgrade to the latest version of the package due to dependency conflicts, but the head of open source for llama-index suggests that the latest version should fix the issue.

The community member also inquires about incorporating Nemoguardrails into the llama-index workflow, and the head of open source confirms that any LLM can be used in the workflow, and that llama

Useful resources
Hi, I'm using NVIDIA module. It works fine for any model, except for nemotron models (which work smooth with native OpenAI library). Is there any idea on that?

Plain Text
from llama_index.llms.nvidia import NVIDIA

    Settings.llm = NVIDIA(model="nvidia/nemotron-4-51b-instruct", ...)


Plain Text
INFO:httpx:HTTP Request: POST https://integrate.api.nvidia.com/v1/chat/completions "HTTP/1.1 404 Not Found"

...

  File "/home/0/miniconda3/envs/gpu_rag/lib/python3.10/site-packages/openai/_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: 404 page not found
L
F
8 comments
Not sure, but I can pass this along to the nvidia folks (they maintain this integration)
Do you have the latest version of the package? pip install -U llama-index-llms-nvidia ?
Is that maybe a typo?

nvidia/llama-3.1-nemotron-51b-instruct instead of nvidia/nemotron-4-51b-instruct ?

https://build.nvidia.com/nvidia/llama-3_1-nemotron-51b-instruct?snippet_tab=Shell
Thank you for your reply! The snippet I pasted was sort of amalgamated. I was testing "nvidia/nemotron-4-340b-instruct". The exact code and erorr message is:

Plain Text
from llama_index.llms.nvidia import NVIDIA

    Settings.llm = NVIDIA(model="nvidia/nemotron-4-340b-instruct", ...)


Plain Text
INFO:httpx:HTTP Request: POST https://integrate.api.nvidia.com/v1/chat/completions "HTTP/1.1 400 Bad Request"
2024-10-30 13:47:25.979 Uncaught app exception
Traceback (most recent call last):

...

/home/0/miniconda3/envs/gpu_rag/lib/python3.10/site-packages/openai/_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'type': 'about:blank', 'status': 400, 'title': 'Bad Request', 'detail': 'Inference error'}


I'm using llama-index-llms-nvidia 0.1.4 , but I can't upgrade to the latest on my project due to the dependency conflict. I tested, though, with
Plain Text
pip install -U llama-index-llms-nvidia --no-deps
but it gave the same error.
I'm not sure the upgrade with dependencies will fix the error. Thank you again!!
the latest is 0.2.6 for the nvidia llm, so it will almost certainly fix this I think πŸ€” They've done a lot to maintain this class
You really seem to be keeping up with their progress. Then, do you happen to know Llamaindex workflow can incorporate Nemoguardrails as one of multi-agents or by as_query_engine, or the workflow:

Plain Text
from llama_index.core.llms import ChatMessage
from llama_index.core.tools import ToolSelection, ToolOutput
from llama_index.core.workflow import Event


class InputEvent(Event):
    input: list[ChatMessage]


class ToolCallEvent(Event):
    tool_calls: list[ToolSelection]


class FunctionOutputEvent(Event):
    output: ToolOutput

(is the workflow module also deprecated by llama_deploy? please say no...πŸ₯Ί )
I'm the head of open source for llama-index lol so I hope I keep up with it πŸ˜…

You can use any llm you want in any workflow

Llama-Deploy is just one way to host your existing workflows as services πŸ‘
🫣 You're the Logan! I'm stoked 🀩

I just found this one, so I'm goana give it a try: Building a multi-agent concierge system from scratch

Eager to see it work on me 🧐 Have a wonderful day or night!
Add a reply
Sign up and join the conversation on Discord