Find answers from the community

Updated 2 months ago

Nvidia module issues with nemotron models

Hi, I'm using NVIDIA module. It works fine for any model, except for nemotron models (which work smooth with native OpenAI library). Is there any idea on that?

Plain Text
from llama_index.llms.nvidia import NVIDIA

    Settings.llm = NVIDIA(model="nvidia/nemotron-4-51b-instruct", ...)


Plain Text
INFO:httpx:HTTP Request: POST https://integrate.api.nvidia.com/v1/chat/completions "HTTP/1.1 404 Not Found"

...

  File "/home/0/miniconda3/envs/gpu_rag/lib/python3.10/site-packages/openai/_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: 404 page not found
L
F
8 comments
Not sure, but I can pass this along to the nvidia folks (they maintain this integration)
Do you have the latest version of the package? pip install -U llama-index-llms-nvidia ?
Is that maybe a typo?

nvidia/llama-3.1-nemotron-51b-instruct instead of nvidia/nemotron-4-51b-instruct ?

https://build.nvidia.com/nvidia/llama-3_1-nemotron-51b-instruct?snippet_tab=Shell
Thank you for your reply! The snippet I pasted was sort of amalgamated. I was testing "nvidia/nemotron-4-340b-instruct". The exact code and erorr message is:

Plain Text
from llama_index.llms.nvidia import NVIDIA

    Settings.llm = NVIDIA(model="nvidia/nemotron-4-340b-instruct", ...)


Plain Text
INFO:httpx:HTTP Request: POST https://integrate.api.nvidia.com/v1/chat/completions "HTTP/1.1 400 Bad Request"
2024-10-30 13:47:25.979 Uncaught app exception
Traceback (most recent call last):

...

/home/0/miniconda3/envs/gpu_rag/lib/python3.10/site-packages/openai/_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'type': 'about:blank', 'status': 400, 'title': 'Bad Request', 'detail': 'Inference error'}


I'm using llama-index-llms-nvidia 0.1.4 , but I can't upgrade to the latest on my project due to the dependency conflict. I tested, though, with
Plain Text
pip install -U llama-index-llms-nvidia --no-deps
but it gave the same error.
I'm not sure the upgrade with dependencies will fix the error. Thank you again!!
the latest is 0.2.6 for the nvidia llm, so it will almost certainly fix this I think πŸ€” They've done a lot to maintain this class
You really seem to be keeping up with their progress. Then, do you happen to know Llamaindex workflow can incorporate Nemoguardrails as one of multi-agents or by as_query_engine, or the workflow:

Plain Text
from llama_index.core.llms import ChatMessage
from llama_index.core.tools import ToolSelection, ToolOutput
from llama_index.core.workflow import Event


class InputEvent(Event):
    input: list[ChatMessage]


class ToolCallEvent(Event):
    tool_calls: list[ToolSelection]


class FunctionOutputEvent(Event):
    output: ToolOutput

(is the workflow module also deprecated by llama_deploy? please say no...πŸ₯Ί )
I'm the head of open source for llama-index lol so I hope I keep up with it πŸ˜…

You can use any llm you want in any workflow

Llama-Deploy is just one way to host your existing workflows as services πŸ‘
🫣 You're the Logan! I'm stoked 🀩

I just found this one, so I'm goana give it a try: Building a multi-agent concierge system from scratch

Eager to see it work on me 🧐 Have a wonderful day or night!
Add a reply
Sign up and join the conversation on Discord