Find answers from the community

Updated yesterday

Creating An Index With An Openai-like Llm

Hello People,
I need your guidance.

Plain Text
from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(
  model="model",
  api_key="Key",
  api_base="OpenAI Compatible endpoint",
  context_window=16000,
  is_chat_model=True,
  is_function_calling_model=False,
)
Settings.embed_model = llm

# Create index
index = VectorStoreIndex.from_documents(
    documents, 
    show_progress=True)

In this code facing error. Am I doing something wrong?
Plain Text
1AssertionError                            Traceback (most recent call last)
Cell In[22], line 31
     22 documents = SimpleDirectoryReader("../data", required_exts=[".txt"]).load_data()
     23 #embed_model = llm
     24 
     25 
   (...)
     29 #     api_base="http://tentris-ml.cs.upb.de:8502/v1"
     30 # )
---> 31 Settings.embed_model = llm
     33 # Create index
     34 index = VectorStoreIndex.from_documents(
     35     documents, 
     36     show_progress=True)

File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\settings.py:74, in _Settings.embed_model(self, embed_model)
     71 @embed_model.setter
     72 def embed_model(self, embed_model: EmbedType) -> None:
     73     """Set the embedding model."""
---> 74     self._embed_model = resolve_embed_model(embed_model)

File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\embeddings\utils.py:136, in resolve_embed_model(embed_model, callback_manager)
    133     print("Embeddings have been explicitly disabled. Using MockEmbedding.")
    134     embed_model = MockEmbedding(embed_dim=1)
--> 136 assert isinstance(embed_model, BaseEmbedding)
    138 embed_model.callback_manager = callback_manager or Settings.callback_manager
    140 return embed_model

I need your little time. Please help
L
K
30 comments
An llm cannot be an embedding model
Did you mean to use OpenAIEmbedding ?
I am trying to use OpenAI compatible end point
https://url/ v1 ... That is a self hosted model built upon OpenAI
So, there are two models, LLMs and Embeddings models.

One generates text, the other generates a list of numbers representing a piece of text.

Right now, you did Settings.embed_model = llm which cannot work

You need an embedding model (either hosted on your server, or some huggingface model running locally, or some remote provider)
I want to use an embedding model that is remotely hosted on my uni server that is based on OpenAI. I have url endpoints and api key as well, but don't know how to use it.
Could you guide or provide any resources
Well, you'll need to probably know the name of the embedding model

But beyond that, its just

Plain Text
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.embed_model = OpenAIEmbedding(model_name="your model", api_key="fake", api_base="http://localhost:8000/v1")
Plain Text
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
import os
from dotenv import load_dotenv
from llama_index.core.embeddings.utils import EmbedType, resolve_embed_model

load_dotenv()

Settings.embed_model = OpenAIEmbedding(
    api_base=os.getenv("TENTRIS_BASE_URL_EMBEDDINGS"),
    api_key=os.getenv("TENTRIS_API_KEY"),
    model_name=os.getenv("tentris"),
)

# Create index
index = VectorStoreIndex.from_documents(
    documents, 
    show_progress=True)
Plain Text
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[2], line 10
      6 from llama_index.core.embeddings.utils import EmbedType, resolve_embed_model
      8 load_dotenv()
---> 10 Settings.embed_model = OpenAIEmbedding(
     11     api_base=os.getenv("TENTRIS_BASE_URL_EMBEDDINGS"),
     12     api_key=os.getenv("TENTRIS_API_KEY"),
     13     model_name=os.getenv("tentris"),
     14 )
     16 # Create index
     17 index = VectorStoreIndex.from_documents(
     18     documents, 
     19     show_progress=True)

File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\embeddings\openai\base.py:310, in OpenAIEmbedding.__init__(self, mode, model, embed_batch_size, dimensions, additional_kwargs, api_key, api_base, api_version, max_retries, timeout, reuse_client, callback_manager, default_headers, http_client, async_http_client, num_workers, **kwargs)
    307 else:
    308     model_name = model
--> 310 super().__init__(
    311     embed_batch_size=embed_batch_size,
    312     dimensions=dimensions,
    313     callback_manager=callback_manager,
    314     model_name=model_name,
    315     additional_kwargs=additional_kwargs,
    316     api_key=api_key,
...

ValidationError: 1 validation error for OpenAIEmbedding
model_name
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.10/v/string_type
thanks a millionnnnnnnn my dear brother
I don't know why I can't figure out this fucking shit
Plain Text
# Set up the OpenAI LLM
llm = OpenAILike(
  model='tentris',
  api_key=os.getenv("TENTRIS_API_KEY"),
  api_base=os.getenv("TENTRIS_BASE_URL_CHAT"),
)
chat_engine = index.as_chat_engine()
str(chat_engine.chat("Where is dice group located?"))

chat_engine = index.as_chat_engine(llm)
str(chat_engine.chat("Where is dice group located?"))
You'll need to set model_name="some name" to get rid of that pydantic error

For the llm, you can set Settings.llm = llm, or you can pass in as_chat_engine(llm=llm)
Currectly facing llm error,

Plain Text
load_dotenv()

# Set up the OpenAI LLM
Settings.llm = OpenAILike(
  model='tentris',
  api_key=os.getenv("TENTRIS_API_KEY"),
  api_base=os.getenv("TENTRIS_BASE_URL_CHAT"),

)

Settings.embed_model = OpenAIEmbedding(
    api_base=os.getenv("TENTRIS_BASE_URL_EMBEDDINGS"),
    api_key=os.getenv("TENTRIS_API_KEY"),
    model_name='tentris',
)

# Create index
index = VectorStoreIndex.from_documents(
    documents, 
    show_progress=True)

Plain Text
chat_engine = index.as_chat_engine()
Plain Text
str(chat_engine.chat("Where is dice group located?"))

ValueError                                Traceback (most recent call last)
Cell In[14], line 1
----> 1 str(chat_engine.chat("Where is dice group located?"))

File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py:321, in Dispatcher.span.<locals>.wrapper(func, instance, args, kwargs)
    318             _logger.debug(f"Failed to reset active_span_id: {e}")
    320 try:
--> 321     result = func(*args, **kwargs)
    322     if isinstance(result, asyncio.Future):
    323         # If the result is a Future, wrap it
    324         new_future = asyncio.ensure_future(result)

File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\callbacks\utils.py:41, in trace_method.<locals>.decorator.<locals>.wrapper(self, *args, **kwargs)
     39 callback_manager = cast(CallbackManager, callback_manager)
     40 with callback_manager.as_trace(trace_id):
---> 41     return func(self, *args, **kwargs)

File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\agent\runner\base.py:692, in AgentRunner.chat(self, message, chat_history, tool_choice)
    687     tool_choice = self.default_tool_choice
>     688 with self.callback_manager.event(
    689     CBEventType.AGENT_STEP,
    690     payload={EventPayload.MESSAGES: [message]},
    691 ) as e:
--> 692     chat_response = self._chat(
...
--> 437     raise ValueError("Reached max iterations.")
    439 if isinstance(current_reasoning[-1], ResponseReasoningStep):
    440     response_step = cast(ResponseReasoningStep, current_reasoning[-1])

ValueError: Reached max iterations.
facing error aftert this str(chat_engine.chat("Where is dice group located?"))
Max iterations means the llm couldn't figure it out.

You can try increasing it, but i don't think that will help

If you llm is not that smart, consider a different chat engine

index.as_chat_engine(chat_mode="condense_plus_context") for example
Finally worked. How did you figure it out so quickly? I am new so is there any tip
I'm the core maintainer, I know the entire codebase lol
The docs definitely touch in all this. But there is a lot of docs
Oh shit. ROFL πŸ˜…

But I just a quick question, for a nerd like me how to find or debug this. Maybe I didn't follow the correct approach
I mean, it requires some background info on what as_chat_engine() is doing. By default, its creating an agent with a single tool (your index). But most open-source LLMs suck at being agents (I know this from experience), so instead I got you to change the chat mode to something else
https://docs.llamaindex.ai/en/stable/module_guides/deploying/chat_engines/usage_pattern/#available-chat-modes
Understood. Some requires initial required (indepth) knowledge and some comes from experience.
Thanks a lot for the help.
Add a reply
Sign up and join the conversation on Discord