Find answers from the community

Home
Members
Darthus
D
Darthus
Offline, last seen 3 months ago
Joined September 25, 2024
D
Darthus
·

Model

I'm trying to use gpt-4o-mini and every time I make a call, I'm getting a response, but getting this error: unknown field: parameter model is not a valid field
1 comment
L
Anyone using Claude 3 (Sonnet in particular?) I'm finding it's cutting off the output tokens at a certain point, every time, mid sentence, if a response is longer than X length.
3 comments
L
D
I've "fixed" this, by implementing a TextClear to the ingestion pipeline to remove all non-text charcaters: https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/transformations.html. Now however, it's returning a lot of useless entries (ie ones that from a PDF page with just an image and page number, so the excerpt is just a number.
4 comments
D
L
One clue, the text for 106 is
Plain Text
breached in a single  turn, at least in theory. Fortification : In essence, Fortification is the Armor   of the building. Weapons with the quality Wrecking  ignore the Fortification value; other weapons must  first penetrate the Fortification value before damaging  the structure. Abilities with an armor penetrating  effect have no such effect on buildings. Set Buildings on Fire Setting wooden buildings on fire is an often used  tactic during sieges. This requires some kind of  flammable concoction \u2013 it can be an alchemical  grenade, a flaming oil canister or a simple fire made  from dry twigs and tinder. When the building has  been exposed to the flames, a test against  [Cunning  \u2013Fortification]  is rolled to see if it catches fire; if so,  the flames deal 1d4  damage and count as having the quality Wrecking. Note that whoever lights the fire will become the target of ranged attacks, provided that the building in question has windows  or archers on the roof. Those inside a burning building are at risk  of suffering damage each turn the building con - tinues to burn \u2013 a passed Strong  test per turn is Pact-making Beings So, what beings may be  interested in forging a  pact with the charac - ters? In general, they  are creatures of the  categories Beasts,  Undead or Abomina - tions that provide at  least Strong resistance.  A couple of perfect ex - amples of such beings  appear in the adventu - res already published  by J\u00e4rnringen, but since  this is a player\u2019s guide  we cannot specify ex - actly who they are\u2026  106
, I see there's a unicode looking escape character right before that 106, it seems to be discarding all text before that...
3 comments
D
L
Hey all, I'm having a heck of a time managing token limits for GPT 3.5 Turbo in llama-index 0.10.7 (I keep overrunning). I'm using the Chat Engine, and after diving into the code, I've implemented some token counters to monitor token usage but still, for example, my most recent chat counted the "all messages" variable, which I think is what eventually is sent is, and came up with 15875 tokens, then I eventually got an overrun error from OpenAI that their limit is 16385 and my total was 16443 (I think including their response). I see in me auditing what's being sent that a huge amount of chat history is being sent, along with my RAG data, system prompt etc. I can see sometimes it truncates stuff to fit under the limit, but other times it doesn't seem to or will give mthe "initial token limit" error, soemtimes it'll go through to OpenAI and error this way. Is there a better more reliable way to make sure the token limit doesn't overrun as opposed to just "doing smaller/fewer chunks"?
16 comments
D
L
V
T
Hey all, I'm trying to use a custom open source LLM via LiteLLM. I can get it to work with basic sample code ,but when I try to load it into the Chat_Engine, everything loads fine, but when I do a simple brief chat, with no system prompt, I always get "ValueError: Initial token count exceeds token limit". The only code I changed from my working original code was OpenAI to : "llm = LiteLLM(model="together_ai/mistralai/Mixtral-8x7B-Instruct-v0.1", temperature = 0)" do I have to set the context length manually or something?
10 comments
D
L
Did you ever get this working? I am trying to use LiteLLM according to the examples in the documentation and getting an error with "import litellm" in the base utils code.
7 comments
L
D
I'm overrunning the token count after increasing my chunk size and adding more system prompts. Can I see an exact output of the request to OpenAI and total token count so I can troubleshoot?
1 comment
L
Getting a weird error. I'm finding for some reason that setting the chunk_size and chunk_overlap directly on the service_context and passing it to VectorStoreIndex.from_documents() seems to make no difference, it always chunks it the same way, so now I'm trying to pass a Text splitter to force different chunking, using this code and I'm getting a weird error about not providing both a text_splitter and node_parser (ValueError: Cannot specify both text_splitter and node_parser), when I'm only specifying one:

Plain Text
embed_model = OpenAIEmbedding(embed_batch_size=10)
    llm = OpenAI(model="gpt-3.5-turbo-16k", temperature=0)
    token_counter = TokenCountingHandler(
        tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo-16k").encode
    )
    callback_manager = CallbackManager([token_counter])
    text_splitter = SentenceSplitter(chunk_size=128, chunk_overlap=15)
    service_context = ServiceContext.from_defaults(
        llm=llm, 
        text_splitter = text_splitter,
        callback_manager=callback_manager, 
        embed_model=embed_model
    )
10 comments
D
L
A
Hey all, I'm trying to wrap my head around how to manage simultaneous user chats at the sme time in Llamaindex and keep their chat histories and contexts separate. Is it just as simple as instantiating a new chat_engine object every time there's a new chat from a different user? And how do you keep them apart in code, just keep a list in memory of all the objects and call the chat function on the right object based on an ID or something? Anyone aware of any examples of this? Seems like a core use case for any sort of production setting that a single instance of a llamaindex program should be able to spin up and manage different chats at the same time and keep them separate.
13 comments
D
L
t
D
Darthus
·

Run

Oddly it ran once ok, I changed which llm I was using, ran it again and started getting this.
33 comments
L
D
Are there any plans to support multi-modal models in the ContextChatEngine?
1 comment
L
Running into this issue on llama-index-core 0.10.21.post1 when trying to do .chat on my ContextChatEngine, any ideas?:

Plain Text
AttributeError                            Traceback (most recent call last)
Cell In[24], line 3
      1 global chat_engine
----> 3 response = chat_engine.chat("What do you know about Darthus?")

File ~\miniconda3\lib\site-packages\llama_index\core\callbacks\utils.py:41, in trace_method.<locals>.decorator.<locals>.wrapper(self, *args, **kwargs)
     39 callback_manager = cast(CallbackManager, callback_manager)
     40 with callback_manager.as_trace(trace_id):
---> 41     return func(self, *args, **kwargs)

File ~\miniconda3\lib\site-packages\llama_index\core\chat_engine\context.py:160, in ContextChatEngine.chat(self, message, chat_history)
    158 if chat_history is not None:
    159     self._memory.set(chat_history)
--> 160 self._memory.put(ChatMessage(content=message, role="user"))
    162 context_str_template, nodes = self._generate_context(message)
    163 prefix_messages = self._get_prefix_messages_with_context(context_str_template)

File ~\miniconda3\lib\site-packages\llama_index\core\memory\chat_memory_buffer.py:140, in ChatMemoryBuffer.put(self, message)
    138 def put(self, message: ChatMessage) -> None:
    139     """Put chat history."""
--> 140     self.chat_store.add_message(self.chat_store_key, message)

File ~\miniconda3\lib\site-packages\llama_index\core\storage\chat_store\simple_chat_store.py:34, in SimpleChatStore.add_message(self, key, message, idx)
     32 """Add a message for a key."""
     33 if idx is None:
---> 34     self.store.setdefault(key, []).append(message)
     35 else:
     36     self.store.setdefault(key, []).insert(idx, message)

AttributeError: 'ContextChatEngine' object has no attribute 'append'
2 comments
d
L
D
Darthus
·

Update

I just updated llama-index, and am now getting this error on existing code: "ImportError: cannot import name 'ChatMessage' from 'llama_index.core.llms' (unknown location)"
4 comments
L
D
Anyone experimented with Gemini? I've gotten it to work, but either the "embedding-001" embeddings are trash, or something is off. I search for something like, "What is the name of the game?" and it returns 20 documents that basically have no text. (like they are pages with images in the source), or just small snippets of text, even though the game name of couse is mentioned all over the place, and OpenAI returns fine.
3 comments
W
D
D
Darthus
·

System prompts

Hey all, trying to use Gemini via Vertex api with Llamaindex, getting this error, even though I've commented out all system priompt assignments in the code: "Gemini model don't support system messages". https://docs.llamaindex.ai/en/stable/examples/llm/vertex.html Any thoughts?
5 comments
n
D
L
I'm looking through the nodes coming from my PDF documents. It seems like the PDFs themselves via SimpleDirectoryReader always get split up into separate document objects, by the page, is that right? So when they get further chunked into nodes, is it the case that a node will never span across multiple pages? If so, that seems to limit the flexibility of the nodes themselves to for example capture an entire concept in one node if the node happens to span across pages.
3 comments
L
D