Find answers from the community

I am having a problem of the llm returning "I am handing off to AgentXYZ" instead of acutally sending "content" as empty and tool calls.. Is there a way to instruct llm to send the tool call instead of the message? I am using multi agent work flow
1 comment
w
m
mxtt v4
·

Crypto

u guys arent planning on making a coin on solana?
2 comments
L
I
Uh, it's not me right? LlamaCloud is out (or LlamaParse at least)
1 comment
W
hello there, i am looking for a way to query a multimodal llm continously while retaining memory. much like simulating an user giving chatgpt an image and the first prompt and subsequently follow up prompts asking question about the image. is there a cookbook for this in llamaindex?
4 comments
W
g
A
Ariel
·

Checkpoint

I'm currently developing an agent workflow with human in the loop interaction and function calling. The workflow works great if the user stays in the session to complete it. I've tried both context serialization and checkpoints in order to persist context state with no success. I save the context/checkpoint after each iteration and load it back when starting the workflow as suggested in the documentation. I think the problem is with tool calling. Right after loading the checkpoint and adding the new user input, the agent gets stuck "thinking" ... as if it didn't know what steps is next.
4 comments
L
A
vLLM Structured Outputs.

Hi, I'm trying to do the same thing as this person (Issue #17677 on GitHub) but running into errors. If I do sllm.complete(prompt) or sllm.chat(ChatMessage[]), I get 'tool_choice must either be a named tool, "auto" or "none".'.

If I put tool_choice = auto or none, I get "Expected at least one tool call, but got 0 tool calls."

I copied the code in the documentation as well as the one recommended on GitHub issues. What could be the problem?

Also tried it with is_function_calling_model=True and False.
5 comments
C
L
Hello there!
I'm experiencing an issue - i want to retrieve chunks in form of ChatResponse objects from an agent.
i did a following:
Plain Text
response_generator = self.agent.stream_chat(message=messages[-1].content, chat_history=messages[:-1]).chat_stream
                for token in response_generator:
                    yield token

but i'm getting:
Plain Text
ValueError: generator already executing

when using response_gen instead of chat_stream it's working flawless. However, i truly need that ChatResponse objects
9 comments
b
L
I tried importing the QuadrantVectorStore like this: from llama_index.vector_stores.qdrant import QdrantVectorStore

but it says it can't be resolved? It's like that in the docs
4 comments
W
b
Hi, how can i increase speed of VectorStoreIndex.from_documents()? I have single JSON ~140mb and it spend 50min on generating index on google colab, before i interrupted it. Now i try to run locally and already 10min of executing. What is the common time it take to generate index?
65 comments
W
P
hi guys, my handoff_to is not working for multi agent, its not able to respond. so the root agent works but its handed off agent but it never get respose from them. any suggestion or help how to fix it? No errors but it never worked. https://docs.llamaindex.ai/en/stable/examples/agent/agent_workflow_multi/
3 comments
W
J
@Icksir , I frequently use the LlamaParse() method. So, when you receive a response from that method, it contains result (which has the text response), input_tokens, and output_tokens, which indicate how many tokens were used from the job ID. I hope this helps.
g
gamecode8
·

Tools

Hello, Im trying out the AgentWorkflow feature and have noticed that the tool outputs arent being captured.

To start I've made a simple single agent AgentWorkflow. While the responses are generated, the response.tool_calls list is empty, and when listening to the stream of events, I never see the ToolCallResult being output.

My goal is to be able to get the source nodes used by the query engine tool. Not sure if its an issue or I have misunderstood something. Im following https://docs.llamaindex.ai/en/stable/understanding/agent/multi_agents/

See basic example below.

topic_a_agent = FunctionAgent(
name="topic_a_expert",
description="Answers questions about topic A",
system_prompt="You are a retrieval assistant.",
tools=[QueryEngineTool((....)]
llm=OpenAI(model="gpt-4"),
)


workflow = AgentWorkflow(
agents=[topic_a_agent], root_agent="topic_a_expert"
)

response = await workflow.run(user_msg="......")
7 comments
L
g
O
OUTYUA
·

Changelog

where can find the release notes
1 comment
L
W
Willi
·

Agent

Hey LlamaIndex community! 👋 I'm an experienced dev looking to build a project to learn the framework in-depth.

I want to create an agent that helps create and refine business offers using natural language. Here's what I want it to do:

  • Take an initial offer description and repeatedly prompt for missing info, if any (customer name, products, quantities)
  • Validate inputs (e.g., no negative quantities, verify products/customers exist)
  • Generate the offer in LaTeX format
  • Allow questions about the offer
  • Allow refinements (change or swap products, ...)
  • Support offer finalization as an end state
I have some architecture/design questions I'd love input on:

  1. What's the best way to handle conditional LaTeX output vs Q&A responses? (Evaluation?)
  2. For offer creation/updates - should these be handled via response generation or function calls?
  3. How to properly manage state transitions (no offer → draft → finalized)?
  4. Should validation be its own LLM workflow step?
  5. Best practice for product/customer data - prompt injection, vector store, or filtered function calls?
Any guidance would be much appreciated! 🙏
1 comment
L
Hi! i am using llamaIndex python, mongoAtlas db as my persistant storage. I had an initial successful implementation of using MongoDBAtlasVectorSearch as my vectorStore, and rag is working 🙂

but now i am exploring the document summary index, and im struggling to understand the concept of docstore and index store.
  1. am i able to create a document summary index from my existing vector store?
  2. does anyone have a copy of data in their docStore, vectorStore, indexStore? I will like to see the data to see how they all related to each other.
Any help pointing me to the right direction is greatly appreciated! T.T
8 comments
W
n
Hello @Logan M.
I hope you're doing well.

I have a question about using the parsing_instruction parameter in the LlamaParse method. Is it possible to reuse the same job_id to perform multiple parsing iterations by sending different parsing_instruction values? Or does each modification require creating a new job?

If this functionality is not currently supported, is there an alternative approach you would recommend to achieve similar behavior?

I appreciate your time and any insights you can provide.

Thanks in advance! Cheers! 😁
1 comment
T
Hi there! I observed that ChatSummaryMemoryBuffer crashes for Anthropic due to sending a single message with role system and no message with role user. when reading the code here: https://github.com/run-llama/llama_index/blob/90761a9f789bb7628d4faf40ae900d93f16065b7/llama-index-core/llama_index/core/memory/chat_summary_memory_buffer.py#L272 I'm seeding that it's sending the context as role system but doesn't send the system prompt instruction to summarize the context. In the image attached I fixed it and it works perfect for me. Is the current implementation bugged?
3 comments
L
p
Hey, I'm running a chroma vector store and a very basic storagecontext and docstore setup. For some reason whenever I try to peek into my chromadb after indexing a handful of sample documents, it returns an empty dict as follows: {'ids': [], 'embeddings': array([], dtype=float64), 'documents': [], 'uris': None, 'data': None, 'metadatas': [], 'included': [<IncludeEnum.embeddings: 'embeddings'>, <IncludeEnum.documents: 'documents'>, <IncludeEnum.metadatas: 'metadatas'>]}. When I print my docstore from index.docstore.docs, it states that I do have documents. I've been debugging it for a bit, playing around with persist paths and other configs but I can't seem to find where the problem resides
21 comments
L
O
Hey! I am curring facing issues with the use of memory inside a workflow, and I don't know where else to ask. I am creating a chatbot to chat with multiple documents, and my workflow is now looking like the image with this message. The "ingest" path just creates the top agent to retrieve the documents, and the "ask" path is meant to consult the LLM with the indexes.

My ask step looks like this, but the chat stores just overwrites itself after the top agent call. It doesn't remember the chat history, and I don't know if I am doing something wrong or simple I shouldn't use SimpleChatStore (I just wanted to do a proof of concept).

Any advice is welcomed

Plain Text
@step
async def ask(self, ev: StartEvent) -> StopEvent | None:

    obj_index = ev.get("obj_index")
    query = ev.get("query")
    chat_store = ev.get("chat_store")
    user = ev.get("user")
    if not obj_index or not query:
        return None
    
    user_file = f"./conversations/{user}.json"

    if not os.path.exists(user_file):
        chat_store = SimpleChatStore()
    else:
        chat_store = SimpleChatStore.from_persist_path(persist_path=user_file)
    
    chat_memory = ChatMemoryBuffer.from_defaults(
        token_limit=3000,
        chat_store=chat_store,
        chat_store_key=user,
    )

    top_agent = OpenAIAgent.from_tools(
        tool_retriever=obj_index.as_retriever(similarity_top_k=3),
        system_prompt=PROMPT,
        memory=chat_memory,
        verbose=True,
    )
    
    response = top_agent.query(query)
    chat_store.persist(persist_path=user_file)

    return StopEvent(result={"response": response, "source_nodes": response.source_nodes})
13 comments
I
L
Is it possible to use Metadata filters to filter by dates? Looking at the schema there doesn't seem to be a way to easily match years in a timestamp
6 comments
L
d
Hello everyone! I’m working on improving my RAG pipeline by extracting images from my PDF files. While I haven’t encountered significant challenges in the ingestion and indexing phases, I’m a bit uncertain when it comes to retrieval.

Currently, retrieval is handled through tool calls, allowing the model to determine when additional information is needed to answer a user’s query. I’m using GPT-4o via OpenAI’s API, but since the output can only contain text and not images, I’m facing a limitation. My goal is to pass images—if present in the retrieved chunks—to enhance the quality of responses.

What would be the best way to overcome OpenAI’s API constraints? Has anyone else faced a similar issue? If so, how did you resolve it?

I've also attached an example of an API call I attempted, but it didn’t work as expected.
6 comments
L
f
T
Hello!
I have a very quick question about retriever querying engines, since I didn't find anything in this regard in the documentation.
Simply, I have a list of files and a multiple questions for which I want an answer, considering these files as context. I know that I can use as_query_engine after indexing my content. However, I can query only one question at time. Do you know if there is any inner library support to parallel querying on the same vector store? The alternative would be to parallelize using Python processes, but it would be nice if something similar is already implemented in llama-index
1 comment
W
Hi I am using AzStorageBlobReader, and while reinitatiing my RAG pipeline, it is ingesting duplicate document chunks, I think its because its putting it in temporary directory because the doc_hash is not changing whereas the doc_id seems to change Any suggestions ?
10 comments
W
p
D
L
Hello People,
I need your guidance.

Plain Text
from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(
  model="model",
  api_key="Key",
  api_base="OpenAI Compatible endpoint",
  context_window=16000,
  is_chat_model=True,
  is_function_calling_model=False,
)
Settings.embed_model = llm

# Create index
index = VectorStoreIndex.from_documents(
    documents, 
    show_progress=True)

In this code facing error. Am I doing something wrong?
Plain Text
1AssertionError                            Traceback (most recent call last)
Cell In[22], line 31
     22 documents = SimpleDirectoryReader("../data", required_exts=[".txt"]).load_data()
     23 #embed_model = llm
     24 
     25 
   (...)
     29 #     api_base="http://tentris-ml.cs.upb.de:8502/v1"
     30 # )
---> 31 Settings.embed_model = llm
     33 # Create index
     34 index = VectorStoreIndex.from_documents(
     35     documents, 
     36     show_progress=True)

File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\settings.py:74, in _Settings.embed_model(self, embed_model)
     71 @embed_model.setter
     72 def embed_model(self, embed_model: EmbedType) -> None:
     73     """Set the embedding model."""
---> 74     self._embed_model = resolve_embed_model(embed_model)

File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\embeddings\utils.py:136, in resolve_embed_model(embed_model, callback_manager)
    133     print("Embeddings have been explicitly disabled. Using MockEmbedding.")
    134     embed_model = MockEmbedding(embed_dim=1)
--> 136 assert isinstance(embed_model, BaseEmbedding)
    138 embed_model.callback_manager = callback_manager or Settings.callback_manager
    140 return embed_model

I need your little time. Please help
30 comments
L
K
c
cmosguy
·

O3

@Logan M how did I use o1 or o3 as an agent in the workflow agent system? Do you guys have an example?
8 comments
L
c