Find answers from the community

m
mewtoo
Offline, last seen 2 months ago
Joined September 25, 2024
m
mewtoo
·

Tool calls

Curious if anyone has experienced the same - I'm using a custom workflow agent and I noticed when testing with models served in the OpenAI format (instaed of OpenAI) that the step in the workflow to intercept a streaming response if there is a tool call doesn't work as well as as it does with openai. OpenAI's response seems to detect a tool call almost immediately when the response starts and some of the other models I'm testing with only detect it after the first few chunks
3 comments
m
L
Is there a good chat ui that people really like/would recommend that plugs in well with llamaindex? I think it’d be cool to have different chatbots based on metadata from ingestion
8 comments
p
m
L
I used the ingestion pipeline to store embeddings (with Postgres) and when I try to load in the index using VectorStoreIndex.from_vector_store, the nodes are empty. Does anyone know how to resolve this?

Im doing index=VectorStoreIndex.from_vector_store(vector_store=vector_store)
7 comments
L
m
Does llamaindex support structured outputs with OpenAI (was released 8/6) or have any plans to?
2 comments
m
L
I’m noticing that switching from chat engine to agent adds a ton of latency. Has anyone else experienced this?
3 comments
L
I am using an agent with 1 QueryEngineTool and 1 FunctionalTool. It seems that the agent is only checking docs (with the QueryEngineTool) if I specifically say "check docs" in my question (to the agent). Is there a way to enforce the QueryEngineTool being used for every single chat or is this fully prompt based?
5 comments
L
W
m
What’s the best way to implement function calling with a chat engine and OpenAI? I’m using astream chat and chat engine
27 comments
m
L
C
m
mewtoo
·

State

I'm building a backend (with fastapi) with llamaindex and trying to understand in general how to handle multiple users at once. I'm using chat_engine. I am not using websockets, but will have some kind of chat id for each interaction

What would be the best way to build this?
Option 1: Store a dictionary of chat engines (with chat ids as keys) and then on each request, get the chat engine for that user
Option 2: Create a SimpleChatStore and store the chat history for each interaction (by chat engine), and on each request initialize a chat engine with the history for that user.

I've looked through the documentation, and it still does not feel clear to me what is best practice
5 comments
m
L
R
I’m defining a chat engine and then passing in a query engine with 10 docs (similarity_top_k) but it still defaults to 2 embeddings. I am using pgvector
4 comments
m
L