Vish

Log inLog into community

Find answers from the community

Home

Members

Vish

Offline, last seen 5 months ago

Joined September 25, 2024

VVish

Does anyone know how llama index uses

Does anyone know how llama index uses UnstructeedReader’s elements to do RAG? i.e how are table and image elements treated? any pointers to code i can read would be fine too

4 comments

VVish

Add Perplexity LLM integration by vishhv...

I've updated the PR for Perplexity's API integration as an LLM - https://github.com/run-llama/llama_index/pull/8734

Please let me know if you get a chance to take a look and lmk if any other items need to be done from my end to help make it ready to merge

2 comments

VVish

@Logan M I recall there being a section

I recall there being a section showing us how to build RAG from the ground up on the documentation page, can't seem to find it now - was it removed or am I just blind

4 comments

VVish

RAG bot

does kapa.ai have access to the discord data here? if not, thoughts on making a RAG bot that can ingest all the questions and threads in this channel and answer questions?

6 comments

VVish

@Logan M what do you think of adding

what do you think of adding some way for agents and tools to have a debug call or an error call that we can customize. In my case - i have a master openai agent controlling sub agents and some query engine tools. Sometimes, the model hallucinates a function call or a tool name that doesnt exist. One way to counter this is to make tool descriptions better, but what Im looking for is smooth execution - i.e for the sub agent to not trigger an exception that tool name doesnt exist, and instead send a message back to the master agent that say it doesnt know the answer instead. Basically have an exception call handler that we can customize for these intermediate parts of the agent execution.

If a sub agent is able to return a cohesive answer when a tool isnt found to the master agent, my workflow wont just come to a halt when there is tool hallucination, if that makes sense.

7 comments

VVish

Eval

Hey , does OpenAI agent work with evaluation modules? Wondering how one would evaluate agents without triggering memory, and if batch eval runner works with agents

4 comments

VVish

Logan M I get JSON decoder errors `json

I get JSON decoder errors - json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 73 char 72 when I try to load metadata of source nodes from a response response.source_nodes[i].node.metadata - how can I fix this? .-.

2 comments

VVish

Logan M Few questions about prompts in

Few questions about prompts in query engines and agents:

text_qa_template - if I'm using a openai chat model, can I use ChatPromptTemplate.format_messages as input to this? I see examples just using a single huge string with the prompt, with context_str and query_str variables - I would like to use system, and user messages with custom variables (that I format and add before I pass the formatted prompt messages to text_qa_template when creating a query engine and its tool) apart from the context and query variables.
How do I add prompts to agents? I see there is prefix_messages and system_prompt - not sure which to use - does prefix_messages basically get added before every call?

15 comments

VVish

Logan M Invalid Request Error X does

Invalid Request Error: 'X' does not match '^[a-zA-Z0-9_-][1,64}$' - 'functions.0.name'

91 comments

VVish

Tools

If I have too many query engine tools and I want to still use OpenAIAgent, what options do I have that don't involve using FnRetrieverOpenAIAgent?

10 comments

VVish

Logan M how would I use a simple

how would I use a simple Langchain LLM with a simple chat engine and handle memory and token limits in llama index? All id like to do is make a chat interface using llama index and a custom langchain LLM. Docs seem to suggest chatmemorybuffer, cant really understand how it works (does it summarize when token limit is reached or does it just remove least recent messages?)

8 comments

VVish

Logan M I have an Azure Deployment of

I have an Azure Deployment of ChatGPT that uses GPT4. I want to use it with OpenAI Agent, but i keep getting errors. For one, if I tried using LangChain's ChatOpenAI or AzureChatOpenAI model while specifying an engine, I get the error LLM must be of type "OpenAI". If I tried using AzureOpenAI from llama_index.LLMs - must provide engine or deployment_id parameter to create a <class 'openai.api_resources.chat_completion.ChatCompetion'. Does llama index have a ChatOpenAI? or is there anyway to use ChatOpenAI from langchain with OpenAIAgent?

11 comments

VVish

Logan M i have this usecase i create an

i have this usecase - i create an index of documents, and I want to be able to support 2 types of queries - one to query the entire index of documents and give a response vis semantic search, the other is to be able to filter the index to use the embeddings of only one document to answer. How can i accomplish this?

8 comments

VVish

I see responses in the retrieve step of

I see responses in the retrieve step of a reAct chat engine. But when I attempt to see the sources via response.sources or response.source_nodes, I get empty lists. Any specific reason this happens and insights ?

10 comments

VVish

LLM calls

Anyone here help me understand how many LLM calls happen when we use a single query engine query method? (say retriever query engine, default params)

4 comments

VVish

Graphrag

off topic, anyone here managed to grab a git clone or fork of https://microsoft.github.io/graphrag/ ? ;-; really wish i did it before they removed it, andtheir documentation is gone, it was such a goldmine

8 comments

VVish

Pprint

@Logan M Is there a way to pretty print a response object? Its too long horizontally to read.

3 comments

VVish

@Logan M I noticed the agent

@Logan M I noticed the agent implementations have been refactored by @jerryjliu0 , and when I try the usual OpenAI Agent implementation example provided in the docs, I can't seem to get it to work when I try astream_chat - just says STARTING TURN 1 and doesn't print anything beyond that. Am I doing something wrong, is there a new way to interact with things?

14 comments

VVish

langchain/cookbook/Multi_modal_RAG.ipynb...

@Logan M I was curious, there's a Multimodal RAG with Unstructured given for langchain here that I found very useful - https://github.com/langchain-ai/langchain/blob/master/cookbook/Multi_modal_RAG.ipynb

Is there any such tutorial for llama index? If not, maybe I can try making one

1 comment

VVish

Optimum

@Logan M Trying to use Optimum ONNX Embedding for bge as shown in the documentation examples on my Macbook with M1 Pro. I get this error when I try to test the model with get_text_embedding : InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid Feed Input Name:token_type_ids. No clue how to debug and fix this, tried looking it up online to understand little to nothing about this

8 comments

VVish

Return

@Logan M So the OpenAI Agent routes a query to a query engine tool via function calling, but once it gets the output, it also does some response synthesizing of its own. Is there anyway to just do no response synthesizing step and just spit out the output of the tool? I understand something like RouterQueryEngine can help here, but I suppose the caveat is it doesn't use OpenAI function calling directly I suppose like an agent? Also, it normally doesn't have memory like an agent. What are your suggestions? Trying to cut down on latency here as much as I can, with a system of agent -> sub agents -> tools.

I tried avoiding the agent -> sub agents using an ObjectIndex for tools too, but even that still has a response step after getting output from a tool, that seems to take a good amount of time.

10 comments

VVish

Logan M When I m using Multi hop Agent

@Logan M When I'm using Multi-hop Agent calls, there's increased response times for queries especially when calling multiple tools across agents. Is there a caching solution that I can use to help reduce response times?

8 comments

VVish

Logan M i have this use case for

@Logan M i have this use case for extracting differences between 400 page legal contracts. Client asks for the most similar contract to one they specify, and the goal is to find differences at a clause level since the contracts have broadly similar structure in sections. To first fetch the most similar documents, we would have to construct some structure around the chunked documents and establish a similarity comparison method - as we cant just compare chunked documents, we would have to compare sets of chunked documents between 2 contract files, arranged in some structure hierarchically i suppose? Not sure how to do that comparison across large documents in one go. Have you encountered this before and do you have any ideas from the documentation that you suggest I can explore straight away?

13 comments

VVish

Logan M I m using an OpenAI Agent that

@Logan M I'm using an OpenAI Agent that has more sub-agents as tools, which in turn have a lot of query engines as tools. How would I go about disabling the refinement step for all of these? It adds to latency, and I would like to make sure I don't execute any refinement step for the master agent, sub agents and query engine tools.

1 comment

VVish

Object index

@Logan M Question: Can I use an ObjectIndex like a regular index? i.e persist it in storage, and use load_index_from_storage?

16 comments