Find answers from the community

S
Saltuk
Offline, last seen 2 weeks ago
Joined September 25, 2024
Isn't that kinda stupid^^ Shouldn't it only remove user/assistant/tool messages?
5 comments
L
S
j
One question regarding refreshing an existing vector store.
As far as i can see the normal VectorStoreIndex allows to pass in the show_progress flag which shows the nice tqdm-like bar when generating the chunks.
I don't see such an option for the refresh_indexmethod. Is there a way to show progress there? Would especially be interesting to see how many chunks/what kinds of chunks are being modified/refreshed.
Currently i'm executing it like this and unfortunately i don't see any indication.

Plain Text
storage_context = StorageContext.from_defaults(persist_dir=storage_dir)
index: VectorStoreIndex = load_index_from_storage(storage_context=storage_context)
index.refresh_ref_docs(documents)
3 comments
L
S
S
Saltuk
·

Tool cals

Hey, so how would i manage to estimate costs for my RAG pipeline if i use the OpenAIAgentRunner?
Are tool calls billed differently then completion calls? And also how do i have to calculate the context?
What i mean here is that for example if i have a chunk size of 512 and retrieve 30 nodes, do i have to calculate 512*30+len(tokenize(message)) as average token count per message?
3 comments
L
I set similiarty_top_k to the value of 100 but only get back 20-22 nodes. Why is that? I'm using the AgentRunner and a basic knowledge tool
15 comments
L
S
s
How can i get all nodes from a previously saved VectorStore on disk space?
2 comments
W
S
Is it possible to instruct LLamaParse to not scan the top header as headline? I want to use the MarkdownParser, but it horribly fails, because of these headers in my document
24 comments
L
S
I implemented LlamaParse like this, but for some reason it always reparses the document. I would have expected the document to only be parsed once? @Logan M Can you maybe tell me what i am doing wrong here? It tries to reparse even before the 48h breakpoint.

Plain Text
def get_file_documents(config: FileLoaderConfig):
    parser = llama_parse_parser()
    files_info = fetch_file_list()
    logger.info(
        f"List of files ready for download. Number of files to download: {len(files_info)}"
    )

    if config.use_llama_parse:
        file_paths = []
        for file_info in files_info:
            resource_url = file_info["resourceURL"]
            file_name = file_info["fileName"]
            file_path = os.path.join(config.data_dir, file_name)
            if not os.path.exists(file_path):
                download_file(resource_url, file_path)
                logger.info(
                    f"Successfully downloaded file: {file_name} and saved it on the server."
                )
            file_paths.append(file_path)
        
        documents = []  
        for file_number, file_path in enumerate(file_paths, 1):
            file_name = os.path.basename(file_path)
            json_representation = parser.get_json_result(file_path)
            document = parser.load_data(
                file_path=file_path,
                extra_info={
                    "file_name": file_name,
                    "file_number": file_number,
                    "pages": json_representation[0]["pages"]
                }
            )

            documents.append(
                document
            )
5 comments
L
S
S
Saltuk
·

Finetune

AgentRunner.from_llmhas a bug. In the is_function_calling_model you are not testing for finetuned models. Which leads to the ReActAgent being automatically used, even if your finetuned model is an OpenAI one. @Logan M
1 comment
L
How can i use the finetuning engine in a non-blocking way? I get an error, when i try to finetune and resolve the engine. Is there a way to just read existing finetuning engines, based on your OpenAI key?
2 comments
L
S
S
Saltuk
·

Context

But i don't really understand the event pipeline in detail, where the event handlers are registered and how exactly the dispatcher works. So maybe someone that is actually knowing what is going on there may have a look...
13 comments
L
S
I wanted to use my knowledge base that's why i'm calling my own chat_engine function. I just refactored the code to use a query engine instead of an Agent-based approach for the finetuning_events. That should work, but still weird that it fails because of such a reason. I think there needs to be some fixing to be done to the ChatCompletionMessageToolCall class?
22 comments
L
S
Why does LLamaParse not support JSON mode anymore?
3 comments
S
S
Hi i dont know if im at the right place here, but i get the error that my context is exceeding the maximum context window:
5 comments
S
A
What is the advantage of the OpenAIAgent instead of a ContextChatEngine (besides function calling?). If i just want to perform plain RAG, so only ask questions against a set of data, does it make sense to use the OpenAIAgent?
I added langfuse for tracing and what i can see is that running the OpenAIAgent with a single tool for querying against the knowledge base has around 2x the latency of the pure ContextChatEngine.
13 comments
S
L
Currently i'm not using any vector db, i use the storage context to just save the vector store as json (which is the Llama Default i think). I don't know how to compare the default_vector_store.json's information with my new set of Documents that i'm trying to embed. I had hoped that VectorStoreIndex.from_documents gives me an API for this, but unfortunately i don't see it.
7 comments
S
t
L
Hi guys, i get this error when i try to run poetry run generate. Any idea what i can do against it?

Plain Text

2 comments
S
M
S
Saltuk
·

Dispatch

What can i do against this? Why does it happen?
3 comments
L
S
Hey do i need a GPU to use the FlagEmbeddingReranker (that use the BAAI models)? I tried to execute it but got an error... If yes, what kind of GPU do i need at the least? Thanks for any input.
2 comments
S
W
S
Saltuk
·

Hey,

Hey,

I built a small chat bot application for a university project with llama index and it went pretty well. Thanks for contributing and building llamaIndex ❤️
I have one quesion however and that would be what i as a developer could do about the response times of the API. I use the vercel chat api for the frontend and the loading times can be 20-30 seconds long which a lot of users complained about.
11 comments
S
b
L