Find answers from the community

s
F
Y
a
P
Home
Members
SeaBerg
S
SeaBerg
Offline, last seen 4 weeks ago
Joined September 25, 2024
Hi everyone. I'm working on taking my RAG app to production and want to convert it to async. My pipeline is two query engine tools called by a RouterQueryEngine.
One of the tools is a summary_index.as_query_engine and I'm using use_async=True with in it. The other tool is a QueryEngineTool with query_engine=vector_query_engine and vector_query_engine=RetrieverQueryEngine with a VectorIndexRetriever.
Both query engines include a node_postprocessor that uses CohereRerank.

I'm unsure if the QueryEngineTool part will support async, and I'm also not sure if the RouterQueryEngine also needs async specially associated with it, or if only the tools within it need to be async. It's really difficult to get clear async instructions for all of the different functions and query engines.

Any tips or info would be appreciated!
6 comments
L
S
I wonder if anyone has gotten chainlit to work after the llama index V 0.1 upgrade? I see there was a chainlit repo update that was supposed to transition from service_context to settings, but I haven't been able to find any examples that have been updated. Have spent hours on it and had no success so far. Would be amazing if there were a functioning example somewhere. About to give up and move onto something else that might have more updated examples with the newer Llama Index codebase.
8 comments
S
P
L
Has anyone tried a LlamaIndex query engine that includes a function call with o1 yet? I just saw the API doesn’t support tool usage. I am planning to try my sub-question query engine pipeline if my enterprise API token has access to o1. Hopefully will know tomorrow.
1 comment
L
I have been given access to the Cohere reranker through my companies Azure AI studio. I see that I can do inference in LlamaIndex with LLMs in Azure, but can I use the a reranking model in Azure as a node postprocessor, as I would do using from 'llama_index.postprocessor.cohere_rerank import CohereRerank'? I saw a github support inquiry mentioned using 'from llama_index.core.postprocessor import AzureRerank' but it doesn't work. Maybe it was an llm hallucination...
7 comments
S
L
L
I'm using SubQuestionQueryEngine.from_defaults. Is it possible to stream the final response from the LLM? I was hoping to reduce the apparent latency by streaming, but haven't figured out how to do it yet.
1 comment
L
@kapa.ai please tell me about query response modes that are available:

  • refine
  • compact
  • tree_summarize
  • accumulate
3 comments
k
Is anyone aware of a convenient way to manually edit the Documents object? I’m still getting some errors in pdf extraction, and it would be nice to do a few edits prior to running it through the node parser.
3 comments
S
L
Anyone know how to set llm to gpt-4o?
I updated the llm package: Successfully installed llama-index-llms-openai-0.1.21

llm = OpenAI(model="gpt-4o")
node_parser = MarkdownElementNodeParser(llm=llm)

ValueError: Unknown model 'gpt-4o'. Please provide a valid OpenAI model name in: gpt-4, gpt-4-32k, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview, gpt-4-vision-preview, gpt-4-1106-vision-preview, gpt-4-turbo-2024-04-09, gpt-4-turbo, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0125, gpt-3.5-turbo-1106, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, gpt-3.5-turbo-instruct, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo, gpt-35-turbo-0125, gpt-35-turbo-1106, gpt-35-turbo-0613, gpt-35-turbo-16k-0613
2 comments
S
L
I'm trying to reload a finetuned embedding model using one of the llama-index examples. https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/finetuning/embeddings/finetune_embedding_adapter.ipynb
Running into an issue with an import, so I'm assuming the source has changed or library name has changed.

Anyone know the current way to accomplish this:
from llama_index.core.embeddings import LinearAdapterEmbeddingModel
2 comments
S
R
Any tips for getting 'from llama_index.core.llms.generic_utils import messages_to_prompt' to work?
I'm getting this error: ModuleNotFoundError: No module named 'llama_index.core.llms.generic_utils'.

I have done pip install -U llama-index-llms-openai llama-index-embeddings-openai llama-index-core which I saw recommended for someone else, but didn't help. I'm not finding which package I need to install to do the import. Thanks!
1 comment
L