.complete
method. Let's address this problem:pip install llama-index-llms-azure-openai
from llama_index.llms.azure_openai import AzureOpenAI llm = AzureOpenAI( engine="your-deployment-name", model="gpt-35-turbo-16k", # or your specific model temperature=0.0, azure_endpoint="<https://your-resource-name.openai.azure.com/",> api_key="your-api-key", api_version="2023-07-01-preview", )
import os os.environ["OPENAI_API_KEY"] = "your-api-key" os.environ["AZURE_OPENAI_ENDPOINT"] = "<https://your-resource-name.openai.azure.com/"> os.environ["OPENAI_API_VERSION"] = "2023-07-01-preview"
.chat
and .complete
methods should work without asking for an API key..chat
and .complete
methods in the context of LlamaIndex and Azure OpenAI integration:.complete
method:.chat
method:.complete
takes a single string prompt..chat
takes a list of ChatMessage objects, allowing for more structured conversations..complete
is better for single-turn interactions or text generation tasks..chat
is more suitable for multi-turn conversations or when you need to provide system instructions and maintain context..complete
method is still available for models of type chat-completions
. In these cases, your input text is converted to a message with role="user"
..complete
typically returns a CompletionResponse object..chat
returns a ChatResponse object.``
It's worth noting that both methods also have streaming variants:
stream_complete and
stream_chat`, which allow for incremental generation of responses..chat
method works fine, but the .complete
method is asking for an API key. This behavior is not typical, as both methods should use the same authentication mechanism. However, I can offer some insights and potential solutions:import os os.environ["OPENAI_API_KEY"] = "your-api-key"
from llama_index.llms.azure_openai import AzureOpenAI llm = AzureOpenAI( engine="your-deployment-name", model="your-model-name", api_key="your-api-key", azure_endpoint="<https://your-resource-name.openai.azure.com/"> )
.complete
method is still available for models of type chat-completions
. In these cases, your input text is converted to a message with role="user"
. This conversion might be causing issues if not handled correctly..complete
method or its authentication mechanism in your code..complete
method call:response = llm.complete("Your prompt here", api_key="your-api-key")
from llama_index.core.llms import ChatMessage response = llm.chat([ChatMessage(role="user", content="Your prompt here")])
agent = ReActAgent.from_tools( await ctx.get("tools"), llm=llm_4o_2, verbose=True ) response = agent.chat(ev.question)
from llama_index.core import Settings Settings.context_window = 3900 # or any other suitable value
query_engine = index.as_query_engine(similarity_top_k=2)
from llama_index.node_parser.topic import TopicNodeParser parser = TopicNodeParser(chunk_size=1000) # Adjust chunk_size as needed
from llama_index.core import Settings from llama_index.core.agent import ReActAgent # Limit the context window Settings.context_window = 100000 # Adjust as needed # Create the agent with limited tools and adjusted configuration agent = ReActAgent.from_tools( await ctx.get("tools"), llm=llm_4o_2, verbose=True, max_reasoning_steps=5, # Adjust as needed context_window=100000 # Explicitly set context window for the agent ) # Use the agent with potentially chunked or limited input response = agent.chat(ev.question)
# Assuming you have a retriever and a reranker set up retrieved_nodes = retriever.retrieve(query) print("Before reranking:") for node in retrieved_nodes: print(f"Node ID: {node.node.node_id}, Score: {node.score}") reranked_nodes = reranker.postprocess_nodes(retrieved_nodes, query) print("\nAfter reranking:") for node in reranked_nodes: print(f"Node ID: {node.node.node_id}, Score: {node.score}")
from llama_index.postprocessor.jinaai_rerank import JinaRerank jina_rerank = JinaRerank(api_key=api_key, top_n=2, verbose=True)
query_engine = index.as_query_engine( similarity_top_k=10, node_postprocessors=[jina_rerank] ) response = query_engine.query("Your query here") print(f"Number of retrieved nodes after reranking: {len(response.source_nodes)}")
for i, node in enumerate(response.source_nodes[:2]): # Check top 2 nodes print(f"Node {i+1}:") print(f"Text: {node.text}") print(f"Score: {node.score}") print("\n")