Find answers from the community

Updated 3 months ago

for vector stores on supabase created

for vector stores on supabase created through VectorStoreIndex, how could we specify the llm and embedding models to point to AzureOpenAI instead of OpenAI where the index was fetched via VectorStoreIndex.from_vector_store(vector_store=vector_store) on a separate server and used as a query engine tool?

Besides, the vecs table right now has around 500 nodes, each node with 500 characters, top_k is set as 12, what we observed is that there were 12 calls to the OpenAI Embedding model that spanned over 6 seconds. Is that normal? And why was the embedding model called 12 times for the same query text when it comes to similarity calculation?
L
d
G
23 comments
You can set the embed model and llm in the service context to use azure. Then you can do

VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context)

Docs on setting up azure:
https://docs.llamaindex.ai/en/stable/examples/customization/llms/AzureOpenAI.html

The embedding model is only called once during queries, but depending on the query engine you have setup, the LLM could be called up to 12 times per query yea
we set the response mode to be compact already
here's the code:

```index = VectorStoreIndex.from_vector_store(vector_store=vector_store)

print("loaded vector store index")
response_synthesizer = get_response_synthesizer(response_mode='compact')

custom_qa_prompt = some_text

qa_prompt_tmpl = PromptTemplate(custom_qa_prompt)

node_postprocessors=[
SentenceEmbeddingOptimizer(percentile_cutoff=0.5)
]

query_engine = index.as_query_engine(
similarity_top_k=rag_nodes_top_k,
response_synthesizer = response_synthesizer,
node_postprocessors = node_postprocessors,
verbose=True
)

query_engine.update_prompts(
{"response_synthesizer:text_qa_template": qa_prompt_tmpl}
)
oh, the SentenceEmbeddingOptimizer will call the embed model for each node
it removes sentences that are under a similarity score
need embeddings to do that
You can also pass the embed_model to it if you are using azure

SentenceEmbeddingOptimizer(percentile_cutoff=0.5, embed_model=embed_model)
so may I interpret that as: SentenceEmbeddingOptimizer helps trimming down the size of each node before the "compact" step and stuffing as contxt_str, which increases runtime?
Since the LLM is the majority of runtime, in cases with a smaller top k it will probably be faster. But it also reduces token cost

(but tbh, I never use it lol)
For azure openai object, is api_version essentially Model version on their portal?
Attachment
Screenshot_2023-12-22_at_11.29.19.png
We have set up the service context, but the app is still trying call the chat_completion endpoint from openai instead of Azure:

```
service_context = ServiceContext.from_defaults(llm = LangChainLLM(rag_llm))
print("got service context")
index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context)
```2023-12-22 12:27:36,256:INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2023-12-22 12:27:46,651:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Model version is not api version
thanks, so now you can see i am using LangChainLLM() to wrap the whole lc llm object and there's still 1 call to the openai endpoint
Embeddings will probably be using openai, unless you changed the embed model
yeah but chat/completions is still there
does it have anything to do with lanchain? because this is query engine served as an lc tool
but the langchain agent was set to be using azure endpoints anyway
hey @Logan M I think I have figured it out, service_context has to also be included in the response_synthesizer. Perhaps I can help update the doc and create a PR?
@Logan M when I try from llama_index.embeddings import AzureOpenAIEmbedding, I get this error: ---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[5], line 1
----> 1 from llama_index.embeddings import AzureOpenAIEmbedding
ImportError: cannot import name 'AzureOpenAIEmbedding' from 'llama_index.embeddings'
I think I am using the latest: Successfully installed llama-index-0.9.21
that import definitely exists. Maybe try from a fresh venv? from llama_index.embeddings import AzureOpenAIEmbedding worked for me πŸ€”
thanks, that worked
Add a reply
Sign up and join the conversation on Discord