Find answers from the community

Updated 7 months ago

llamaindex_azurecosmosdb_help

Hi ,
using llamaindex i am trying to read an pdf and after chunking and embedding trying to lead the document into azurecosmos db , But i am getting following error --pymongo.errors.OperationFailure: cosmosSearchOptions, full error: {'ok': 0.0, 'errmsg': 'cosmosSearchOptions', 'code': 197, 'codeName': 'InvalidIndexSpecificationOption'} my code is as below
import openai
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, get_response_synthesizer
from llama_index.core import StorageContext, ServiceContext, load_index_from_storage
from llama_index.embeddings.openai import OpenAIEmbedding
import pymongo
from llama_index.vector_stores.azurecosmosmongo import (
AzureCosmosDBMongoDBVectorSearch,
)
import json
import certifi
mounted_fldr = "Users/vivek/Documents/pycharmprojects/docs_process_llamaindex"
config_file = f"/{mounted_fldr}/config/config1.json"
src_data = f"/{mounted_fldr}/src_data"

index_dir = f"/{mounted_fldr}/index_data"

Set up your OpenAI API key

with open(config_file) as f:
config = json.load(f)

open_ai Access API keys

key = config['openai_api']['api_key']
openai.api_key = key
Settings.llm = OpenAI(temperature=0, model="gpt-4-turbo")
Settings.embed_model = OpenAIEmbedding(model='text-embedding-ada-002')
documents = SimpleDirectoryReader(src_data).load_data()
connection_string = f'mongodb://{u_name}:{passwd}@{host}:10255/?ssl=true'
mongodb_client = pymongo.MongoClient(connection_string,tlsCAFile=certifi.where())
print(mongodb_client.HOST)
store = AzureCosmosDBMongoDBVectorSearch(
mongodb_client=mongodb_client,
db_name="db_llama",
collection_name="test_db_pdf",
index_name="test_index"
)
storage_context = StorageContext.from_defaults(vector_store=store)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context,
)
k
W
13 comments
team please help on above
Hi I would suggest you update all the requirements to latest version and try one more time by following this: https://docs.llamaindex.ai/en/stable/api_reference/storage/vector_store/azurecosmosmongo/?h=azure
so @WhiteFang_Jr you want to do pip install -r requirement.txt again and then run the same script
You need to upgrade the requirements: pip install -U package_name
still getting the same error
@WhiteFang_Jr still getting the same error. any other solution if you can think of?
You'll have to debug where exactly the error is popping. These are third party vector stores mostly made by community
Based on seeing the error, it looks like it has something to do with Index.
explore further in this direction
@WhiteFang_Jr can you tell, how can i retrieve the indexes saved in azure cosmos db and then perform the query. I am not able to find the page with required packages
I have saved the documents indexes and embedding with below code
vector_store = AzureCosmosDBMongoDBVectorSearch(
mongodb_client=mongodb_client,
db_name="db_llama",
collection_name="test_db_pdf",
index_name="pdf_index"

)

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context,
You can follow this doc: https://docs.llamaindex.ai/en/stable/api_reference/storage/vector_store/azurecosmosmongo/?h=azure

You can create a vector store and create vector index over it. Then based on your query it will retrieve the related nodes
@WhiteFang_Jr creating vector store is and loading data will be done by pipeline and there is another service, that willl be accessing it from the loaded index in vector store. above page talk about laoding data in vector store.how to retrieve that saved index not mentioned anywhere
Check this doc:
https://docs.llamaindex.ai/en/stable/examples/vector_stores/AzureCosmosDBMongoDBvCoreDemo/#create-the-index


For loading: you need to change just one line:
index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
I dont think it will fetch the index, it will only fetch the vector store instance
that will help you to query your saved data based on your query
Add a reply
Sign up and join the conversation on Discord