The community members are discussing how to find information about a vector index, such as the number of documents, the model used, dimensionality, and other basic details. One community member suggests storing this information in the metadata, using document.metadata['llm_used'] = LLM name to store the language model used. They also mention that the number of nodes in the index can be checked using print(len(index.docstore.docs)).
Another community member is having issues with the OpenAI embedding model, where they only see "ada" usage even after upgrading to version 0.10 and using the "text-embedding-3-large" model. They ask if this is normal or if there might be a bug. Other community members suggest checking the Settings.embed_model information to ensure the new model is being used, and to interact with the OpenAI embedding directly to verify the model name.
The community members also discuss whether it's possible to store the embedding model information in the vector_index metadata, rather than just the individual document metadata. They ask if they can find the embedding model used to create the vector index after it has been created, or how to save the model name when creating the index.
I upgraded to 0.10 and started using "text-mbedding-3-large", tried it out using the new settings thing using: Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-large", embed_batch_size=10) But under OpenAI usage, I only see ada usage, it's been at least 3 hours now. Does it normally take such long time to update it or might there be a bug with it?
Check the embed_model info: print(Settings.embed_model) It should reflect the new model name, You can check this using a py script ( Just to be sure that the model is not being replaced under the hood )
Create a py script , interact with openai embedding directly and then check if they are showing new model nam eor not
Sorry to check in again, that is a really helpful thing actually for later evaluation with multiple indexes. One question however - "You can define all this in the metadata if you want.
document.metadata['llm_used'] = LLM name"
Can I do that for the vector_index = load_index_from_storage(storage_context)
So that I can do vector_index.metadata instead of the document.metadata?
To specify, can I find out the embedding model used to create the vector_index after it is created? ("or how to save the LLM_NAME in the vector_index when creating it?")