Find answers from the community

Updated 3 months ago

Hello,

Hello,
I just found that when I implement a MultiModalVectorStoreIndex from a ChromaDB collection containing text and image nodes, It ignores the imageNodes. However If I instanciate a VectorStoreIndex using the same nodes or documents it's correctly retrieving the image nodes. I suppose that it's a bug ?
L
V
6 comments
what was the code?
Here you can find my code using MultiModalVectorStoreIndex :
Plain Text
client = chromadb.EphemeralClient(
    Settings(anonymized_telemetry=False, allow_reset=True)
)
chroma_collection = client.create_collection(
                name="multimodal_collection_test",
                embedding_function=embedding_function,
                data_loader=ImageLoader(),
)
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
 storage_context = StorageContext.from_defaults(vector_store=vector_store)


documents = SimpleDirectoryReader(mm_rag_helper.rag_folder_path).load_data()

# Nodes processing which returns a List[Union[TextNode, ImageNode] 
# where ImageNode has text field filled with a description.
nodes_lst  = compute_nodes(
            documents=documents,
            lmm=lmm,  #GPT4o
            prompt_template=useful_prompt_template,
            embedding_model=embedding_model,
        )

index = MultiModalVectorStoreIndex(
            nodes=nodes_lst,
            embed_model=embedding_model,
            storage_context=storage_context,
            is_image_to_text=True,
        )
And here is the models code:
Plain Text
embedding_function = OpenAIEmbeddingFunction(
    api_key=app_settings.azure.embedding_api_key,
    model_name=app_settings.azure.embedding_model,
    deployment_id=app_settings.azure.embedding_model,
    api_type="azure",
    api_base=app_settings.azure.embedding_uri,
    api_version=app_settings.openai.api_version,
)
lmm = AzureOpenAIMultiModal(
    engine=app_settings.openai.deployment_id,
    model=app_settings.openai.deployment_id,
    temperature=app_settings.rag.llm_config.temperature,
    azure_endpoint=app_settings.openai.endpoint,
    api_key=app_settings.openai.api_key,
    api_version=app_settings.openai.api_version,
    image_detail="high",
    max_new_tokens=700,
)
embedding_model = AzureOpenAIEmbedding(
    model=app_settings.azure.embedding_model,
    deployment_name=app_settings.azure.embedding_model,
    api_key=app_settings.azure.embedding_api_key,
    azure_endpoint=app_settings.azure.embedding_uri,
    api_version=app_settings.openai.api_version,
)
It didn't ignore the image nodes, but you didn't provide an image vector store. (They need to be separete due to the different embeddings)
I've tested also this solution with ChromaDB but It didn't retrieve any imageNode neither. By the way, I wanted to only use a text embedding model and use it on images summaries to retrieve the original images.
Add a reply
Sign up and join the conversation on Discord