Find answers from the community

Updated 2 months ago

Azure aisearch vector store issue with metadata fields mapping

running into an issue using AzureAISearchVectorStore here when trying to use this vector store instead of deafult in-memory:
Plain Text
# Define metadata fields mapping
metadata_fields = {
    "doc_id": ("doc_id", MetadataIndexFieldType.STRING),
    "page_num": ("page_num", MetadataIndexFieldType.INT64),
    "image_path": ("image_path", MetadataIndexFieldType.STRING),
    "parsed_text_markdown": ("parsed_text_markdown", MetadataIndexFieldType.STRING),
    "context": ("context", MetadataIndexFieldType.STRING),
}

# Initialize Azure AI Search vector store
vector_store = AzureAISearchVectorStore(
    search_or_index_client=index_client,
    index_name="llamaindex-multimodal-contextual-retreival",
    index_management=IndexManagement.CREATE_IF_NOT_EXISTS,
    id_field_key="id",
    chunk_field_key="parsed_text_markdown",  
    embedding_field_key="embedding",
    embedding_dimensionality=1536,  # Based on embedding model
    metadata_string_field_key="metadata",  # Stores all metadata as a JSON string
    doc_id_field_key="doc_id",
    filterable_metadata_field_keys=metadata_fields,
    language_analyzer="en.lucene",
    vector_algorithm_type="exhaustiveKnn",
)

# Create storage context
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Build the index
index = VectorStoreIndex.from_documents(
    new_text_nodes,
    storage_context=storage_context,
    llm=llm,
    embed_model=embed_model,
)

Error: AttributeError: 'TextNode' object has no attribute 'get_doc_id'
f
L
4 comments
here is the output of new_textnodes[0]TextNode(id='d2edfe0f-83e8-4849-b7c9-24348724df68', embedding=None, metadata={'doc_id': 'doc_1', 'page_num': 1, 'image_path': 'data_images_state_of_ai_report_2024\58f8dea9-e908-4db0-82dc-f1747fc8abb3-page_1.jpg', 'parsed_text_markdown': '# STATE OF AI REPORT\n\nOctober 10, 2024\n\nNathan Benaich\n\n## AIR STREET CAPITAL\n\nstateof airstreet.', 'context': 'assistant: This chunk serves as the title page of the "State of AI Report 2024," authored by Nathan Benaich from Air Street Capital, introducing the report's focus on the current state and future predictions of artificial intelligence.'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n')
@Logan M can you advise? πŸ™
Usually this happens when you do from_documents() and pass in nodes

If you already have nodes, you can do VectorStoreIndex(nodes=nodes, ...)
you are the goat thanks
Add a reply
Sign up and join the conversation on Discord