Hello indexers,

At a glance

The post is about a community member wanting to use the Weaviate client with the LlamaIndex library. They have two questions: how to connect to a Weaviate client hosted in a Docker service, and how to implement RAG (Retrieval Augmented Generation) when the index and RAG are hosted in different services.

The comments provide some suggestions, such as checking the Weaviate documentation for connecting the client, and setting up the Weaviate client and passing it to the WeaviateVectorStore. However, the community member encounters an error related to the "llama-index-embeddings-huggingface" package not being found, and another error related to the "metadata" field in a Weaviate vector store query.

The community members and the assistant discuss the GraphQL query structure required to handle the metadata field issue, providing example queries and troubleshooting steps. However, there is no explicitly marked answer to the original questions in the post.

Useful resources

iisaac47

Hello indexers,

I want to use weaviate client with llama-index. I have 2 questions:

I have a docker service hosting my weaviate indexes. How to connect to this client
I want to implement RAG via llamaIndex. But I don't understand how (in my use case where the index and the RAG will be host in differents service) to achieve it

23 comments

WWhiteFang_Jr

https://weaviate.io/developers/weaviate/installation/docker-compose

I think if this works you may or may not have to make some changes in weaviate vector sctore class and then it should work

iisaac47

👍🏽

iisaac47

Thanks. SO how to connect llama-index?

WWhiteFang_Jr

You'll have to check, how they allow to connect with docker image, if it has some changes you can implement those changes in the weaviate vector store class for connection

LLogan M

You can just setup the weaviate client and pass it into WeaviateVectorStore

iisaac47

Hello, Here is my code:

Plain Text

import os
import weaviate

client = weaviate.Client("http://localhost:8080"
)

vector_store = WeaviateVectorStore(
    weaviate_client=client, index_name="Test1", text_key="text"
)

retriever = VectorStoreIndex.from_vector_store(vector_store=vector_store, embed_model='local').as_retriever(
    similarity_top_k=2
)

iisaac47

I got this error

Plain Text

    100 kwargs.pop("storage_context", None)
    101 storage_context = StorageContext.from_defaults(vector_store=vector_store)
--> 103 return cls(
    104     nodes=[],
    105     embed_model=embed_model,
...
    115         )
    117 if LCEmbeddings is not None and isinstance(embed_model, LCEmbeddings):
    118     try:

ImportError: `llama-index-embeddings-huggingface` package not found, please run `pip install llama-index-embeddings-huggingface`

iisaac47

My weavite client already has embeddings. How to fix this issue please

iisaac47

@kapa.ai How to fix this bug

import os
import weaviate
from llama_index.vector_stores.weaviate import WeaviateVectorStore
from llama_index.core import VectorStoreIndex
from llama_index.core.response.pprint_utils import pprint_source_node
from llama_index.core import Settings

Settings.embed_model = None

client = weaviate.Client("http://localhost:8080"
)

vector_store = WeaviateVectorStore(
weaviate_client=client, index_name="Test1", text_key="text"
)

retriever = VectorStoreIndex.from_vector_store(vector_store=vector_store)
from llama_index.core.vector_stores import ExactMatchFilter, MetadataFilters

filters = MetadataFilters(
filters=[ExactMatchFilter(key="page_number", value="1")]
)

retriever_query = retriever.as_retriever(
similarity_top_k=2,
# filters=filters,
verbose=True
)

source_nodes = retriever_query.retrieve("What did the author do growing up?")

ValueError: Invalid query, got errors: [{'locations': [{'column': 66, 'line': 1}], 'message': 'Field "metadata" of type "Test1_metadata_object" must have a sub selection.', 'path': None}]

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
Based on the information provided in the extracts, there isn't a direct solution to the specific bug you're encountering with the ValueError related to the metadata field in a Weaviate vector store query. The extracts do not cover troubleshooting or fixing errors related to querying vector stores, especially with the metadata field requirements in Weaviate.

However, the error message suggests that the query to the Weaviate vector store is expecting a sub-selection for the metadata field of type Test1_metadata_object. This typically means that when querying a GraphQL API (which Weaviate uses for queries), you must specify which fields of the metadata object you want to retrieve in your query.

While the extracts do not provide a direct fix, they do show examples of setting up and querying vector stores with LlamaIndex, including using metadata filters. If you're encountering this error during a retrieval operation with LlamaIndex and Weaviate, ensure that your query or the way LlamaIndex constructs the query for Weaviate correctly specifies the necessary sub-selections for the metadata field.

For more detailed guidance on constructing queries with Weaviate or troubleshooting this specific error, I recommend consulting the Weaviate documentation () and the LlamaIndex documentation (). These resources might provide more insights into the correct format for queries and any additional configuration that might be required when working with metadata in Weaviate.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

iisaac47

@kapa.ai can you write the expected graphql query

iisaac47

@kapa.ai can you write the expected graphql query

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
Based on the information provided in the extracts, there isn't a specific example of a GraphQL query for Weaviate that directly corresponds to the error message you're encountering. The extracts focus on using LlamaIndex with Weaviate and other vector stores, including how to set up clients, insert documents, and perform retrievals using LlamaIndex's Python interface. They do not include raw GraphQL queries or details on how to construct them for Weaviate.

However, based on the error message you've shared and general GraphQL query structure, I can provide a hypothetical example of what a GraphQL query might look like when querying a Weaviate vector store and needing to specify a sub-selection for a metadata field. Please note, this is a general example and not extracted from the provided documentation:

Plain Text

{
  Get {
    Test1(where: {operator: Equal, path: ["page_number"], valueInt: 1}) {
      text
      metadata {
        page_number
        # Include other metadata fields you need here
      }
    }
  }
}

This example assumes you're querying a class Test1 and you're interested in entries where the page_number is 1. It specifies that you want to retrieve the text field and a sub-selection of the metadata field, including page_number and potentially other metadata fields.

For constructing and troubleshooting GraphQL queries in Weaviate, I strongly recommend consulting the Weaviate documentation () for the most accurate and detailed guidance. The documentation will provide insights into the query structure, filtering options, and how to specify sub-selections for objects like metadata.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

iisaac47

@kapa.ai It is not working. here is my weaviate client schema

collection_schema = {
"properties": [
{
"name": "text",
"data_type": "TEXT",
"index_filterable": True,
"index_searchable": True
},
{
"name": "metadata",
"data_type": "OBJECT",
"index_filterable": True,
"nested_properties": [
{
"name": "embed_text",
"data_type": "TEXT",
"index_filterable": True,
"index_searchable": True
},
{
"name": "chunk_id",
"data_type": "TEXT",
"index_filterable": True,
"index_searchable": True
},
{
"name": "previous_chunk_id",
"data_type": "TEXT",
"index_filterable": True,
"index_searchable": True
},
{
"name": "next_chunk_id",
"data_type": "TEXT",
"index_filterable": True,
"index_searchable": True
},
{
"name": "page_number",
"data_type": "TEXT",
"index_filterable": True,
"index_searchable": True
},
{
"name": "file_name",
"data_type": "TEXT",
"index_filterable": True,
"index_searchable": True
},
{
"name": "file_id",
"data_type": "TEXT",
"index_filterable": True,
"index_searchable": True
}
]
}
]
}

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
Given the schema details you've provided for your Weaviate client, it appears you have a complex structure with a metadata object containing nested properties. To query such a structure, especially when dealing with an error that requires specifying a sub-selection for the metadata field, you need to adjust your GraphQL query accordingly.

Based on the schema, a more accurate GraphQL query example to retrieve documents from your Weaviate vector store, including specific metadata fields, would look something like this:

Plain Text

{
  Get {
    Test1(where: {path: ["metadata", "page_number"], operator: Equal, valueString: "1"}) {
      text
      metadata {
        embed_text
        chunk_id
        previous_chunk_id
        next_chunk_id
        page_number
        file_name
        file_id
      }
    }
  }
}

This query attempts to retrieve documents from the class Test1 where the page_number within the metadata object equals "1". It specifies that you want to retrieve the text field of the document as well as various fields within the metadata object.

Please note, the where clause in the query uses valueString to match the page_number since your schema indicates that page_number is of data type TEXT. Adjust the valueString to valueInt or another appropriate type if the actual data type of your page_number or any other field differs.

Remember, the exact structure of your query might need to be adjusted based on the specific requirements of your application and the Weaviate version you are using. Always refer to the Weaviate documentation () for the most accurate and up-to-date information on constructing GraphQL queries.

kkapa.ai

Since the provided extracts do not include direct examples of GraphQL queries for complex schemas or troubleshooting specific errors, the example above is constructed based on general GraphQL principles and the schema information you've shared.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

iisaac47

@kapa.ai I got an error
{
"data": {
"Get": {
"Test1": null
}
},
"errors": [
{
"locations": [
{
"column": 5,
"line": 3
}
],
"message": "could not extract filters: invalid where filter: missing an argument after 'page_number'",
"path": [
"Get",
"Test1"
]
}
]
}

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
Based on the error message you've received, it seems there's an issue with the where filter in your GraphQL query, specifically related to the page_number argument. The error suggests that the query is missing an argument after page_number, which likely means there's a formatting issue with how the filter is specified.

Given the schema you've shared earlier and the nature of the error, it's important to ensure that the where filter in your GraphQL query is correctly formatted according to Weaviate's requirements for filtering on nested properties within an OBJECT type.

Here's a corrected example of how your GraphQL query might be structured, ensuring that the where filter is properly formatted. This example assumes you're trying to filter based on the page_number within the metadata object:

Plain Text

{
  Get {
    Test1(where: {path: ["metadata", "page_number"], operator: Equal, valueString: "1"}) {
      text
      metadata {
        embed_text
        chunk_id
        previous_chunk_id
        next_chunk_id
        page_number
        file_name
        file_id
      }
    }
  }
}

Key points to check in your query based on the error message:

Path Specification: Ensure the path is correctly specified as an array, including the object name (metadata) and the property name (page_number).
Operator: Confirm you're using the correct operator (Equal in this case) for your filter condition.
Value Type: Make sure you're using the correct value type (valueString in this example) that matches the data type of the field you're filtering on. Since page_number is of type TEXT in your schema, valueString is used.

kkapa.ai

If you continue to encounter errors, I recommend reviewing the Weaviate documentation () for more detailed guidance on constructing where filters, especially for nested properties. The documentation will provide insights into the correct syntax and available operators for filtering your queries effectively.

Remember, the provided code and suggestions are based on the information shared in your schema and error message. Adjustments may be necessary based on the specific details of your implementation and the version of Weaviate you are using.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

Add a reply

Find answers from the community

Hello indexers,