index = VectorStoreIndex.from_documents( documents, transformations=[splitter], embed_model=embed_model ) query_engine = index.as_query_engine(llm=llm)
%pip install llama-index-llms-openai %pip install llama-index-vector-stores-weaviate
metadata
key of each document. Here's an example of how you can do this:# load your documents normally, then add your metadata documents = SimpleDirectoryReader("../data/paul_graham").load_data() for document in documents: document.metadata = {"unique_text": "Your unique text here"} storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex.from_documents( documents, storage_context=storage_context )
"Your unique text here"
is the unique text that you want to add as metadata. You can replace this with any unique text that you want to add for each document. The SimpleDirectoryReader
is used to load the documents, and then a loop is used to add the metadata to each document.embedding=None
for your documents, it means that the embeddings for your documents have not been generated or set. In LlamaIndex, embeddings are vector representations of your documents that are used for similarity search in the vector store.from llama_index.embeddings.dashscope import ( DashScopeEmbedding, DashScopeTextEmbeddingModels, DashScopeTextEmbeddingType, ) # Create embeddings embedder = DashScopeEmbedding( model_name=DashScopeTextEmbeddingModels.TEXT_EMBEDDING_V2, text_type=DashScopeTextEmbeddingType.TEXT_TYPE_QUERY, ) # Call text Embedding embedding = embedder.get_text_embedding("Your text here")
DashScopeEmbedding
model is used to generate embeddings for a text. You can replace "Your text here"
with the text of your document.for document in documents: document.embedding = embedder.get_text_embedding(document.text)
from pathlib import Path # load your documents normally, then add your metadata documents = SimpleDirectoryReader("../data/paul_graham").load_data() for document in documents: file_path = Path(document.metadata['filename']) document.metadata['filename_without_extension'] = file_path.stem
Path
class from the pathlib
module is used to get the file name without extension (also known as the stem of the file path). The stem
attribute of a Path
object returns the file name without the extension.filename
metadata is already set for each document and contains the full file name including the extension. If this is not the case, you will need to adjust the code accordingly."../data/paul_graham"
with the path to your actual data.prefix
but later change that prefix in your code, your query will come back empty. Redis saves the prefix you originally created your index with and expects it to be consistent [source] ().vector_store.delete_index()
prefix
but later change that prefix in your code, your query will come back empty. Redis saves the prefix you originally created your index with and expects it to be consistent [source] ().vector_store.delete_index()