metadata.theme
as well.filters = MetadataFilters( filters=[ MetadataFilter( key="theme", operator=FilterOperator.EQ, value="Fiction" ), ] )
QueryFusionRetriever
generate num_queries
queries for each retriever defined? In the example here 3 are generated, but for some reason I get 6 (3 for bm25 and 3 for vector?). I would prefer just the 3 since they're almost always the same...Generated queries: 1. What were the major events or milestones in the history of Interleafe and Viaweb? 2. Can you provide a timeline of the key developments and achievements of Interleafe and Viaweb? 3. What were the successes and failures of Interleafe and Viaweb as companies?
DEFAULT = "default" SPARSE = "sparse" HYBRID = "hybrid" TEXT_SEARCH = "text_search" SEMANTIC_HYBRID = "semantic_hybrid"
from llama_index.embeddings.huggingface import HuggingFaceEmbedding # loads BAAI/bge-small-en # embed_model = HuggingFaceEmbedding() # loads BAAI/bge-small-en-v1.5 embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
index = VectorStoreIndex.from_documents(documents)
I can show_progress
, but that just shows the node's progress. Is it possible to print
one of the fields/metadata fields so I can see what file/node is being processed? I have a file that is causing failures but not sure what it is, and turning on full logging I couldn't find it.llama-create
for a starting point? I did for a quick test of making a backend and it worked great! I'm looking for some advice. What's the easiest way to add the nodes and corresponding metadata that was returned? I want to maintain the streaming response, and don't think I can modify index.as_chat_engine
...Just build a custom chat message object like this?SimpleDirectoryReader
chunks a PDF into nodes by default (pages), how do I control the node sizing? Do I need to call out each file type explicitly? Or can I use something like SimpleNodeParser
to overwrite the defaults?