Find answers from the community

Home
Members
asti009asti
a
asti009asti
Offline, last seen 3 months ago
Joined September 25, 2024
a
asti009asti
·

Nodes

Hi, I have a couple of questions on nodes' metadata:

  1. is there a way to evaluate how different metadata variables affects the cosine similarity node scores? I was thinking of building a correlation matrix to evaluate this but if there's already something available, I would appreciate a hint.
  1. Can somebody please explain how LLM visible metadata works? Is it sent along with the query and contexts like a contexts = [[node1_text, node1_metadata], [node2_text, node2_metadata], etc ] ? I wonder how LLM decides on the 'weight' of metadata information provided to help improve the answer.
2 comments
a
W
Hi, does anybody have an example of working PydanticProgramExtractor? There might be something obvous missing. I was following an example in the docs for 0.10.15 and am getting a Field required [type=missing, input_value={}, input_type=dict] error.

def run_metadata_pipeline(nodes, node_parser): openai_program = OpenAIPydanticProgram.from_defaults( output_cls=NodeMetadata, prompt_template_str="{input}", # extract_template_str=EXTRACT_TEMPLATE_STR ) program_extractor = PydanticProgramExtractor( program=openai_program, input_key="input", show_progress=True,metadata_mode=MetadataMode.EMBED, workers=3 ) extractor = [ QuestionsAnsweredExtractor(questions=3, metadata_mode=MetadataMode.EMBED, workers=3), SummaryExtractor(summaries=["prev", "self", "next"],workers=3) ] pipeline = IngestionPipeline(transformations=[program_extractor]) return pipeline.run(nodes=nodes, in_place=False, show_progress=True)


here's my NodeMetadata class:

class NodeMetadata(BaseModel): """Node metadata.""" description: str = Field( ..., description="A concise one sentence description of what this text chunk useful for." ) terms: List[str] = Field( ..., description="a list of keywords used in this text chunk" )

The thing is that it works for a while but the error is thrown upon processing of 1-2% of nodes. If you are aware of the bug in this regards, please let me know. Thanks.
7 comments
L
a
W
Hi, anyone knows if llamaindex supports metadata matching by semantic similarity? KeywordTableIndex does this by checking stings for equality while in certain cases cosine similarity works better. I would appreciate a couple of ideas. I currently do so by building a separate index from metadata items and do the similarity check as well nodes matching via a custom retriever but the complexity grows quickly if there's more matching items.
1 comment
a