How are you loading your docs?
Which has a field metadata
But, for whatever reason, metadata is not a field in the TextNode schema
Rather, extra_info, the deprecated, is showing up, and is populated in responses
It's weird too because clearly the metadata is there
Are you passing metadata in as input when loading docs?
no, this is from a vector db
the metadata is quite literally there
and yet when serializing to json, gone
TextNode comes from from llama_index.core.schema import TextNode, NodeWithScore
is the answer
part of your metadata then?
that's one of the keys in the metadata object yep
I'm not having the same issue and i'm using the same version
let me try manually serialize before fastapi does it
def load_data(self, metadata, text):
return Document(text=text, metadata=metadata)
...
for child_node in child_nodes:
print(child_node.metadata)
all_child_nodes.extend(child_nodes)
{'date': '2024-02-18 w07', 'url': 'hxxps://www.example', 'section': 'threat intelligence/hunting', 'source': 'phishlabs', 'title': 'phishingasaservice profile labhost threat actor group'}
or try passing child_node.to_json()
to serialize it to json before returning it
so i serialized it and then returned a dict instead and same issue
and it still is not in the schema, but it is in the model
seems to be the same deal
print(child_node.metadata)
data_json = child_node.to_json() # str
data = json.loads(data_json) # dictionary
print(json.dumps(data['metadata']))
response: IndexResponse = await group.search(config)
child_node = response.nodes[0].node
print(child_node.metadata)
data_json = child_node.to_json() # str
print("got", data_json)
data = json.loads(data_json) # dictionary
print("got2", data)
print(json.dumps(data['metadata']))
ok, so, maybe a fastapi issue...
can you print:
print(type(response), response)
print(type(response.nodes), response.nodes)
print(type(response.nodes[0]), response.nodes[0])
print(type(response.nodes[0].node), response.nodes[0].node)
print(type(response.nodes[0].node.metadata), response.nodes[0].node.metadata)
response: IndexResponse = await group.search(config)
print(type(response), response)
print(type(response.nodes), response.nodes)
print(type(response.nodes[0]), response.nodes[0])
print(type(response.nodes[0].node), response.nodes[0].node)
print(type(response.nodes[0].node.metadata), response.nodes[0].node.metadata)
and this wont work?
meh = json.dumps(response.nodes[0].node.metadata)
for reference, here is a response generate by self.ResponseModel() and returned
and that was produced with ResponseModel
# Success!
d = self.ResponseModel(
code="SUCCESS",
status=200,
message="Successfully retrieved the index group info.",
response=response.dict()
)
print(d.json())
return d
this seems like it simply must be a fastapi-related issue and not a llama issue directly, but it may be how they are interacting
i've gone and inserted a FastAPI middleware and just before the response i can see that it is producing the incorrect dict missing just the metadata field, but containing the extra_info field
weird... yeah, sorry. not sure I can help with that part. maybe someone else will chime in here for you
could maybe try adding a field:
response = client.post(
'hxxp://meh[.]com/apistuff',
json={"metadata": metadata}
)
async def your_thing(metadata: PydandicModel..):
metadata = metadata.metadata
meh = await do_stuff_with_metadata(metadata)
return meh
long thread lol not sure what the issue is here.
You want the metadata from the response object? From the source nodes?
when serializing from a TextNode to JSON, the metadata field is...just gone from the object
I have an extra_info
object which is strange because it's deprecated
And yet even if I print right before I return to FastAPI, it serialized correctly
i thought it was a llamaindex issue, but it may just be a fastapi one
either way--frustrating π
well...by un-typing it...i got my metadata back
ok, well, i'm going to consider this resolved
no idea why this works but whatever