Find answers from the community

Updated 11 months ago

Spraying a Thread

At a glance
Having a weird problem, was wondering if this is a bug or something
i
s
L
68 comments
How are you loading your docs?
Which has a field metadata
But, for whatever reason, metadata is not a field in the TextNode schema
Rather, extra_info, the deprecated, is showing up, and is populated in responses
It's weird too because clearly the metadata is there
Are you passing metadata in as input when loading docs?
no, this is from a vector db
the metadata is quite literally there
and yet when serializing to json, gone
TextNode comes from from llama_index.core.schema import TextNode, NodeWithScore
is the answer part of your metadata then?
that's one of the keys in the metadata object yep
I'm not having the same issue and i'm using the same version
let me try manually serialize before fastapi does it
Plain Text
    def load_data(self, metadata, text):
        return Document(text=text, metadata=metadata)


...

Plain Text
            for child_node in child_nodes:
                print(child_node.metadata)

            all_child_nodes.extend(child_nodes)


Plain Text
{'date': '2024-02-18 w07', 'url': 'hxxps://www.example', 'section': 'threat intelligence/hunting', 'source': 'phishlabs', 'title': 'phishingasaservice profile labhost threat actor group'}
or try passing child_node.to_json()
to serialize it to json before returning it
Plain Text
print(response.json())
and yet not in the reply
so i serialized it and then returned a dict instead and same issue
and it still is not in the schema, but it is in the model
seems to be the same deal
print(child_node.metadata)
data_json = child_node.to_json() # str
data = json.loads(data_json) # dictionary
print(json.dumps(data['metadata']))
as i think you suspected
Plain Text
        response: IndexResponse = await group.search(config)
        child_node = response.nodes[0].node
        print(child_node.metadata)
        data_json = child_node.to_json()  # str
        print("got", data_json)
        data = json.loads(data_json)  # dictionary
        print("got2", data)
        print(json.dumps(data['metadata']))
ok, so, maybe a fastapi issue...
can you print:
Plain Text
print(type(response), response)
print(type(response.nodes), response.nodes)
print(type(response.nodes[0]), response.nodes[0])
print(type(response.nodes[0].node), response.nodes[0].node)
print(type(response.nodes[0].node.metadata), response.nodes[0].node.metadata)
Plain Text
        response: IndexResponse = await group.search(config)

        print(type(response), response)
        print(type(response.nodes), response.nodes)
        print(type(response.nodes[0]), response.nodes[0])
        print(type(response.nodes[0].node), response.nodes[0].node)
        print(type(response.nodes[0].node.metadata), response.nodes[0].node.metadata)
and this wont work?

meh = json.dumps(response.nodes[0].node.metadata)
for reference, here is a response generate by self.ResponseModel() and returned
and that was produced with ResponseModel
when i do the following:
Plain Text
       # Success!
        d = self.ResponseModel(
            code="SUCCESS",
            status=200,
            message="Successfully retrieved the index group info.",
            response=response.dict()
        )

        print(d.json())
        return d
this seems like it simply must be a fastapi-related issue and not a llama issue directly, but it may be how they are interacting
i've gone and inserted a FastAPI middleware and just before the response i can see that it is producing the incorrect dict missing just the metadata field, but containing the extra_info field
weird... yeah, sorry. not sure I can help with that part. maybe someone else will chime in here for you
could maybe try adding a field:

Plain Text
response = client.post(
   'hxxp://meh[.]com/apistuff',
    json={"metadata": metadata}
                        )
Plain Text
async def your_thing(metadata: PydandicModel..):
    metadata = metadata.metadata
    
    meh = await do_stuff_with_metadata(metadata)
    
    return meh
long thread lol not sure what the issue is here.

You want the metadata from the response object? From the source nodes?
when serializing from a TextNode to JSON, the metadata field is...just gone from the object
it isn't in the schema
I have an extra_info object which is strange because it's deprecated
And yet even if I print right before I return to FastAPI, it serialized correctly
i thought it was a llamaindex issue, but it may just be a fastapi one
either way--frustrating πŸ˜‚
well...by un-typing it...i got my metadata back
Attachment
image.png
ok, well, i'm going to consider this resolved
no idea why this works but whatever
Add a reply
Sign up and join the conversation on Discord