@Logan M, I tried your method to remove the metadatas to embedde from the documents.
However when I try to embedde my sub nodes (I'm trying to build a small to big retriever) I still have a message displaying :
Metadata length (108) is close to chunk size (128). Resulting chunks are less than 50 tokens. Consider increasing the chunk size or decreasing the size of your metadata to avoid this.
Any idea why and what can I do about it please? π
I even tried to removed all the metadatas from the embedding cf :
sub_nodes[0].excluded_embed_metadata_keys
Out[50]:
['reference',
'url',
'issu_jurisprudence',
'cour',
'type_chambre',
'chambre',
'date_execution',
'reference_jurisprudence',
'publie',
'origine',
'date_execution_date_format',
'url']
sub_nodes[0].metadata
Out[51]:
{'reference': 'Cour de cassation, civile, Chambre civile 2, 6 janvier 2022, 20-12.220, InΓ©dit',
'url': '
https://www.legifrance.gouv.fr/juri/id/JURITEXT000045009675',
'issu_jurisprudence': 'rejet',
'cour': 'cour de cassation',
'type_chambre': 'civile 2',
'chambre': 'civile',
'date_execution': '6 janvier 2022',
'reference_jurisprudence': '20-12.220',
'publie': 'InΓ©dit',
'origine': 'jurisprudence judiciaire',
'date_execution_date_format': '2022-01-06T00:00:00'}