Find answers from the community

Updated 5 months ago

πŸ‘‹ Is there a way to change the default

At a glance

The community members discuss how to customize the default templates in schema.py, specifically the DEFAULT_TEXT_NODE_TMPL. A community member provides an example of how to create a customized Document object with specific metadata and templates. The discussion then explores the differences between Document and Node objects, and the community members conclude that there is nearly no difference between them. They suggest that the community member can either instantiate new Node objects with the desired customization or modify the existing nodes. Finally, the community members provide guidance on how to use the VectorStoreIndex class to build an index from the customized nodes.

Useful resources
πŸ‘‹ Is there a way to change the default templates in schema.py (specifically DEFAULT_TEXT_NODE_TMPL)? The template is used by the get_content method on the TextNode.
L
P
20 comments
Plain Text
document = Document(
    text="This is a super-customized document",
    metadata={
        "file_name": "super_secret_document.txt",
        "category": "finance",
        "author": "LlamaIndex",
    },
    excluded_llm_metadata_keys=["file_name"],
    metadata_seperator="::",
    metadata_template="{key}=>{value}",
    text_template="Metadata: {metadata_str}\n-----\nContent: {content}",
)
perfect! But now wait what's the difference between a document and a node? I thoguht I was working with nodes here
oh..... I think I see. So is this something I'd have to connect to SimpleDirectoryReader?
there is nearly zero difference between a document and node
mostly just naming/perception lol
the classes are nearly identical
so when I iterate through all of my nodes after pulling them out of a PDF, should I just instantiate new Nodes with all of this customization added? I'm guessing I could do:
Plain Text
node = TextNode(
  text="blah blah",
...
)
Yea you can do that! Or you can just modify the existing nodes if you have them
node.text_template = "..."
of course I can. How did I miss that! Thanks!
Ok haha followup question! VectorStoreIndex().build_index_from_nodes(nodes) returns an IndexDict type object but I really need the VectorStoreIndex. Should I just be passing nodes to .from_documents(nodes) instead?
Except no that doesn't work because .from_documents expects the list items to have .get_doc_id
Use VectorStoreIndex(nodes, ...)
oh ok I thought I'd still need to call some sort of processing method but I guess not
build_index_from_nodes() is actually called from the base constructor πŸ‘
Add a reply
Sign up and join the conversation on Discord