Find answers from the community

Updated 6 months ago

Customized Documents

At a glance

The community member is asking whether it is necessary to create Documents if their preprocessing pipeline outputs chunks, and whether they can just create Nodes and insert them into the index, or create a Document and put Nodes inside it.

The comments indicate that the community member can customize Documents, and that they can also create the Nodes themselves and insert them. The comments suggest that the community member should ensure the Nodes are short enough to fit into the LLM.

Useful resources

IItamar

Is it necessary to create Documents if my preprocessing pipeline outputs chunks? ie I have some unique data type that is input into my preprocessing pipeline, that pipeline outputs chunks of each data sample with associated metadata for each chunk. Can I just create Nodes and insert those nodes into my index? Better yet, can I create a Document and put nodes inside of it?

2 comments

WWhiteFang_Jr

Yes you can customize Documents.
You can find more here: https://gpt-index.readthedocs.io/en/stable/core_modules/data_modules/documents_and_nodes/usage_documents.html#customizing-documents

LLogan M

Also in addition, you can also create the nodes yourself and insert. You'll get better results if you make sure the nodes are short enough to fit into the LLM though

Plain Text

from llama_index.schema import TextNode

node = TextNode(text="..")

Add a reply