Find answers from the community

Updated 4 months ago

Chapterwise nodes

At a glance

The community members are discussing how to create a book embedding where each chapter is a separate node, so that a user can query for a summary of a specific chapter. The main points are:

- Creating a single node per chapter may not be feasible due to token limits, so a set of nodes with defined relations for each chapter is suggested instead.

- The community members discuss using metadata or the DocumentSummaryIndex to help with queries like "summarize chapter 3".

- There is a discussion around using a vector database and a retriever to get the relevant nodes based on metadata filters, but there are concerns about the token limit still being an issue.

There is no explicitly marked answer in the comments.

Useful resources
hi all, if i am embedding a book, is there a way to make each chapter a node, so that when i ask "summarize chapter x" it works through this node only? would appreciate guidance. thank you!
W
M
b
11 comments
I wouldn't recommend single node for each chapter as it can be very long and high chance llm token limit will get crossed.

You can create set of nodes and define relation for each chapter in. Still all the nodes will be used to create summary may not get possible.

If the chapters are small in size then you can try following this link customise the nodes or document object as per your requirement.
https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/usage_pattern.html#basic-usage-pattern
thank you , the chapters are not small , 3-4 pages of text, will the basic-usage-pattern be enough to address queries like "summarize chapter 3" as an example?
Not entirely though. You can try adding metadata to each chapter or try DocumentsummaryIndex along with metadata. This may be able to help you in query like these.
Are you using a vector db @MitchMcD ?
Will it help @bmax to tackle this scenario?
I was just wondering, I was thinking if he was using a vector db, he could do what you said with a chapter in each node's metadata, use a retriever to get all of the nodes there and then pass into documentSummaryIndex
trying to figure out how to do it myself lol
like can you just do
Plain Text
filters = MetadataFilters(filters=[ExactMatchFilter(key="name", value="Chapter 1")])
retriever = index.as_retriever(filters=filters)
retriever.retrieve()

to get all nodes
but you can't do empty retrieve() so, wondering how
Yeah but while response generation all the nodes may not get used if the node length crosses token limit, that's what I'm thinking. So it may look half cooked response
that's the plan
Add a reply
Sign up and join the conversation on Discord