----------------------
To generate questions for your nodes with the metadata "questions_this_excerpt_can_answer", you can use the
QuestionsAnsweredExtractor
from the
MetadataExtractor
modules in LlamaIndex. Here is a step-by-step guide on how to do it:
- Import the necessary modules:
from llama_index.node_parser import SimpleNodeParser
from llama_index.node_parser.extractors import (
MetadataExtractor,
QuestionsAnsweredExtractor
)
from llama_index.text_splitter import TokenTextSplitter
- Initialize the
TokenTextSplitter
and MetadataExtractor
:
text_splitter = TokenTextSplitter(separator=" ", chunk_size=512, chunk_overlap=128)
metadata_extractor = MetadataExtractor(
extractors=[
QuestionsAnsweredExtractor(questions=3),
],
)
- Create a
SimpleNodeParser
and extract nodes from your documents:
node_parser = SimpleNodeParser.from_defaults(
text_splitter=text_splitter,
metadata_extractor=metadata_extractor,
)
# assume documents are defined -> extract nodes
nodes = node_parser.get_nodes_from_documents(documents)
This will generate questions for each node and store them in the metadata field "questions_this_excerpt_can_answer".
For more details, you can refer to the
LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------