llama-index
/gpt_index
llama-index
/gpt_index
, you can use the NodeParser
classes. Here's an example using the SimpleNodeParser
:from llama_index.node_parser import SimpleNodeParser parser = SimpleNodeParser() nodes = parser.get_nodes_from_documents(documents)
documents
is a list of Document objects. The get_nodes_from_documents
function will parse these Document objects into Node objects.from llama_index.data_structs.node_v2 import Node, DocumentRelationship node1 = Node(text="<text_chunk>", doc_id="<node_id>") node2 = Node(text="<text_chunk>", doc_id="<node_id>") # set relationships node1.relationships[DocumentRelationship.NEXT] = node2.get_doc_id() node2.relationships[DocumentRelationship.PREVIOUS] = node1.get_doc_id()
Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:node1
is the first node, you don't need to assign a PREVIOUS
relationship, as there is no previous node. Similarly, if node2
is the last node, you don't need to assign a NEXT
relationship, as there is no next node. The relationships are only assigned when they exist. Here's an example:from llama_index.data_structs.node_v2 import Node, DocumentRelationship node1 = Node(text="<text_chunk>", doc_id="<node1_id>") node2 = Node(text="<text_chunk>", doc_id="<node2_id>") # Set relationships only when they exist # For node1, since it's the first node, no PREVIOUS relationship is assigned node1.relationships[DocumentRelationship.NEXT] = node2.get_doc_id() # For node2, since it's the last node, no NEXT relationship is assigned node2.relationships[DocumentRelationship.PREVIOUS] = node1.get_doc_id()
node1
is the first node, so it only has a NEXT
relationship pointing to node2
. node2
is the last node, so it only has a PREVIOUS
relationship pointing to node1
.Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:None
value for a reference document ID when working with the GPT index.Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:required_keywords
and exclude_keywords
when querying the index. This will preemptively filter out nodes that do not contain required_keywords
or contain exclude_keywords
, reducing the search space and hence the time/number of LLM calls/cost.index.query( "What did the author do after Y Combinator?", required_keywords=["Combinator"], exclude_keywords=["Italy"] )
Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:ServiceContext
without an llm_predictor
. This will prevent the token usage information from being printed.from llama_index import ServiceContext # Create a custom ServiceContext without an llm_predictor service_context = ServiceContext.from_defaults(llm_predictor=None) # Query the index using the custom ServiceContext response = index.query("What did the author do after his time at Y Combinator?", service_context=service_context)
ServiceContext
is created without an llm_predictor
, and the query is performed using this custom ServiceContext
. This will prevent token usage information from being printed during the query.Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon: