Find answers from the community

Updated 3 months ago

When processing a JSON file https://docs

When processing a JSON file https://docs.llamaindex.ai/en/stable/module_guides/loading/node_parsers/modules.html

Can JSONNodeParser be combined with SentenceWindowNodeParser or SemanticSplitterNodeParser ?

I'd like to maintain the metadata of the original JSON structure and be able to Vector Query at a low level, but return a larger contextual window for processing with the LLM.
L
c
2 comments
you can definitely chain node parsers, and metadata should be maintained
Thanks logan, still having some trouble (just getting my head wrapped around the abstraction here.

Example JSON:
[
{
"start": 0.0,
"end": 3.16,
"speaker": "SPEAKER_01",
"text": " D\u00e4r, nu verkar det vara ig\u00e5ng och du h\u00f6r vad jag s\u00e4ger."
},
{
"start": 3.56,
"end": 8.34,
"speaker": "SPEAKER_01",
"text": " Bra, vi kan v\u00e4l b\u00f6rja lite snabbt med, vi kan bara presentera lite snabbt kanske."
},.... etc on for thousands of lines

I load the JSON like this:

##Load Option 1 (JSON)
from llama_index.core import (
VectorStoreIndex,
ServiceContext
)

from llama_index.readers.json import JSONReader

documents = JSONReader(ensure_ascii = False).load_data("/content/data/Lajla - Buttle Hembygdsförening_final_results.json")

Docs looks good but is just a single document.


Then I try to parse to nodes like this:


from llama_index.core.node_parser import JSONNodeParser

parser = JSONNodeParser()

nodes = parser.get_nodes_from_documents(documents)

print(nodes)

##But nodes is just empty.

What is the intended workflow given an input json like that? Where I'd like to maintain some sort of metadata/relationship info.
Add a reply
Sign up and join the conversation on Discord