Question regarding JSON LOADER / WHISPER

At a glance

Question regarding JSON LOADER / WHISPER TRANSCRIPTS

The goal may be something as simple as retrieving segments (chunks) of the interview/transcript based on thematic codes (Drivers, Solutions, Barriers), summarization tasks, or additional Q&A.

The question revolves around the best doc loading / chunking / embedding strategy given the structure of whisper transcripts. If one wanted to maintain metadata in the doc (segment) level such as "speaker" "confidence" "timestamps", how would one then structure the chunks and embedding to maintain semantic cohesion?

IE we may have 15 lines from a 5000 line interview (whisper JSON file) which should be grouped together:
...
[
Speaker1: Asking a question
Speaker1: Continuing the same question
Speaker1: filler word
Speaker2: Asks clarifying question
Speaker 1: quick answer:
Speaker2: begins answering...
Speaker2: Continues....
Speaker2: Continues...
Speaker1: interrupts with quick clarifier
Speaker 2: continues...
(end of answer)
]
...

What are some methods to employ to isolate these high level question answer pairs from a whisper transcript? How can the JSON loader be employed, or are there best practices around whisper transcript RAG in general?

Thanks 🙂👀

5 comments

LLogan M

To isolate question/answer pairs, you basically need to use an LLM to process the transcript I think

ccablecutter

Maybe I'm an idiot, but I'm having a hard ass time figuring out how nodes / documents can be built without using the abstracted "loaders" and "parsers" and the loaders and parsers are not working the way I'd like them to for my use case.

LLogan M

Plain Text

from llama_index.core.schema import TextNode

node = TextNode(text="text", metadata={"key": "val"})

LLogan M

https://docs.llamaindex.ai/en/stable/module_guides/loading/documents_and_nodes/root.html

LLogan M

Documents and nodes have basically the same API, just a semantic difference in what they represent

Add a reply

Find answers from the community

Question regarding JSON LOADER / WHISPER