Find answers from the community

Updated 2 years ago

A beginner so perhaps a silly question I

At a glance
A beginner, so perhaps a silly question. I'm getting it to run on a book I had, and it seems to be copy pasting sentences in full in response to any question. Why would that be?
2
y
j
R
20 comments
Depends on the prompt, you might need to ask it to summarize or re-phrase
if you're asking a summarization question, consider using a GPTListIndex instead of a GPTSimpleVectorIndex (which by default only returns 1 chunk)
Cool! Will try both out
Trying the listindex out, though it's taking a looong time!
yeah it will go over all your documents :/ since it's a summary query
Ah fair. Is that gonna take like a shitton of tokens for querying each chunk? Wondering
Is this something that could be added to GPT-Index and makes sense to implement? If it can, I wonder how itโ€™ll fare agains regular summarization as far as speed of summarization goes.

https://github.com/helliun/targetedSummarization
Need the ability to make a thread within a thread in discord ๐Ÿ™‚
ooh this is interesting. i'll take a look!
Vereeery interesting indeed
Being able to do a first pass in memory before a call to an LLM is miiiiint
Update: this works surprisingly well on the ToS documents I am working with right now.
The reducer is pretty lit at extracting relevant information from a given named section of the document such that it can then be passed to the LLM for inference, though I obviously need to vet it more for accuracy.
I'm trying to think about how this could potentially work with gpt index. you can create targeted summaries in gpt index as well right? it's just that TextReducer can extract exact sentences from the text, and is maybe faster at summarization since they use SentenceTransformer
I think the big thing is reducing LLM calls to a minimal level.

That said, ada and curie are real cheap, but if the difference is "free vs cheap" it is just another option in the cost-benefit matrix.

This library's reducer is subtractive, leaving the original text in place and attempting to remove text that is irrelevant to the query, which is theoretically a task that ada could do, at the expense of not necessarily being able to do the whole task in one-shot.
got it, thanks! maybe one application of this is as a "preprocessor" of sorts on the underlying data before it gets fed into the LLM?
Would this make more sense to implement as part of LangChain, instead of GPT-Index? It'd be available in both then, no? @hwchase17
regardless of where its implemented, having an interface for you to plug that into gpt index as part of a "preprocessing" step for data structures during query-time is a TODO specific to gpt index
gotcha. i think i struggle, like many, in really understanding the intersections between yours and Harrison's apps. hrmm... a venn diagram would be sweet
imo would be awesome to add this (or some version) as a chain in langchain! and then also would be awesome to mke the "preprocessing" step chains (or at least support chains)
Add a reply
Sign up and join the conversation on Discord