Find answers from the community

Updated last year

Has anyone here used the

At a glance

..assets.

Has anyone here used the LongLLMLinguaPostprocessor? It definitely increases the density and quality of information that the LLM retains from the context, but it seems to clobber the metadata, so I can't cite any sources.

Just wondering if anyone is having a similar issue w/ the metadata.

51 comments

LLogan M

I think that feature is a tad experimental, but glad to hear its working for you! Would probably be an easy change to not clobber the metadata 🙂

..assets.

Yeah, I took a look but the solution wasn't too obvious

..assets.

I'm using the default metadata mode of MetadataMode.ALL, but all I get back at the end of the query for Metadata is

Plain Text

{'05603d6e-02fd-4ef3-bfa2-068961df84cf': {},
 '40c874f6-edd5-4747-87b1-eda8ae72f2cb': {},
 '4acb82e9-9f62-4795-a14e-d94f7bf2c05a': {},
 '632a3505-0bfb-4d73-b792-71e436e5a916': {},
 '9d0ee846-bce2-4374-bcf7-6cc2afe7ee6e': {},
 'ae9c50eb-548d-499f-bfc9-7070087df97a': {},
 'f35a3639-0319-467c-b1da-9fc74ce1728b': {},
 'f41aed07-f230-40e2-892f-46c926a7ecb5': {},
 'fb943654-6449-40e2-adf3-8c946d990c7f': {}}

..assets.

on the plus side, the improvement in detail from the responses is almost magical

..assets.

oh wait, I think I see the issue

..assets.

we're returning:

Plain Text

return [
  NodeWithScore(node=TextNode(text=t)) for t in compressed_prompt_txt_list
]

..assets.

seems to me that would do it 😄

..assets.

hmmm, the question is: how do you correlate the proper metadata back after it's compressed? I guess that's why it wasn't done originally... probably a "handle this another day" sorta issue

LLogan M

hmmm yea, a tricky problem 😅 Once it compresses, there's no way to associate back to an original source

..assets.

haha, well, good to know that we came to the same conclusion at least

..assets.

the real problem is that LLMLingua also reorders the context chunks to improve compressibility (and is part of the secret sauce for information retainment downstream)

..assets.

there should be the same number of chunks out as in, though... so I guess if you don't care if the metadata matches up chunk to chunk, then that's fine

..assets.

I wonder, though, if it would be better to make a "metadata batch" so to speak...

"These 3 documents are were batched together and transformed, here's the metadata for those original 3 documents"

..assets.

or maybe a better way to say it is "This textnode was synthesized from some or all of these 3 documents --- here is the corresponding metadata for those 3 documents"

..assets.

Whew... so the LLMLingua module is pretty dang amazing... it virtually eliminates lost in the middle problems

..assets.

right now it's only being used as a postprocessor for document retrieval, but I think it would also be effective for the refine step in the SubQuestionQueryEngine

..assets.

I'm blown away right now... even a 47x compression yields great info retrieval from the context... :mindexplosion:

LLogan M

wow that's kind of wild lol

LLogan M

I should look into trying that module again... maybe we can make it better 🙂

..assets.

lol, it totally is... when it compresses down that much, the majority of the compressed output is just gobbledygook, but apparently the LLM is able to make sense of it

..assets.

My retrieval stack is splitting documents using the sentence splitter stored in elasticsearch with bge-small for embeddings, then I retrieve the top 30 (lol) using hybrid search and then expand the window by 3 sentences, then re rank with bge-reranker-large to pick the top 5, then use Llm lingua to reduce it to a 100 token target... It's, amazingly, able to compress all the relevant info down to 100ish tokens while throwing out irrelevant stuff that remains after reranking...

All three of the models involved in that retrieval and reranking fit comfortably on a RTX 3060 with room to spare

..assets.

In the Contribution Guide, I noticed that there is the following section

Attachment

..assets.

however, that link is dead

..assets.

that seems like an ideal place to put LLM Lingua, though

..assets.

oh, wow... you were quick to comment on my PR

..assets.

I was sitting here trying to figure out how to test it 😄

..assets.

@Logan M this is my PR you're commenting on -- https://github.com/run-llama/llama_index/pull/9988

..assets.

tbh I'm a little lost on how to mock the embeddings so that we can get a valid test

..assets.

my "test" so far has been a directory of 380 medical pdfs that it handled well 😄

LLogan M

Check out the test here for an example of mocking embeddings:
https://github.com/run-llama/llama_index/blob/d62f84c29471a1e77c24bbee8f598baf92f55c2d/tests/playground/test_base.py#L15

Alternatively, refactoring to use more explicit functions would make testing easier without needing the embedding model 👀

LLogan M

(also, very cool PR!)

..assets.

yup! I'm doing a bit of refactoring now

..assets.

oh, thanks!

..assets.

yeah... tried it out with LLM Lingua in mind... I'm honestly blown away

..assets.

Attachments

..assets.

This is the result of a chatbot I'm building with llama index ... the LLM is Mistral 7b Instruct

..assets.

all of that information is correct, and the references that it plopped down in its response are actual real references... correctly notated in its response

..assets.

the "Context Sources" are plucked from the metadata, multiple of which were originally 4000+ tokens compressed down to 100 ~ 150 with LLM Lingua

..assets.

a big problem I was having using small to big was like, if the window was too small it would leave out lots of details, if the window was too big, it would hallucinate because there might be irrelevant data in there

..assets.

splitting the document semantically tends to group all the related text together... sometimes a single paragraph, sometimes multiple related paragraphs

..assets.

lol at the lotr quotes in the mock embedding

..assets.

awesome, got some a couple basic tests written

..assets.

that being said... make test didn't seem to pick them up

..assets.

I had to pytest /path/to/test.py to get it to run

..assets.

I feel like I'm missing something to register the test...

Plain Text

tests/multi_modal_llms/test_replicate_multi_modal.py .                        [ 49%]
tests/node_parser/test_html.py .....                                          [ 50%]
tests/node_parser/test_json.py .....                                          [ 51%]
tests/node_parser/test_markdown.py ....                                       [ 51%]
tests/node_parser/test_markdown_element.py ...                                [ 52%]
tests/node_parser/test_unstructured.py s                                      [ 52%]
tests/objects/test_base.py ...                                                [ 52%]
tests/objects/test_node_mapping.py ....                                       [ 53%]

..assets.

oh, I'm a dummy... I didn't prefix the filename with test_

LLogan M

Yea that will do it! 👍 Awesome we got tests now 💪

ayyyy

merged! yoooo

first contribution 😄

:dotsFire:

Add a reply