Find answers from the community

Updated 6 months ago

Logan M so i m looking at the logs of

At a glance

so i'm looking at the logs of the tgis server and the output of print(summary.response) and it looks like it's doing a double-summary

79 comments

tthoraxe

i see in the tgis server logs what looks like a request to summarize the summaries

LLogan M

Exactly

LLogan M

It builds a bottom up tree of summaries

LLogan M

And returns the root

tthoraxe

well i thought the purpose of this summary exercise was to generate summaries of each document to help improve index retrieval or whatever

LLogan M

Correct -- and tree summarize is the best way to do this

A bottoms-up tree will do the best job of capturing details from across the entire document, especially if the document/folder is large and doesn't fit into the LLM itself

LLogan M

So if we create summaries of each folder, we can do recursive retrieval based on those summaries

LLogan M

As a way of narrowing the retrieval scope when answering a query

tthoraxe

ah, yeah, that's a problem, because the summaries are all going to be "this is troubelshooting information"

tthoraxe

and/or each single document would need to be in its own folder

tthoraxe

and/or falcon sucks

tthoraxe

😬

tthoraxe

does the summarization happen at the directory level and not the individual file level? even so, these summaries are pretty weaksauce

LLogan M

basically it takes all the documents you gave it and attempts to summarize it. So I think that should be a whole folder?

My bet is indeed falcon sucks haha

I would expect the summaries to say something like "Information on troubleshooting X, Y, and Z" with some specifics, in order to be helpful

LLogan M

is 7B the biggest model you can run?

tthoraxe

i only have access to a ~20G GPU at present

tthoraxe

The only other Falcon option is 40b which didn't fit

tthoraxe

unless there's some FP option I can pass to the TGIS server

tthoraxe

but like I said, this whole folder is troubleshooting stuff

tthoraxe

so if you summarize a folder of 50 troubleshooting documents, the summary is just "troubleshooting documents"

tthoraxe

The text is a general troubleshooting guide for troubleshooting issues with Red Hat OpenShift containers in a cluster environment. It is most useful for understanding the resolution process and the steps taken to address the issue.

tthoraxe

^^ yes, I already knew that, because this entire repo is "a troubleshooting guide for OpenShift"

tthoraxe

so this type of folder-level summarization isn't helpful (with the current organization)

tthoraxe

if I have to go read all the docs and reorganize them into different folders, at that point i've made these documents easy to browse, so what value does llm/chat serve then?

tthoraxe

i'm really just trying to understand here

LLogan M

oh, I though this already had some folder organization

tthoraxe

Thus far I have failed at:

whole documents
sentence window
summarization

😆

tthoraxe

i mean, it sort-of does

tthoraxe

there are 32 documents in the troubleshooting folder

tthoraxe

hey, great, i already know the troubleshooting folder contains troubleshooting documents 🙂

LLogan M

well if there's only 32, then we could do it per-document then 🤔

tthoraxe

the entire repo looks like it has ~400 docs. 32 are troubleshooting. ~60 are general knowledgebase, etc etc

tthoraxe

lots of "howto" as well

tthoraxe

but even when i tried full docs on ONLY the troubleshooting folder, the answers were bad

tthoraxe

i'm about to try mpt-7b-instruct for giggles

LLogan M

But I guess it's good to narrow down the cause of the bad answers

falcon is 💩 ?
Should we be using a better embedding model?

tthoraxe

these documents are also "weird"

tthoraxe

code/yaml/cli samples, markdown, just all over the place

tthoraxe

i tried to get some sample "questions" from the SRE team

tthoraxe

this looks like it should be easy to answer

tthoraxe

How do I get the SSH key for a cluster from XX?

tthoraxe

i'll just try it against different models

LLogan M

maybe we need better parsers for these file type-too, to help with ingestion 😅

On the embedding side, I think I saw you were using mpnet-base-v2, which tbh is a bad model

Could try setting something like embed_model='local:BAAI/bge-base-en in the service context

tthoraxe

embed_model is s4et to the HF default:

embed_model = LangchainEmbedding(HuggingFaceEmbeddings())

tthoraxe

are you saying the default is that mpnet and it's 💩 ?

LLogan M

yessir

LLogan M

its like... bottom of the leaderboard 😅

LLogan M

28th place

LLogan M

https://huggingface.co/spaces/mteb/leaderboard

LLogan M

bge-base is pretty good. The jump to large is not worth the increased model size imo

tthoraxe

so this is a weird question that i don't know how to phrase correctly -- is there a way to make the embedding process run "over there" on the GPU (via TGIS?) or does that always run "locally"

tthoraxe

ah none of these are that huge thop

LLogan M

yea embeddings are pretty tiny -- TGIS miiight have embedding support, but tbh I haven't looked into it

nbd

trying baai now

it's chugging along

python will gladly completely destroy my computer when it tries to cpu+llm

tthoraxe

like legit totally locks the machine

LLogan M

There is also a smaller version of bge too if it is also locking your machine 😅

tthoraxe

nah this baai/bge is working

tthoraxe

INFO:llama_index.vector_stores.redis:Creating index pg_essays
Creating index pg_essays
INFO:llama_index.vector_stores.redis:Added 287 documents to index pg_essays
Added 287 documents to index pg_essays
Done indexing!

tthoraxe

not sure why that's 287

tthoraxe

there's only 64 files

LLogan M

they get chunked 🙂

tthoraxe

o, ya

tthoraxe

https://metro.co.uk/wp-content/uploads/2016/03/chunk_before.jpg

tthoraxe

dating myself with that

tthoraxe

ooh!

tthoraxe

much better answer.

LLogan M

Nice! The embeddings are helping then 🙏

tthoraxe

mpt/embeddings seems to be working better

tthoraxe

answers are curt but that's ok

tthoraxe

just need to figure out how to store the filename in the index so that i can display the "sourced" files back to the user

tthoraxe

that's probably the metadata stuff

LLogan M

Yup exactly. If you are using SimpleDirectoryReader, there's a neat trick for this

Plain Text

from llama_index import SimpleDirectoryReader
filename_fn = lambda filename: {'file_name': filename}

# automatically sets the metadata of each document according to filename_fn
documents = SimpleDirectoryReader('./data', file_metadata=filename_fn).load_data()

LLogan M

Then response.source_nodes[0].node.metadata will have it for example

LLogan M

basically inserts a metadata hook based on the filename

tthoraxe

schweet

tthoraxe

reindexing and trying again

Add a reply