is basically the same of the tutorial

At a glance

is basically the same of the tutorial yet, the service context is :
service_context = ServiceContext.from_defaults(
llm=self.llm,
embed_model=self.embed_model,
callback_manager=self.callback_manager,
)

llm : gpt 3.5
emed_model: ada-002
callbacks:CallbackManager([self.llama_debug, self.token_counting_handler])

the problem is that when i run len(nodes) and len(base_nodes) the number is the same, and the nodes retrieved too. The automerging retriever not merging or another thing. Someone can help me please?

7 comments

LLogan M

the number of nodes retrieved will be the same since top-k is 6 for both

Merging only happens when neighboring nodes get retrieved.

Try bumping the top-k to like 12 or more for the auto-merging retriever

LLogan M

You may need a reranker to filter down the auto-merging retriever output, since 12 is pretty high if it doesn't do any merging

RRyan Ribeiro

aah okay, make sense, i m just follow the tutorial and there the top-k is 6 for both, but my document is different so probably with just 6 the automerging not get the neighboring nodes

RRyan Ribeiro

i'm try your suggestion now

RRyan Ribeiro

it works! thank you !!

RRyan Ribeiro

if a try change the chunk sizes of hierarchy maybe can i get i merge with less top_k?
chunk_sizes = [2048, 512, 128]
node_parser = HierarchicalNodeParser.from_defaults(chunk_sizes=chunk_sizes)

RRyan Ribeiro

change to [1024,512,128] or other way

Add a reply

Find answers from the community

is basically the same of the tutorial