Find answers from the community

Updated 9 months ago

Anyone? Please help?

Anyone? Please help?
L
S
16 comments
The token counter is meant to track LLM and embedding tokens. Neither are used in a sentence splitter
Where should I pass the callback_manager then?
probably to your LLM/embedding model and your index?
And it's only for indexing, not for querying
Let me try the embedding model.
Still can't make it work. This is my code:
Plain Text
token_counter = TokenCountingHandler(
    tokenizer=tiktoken.encoding_for_model('text-embedding-ada-002').encode,
    verbose=False  # set to true to see usage printed to the console
)
token_counter_callback_manager = CallbackManager([token_counter])

embed_model = OpenAIEmbedding(mode='similarity', embed_batch_size=2000, api_key=openai_key,
                       callback_manager=token_counter_callback_manager)

splitter = TokenTextSplitter(
        chunk_size=project_chunk_size,
        chunk_overlap=20
    )
    index = VectorStoreIndex.from_vector_store(vector_store=vector_store, embed_model=embed_model)

## Then indexing documents:

content_nodes = splitter.get_nodes_from_documents([document]) 
index.insert_nodes(content_nodes)
tokens_used += token_counter.total_embedding_token_count # 0!

What am I doing wrong?
Try this instead index = VectorStoreIndex.from_vector_store(vector_store=vector_store, embed_model=embed_model, callback_manager=callback_manager)
Great, let me try this.
It worked!! Thanks a lot!
One additional question on the indexing, if you don't mind. I found that the chunk size is a critical parameter. If it's not fitting the text, the result of querying later could be pretty bad. Is there any way to predict which chunk size could be the best for a specific text? Are there some rules, what does exactly the chunk size depend on?
nope -- mostly you pick a size + overlap, and pray it works out well. Usually adding reranking is helpful as well
What is reranking?
reranking is a step in a query engine where you use a specialized model to rerank/re-order the retreived nodes

This is useful for example when you set the top-k to be like 10, but then rerank and only return the top 3
An interesting concept! Are there any documentation to know about the details?
There are a few rankers in the docs
  • FlagEmbeddingRerank
  • CohereRerank
  • SentenceTransformersRerank
Typically you set one up and use it in you engine, or standalone

Plain Text
index.as_chat_engine(similarity_top_k=6, node_postprocessors=[FlagEmbeddingRerank(top_n=2)])

# or
rerank = FlagEmbeddingRerank(top_n=2)
nodes = retriever.retrieve("query")
nodes = rerank.postprocess_nodes(nodes)
I will try to use them!
Add a reply
Sign up and join the conversation on Discord