Find answers from the community

Home
Members
oedemis
o
oedemis
Offline, last seen 2 months ago
Joined September 25, 2024
why this dont work? transformations = []
transformations.append(TitleExtractor(nodes=5))
transformations.append(QuestionsAnsweredExtractor(questions=3))
transformations.append(Settings.embed_model)
transformations.append(splitter)
transformations.append(node_parser)

transformations = []

pipeline = IngestionPipeline(
transformations=transformations,
vector_store=vector_store,
)
nodes = pipeline.run(documents=docs), it should first parse then split then do the titleextraction etc. it dont split the doc
4 comments
L
o
o
oedemis
·

Embeddings

general question according to the documentation the dafault chunking strategy is automatically enabled with chunksize=1024 overlap=20 ; if i parse with node_parser = MarkdownNodeParser()
transformations = [node_parser] each node contains 1024 token ist this assumption correct? If yes the next step is the vectorization, i want leverage a multilingual embedder like sentence-transformers/paraphrase-multilingual-mpnet-base-v2 i think this has a 128 length of 128 tokens. the vectorization goes fine, but this means EACH NODE from contains 1024 token, captures only 128 tokens as vectors????
3 comments
L
o
Hello Everyone, Iam iferring via watsonx is there an oppportunity to track the token input count? i setup # setup Arize Phoenix for logging/observability
import phoenix as px

px.launch_app()
import llama_index.core

llama_index.core.set_global_handler("arize_phoenix") but it not detect the token input count to the llm
2 comments
W
why MarkdownElementNodeParser take to long compared to MarkdownNodeParser even llm set to None, and how llm plays here a role?
3 comments
W
o
Hello community iam doing simple rag over my pdf, i need in the reponse part the page references, the answers are corret grounded by the page numbers are always wrong. can you give some hints or samples. tried many approaches direclty prompting it in with give pagenumbers nothing helps
2 comments
o
W