Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
๐
๐
๐
Powered by
Hall
Inactive
Updated 2 months ago
0
Follow
It isn't clear to me the default
It isn't clear to me the default
Inactive
0
Follow
n
nickjtay
12 months ago
ยท
It isn't clear to me the default chunking and tokenization that is being performed under VectorStoreIndex.from_documents(). Usually I can figure this out on my own, but having difficulty. Is this documented somewhere?
L
n
2 comments
Share
Open in Discord
L
Logan M
12 months ago
SentenceSplitter() using chunk_size=1024 and gpt-3.5 tokenizer
I agree it's a bit opaque -- The ingestion pipeline is a bit more preffered, since it's much more transparent as to whats happening
https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/root.html
n
nickjtay
12 months ago
Awesome, thank you @Logan M
Add a reply
Sign up and join the conversation on Discord
Join on Discord