Log in
Log into community
Find answers from the community
View all posts
Related posts
Was this helpful?
π
π
π
Powered by
Hall
Inactive
Updated 12 months ago
0
Follow
Retrieval
Retrieval
Inactive
0
Follow
At a glance
A
AmitKhandey
12 months ago
Β·
but this work very well :vector_store = self.setup_chroma(collectionName)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
documents = SimpleDirectoryReader(
input_files = files
).load_data()
node_parser = SentenceSplitter(chunk_size=180, chunk_overlap=80)
nodes = node_parser.get_nodes_from_documents(documents)
index = VectorStoreIndex(nodes, storage_context=storage_context)
L
A
15 comments
Share
Open in Discord
L
Logan M
12 months ago
Why is your chunk size so small? (And overlap is very large here compared to chunk size)
L
Logan M
12 months ago
If you aren't changing the top k, it's only retrieving the top 2 nodes -- and both those nodes are tiny
L
Logan M
12 months ago
So retrieval will probably not be good with these settings
A
AmitKhandey
12 months ago
What should be ideal size of chunk and chunk_overlap
A
AmitKhandey
12 months ago
I am using gpt-3.5-turbo AND text-embedding-ada-002
L
Logan M
12 months ago
The default is a good choice (1024)
512 is also not bad.
Chunk overlap depends on the chunk size generally. I usually go with 20, but it's less important
A
AmitKhandey
12 months ago
how can I improve performance?
A
AmitKhandey
12 months ago
RAG is unable to give proper answer to questions I did chunking before storing in chroma
A
AmitKhandey
12 months ago
what else can be done?
L
Logan M
12 months ago
Did you change the chunk size? How much data are you indexing?
L
Logan M
12 months ago
what kinds of questions are you asking? You can debug retrieval somehwat by checking the source nodes
Plain Text
Copy
response = quer_engine.query("...") for node in response.source_nodes: print(node.text)
A
AmitKhandey
12 months ago
Yes I did change the size. the size of data is 4.78 MB which is 4 documents.
A
AmitKhandey
12 months ago
Most of data are in form of tables in word document and excel
A
AmitKhandey
12 months ago
I think it is having trouble storing and retrieving of data in table
L
Logan M
12 months ago
4.78 MB is a looot of text π I would suggest
a) using a chunk size of 512
b) increasing the top k -- probably 12?
c) using a reranker with top-n = 3 or 4
Add a reply
Sign up and join the conversation on Discord
Join on Discord