The community member is experiencing slow response times when querying a Pinecone index and is looking for ways to speed up the process. The comments suggest that the default chunk size of ~4k tokens may be too large, and reducing the chunk size to 512 tokens can help improve performance. Additionally, increasing the similarity_top_k parameter can also help. The community members also discuss the impact of the embedding dimension on performance, but note that the dimension is determined by the underlying embedding model (1536 for text-embedding-ada-002). There is also a discussion around restricting the number of tokens used by the language model, with examples of using around 400 max tokens or even up to 15000 tokens.
Hi Guys, I was trying Pinecone index example. But the response generation is taking a while to query the index. Is there any way to speed it up or is there any other index which could be useful. I want to build an index on large set of documents and want to keep the query time low for user experience. Thanks in advance
Thanks , yes it impacts the response time. Is it also possible to change dimension in the create_index method? I tried to change 728 from 1536. But I’m getting error while adding chunks. Imo the dimension size of the embedding should also impact the performance.
So, I have playing around with examples provided by pinecone on their documentation and they are using around 400 max tokens. In case of gpt index, the tokens is a function of size of the chunk and number of chunks selected