Hi Guys I was trying Pinecone index

At a glance

The community member is experiencing slow response times when querying a Pinecone index and is looking for ways to speed up the process. The comments suggest that the default chunk size of ~4k tokens may be too large, and reducing the chunk size to 512 tokens can help improve performance. Additionally, increasing the similarity_top_k parameter can also help. The community members also discuss the impact of the embedding dimension on performance, but note that the dimension is determined by the underlying embedding model (1536 for text-embedding-ada-002). There is also a discussion around restricting the number of tokens used by the language model, with examples of using around 400 max tokens or even up to 15000 tokens.

ppreet

Hi Guys, I was trying Pinecone index example. But the response generation is taking a while to query the index. Is there any way to speed it up or is there any other index which could be useful. I want to build an index on large set of documents and want to keep the query time low for user experience. Thanks in advance

8 comments

jjerryjliu0

yes! what's your chunk size?

jjerryjliu0

the default chunk sizes are very big (~4k tokens)

jjerryjliu0

you can do something like

Plain Text

index = GPTPineconeIndex(documents, chunk_size_limit=512)

index.query(query, similarity_top_k=3)

By default similarity top k is 1, but since you're using smaller chunks you can increase the top k

ppreet

Thanks , yes it impacts the response time.
Is it also possible to change dimension in the create_index method? I tried to change 728 from 1536. But I’m getting error while adding chunks. Imo the dimension size of the embedding should also impact the performance.

jjerryjliu0

the embedding dimension is determined by the underlying embedding model - text-embedding-ada-002 is 1536

ppreet

Thanks, is there any option to restrict number of tokens used by LLM?

ppreet

So, I have playing around with examples provided by pinecone on their documentation and they are using around 400 max tokens. In case of gpt index, the tokens is a function of size of the chunk and number of chunks selected

ppreet

Or even more than that for eg in one of the question it used 15000 tokens

Add a reply

Find answers from the community

Hi Guys I was trying Pinecone index