hi all just a short hopefully not

At a glance

hi all, just a short, hopefully not completely stupid, question. I have the following short application (roughly):

Plain Text

d = 1536 # chatGPT embedding
faiss_index = faiss.IndexFlatL2(d)
vector_store = FaissVectorStore(faiss_index)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

llm_predictor2 = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4"))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor2)

# initialize an index using our sample data and the client we just created
index = VectorStoreIndex.from_documents(storage_context=storage_context,
documents=documents, service_context=service_context)

query_engine = index.as_query_engine()

From what I understand with using chatgpt-4 the maximum context size is 8192, so i want to make sure i only retrieve k-amount of vectors of size 1536 so that k*1536<8192 (roughly). So my question is, do I have to set k manually somewhere or is there a fundamental misunderstanding?

4 comments

LLogan M

the default chunk_size is 1024 (set in the service context)

the default top k is 2 (set as index.as_query_engine(similarity_top_k=2))

vvcb

awesome tyvm Logan

vvcb

Hey Logan, I'm trying to wrap my head around the idea of chunk_size within the service context.

If I'm understanding correctly, the way q&a with llamaindex & an LLM works is as follows:

You create a vector store with your documents to create a context. For example faiss or maybe opensearch.
If you have a question you do a semantic search on the vector store and get back top k results.
you send the top k result vector embeddings as context & the original question to LLM to give you a response.

If i set, for example, a chunk_size of 500 into the ServiceContext.from_defaults what actually gets chunked? Is it related to 3. ?

LLogan M

The chunk size comes into play you do .from_documents() actually

So the input documents are broken into nodes that are at most chunk_size tokens long

Add a reply

Find answers from the community

hi all just a short hopefully not