Find answers from the community

Updated 2 years ago

Two questions for Qdrant as the vector

At a glance

Two questions, for Qdrant as the vector store, does creating the index also pass the extra_info from the documents into Qdrant as the payload? And for PromptHelper, what would be the configuration to use if I want the output to use as much of the tokens not taken up by the prompt as possible?

4 comments

LLogan M

It should be yea 🙂

mmmm if you set max_tokens on the OpenAI llm to -1, the LLM will output as much as it has room for

LLLYX

Hmm, so for PromptHelper I see it calculates token space as context_length - num_output - num_input, what is this used for? Is it just to decide whether to refine the given text? What happens if I set num_output to like 0 if I want it to use as much of the available space as possible? xD

LLogan M

So, most LLMs these days are decoder models. This means they generate 1 token at time, add it to the input, and generate the next token

num_output is ensuring there is a minimum amount of space to generate an answer when we prompt the LLM 🙂

LLLYX

Makes sense thanks

Add a reply