num_output
parameter in the PromptHelper
configuration. In your code, you've set tokens
as 512, which is passed to both the LLMPredictor
and PromptHelper
. Therefore, the maximum response size would be 512 tokens.tokens = 512 ... llmPredictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003", max_tokens=tokens)) ... promptHelper = PromptHelper(max_input, tokens, max_chunk_overlap, chunk_size_limit=chunk_size)
tokens
is used for the num_output
parameter of PromptHelper
, which determines the maximum number of output tokens from the LLM (Language Model). Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:num_output
(or tokens
in your code) depends on the specific Language Model (LLM) you are using. In your case, you're using the OpenAI
model with the text-davinci-003
variant.num_output
parameter is used to reserve a certain amount of "space" for the output when computing the available context window size for packing text from retrieved Nodes. It's important to note that this parameter does not actually limit the model output.context_window
or chunk_size
, or consider using a different LLM that can handle larger context sizes.Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:num_output
) for the OpenAI model with the text-davinci-003
variant is typically set to a low number by default. For instance, with OpenAI the default is 256 (source ()).context_window
or chunk_size
, or consider using a different LLM that can handle larger context sizes.Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:output_parser
module in LlamaIndex. This module allows you to format each prompt with output parser instructions and obtain a structured response.# format each prompt with output parser instructions fmt_qa_tmpl = output_parser.format(DEFAULT_TEXT_QA_PROMPT_TMPL) fmt_refine_tmpl = output_parser.format(DEFAULT_REFINE_PROMPT_TMPL) qa_prompt = QuestionAnswerPrompt(fmt_qa_tmpl, output_parser=output_parser) refine_prompt = RefinePrompt(fmt_refine_tmpl, output_parser=output_parser) # obtain a structured response query_engine = index.as_query_engine( service_context=ServiceContext.from_defaults( llm_predictor=llm_predictor ), text_qa_template=qa_prompt, refine_template=refine_prompt, ) response = query_engine.query( "What are the three items the author did growing up?", ) print(response)
{'points': [{'explanation': 'Writing short stories', 'explanation2': 'Programming on an IBM 1401', 'explanation3': 'Using microcomputers'}]}
GuidancePydanticProgram
to generate structured data with guidance via LlamaIndex. This is especially helpful when you are using lower-capacity models, which otherwise would struggle to generate valid output that fits the desired output schema. Here's an example from the documentation ():from pydantic import BaseModel from typing import List from guidance.llms import OpenAI from llama_index.program import GuidancePydanticProgram # Define output schema class Song(BaseModel): title: str length_seconds: int class Album(BaseModel): name: str artist: str songs: List[Song] # Define guidance pydantic program program = GuidancePydanticProgram( output_cls=Album, prompt_template_str="Generate an example album, with an artist and a list of songs. Using the movie {{movie_name}} as inspiration", guidance_llm=OpenAI("text-davinci-003"), verbose=True, ) # Run program to get structured output output = program(movie_name="The Shining")
Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:GuidancePydanticProgram
to generate structured data with guidance via LlamaIndex. This is especially helpful when you are using lower-capacity models, which otherwise would struggle to generate valid output that fits the desired output schema.from pydantic import BaseModel from typing import List class Song(BaseModel): title: str length_seconds: int class Album(BaseModel): name: str artist: str songs: List[Song]
GuidancePydanticProgram
with your output schema:from guidance.llms import OpenAI from llama_index.program import GuidancePydanticProgram program = GuidancePydanticProgram( output_cls=Album, prompt_template_str="Generate an example album, with an artist and a list of songs. Using the movie {{movie_name}} as inspiration", guidance_llm=OpenAI("text-davinci-003"), verbose=True, )
output = program(movie_name="The Shining")
answerMe
function to generate structured responses. Here's an example:def answerMe(question): storage_context = StorageContext.from_defaults(persist_dir='index_store') index = load_index_from_storage(storage_context) query_engine = index.as_query_engine(response_mode='refine') for sent in rewriting: if sent in question : qq = "Write a lengthy response to this query" response = query_engine.query(qq + '\n\n'+ question) return response response = query_engine.query(question) for phrase in phrases: if phrase in str(response) : content =agent({"input": question}) actualcontent = content['output'] response = re.sub(r'\[(.*?)\]\((.*?)\)', r'<a href="\2" target="_blank">\1</a>', actualcontent) # Generate structured output output = program(movie_name=response) return output
prompt_template_str
and the arguments to the program
call to match your specific use case.Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon: