Mike

Switching from Gpt 3.5 to Gpt 4o Mini

Does anyone have any findings around switching from gpt 3.5 to gpt 4o mini? I'm finding that structured content is quite a bit worse for gpt 4o mini vs 3.5, often failing to return the values in the correct format. Also the speed is quite a bit slower... But I feel we have to switch as the pricing is so much better...

5 comments

MMike

We use PydanticProgramExtractor to get a

We use PydanticProgramExtractor to get a list of tags aswell as a summary and we see a strange error where content is repeated endlessly. This then causes the validation to fail.

This is our code:

Plain Text

EXTRACT_TEMPLATE_STR = """\
Here is the content of a section:
----------------
{context_str}
----------------
Given the contextual information, extract out a {class_name} object.\
"""

openai_program_summary = OpenAIPydanticProgram.from_defaults(
    llm=get_llm(model=MODEL_BASIC),
    output_cls=NodeSummaryMetadata,
    prompt_template_str="You must answer in the same language as the context given. {input}",
    extract_template_str=EXTRACT_TEMPLATE_STR,
)

openai_program_keywords = OpenAIPydanticProgram.from_defaults(
    llm=get_llm(model=MODEL_BASIC),
    output_cls=NodeKeywordsMetadata,
    prompt_template_str="You must answer in the same language as the context given. {input}",
    extract_template_str=EXTRACT_TEMPLATE_STR,
)

summary_extractor = PydanticProgramExtractor(program=openai_program_summary, input_key="input", num_workers=12)
keywords_extractor = PydanticProgramExtractor(program=openai_program_keywords, input_key="input", num_workers=12)

8 comments

MMike

Anyone have any information about the

Anyone have any information about the speed differences between Flask and FastAPI? We were thinking about switching from Flask to FastAPI but it seems to be quite a bit slower. A basic request where I use the chat engine seems to be almost twice as slow...

21 comments

MMike

Is there a way to retry matching a

Is there a way to retry matching a certain JSON format? Right now i just try to parse it and re-prompt if it fails. But i feel it would be better to just have it correct its mistake instead of retrying enterly right? Is this possible?

6 comments

MMike

We're seeing this issue when running our

We're seeing this issue when running our project and doing a fresh install of the dependencies. Anyone else experiencing this or know what could cause this?

"Resource wordnet not found. Please use the NLTK Downloader to obtain the resource:"

14 comments

MMike

Chat history with index not working

How can I use chat history in combination with an index? In this example the AI does not know anything about the chat history and just seems to try and query the index.

Plain Text

@app.route("/history")
def history():
    # Load data
    documents = SimpleDirectoryReader("./src/data/paul_graham").load_data()

    # create index
    index: VectorStoreIndex = VectorStoreIndex.from_documents(documents)

    custom_prompt = PromptTemplate(
        """\
    Given a conversation (between Human and Assistant) and a follow up message from Human, \
    rewrite the message to be a standalone question that captures all relevant context \
    from the conversation.

    <Chat History>
    {chat_history}

    <Follow Up Message>
    {question}

    <Standalone question>
    """
    )

    # list of `ChatMessage` objects
    custom_chat_history = [
        ChatMessage(
            role=MessageRole.USER,
            content="Remember that John Doe is wearing a blue shirt.",
        ),
        ChatMessage(role=MessageRole.ASSISTANT,
                    content=(
                        "Certainly, I'll remember that John Doe is wearing a blue shirt."
                    )
                    ),
    ]

    query_engine = index.as_query_engine()

    chat_engine = CondenseQuestionChatEngine.from_defaults(
        query_engine=query_engine,
        condense_question_prompt=custom_prompt,
        chat_history=custom_chat_history,
        verbose=True,
    )

    chat_response: AgentChatResponse = chat_engine.chat(
        "What color shirt is John Doe wearing?",
        # tool_choice="query_engine_tool"
    )

    pprint("answer:")
    pprint(chat_response.response)

    return "Done."

Response:

Plain Text

I'm sorry, but I cannot answer that question based on the given context information.

3 comments

MMike

Json

Is there an easy way to turn a llama_index.response.schema.PydanticResponse into json?

2 comments

MMike

When using pydantic classes with OpenAI

When using pydantic classes with OpenAI I sometimes get a validation error. Is there anything I can do about this? Maybe retry?

Plain Text

Expecting value: line 1 column 1 (char 0) (type=value_error.jsondecode; msg=Expecting value; doc=Empty Response; pos=0; lineno=1; colno=1)

12 comments

Find answers from the community

Switching from Gpt 3.5 to Gpt 4o Mini

We use PydanticProgramExtractor to get a

Anyone have any information about the

Is there a way to retry matching a

We're seeing this issue when running our

Chat history with index not working

Json

When using pydantic classes with OpenAI