I m having trouble finding the list of I

At a glance

I'm having trouble finding the list of (I'm going to call them:) "levers" llama provides for chat_history. Like, how much history is used... which parts of history are used (such as... sentence similarity possibly? 🤷‍♂️)... how long ago and/or how many tokens ago do I start forgetting things... etc. -- just, what functions/features/etc are provided that I can leverage (🤭) to reduce/limit/optimize token usage costs.

26 comments

LLogan M

The chat history right now is super basic. It's just a list of messages 😅

Working on better "memory" abstractions though! Should be ready Soon ™️

RRubenator

So what happens when the history is >16k tokens?

RRubenator

And... is there a hook/callback or something we can use to filter-through/limit the history ourselves? Or no?

LLogan M

you can control the chat history pretty easily, as it's just a list of ChatMessage objects

On the chat engine, you can access it directly using chat_engine._chat_history, and same with the agent -> agent._chat_history

LLogan M

These are all pretty new -- the new memory objects we have in the pipeline should make this easier. But handling this manually is also not too bad

RRubenator

editing variables that start with _ just... feels wrong... but okay lol

LLogan M

i know i know, it's hacky lol

RRubenator

Oh and...

RRubenator

How is the chat history meant to be maintained between requests? Is that something we just, deal with however we see fit? orr?

LLogan M

Both agent.chat and chat_engine.chat allow you to pass in chat_history as a kwarg

Plain Text

agent.chat("Hello!", chat_history=[ChatMessage(role="assistant", content="text")])

RRubenator

Is there a page in the documentation that talks about that? Or not yet?

LLogan M

if you pass in a variable with a list, it will get appened to with new history

LLogan M

Not yet, I'm reading source code right now

LLogan M

which tbh I recommend doing as well

RRubenator

okay so, it just expects an array of that object -- and I'm assuming that is the same thing that is in _chat_history?

LLogan M

we are having a larger agent publicization/push I think next week? So hopefully better docs by then

LLogan M

correct

RRubenator

I assume you skipped this question because you don't know? haha

LLogan M

it will just crash/traceback haha is my best guesss

RRubenator

k lol

LLogan M

like I said, baby steps here 😅

If you are inclined to make any PRs for this as well, I definitely welcome it. Community help is extremely appreciated 🙏

RRubenator

Yeah I get it. Just making sure I understand the current behavior 🙂

RRubenator

Alright...

Plain Text

    store = MongoDBAtlasVectorSearch(get_db(), db_name=config["db_name"],collection_name=config["collection_name"], index_name=config["index_name"])
    index = VectorStoreIndex.from_vector_store(vector_store=store)
    service_context = ServiceContext.from_defaults(llm=OpenAI(temperature=config["temperature"], model=config["model_name"]), num_output=config["num_output"])
    chat_engine = index.as_chat_engine(
        node_postprocessors=[SentenceEmbeddingOptimizer(threshold_cutoff=config["threshold_cutoff"],percentile_cutoff=config["percentile_cutoff"])],
        retriever_mode="embedding",
        service_context = service_context,
        similarity_top_k=config["similarity_top_k"],
        text_qa_template=qa_template,
        streaming=True,
        condense_question_prompt=custom_prompt,
    )
    streaming_response = chat_engine.stream_chat(prompt, chat_history=modified_chat_history)

Plain Text

ValueError: Streaming is not enabled. Please use chat() instead.

How am I supposed to set it up for streaming properly if streaming=True is insufficient? 🤔

RRubenator

(btw I think I'm still on 7.4-ish if that matters)

RRubenator

whoops... wrong thread..

RRubenator

https://discord.com/channels/1059199217496772688/1129195971478298634/1129466101688958996

Add a reply

Find answers from the community

I m having trouble finding the list of I