I'm not sure what you mean by looping? It works fine for me
see this is my chat_engine in streamlit if st.session_state.messages[-1]["role"] != "assistant":
with st.chat_message("assistant"):
with st.spinner("Thinking..."):
response = st.session_state.chat_engine.chat(prompt)
st.write(response.response)
message = {"role": "assistant", "content": response.response}
# Add response to message history
st.session_state.messages.append(message)
and this is where i defined my engine,
are u using in a app or in jupyter?
In a fastapi app (where I was testing) it works fine
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.chat_engine import CondensePlusContextChatEngine
documents = SimpleDirectoryReader("./docs/docs/examples/data/paul_graham").load_data()
index = VectorStoreIndex.from_documents(documents)
chat_engine = CondensePlusContextChatEngine.from_defaults(
index.as_retriever(),
)
from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse
app = FastAPI()
@app.post("/chat")
async def root(request: Request):
data = await request.json()
message = data.get("message")
response = await chat_engine.astream_chat(message)
async def gen():
async for chunk in response.async_response_gen():
yield str(chunk)
return StreamingResponse(gen())
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="127.0.0.1", port=8055)
Your streamlit code seems fine to me. Maybe add some prints to see what line it seems to be stuck on
i did already, i cheched, it stays in this line
do you think for production i should use fastai? this streamlit doesnt looks best for production! what is your suggestion?
streamlit is definitely not a production app π
Its mostly to throw together a quick UI to show your boss hehe
ok, i am trying with what u said using flask app, tanks @Logan M u are great
hey @Logan M when i try the flask app uisng postman for my second question, this is what i get:
That's pretty weird π issue with openai embeddings I guess?
chat_engine = CondensePlusContextChatEngine.from_defaults(
retriever,
node_postprocessors = [colbert_reranker,llm_rerank_postprocessor],
) see this one i am using
i mean its different than what u used, u are calling this condence chat engine and passing index as retreiver, but my retriever i am building it myslef!
using QueryFusionRetriever
Right, but the console log shows openai embeddings isn't connecting
well the first quetion is always correct
Did you set your openai key? Are you manually setting up an embedding model?
yes, openai key is setup properly, for embedding i am using like this embed_model = OpenAIEmbedding(embed_batch_size=10)
Settings.embed_model = embed_model
Could just be openai having issues too
It's an APIConnectionError, I guess that means it had trouble hitting openai servers
but not always i am getting that, for example i tried one more time and i dont get any error in my postman consol, it shows its waiting for reposne
hey @Logan M i dont need to convert query to embediing and stuff right?
Nope it does that for you
i. assume when i am using this condence chat engine by default it will take the embedding of user query and then search the database, right?
do u have any tutorial for how can i add the source document part of the response to user? assume i am uisng condece query engine
you can get source nodes from the response, response.source_nodes
hey @Logan M i want to use the history of user for future chats that the user starts, similart to chatgpt, that maintain the user previous chats within user profile, do we have such a thing in llamaindex? if yes, can you navigate me to the right material pls?
nice, well done. guys, u made our life so much easier π
hey @Logan M if i have for example 100 pdf documents and I can put them in meanigful categories, let say 5 diffent categories. I am thinking to put each category pdf files under one index and give the user buttons to select each category first and in backedn its releavnt index is trigered. my guess is this way as each knowledge base is specific then pulling information is easier for a RAG model, rather if i save all pdfs in one index flat! what are your thoughts
only flaw i see with this is its more convinient for user to just have one chatbot and ask anything they want, but i am worried if i make flat saving all pdfs at one index then it would be harder for RAG to pull the info
will this work with ReAct agent also?
did you try with ReAct agent?
@Logan M and also how the token limit is handled , for example , my token limit is 300, first question-answer take 100 token, 2nd 100 and 3rd 100 token, when i ask the 4th question and it take 100 token, will the 1st question answer be removed from the memory to store the 4th question answer ?
thats kind of how the memory buffer works yes. Ther's also a summary memory buffer. Or you can make your own memory
thank you @Logan M , will this work with ReAct agent also?
@Logan M after updating the chat_store redis packages i am geting this error "File "chat_store=RedisChatStore(redis_url=settings.REDIS_URL),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
packages/llama_index/storage/chat_store/redis/base.py", line 54, in init
self._aredis_client = aredis_client or self._aget_client(redis_url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File packages/llama_index/storage/chat_store/redis/base.py", line 374, in _aget_client
if self._check_for_cluster(redis_client):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File packages/llama_index/storage/chat_store/redis/base.py", line 199, in _check_for_cluster
return cluster_info["cluster_enabled"] == 1
~~~~^^^^^^^^^^^^^^^^^^^
TypeError: 'coroutine' object is not subscriptable
class Chat:
def init(self, model):
self.model = model
if model.id is None:
self.id = str(uuid.uuid4())
else:
self.id = model.id
if settings.REDIS_URL is not None:
self.memory = ChatMemoryBuffer.from_defaults(
token_limit=3900,
chat_store= RedisChatStore(redis_url=settings.REDIS_URL),
chat_storekey="memory" + self.id,
)
else:
self.memory = ChatMemoryBuffer.from_defaults(
token_limit=3900, chat_storekey="memory" + self.id
)
self.created = datetime.datetime.now()
seems like a bug π€· Will have to fix it
redis = "5.1.0"
llama-index-storage-chat-store-redis = "0.3.1"
llama-index-storage-docstore-redis = "0.2.0"
llama-index-storage-index-store-redis = "0.3.0"
llama-index-vector-stores-redis = "0.3.2"
just pushed/released a fix