Hello,

At a glance

mmeeffe

Hello,
Update llama_index to be compatible with newest langchain 0.1 please; these import errors are driving me crazy

9 comments

LLogan M

Yea sure, I can update it. Since we removed it as a core dependency, I guess the CI doesn't catch this

LLogan M

Curious, what are you using from langchain that llama-index doesn't have?

mmeeffe

chatting with documents and preserving memory. I dont know if its possible with llama_index but langchain has some good tutorials for that e.g.

Plain Text

mmeeffe

@Logan M or perhaps you could give me some quick example of doing that with llama_index because to be frank I prefer its simplicity.

mmeeffe

in my example i combine llama_index with langchain

LLogan M

Ah, you only used llama-index for the loader.

Do you know how the chat history is working with this example? Tried to look at the langchain source code but I don't really understand the path it takes to include chat history.

Would take some work to migrate. Ignoring the custom template for now, you could do

Plain Text

import faiss
from llama_index import VectorStoreIndex, SimpleDirectoryReader, StorageContext
from llama_index.vector_stores.faiss import FaissVectorStore

loader = SimpleDirectoryReader(directory_manager.sources_dir, recursive=True, exclude_hidden=True)
documents = loader.load_data()

# dimensions of text-ada-embedding-002
d = 1536
faiss_index = faiss.IndexFlatL2(d)
vector_store = FaissVectorStore(faiss_index=faiss_index)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

chat_engine = index.as_chat_engine(chat_mode="condense_plus_context")
while True:
  msg = input(">>: ").strip()
  response = chat_engine.chat(msg)
  print(str(response))

By default, this is using openai embeddings and gpt-3.5-turbo. Documents get chunked using a SentenceSplitter at a chunk size of 1024 tokens.

The chat engine works by re-phrasing the user message into a query using the chat history, retrieves the top-2 most relevant chunks and inserts them into the system prompt, then sends that + the chat history to the LLM to create a response.

There are a few chat modes detailed here
https://docs.llamaindex.ai/en/stable/module_guides/deploying/chat_engines/usage_pattern.html#available-chat-modes

mmeeffe

@Logan M thanks for that!

I believe here is another example of chatting with assistant based on index data right? But in this case i will need to store ONLY AI messages in chat history which is doable from what I see here. I use chat history to tell Assistant to refrain from giving me similar facts from the context in next messages.

Question: In this case giving <context> from the index is not necessary here? In langchain I have to put context into prompt template.

from docs:

Plain Text

from llama_index.prompts import PromptTemplate
from llama_index.llms import ChatMessage, MessageRole
from llama_index.chat_engine.condense_question import (
    CondenseQuestionChatEngine,
)

custom_prompt = PromptTemplate(
    """\
Given a conversation (between Human and Assistant) and a follow up message from Human, \
rewrite the message to be a standalone question that captures all relevant context \
from the conversation.

<Chat History>
{chat_history}

<Follow Up Message>
{question}

<Standalone question>
"""
)

# list of `ChatMessage` objects
custom_chat_history = [
    ChatMessage(
        role=MessageRole.USER,
        content="Hello assistant, we are having a insightful discussion about Paul Graham today.",
    ),
    ChatMessage(role=MessageRole.ASSISTANT, content="Okay, sounds good."),
]

query_engine = index.as_query_engine()
chat_engine = CondenseQuestionChatEngine.from_defaults(
    query_engine=query_engine,
    condense_question_prompt=custom_prompt,
    chat_history=custom_chat_history,
    verbose=True,
)

LLogan M

Right, in this case, this template is only used to create the standalone question. That question is then sent to the underlying query engine, which has it's own template with context

mmeeffe

@Logan M There is so much indexes, retrievers, querying engines etc in llama)index I am quite lost now. Would you recommend best tools to use when I have some big context and based on these context I am giving AI to write about some topic based on that context (one at a time) and it should remember what it done so that it will not repeat similar facts over and over again in next sections.

I tried with example above and many others like https://docs.llamaindex.ai/en/stable/understanding/querying/querying.html

but to be frank for now langchain and my old example produces better output and I don;t know why is that

Add a reply

Find answers from the community

Hello,