Google Colaboratory

At a glance

yes. You can see an example here https://colab.research.google.com/drive/16QMQePkONNlDpgiltOi7oRQgmB8dU5fl#scrollTo=20cf0152

6 comments

ssimone.dicola

or this

Plain Text

from llama_index.llms import Replicate
from llama_index import ServiceContext, set_global_service_context
from llama_index.llms.llama_utils import messages_to_prompt, completion_to_prompt

# The replicate endpoint
LLAMA_13B_V2_CHAT = "a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5"

# inject custom system prompt into llama-2
def custom_completion_to_prompt(completion: str) -> str:
    return completion_to_prompt(
        completion,
        system_prompt=(
            "You are a Q&A assistant. Your goal is to answer questions as "
            "accurately as possible is the instructions and context provided."
        ),
    )


llm = Replicate(
    model=LLAMA_13B_V2_CHAT,
    temperature=0.01,
    # override max tokens since it's interpreted
    # as context window instead of max tokens
    context_window=4096,
    # override completion representation for llama 2
    completion_to_prompt=custom_completion_to_prompt,
    # if using llama 2 for data agents, also override the message representation
    messages_to_prompt=messages_to_prompt,
)

# set a global service context
ctx = ServiceContext.from_defaults(llm=llm)
set_global_service_context(ctx)

aaszaiman1

Thank you so much for your response I was wondering if you knew how to pass it into a condensechatengine

LLogan M

Looks like condense question chat engine uses the LLM attached to the service context

See the from_defaults function here
https://github.com/jerryjliu/llama_index/blob/b8545fbd4f8daac816290d8ff114d0ecc40f04f8/llama_index/chat_engine/condense_question.py#L62

aaszaiman1

ah okay thank you! When I specify the service context in condense engine is seems to then forget about the index because all the responses are "1 : I'm sorry, but I can't answer that question based on the given context information."

LLogan M

nah that seems pretty common 😅

If you query your index with a normal query engine, with similar queries, do you get the same response?

aaszaiman1

No not usually like if I do not specify a service context and it just gets set to the default the responses are good

Add a reply

Find answers from the community

Google Colaboratory