Bedrock

At a glance

Hi, anybody has tried AWS Bedrock with llamaindex? I have tried it and it does not give any error but it doesn't take the prompt template nor it interacting with the results:

this is a piece of code of how I use it up:

Plain Text

from llama_index.llms.bedrock import Bedrock

llm = Bedrock(model="meta.llama2-13b-chat-v1", profile_name="machineuser1")

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

service_context = ServiceContext.from_defaults(
    llm = llm,
    embed_model = embed_model,
    chunk_size=256,
)

15 comments

LLogan M

I'm not sure what you mean, what's the issue?

ddavidp

the result has a format of "Human: and Assitant:" dialog but the doesn't have that format

Attachments

ddavidp

if i use my llama2 local model the result matches what's expected acording to the template

ddavidp

Plain Text

 llm = Ollama(model="llama2",base_url="http://192.168.1.245:11435")

ddavidp

Attachment

LLogan M

You are using ollama now? I thought this was bedrock 😅

Ollama will convert completion prompts to chat templates

Ollama handles the proper template formatting for chat models, but when it's printed to console, it just uses str(message) which makes both the role and content into a string

LLogan M

This is the request to ollama

https://github.com/run-llama/llama_index/blob/0774d393be8116b7063367d613fde2e3bb11d26d/llama-index-integrations/llms/llama-index-llms-ollama/llama_index/llms/ollama/base.py#L85

ddavidp

hi @Logan M, ah no, I have just put both the ouput of the RAG when I use Bedrock and then when I use Ollama. With Ollama the template is used well but with Bedrock it can be seen how is not using the template but it's using a "Human" and "Assistant" dialog that I haven't specified. This is the issue I don't how to tackle

LLogan M

You can use the messages_to_prompt and completion_to_prompt hooks to have more control on the input format

Plain Text

def completion_to_prompt(completion):
  return completion

def messages_to_prompt(messages):
  return "\n".join([str(x) for x in messages])

llm = Bedrock(..., completion_to_prompt=completion_to_prompt, messages_to_prompt=messages_to_prompt)

ddavidp

Hi @Logan M , thanks but I don't get it how to do it. With Ollama llama2 I have to use the synthethizer for exmaple:

response_synthesizer = get_response_synthesizer( ##try compact?
        text_qa_template=qa_template,
        refine_template=new_summary_tmpl,
        #streaming=True
    )

why is with bedrock different? and how can I translate what I have for local llama 2 to bedrock?
what are doing these two functions: completion_to_prompt and messages_toprompt? completion will append something to the output of the completion and the messages_ will perform some text treatment to the retrieved text fragments before sending them? Is there any example somewhere? I only found out this in the source code:
https://github.com/run-llama/llama_index/blob/0ae69d46e3735a740214c22a5f72e05d46d92635/llama-index-core/llama_index/core/llms/chatml_utils.py#L26

LLogan M

Ollama automatically handles the prompt formatting in their library. Bedrock does not, I guess thats a decision bedrock made 🤷‍♂️

LLogan M

The prompt formatting depends on the LLM being used

For llama2, there is one here
https://github.com/run-llama/llama_index/blob/0ae69d46e3735a740214c22a5f72e05d46d92635/llama-index-integrations/llms/llama-index-llms-llama-cpp/llama_index/llms/llama_cpp/llama_utils.py

For something like zephyr, it would look like this

Plain Text

def messages_to_prompt(messages):
  prompt = ""
  for message in messages:
    if message.role == 'system':
      prompt += f"<|system|>\n{message.content}</s>\n"
    elif message.role == 'user':
      prompt += f"<|user|>\n{message.content}</s>\n"
    elif message.role == 'assistant':
      prompt += f"<|assistant|>\n{message.content}</s>\n"

  # ensure we start with a system prompt, insert blank if needed
  if not prompt.startswith("<|system|>\n"):
    prompt = "<|system|>\n</s>\n" + prompt

  # add final assistant prompt
  prompt = prompt + "<|assistant|>\n"

  return prompt


def completion_to_prompt(completion):
  return "<|system|>\n</s>\n<|user|>\n{completion}</s>\n<|assistant|>\n"

ddavidp

Hi @Logan M , ok more or less I get the point, but if I want just to do a query answer without the roles of assistant, system and user? let's say I just want to provide the text fragments and also specify that multiple completions have to be taken into account because not all fragments will fit the LLM?

example:

template = (
    "We have provided trusted context information below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given this trusted and cientific information, please answer the question: {query_str}. Remember that the statements of the context are verfied and come from trusted sources.\n"
)
qa_template = Prompt(template)

new_summary_tmpl_str = (
    "The original query is as follows: {query_str}"
    "We have provided an existing answer: {existing_answer}"
    "We have the opportunity to refine the existing answer (only if needed) with some more trusted context below. Remember that the statements of the context are verfied and come from trusted sources."
    "------------"
    "{context_msg}"
    "------------"
    "Given the new trusted context, refine the original answer to better answer the query. If the context isn't useful, return the original answer. Remember that the statements of the new context are verfied and come from trusted sources."
    "Refined Answer: sure thing! "
)
new_summary_tmpl = PromptTemplate(new_summary_tmpl_str)

LLogan M

yea that looks fine to me. But if your model is in this list, any prompts will be converted to chat, because they are a chat model

Attachment

ddavidp

ok @Logan M , I'll user first to simplify this:
COMPLETION_MODELS = {
"amazon.titan-tg1-large": 8000,

but how do I convert from the local llama templates to the templates or what is required in bedrock?

Add a reply

Find answers from the community

Bedrock