Find answers from the community

Updated last year

Hey,

Hey,

I'm experimenting with "Replicate" to compare with the OpenAI 3.5 turbo, which seems quite reasonable. My use case is to provide a knowledge base and no prior knowledge to respond to queries.

Which model category should I look into to have the best experience for my use case?

Thank you!
W
P
10 comments
LlamaIndex provides LLM compatibility for performance on different tasks, You can check the LLM that suits your need.

https://docs.llamaindex.ai/en/stable/module_guides/models/llms.html#llm-compatibility-tracking
@WhiteFang_Jr thanks for checking! Seems that openai is the best option it appears
Yeah, But Opensource model like Zephyr https://colab.research.google.com/drive/1UoPcoiA5EOBghxWKWduQhChliMHxla7U?usp=sharing
Is also looking well. It will not be as good as OpenAI though
@WhiteFang_Jr did a quick test for Zephyr and find that the response/answer is trimmed.
"...This command will download the required files and perform the"

It ends at "the" for some odd reason
@WhiteFang_Jr do you have any idea why that'd be?

Plain Text
from typing import Union
from fastapi import FastAPI
from llama_index import SummaryIndex
from llama_hub.web.sitemap.base import SitemapReader
from pathlib import Path
from llama_index import download_loader, ServiceContext
from llama_index.prompts import PromptTemplate
from llama_index.llms import Replicate

app = FastAPI()
loader = SitemapReader()

llm = Replicate(
    model="tomasmcm/zephyr-7b-beta:961cd6665b811d0c43c0b9488b6dfa85ff5c7bfb875e93b4533e4c7f96c7c526"
)
service_context = ServiceContext.from_defaults(llm=llm)

MarkdownReader = download_loader("MarkdownReader")
loader = MarkdownReader()

template = (
    "We have provided knowledge below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given the provided knowledge and not prior knowledge,"
    "answer the query, including the commands and the documentation URL,\n"
    "the answer should only contain accurate information from provided knowledge only\n"
    "If no answer, ask to visit the main blog and documentation website\n"
    "The query is: {query_str}\n"
)
qa_template = PromptTemplate(template)

@app.get("/ping")
def read_root():
  return "pong"

@app.get("/query")
def query(question: Union[str, None] = None):
  documents = loader.load_data(file=Path("./knowledge.md"))
  index = SummaryIndex.from_documents(documents, service_context=service_context)
  query_engine = index.as_query_engine()
  query_engine.update_prompts(
    {"response_synthesizer:text_qa_template": qa_template}
  )
  answer = query_engine.query(question)

  print(answer)

  return { "answer": str(answer )}
Maybe total tokens are getting consumed before it is able to generate the full answer
I see! I'm actually using the free version just for resting, maybe that's why?
You can try interacting with the llm directly to see if it is working fine or not.

Plain Text
print(llm.complete("Hey how are you?"))
@WhiteFang_Jr thanks, I'll check it out
Add a reply
Sign up and join the conversation on Discord