Version is 0.9.14.post3. I was previous using GPT 3.5 turbo 16k via the OpenAI function, now switching to LiteLLM so that I can try out Mixtral (32K) via Together AI: Here's my initialization code:
llm = LiteLLM(model="together_ai/mistralai/Mixtral-8x7B-Instruct-v0.1", temperature = 0)
service_context = ServiceContext.from_defaults(
llm=llm,
chunk_size=768,
callback_manager=callback_manager,
chunk_overlap=75,
embed_model=embed_model
)
index = load_embeddings(service_context, storage_folder, game_system_name)
reranker = CohereRerank(api_key=api_key, top_n=50)
memory = ChatMemoryBuffer.from_defaults()
chat_engine = index.as_chat_engine(
chat_mode='context', retriever_mode="embedding", similarity_top_k=10,
node_postprocessors=[reranker],
verbose=True,
system_prompt=""#" ".join((DM_Prompt, History_Prompt, Character_Prompt, Story_Prompt)),
)