How do you switch LLM's with the create-

At a glance

The community members are discussing how to switch LLMs (Large Language Models) with the create-llama stack on LlamaIndex and how to fix the max token problem. They suggest modifying the LLM in the service context, which may be in the llamaindex-streaming.ts file. One community member provides some code related to creating a parser and stream transformer. Another community member suggests changing the model name in the constants.ts file or modifying a specific line of code. The community members also discuss whether the chat history resets after each run or refresh, and how to automatically reduce the size of the input context to avoid the max token error.

Useful resources

CChinedu

How do you switch LLM's with the create-llama stack on llama index or fix this max token problem

10 comments

LLogan M

which backend did you create with create-llama ?

You just have to modify the LLM in the service context usually

CChinedu

I used NEXT JS

CChinedu

I need to modify the service context?

CChinedu

Is that in the llamaindex-streaming.ts file

CChinedu

import {
createCallbacksTransformer,
createStreamDataTransformer,
trimStartOfStreamHelper,
type AIStreamCallbacksAndOptions,
} from "ai";

function createParser(res: AsyncGenerator<any>) {
const trimStartOfStream = trimStartOfStreamHelper();
return new ReadableStream<string>({
async pull(controller): Promise<void> {
const { value, done } = await res.next();
if (done) {
controller.close();
return;
}

const text = trimStartOfStream(value ?? "");
if (text) {
controller.enqueue(text);
}
},
});
}

export function LlamaIndexStream(
res: AsyncGenerator<any>,
callbacks?: AIStreamCallbacksAndOptions,
): ReadableStream {
return createParser(res)
.pipeThrough(createCallbacksTransformer(callbacks))
.pipeThrough(
createStreamDataTransformer(callbacks?.experimental_streamData),
);
}

LLogan M

you can just change the model name in constants.ts

Alternatively, you can modify this line https://github.com/run-llama/LlamaIndexTS/blob/f2e3935c0bda2032d63904f53dbeba6b641a6ce8/packages/create-llama/templates/types/streaming/nextjs/app/api/chat/route.ts#L45

CChinedu

Thanks brother I found it

CChinedu

Just one more question. Does the chat history reset with this app after every run or refresh?

CChinedu

Or do you know a way to automatically reduce the size of the input context so that you dont always get this error of "This model's maximum context length is 8192 tokens. However, your messages resulted in 8209 tokens. Please reduce the length of the messages."

LLogan M

I thiiiiink every refresh should reset it.

I'm waaay less familiar with the TS library, so I'm not immediately sure how to fix that. Something about limiting the chat memory somewhere I'm guessing

Add a reply

Find answers from the community

How do you switch LLM's with the create-