Hey Logan M 8260

korzhov_dm · 2023-06-06T21:49:48.656Z

Hey @Logan M!Is it possible to setup streaming answer for gpt-4 / gpt-3.5?

Definitely, ezpz

Plain Text

from langchain.chat_models import ChatOpenAI
from llama_index import ServiceContext, LLMPredictor, GPTVectorStoreIndex

llm = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0, streaming=True)
service_context = ServiceContext.from_defaults(llm_predictor=LLMPredictor(llm=llm))

index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)
query_engine = index.as_query_engine(streaming=True)

response = query_engine.query('...')

# print the stream to terminal
response.print_response_stream()

# OR get the actual token generator object
for token in response.response_gen:
   # do a thing with the token?

Find answers from the community

Hey Logan M 8260