Find answers from the community

Home
Members
martinkozle
m
martinkozle
Offline, last seen 3 months ago
Joined September 25, 2024
Hello, I came across the term "Act Order (desc_act)" here https://huggingface.co/TheBloke/vicuna-7B-v1.3-GPTQ in context of quantisation. I couldn't find any info on what that is. Does anybody happen to know what it means?
1 comment
m
Hello, I am having trouble porting my code to async. I have a chat engine initialized with streaming=True for which I now call aquery, this still returns StreamingResponse, which has the attribute response_gen: TokenGen, which is a synchronous generator. I noticed that in types.py there is also a TokenAsyncGen defined but I don't see a way that I can get that by using chat engine. Am I missing something in the library API, or is async streaming of the tokens not implemented yet and I have to use a thread to do this asynchronously?
6 comments
L
m