kittenkill

Rate limits

Hi!, im getting 'Rate limit reached for 10KTPM-200RPM' errors when using gpt-4. Should LlamaIndex take into consideration the limit's, and sleep between calls? or thats something i'll need to do at the app side?

2 comments

kkittenkill

llama_index/llama_index/response_synthes...

Hi all!, im wondering why is response_mode=tree_summarize so fast, and came to this: https://github.com/jerryjliu/llama_index/blob/99f3127012368cac3450c10b6d30e7942ae1bccd/llama_index/response_synthesizers/tree_summarize.py#L129 Is it so, because its working with each text chunk in paralell? i see that use_async is false by default, so it should not (?)

69 comments

kkittenkill

Text2SQL + analize answer

I was trying to follow the text2sql example, and noticed that when you ask something, it tried to build a sql, and execute it. (nice!) however i came to read the prompt used, which is this one:

11 comments

kkittenkill

Hi all I ve got a qdrant db where i

Hi all!.. I've got a qdrant db, where i stored some docs using Langchain. Im trying to load and use them with Llama, without much luck. Both seems to save content in the db in a different way. Llama seems to use the 'text' payload property to store the text, and langchain 'page_content'. How could one query langchain's way of storing the text with llama?

12 comments

kkittenkill

Hi all It looks like LlamaIndex s create

Hi all. It looks like LlamaIndex's create and refine over a vectorindex, is pretty similar as Langchain's load_summarize_chain with chain_type='refine'. Are they really similar or im creazy?

3 comments

kkittenkill

What embedding model are ppl using looks

What embedding model are ppl using?.. looks like openai's ada3 is not the obvious choice (anymmore) ?

3 comments

kkittenkill

Prompt translate

Not sure how to translate the templates to another language

24 comments

kkittenkill

Vector index building

Or maybe would be a greater idea to use another type of index, != GPTSimpleVectorIndex ?

45 comments

kkittenkill

Hi all new to llamaindex here Im trying

Hi all, new to llamaindex here. Im trying to figure out how to add ‘short memory’’. Like adding the query and response text of the conversation into the next prompt.. Is that possible? I know i would hit the max tokens limit quite fast, but it would be useful anyway,.

4 comments

Find answers from the community

Rate limits

llama_index/llama_index/response_synthes...

Text2SQL + analize answer

Hi all I ve got a qdrant db where i

Hi all It looks like LlamaIndex s create

What embedding model are ppl using looks

Prompt translate

Vector index building

Hi all new to llamaindex here Im trying