paull

The double retries in the openai llm

Hello. I noticed that the OpenAI LLM seems to have double retries: the max_retries parameter is passed to openai's OpenAI client ("native" SDK) as well as it is used by llm_retry_decorator which wraps methods such as chat. This leads to retrying 3x3 times when we think we only retry 3 times. Is that on purpose or am I missing something?

4 comments

ppaull

Hi. I'm implementing my own VectorStore

Hi. I'm implementing my own VectorStore. I’m currently implementing the BasePydanticVectorStore and I’m wondering why this class has to be a Pydantic model rather than a simple Python class? Context: I’d like to pass a non-serializable object as attribute to the class when instantiating it

2 comments

ppaull

Hi. I'm wondering whether it's possible

Hi. I'm wondering whether it's possible to run a QueryPipeline that typically connect retriever to a synthesizer, with one argument query_str that is used by both retriever and synthesizer but additional arguments foo and bar for example that would partial format the prompt from the synthesizer? I've been unsuccessful until now with run_multi because I don't see how to pass additional variables to format the prompt template to the synthesizer

2 comments

ppaull

Hello. How do you delete all indexed

Hello. How do you delete all indexed documents by metadata filter (e.g., delete all rows from vector store with custom metadata filename = xxx?

2 comments

ppaull

Text replace

Hi. What's the recommended way to index a transformation of a text (e.g. a summary, a title) and replace by the original text if retrieved? Constraint would be to have the original text not indexed. One use case would be to index table summaries but actually return the original tables after retrieval (but not index the original table, or at least have the tables not retrievable directly - only via their summaries). Thanks!

2 comments

ppaull

Hello. I don't manage to obtain parallel

Hello. I don't manage to obtain parallel call while summarizing documents:

Plain Text

import nest_asyncio
nest_asyncio.apply()

...

service_context = ServiceContext.from_defaults(llm=llm, chunk_size=1024)
response_synthesizer = get_response_synthesizer(
    use_async=True,
    response_mode="tree_summarize",
    summary_template=PromptTemplate(custom_tmpl)
)
doc_summary_index = DocumentSummaryIndex.from_documents(
    tables,
    service_context=service_context,
    response_synthesizer=response_synthesizer,
    show_progress=True,
    use_async=True,
)

Logs show that each element in tables is summarized one by one, waiting for the previous to be completed, no parallelism at all. Am I missing something? I didn't find a doc specifying how to parameterize the level of parallelism (how many call to an API in parallel for example?)

7 comments

ppaull

I'm using pgvector as vector store.

I'm using pgvector as vector store. Given a node id, I used to get it as follows: index.docstore.get_node('xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'), however I noticed it does not work anymore (I checked carefully the id and it exist in the db indeed). Did something change?
Error:

Plain Text

server  |   File "/usr/local/lib/python3.11/site-packages/llama_index/core/storage/docstore/keyval_docstore.py", line 279, in get_document
server  |     raise ValueError(f"doc_id {doc_id} not found.")

3 comments

ppaull

I'm thinking about customizing the

I'm thinking about customizing the metadata_template and text_template parameters of Document / TextNode in order to customize the generated prompt sent to the LLM. I noticed it was written to the metadata_ field in the DB, does it mean it can't be dynamic? Ideally, I would like to be able to change the prompt context_str easily without re-indexing everything

1 comment

ppaull

Hi there. When we talk about

Hi there. When we talk about multilingual models for embeddings, what do we mean exactly?

that we can embed efficiently a document from any of the supported language and query in any of the supported languages (=> embeddings would encode the same representation whatever the original language)
that we can embed efficiently a document from any of the supported language but query must be from the same language

I'm asking because I noticed that a query in French did not retrieve the relevant English document whereas the English-translated query retrieved it perfectly. Doing a test by switching roles between FR and EN showed the same behaviour.

4 comments

Find answers from the community

The double retries in the openai llm

Hi. I'm implementing my own VectorStore

Hi. I'm wondering whether it's possible

Hello. How do you delete all indexed

Text replace

Hello. I don't manage to obtain parallel

I'm using pgvector as vector store.

I'm thinking about customizing the

Hi there. When we talk about