Find answers from the community

p
paull
Offline, last seen last month
Joined September 25, 2024
Hello. I noticed that the OpenAI LLM seems to have double retries: the max_retries parameter is passed to openai's OpenAI client ("native" SDK) as well as it is used by llm_retry_decorator which wraps methods such as chat. This leads to retrying 3x3 times when we think we only retry 3 times. Is that on purpose or am I missing something?
4 comments
L
p
Hi. I'm implementing my own VectorStore. I’m currently implementing the BasePydanticVectorStore and I’m wondering why this class has to be a Pydantic model rather than a simple Python class? Context: I’d like to pass a non-serializable object as attribute to the class when instantiating it
2 comments
L
Hi. I'm wondering whether it's possible to run a QueryPipeline that typically connect retriever to a synthesizer, with one argument query_str that is used by both retriever and synthesizer but additional arguments foo and bar for example that would partial format the prompt from the synthesizer? I've been unsuccessful until now with run_multi because I don't see how to pass additional variables to format the prompt template to the synthesizer
2 comments
p
L
Hello. How do you delete all indexed documents by metadata filter (e.g., delete all rows from vector store with custom metadata filename = xxx?
2 comments
p
L
p
paull
·

Text replace

Hi. What's the recommended way to index a transformation of a text (e.g. a summary, a title) and replace by the original text if retrieved? Constraint would be to have the original text not indexed. One use case would be to index table summaries but actually return the original tables after retrieval (but not index the original table, or at least have the tables not retrievable directly - only via their summaries). Thanks!
2 comments
p
L
Hello. I don't manage to obtain parallel call while summarizing documents:
Plain Text
import nest_asyncio
nest_asyncio.apply()

...

service_context = ServiceContext.from_defaults(llm=llm, chunk_size=1024)
response_synthesizer = get_response_synthesizer(
    use_async=True,
    response_mode="tree_summarize",
    summary_template=PromptTemplate(custom_tmpl)
)
doc_summary_index = DocumentSummaryIndex.from_documents(
    tables,
    service_context=service_context,
    response_synthesizer=response_synthesizer,
    show_progress=True,
    use_async=True,
)

Logs show that each element in tables is summarized one by one, waiting for the previous to be completed, no parallelism at all. Am I missing something? I didn't find a doc specifying how to parameterize the level of parallelism (how many call to an API in parallel for example?)
7 comments
p
L
I'm using pgvector as vector store. Given a node id, I used to get it as follows: index.docstore.get_node('xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'), however I noticed it does not work anymore (I checked carefully the id and it exist in the db indeed). Did something change?
Error:
Plain Text
server  |   File "/usr/local/lib/python3.11/site-packages/llama_index/core/storage/docstore/keyval_docstore.py", line 279, in get_document
server  |     raise ValueError(f"doc_id {doc_id} not found.")
3 comments
L
p
I'm thinking about customizing the metadata_template and text_template parameters of Document / TextNode in order to customize the generated prompt sent to the LLM. I noticed it was written to the metadata_ field in the DB, does it mean it can't be dynamic? Ideally, I would like to be able to change the prompt context_str easily without re-indexing everything
1 comment
L
Hi there. When we talk about multilingual models for embeddings, what do we mean exactly?
  1. that we can embed efficiently a document from any of the supported language and query in any of the supported languages (=> embeddings would encode the same representation whatever the original language)
  2. that we can embed efficiently a document from any of the supported language but query must be from the same language
I'm asking because I noticed that a query in French did not retrieve the relevant English document whereas the English-translated query retrieved it perfectly. Doing a test by switching roles between FR and EN showed the same behaviour.
4 comments
p
V