Find answers from the community

Home
Members
Abhishek
A
Abhishek
Offline, last seen 3 months ago
Joined September 25, 2024
Hey Team,
We are adding a custom retriever for search, where we require a metadata field (array of string) added under the metadata but while doing so we received the following error [Attached Screenshot: Llama-index insertion.png]. But when we tried with pinecone core package, we are able to successfully upsert the document [Attached Screenshot: Pinecone insertion.png].

I also remember this is already being discusesd by earlier [Sorry, Couldn't find the ref to exact thread]
Any estim. here how long it might gonna take for this functionality to be added in llama-index for pinecone?

/ /
7 comments
A
r
d
L
W
Hi team, We are migrating to latest version of llama-index i.e. 0.8.36.
We have the following code to generate the embeddings, But seems like It doesn't work anymore with the latest version:

CODE:
Plain Text
embed_model = self.service_context.embed_model
for node in keyword_nodes:
    embed_model.queue_text_for_embedding(
        node.node.node_id,
        node.node.get_text(),
    )
_, text_embeddings = embed_model.get_queued_text_embeddings()
for idx in range(len(keyword_nodes)):
    keyword_nodes[idx].node.embedding = text_embeddings[idx]

Error: AttributeError: 'OpenAIEmbedding' object has no attribute 'queue_text_for_embedding'

/ / Any help here?
24 comments
L
A
A
Abhishek
·

No text

Hi all, I was going through the documentation of query evaluator & query response evaluator, I found out it examples where response_mode="no_text" So, I'm curious to know when we use no_text response mode does it affect the accuracy of evaluator?

When gone through the code of evaluators, I also found out that answer=str(response) is requried and prompt are written in such ways that It assumes an answer with it.

Any help here . Thanks!
1 comment
L
Hi , Any help on below error
Plain Text
text_chunks = self.text_splitter.split_text(document.text)

File "/usr/local/lib/python3.8/site-packages/llama_index/langchain_helpers/text_splitter.py", line 118, in split_text
    text_splits = self.split_text_with_overlaps(text, extra_info_str=extra_info_str)
File "/usr/local/lib/python3.8/site-packages/llama_index/langchain_helpers/text_splitter.py", line 157, in split_text_with_overlaps
    raise ValueError(
ValueError: A single term is larger than the allowed chunk size.
Term size: 1094
Chunk size: 512Effective chunk size: 512


If i have no control over the document, how can i force the chunk size
14 comments
A
L
S
What is the use case of SentenceEmbeddingOptimizer?

I have used it with GPTPineconeIndex in query, But i see token consumption has increased instead of decreasing
When used with following parameters
Original -> (LLM: 2342, Embedding: 7)
  1. threshold_cutoff=0.7 - Total LLM token usage is increased (LLM: 2720, Embedding: 7)
  2. percentile_cutoff=0.5, threshold_cutoff=0.7 - Total LLM token usage is reduced then test 1 but It is still more than the previous token consumption
  3. percentile_cutoff=0.8, threshold_cutoff=0.7 - Token consumption is reduced than original but model hallucinated and generated the wrong answer (LLM: 2248, Embedding: 7)
  4. threshold_cutoff=0.8 - Error: optimizer returned zero sentences
Any help over here to reduce token consumption?
19 comments
H
j
k
A
L
1 comment
j
Hey Team, Hope you're doing all good 🙂

We recorded an issue while querying the index, We haven't faced it earlier and isn't reproducible from our side. We need assistance on debugging it. Sharing below the necessary details:

Version details:
llama-index==0.8.38
langchain==0.0.304

Traceback message in the attached file
@Logan M / @ravitheja any help here?
6 comments
L
A
Hey team, We are migrating to version 0.8.36 from an older version. We have successfully migrated to the version except the following issue. It seems that the Dataset Generator used to work for the older version. But it's breaking on the version 0.8.36. Sharing the details of error and code snippet below.

Code Snippet:
Plain Text
## document_chunks is of the following typehints: List[Document]
## For example: document_chunks = [
## Document(id_='651ee77f9e9ad9292457dce8', embedding=None, metadata={...}, excluded_embed_metadata_keys=[],
## excluded_llm_metadata_keys=[], relationships={}, hash='6a1ed207bcccfea219f5d4b9fe764aa70bac565518c17865804b3505d6a4c2bb', ## text="...", start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', 
## metadata_template='{key}: {value}', metadata_seperator='\n'),
## ...,]

data_generator = DatasetGenerator.from_documents(
            documents=document_chunks,
            service_context=service_context,
        )
questions = data_generator.generate_questions_from_nodes(num=NUM_QUESTIONS)


The error we started to face from the above code (The same code used to work with earlier versions)
Plain Text
File "/lib/python3.10/site-packages/llama_index/evaluation/dataset_generation.py", line 252, in <dictcomp>
    query_id: responses_dict[query_id] for query_id in query_ids

Any help on the above issue^ @Logan M / @ravitheja?
Thanks
3 comments
A
L
Hi all, While using download_loader() When i run the application, I face FileExistError, or I face ConnectionTimeoutError Especially with Google calender and Google drive connectors.
Do anyone else face this issue?
Any help here @jerryjliu0 @Logan M @ravitheja
16 comments
A
L