Abhishek

Metadata

Hey Team,
We are adding a custom retriever for search, where we require a metadata field (array of string) added under the metadata but while doing so we received the following error [Attached Screenshot: Llama-index insertion.png]. But when we tried with pinecone core package, we are able to successfully upsert the document [Attached Screenshot: Pinecone insertion.png].

I also remember this is already being discusesd by earlier [Sorry, Couldn't find the ref to exact thread]
Any estim. here how long it might gonna take for this functionality to be added in llama-index for pinecone?

/ /

7 comments

AAbhishek

Hi team We are migrating to latest

Hi team, We are migrating to latest version of llama-index i.e. 0.8.36.
We have the following code to generate the embeddings, But seems like It doesn't work anymore with the latest version:

CODE:

Plain Text

embed_model = self.service_context.embed_model
for node in keyword_nodes:
    embed_model.queue_text_for_embedding(
        node.node.node_id,
        node.node.get_text(),
    )
_, text_embeddings = embed_model.get_queued_text_embeddings()
for idx in range(len(keyword_nodes)):
    keyword_nodes[idx].node.embedding = text_embeddings[idx]

Error: AttributeError: 'OpenAIEmbedding' object has no attribute 'queue_text_for_embedding'

/ / Any help here?

24 comments

AAbhishek

No text

Hi all, I was going through the documentation of query evaluator & query response evaluator, I found out it examples where response_mode="no_text" So, I'm curious to know when we use no_text response mode does it affect the accuracy of evaluator?

When gone through the code of evaluators, I also found out that answer=str(response) is requried and prompt are written in such ways that It assumes an answer with it.

Any help here . Thanks!

1 comment

AAbhishek

Chunk size issue

Hi , Any help on below error

Plain Text

text_chunks = self.text_splitter.split_text(document.text)

File "/usr/local/lib/python3.8/site-packages/llama_index/langchain_helpers/text_splitter.py", line 118, in split_text
    text_splits = self.split_text_with_overlaps(text, extra_info_str=extra_info_str)
File "/usr/local/lib/python3.8/site-packages/llama_index/langchain_helpers/text_splitter.py", line 157, in split_text_with_overlaps
    raise ValueError(
ValueError: A single term is larger than the allowed chunk size.
Term size: 1094
Chunk size: 512Effective chunk size: 512

If i have no control over the document, how can i force the chunk size

14 comments

AAbhishek

ravitheja 0475 jerryjliu98 9313 Logan M

What is the use case of SentenceEmbeddingOptimizer?

I have used it with GPTPineconeIndex in query, But i see token consumption has increased instead of decreasing
When used with following parameters
Original -> (LLM: 2342, Embedding: 7)

threshold_cutoff=0.7 - Total LLM token usage is increased (LLM: 2720, Embedding: 7)
percentile_cutoff=0.5, threshold_cutoff=0.7 - Total LLM token usage is reduced then test 1 but It is still more than the previous token consumption
percentile_cutoff=0.8, threshold_cutoff=0.7 - Token consumption is reduced than original but model hallucinated and generated the wrong answer (LLM: 2248, Embedding: 7)
threshold_cutoff=0.8 - Error: optimizer returned zero sentences

Any help over here to reduce token consumption?

19 comments

AAbhishek

Custom qa prompt is not working · Issue ...

Hi team, Can i get help on this issue at github?

https://github.com/jerryjliu/gpt_index/issues/509

1 comment

AAbhishek

Langchain

Hey Team, Hope you're doing all good 🙂

We recorded an issue while querying the index, We haven't faced it earlier and isn't reproducible from our side. We need assistance on debugging it. Sharing below the necessary details:

Version details:
llama-index==0.8.38
langchain==0.0.304

Traceback message in the attached file
@Logan M / @ravitheja any help here?

6 comments

AAbhishek

Key error

Hey team, We are migrating to version 0.8.36 from an older version. We have successfully migrated to the version except the following issue. It seems that the Dataset Generator used to work for the older version. But it's breaking on the version 0.8.36. Sharing the details of error and code snippet below.

Code Snippet:

Plain Text

## document_chunks is of the following typehints: List[Document]
## For example: document_chunks = [
## Document(id_='651ee77f9e9ad9292457dce8', embedding=None, metadata={...}, excluded_embed_metadata_keys=[],
## excluded_llm_metadata_keys=[], relationships={}, hash='6a1ed207bcccfea219f5d4b9fe764aa70bac565518c17865804b3505d6a4c2bb', ## text="...", start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', 
## metadata_template='{key}: {value}', metadata_seperator='\n'),
## ...,]

data_generator = DatasetGenerator.from_documents(
            documents=document_chunks,
            service_context=service_context,
        )
questions = data_generator.generate_questions_from_nodes(num=NUM_QUESTIONS)

The error we started to face from the above code (The same code used to work with earlier versions)

Plain Text

File "/lib/python3.10/site-packages/llama_index/evaluation/dataset_generation.py", line 252, in <dictcomp>
    query_id: responses_dict[query_id] for query_id in query_ids

Any help on the above issue^ @Logan M / @ravitheja?
Thanks

3 comments

AAbhishek

Hey ravitheja 0475 jerryjliu98 9313

Hi all, While using download_loader() When i run the application, I face FileExistError, or I face ConnectionTimeoutError Especially with Google calender and Google drive connectors.
Do anyone else face this issue?
Any help here @jerryjliu0 @Logan M @ravitheja

16 comments

Find answers from the community

Metadata

Hi team We are migrating to latest

No text

Chunk size issue

ravitheja 0475 jerryjliu98 9313 Logan M

Custom qa prompt is not working · Issue ...

Langchain

Key error

Hey ravitheja 0475 jerryjliu98 9313