AndreaSel93

LlamaIndex

Log inLog into community

Find answers from the community

AndreaSel93

A

AndreaSel93

Offline, last seen 5 months ago

Joined September 25, 2024

·

Prompt Helper Parameters

Hey guys, since similarity top k does not always present the actual most relevant data, I was thinking to do the following:

Loop using the n top k, but using k=1 for each llm_call, asking if the context is really pertinent to the question (answers must be 0 if no and 1 if yes (to keep it fast), like a classification problem);
keep the nodes where the context is relevant and merge them (accurately separated)
using the qa and (only if the chunk is long enough) the refine template only for the relevant nodes

Im following the openai best practices for complex reasoning problems where they suggest to split the problem in smaller problems.

What do u think? In this way i should obtain more flexibility and better responses. However im afraid of the overall time of execution.
Do u have suggestions?

1 comment

A

·

Search results

In my case it finds:

1) doc “xyw” paragraph “123”
2) doc “xyw” paragraph “123”
3) doc “xyz” paragraph “456”

Wtf is going on😅

5 comments

A

L

·

Huge work Do you think that the this new

Huge work! Do you think that the this new lower level API (node) also affects a big text db or only json, images etc? Can you better explain the implications?

10 comments

A

j

·

Hey Any way to keep concise and short

Hey! Any way to keep concise and short answer without truncating them? Max_token just makes a truncation. Im using GPTSImpleVectorIndex

2 comments

A

r

·

Pinecone agent

Is it possible with a pinecone index or only GPTSimpleVectorIndex?

7 comments

A

j

L

·

Hey Is it possible using a langchain

Hey! Is it possible using a langchain vectorstore agent which takes as input a GPT vector store index?

7 comments

j

A

·

Does this regard only the gpt simple

Does this regard only the gpt simple vector index or also pinecone etc?

2 comments

A

j

·

Hey I saw the new release about the

Hey! I saw the new release about the expansion of pinecone index. It says that a single pinecone index is sharable among multiple vector indices. Anyone able to explain more?

4 comments

j

A

·

Ah ok…i think openai embeddings are the

Ah ok…i think openai embeddings are the ones with the best performance, no?

11 comments

L

A

·

Hey How does Llama Index split long

Hey. How does Llama Index split long texts? Eg langchain provides some features (textsplitter, recursiveTextSplitter, SpacySplitter and so on). What about Llama Index?

1 comment

j

·

Find better matching context

Another Q! I have 1k docs ingested. I know that the response of my query is in more than 1 doc, but specifically in one document. However when I look for it in get_formatted_sources i see that’s in 14th position (ordered by similarity, that imo works poorly). Except for writing better the query, which other methods may I apply to get that response without put similarity top k with large numbers?

1 comment

L

·

Pinecone

Hey! Anyone knows how can I load my data from pinecone index? To be more specific, I’m trying to keep all in cloud and without saving all my docs. So id like to get vectors directly

2 comments

A

L

·

Poor Performance

Same! Over a minute for a response with 1/2 nodes passed. I have GBs of data and I used GPTSImpleVectorIndex so far. Today I initialized a Pinecone index to check performances. I was also wondering if RAM could be an issue since the index is quite heavy to keep it in memory. Glad to hear someone else with same issues. Lets keep each other updated on improvements!

36 comments

A

R

L

·

Hey Very important to me Is it possible

Hey! Very important to me: Is it possible to get a portion of the embeddings of an already built simple vector index?

4 comments

L

A

·

Anyone facing speed issues Is this

Anyone facing speed issues? Is this directly proportional to the index size? (Larger size=slower response?)

6 comments

j

A

·

jerryjliu98 9313 do you suggest to edit

do you suggest to edit some of llama index prompts to better adapt them for different use cases?

1 comment

j

·

Hey guys hugely important for not

Hey guys, hugely important for not wasting money, maybe I lost it, but when i fine tune an embedding model following the llama index guide, how can I save it? It’s my first fine tuning so pretty ignorant. Thx in advance!

4 comments

L

A

·

Or i can use the node parser through the

Or i can use the node parser through the service context?

12 comments

L

A

·

Hey again Just a simple question when I

Hey again! Just a simple question: when I use davinci I can set max_tokens = -1 and this returns complete responses. However, when I use gpt turbo (3.5) i cannot use -1 as max_tokens so i should put 4096 or whatever. The problem is that in this way i get responses supposed to be long and instead they are truncated. How can I solve this? Thank you!

7 comments

L

A

·

You have to pass your own function

You have to pass your own function instead of the lambda where you can print or store somewhere all the info you need. Now i don’t have the link but @Logan M has an example!

1 comment

L

·

I don’t know but below my code hope

I don’t know, but below my code, hope could be helpful:
pinecone_api_key = “”
pinecone.init(api_key= pinecone_api_key, environment=“”)
pinecone_idx = pinecone.Index(<name_of_your_project>)

And then define the GPTPineconeIndex

1 comment

M

·

Langchain print sources

Is it possible to print the response.source_node using the llama_index as a tool in a langchain agent?

5 comments

L

A

·

A Q not related to the new version do

A Q not related to the new version: do the QA and REFINE prompts (also the similarity top_k) work with langchain agents using llama index as tool? or with GPTIndexChatMemory?

8 comments

L

A

·

Picking relevant nodes

I’m already obtaining high similarities (0.75+ for the first 10/20 nodes).
The difference is that i’m adding a llm call. So instead to use the top k to create the responses, i’m using that for selecting like 10-20 nodes. Than I let gpt-turbo decides if each one is truly relevant (this is the key, imo i should obtain some not relevant nodes even though the similarity is high). If relevant i merge them and I use the usual llama index solution.

Do you think that i can incorporate all of that in the qa and refine prompt?
My use case is not simple as “who wins the 2022 world cup”.

2 comments

A

L

·

Llama Index Secret Sauce

Guys a question: why when I use langchain for QA based on a Pinecone vectorstore I hit immediately the model's maximum context length and why when I use gpt_index for querying the same vector_store this doesn't happen? Which kind of magic does gpt_index use? ahahah

4 comments

i

h

A

L