yoelk

Log inLog into community

Find answers from the community

Home

Members

yoelk

Offline, last seen 5 months ago

Joined September 25, 2024

yyoelk

Hello,

Hello,
Suppose I have thousands of one-page documents (such as CVs) and I want to find the top 5 documents that meet a specific criteria. Do you have any recommendations for the architecture I should use? In light of the fact that these are unrelated one-page documents, I'm wondering if I should include embeddings at all (assuming I'm not trying to save money on LLM).
TIA!!

7 comments

yyoelk

Saving a graph

How to save a composable graph to a json file?

2 comments

yyoelk

Response mode graph

I have a ComposableGraph over a GPTListIndex. How do I set the response mode of the List indices to "compact"? (When creating the ComposableGraph I didn't provide the List indices "as query engines" so I wonder where I should set this)

28 comments

yyoelk

llama_index/SimpleIndexDemo-multistep.ip...

I have an issue with the logging mechanism for agents. It seems like debug info is not being printed, even though I wrote the below lines:

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

I was basically following this tutorial and I can't see the additional "thinking steps" prints. Is this a bug or am I missing something?
https://github.com/jerryjliu/llama_index/blob/main/examples/vector_indices/SimpleIndexDemo-multistep.ipynb

2 comments

yyoelk

Gpt4

Can anyone share a working example of using GPT-4? I tried using it and I keep getting an error even though I was granted access to it.
I also regenerated the API key.

"openai.error.InvalidRequestError: The model: gpt-4 does not exist"

11 comments

yyoelk

Could anyone please share a working

Could anyone please share a working example of querying a ListIndex built on top of 2 SimpleVector indices? I looked at the docs but the examples are incomplete (require query_configs) and I couldn't make it work even after providing a query_configs parameter

30 comments

yyoelk

Hello

Hello,
First I would like to thank and all other contributors who created and maintain this great library!
I have two questions:

In the "Embeddings" documentation it says that the default model is text-embedding-ada-002 which can be used for both text search and similarity. Are there any examples / tutorials on how to use it for similarity, or more specifically anomaly detection?
When using embedding for Q&A, what is the actual query that's being sent to GPT-3 along with the matching documents retrieved from the index?

TIA!

3 comments

yyoelk

save_to_string speed

Hey . I'm also using AWS lambda but apperently save_to_string takes very long time to run even on small text files. Any idea how things can be speed up?

9 comments

yyoelk

@Leonardo Oliva But I need some

@Leonardo Oliva But I need some indication from the LLM that all details have been collected

2 comments

yyoelk

How to build a simple chat (no index)

How to build a simple chat (no index) with a message history limit rather than token limit (i.e only the last K messages will be taken into account)?

16 comments

yyoelk

Is there a way to use a raptor tree in

Is there a way to use a raptor tree in conjuction with meta data filtering? Given a query, basically I would like to filter relevant nodes that have certain meta data associated (VectorStore3B?), and then use a raptor tree search only on those to get the top_k.

3 comments

yyoelk

llama-index-readers-smart-pdf-loader

Hey everyone,
I'm running this tutorial:
https://pypi.org/project/llama-index-readers-smart-pdf-loader/
However, I'm getting this error:

from llama_index.readers.base import BaseReader

ModuleNotFoundError: No module named 'llama_index.readers.base'

Is that a bug in the packages? I would assume that
pip install llama-index-readers-smart-pdf-loader
will also install the base packages.

16 comments

yyoelk

Add data to nodes

I have a keyword graph built on top of vector indexes. Is there a way to add some data (meta data) to the response of each index before they are being forwarded to the graph which produces the final answer?

10 comments

yyoelk

How can I get the debug info shown when

How can I get the debug info shown when setting verbose=True through the code?

11 comments

yyoelk

Saving loading

Is there an example showing the proper way for saving / loading several GPTVectorStoreIndex indices (each txt file should have an associated index)?
In the older version there was a single json per index but now it's more complicated and I'm not sure how to have multiple indices

7 comments

yyoelk

Keyword

I built a composable graph over Vector Indices. Some queries work fine, but others fail on this error:
ZeroDivisionError: integer division or modulo by zero

I saw that a common output between the failed queries is that no keywords found -
INFO:gpt_index.indices.keyword_table.query:> Extracted keywords: []

Any idea what causes it?

17 comments

yyoelk

Iterative

Hey Everyone,
Hope someone can help me out here -
I have a long document (longer than 8K tokens) already splitted into chunks. I would like to ask complex questions on this document which require an iterative process by using an agent.
Would love to hear any recommendations on how to approach it and highly appreciate any code snippets as I couldn't find anything that might be relevant.
TIA!!

23 comments

yyoelk

Weaviate

I started getting into Weaviate vector store and it seems like it has many more options for inserting documents and performing queries than what the GPT-Index wrapper provides.
Any insights on how I should work with it in combination with GPT-Index?

10 comments

yyoelk

Very nice is there a link

Very nice:) is there a link?

1 comment

yyoelk

Question for the NLP experts

Question for the NLP experts:
Say I have thousands of academic articles (~50 page per article on average) on a certain broad subject which I would like to index. The idea is to find paragraphs/articles related to a very specific use case given as free text input.
Originally I thought about indexing each article using the GPTSimpleVectorIndex with rather small chunk size (256) and then run the query (I use GPT3 as the LLM), but happy to hear your thoughts on more sophisticated indexing schemas (hierarchies?) as I'm afraid this doesn't work as good as I expected.
TIA for your valuable insights!

18 comments

yyoelk

Hello I m getting random crashes when

Hello, I'm getting random crashes when using GPT3 to calculate embeddings using GPTSimpleVectorIndex. This is the error msg:

01:17:38.145 error_code=None error_message="[''] is not valid under any of the given schemas - 'input'" error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False

Note that I upgraded GPT-Index to the latest version and that sometimes when the document store contains less documents it works fine (so no issues with my OpenAI's API KEY)

25 comments

yyoelk

I have a SimpleIndexVector store that I

I have a SimpleIndexVector store that I created from very long source documents. What's the easiest way to take the first chunk of text from each source document (which corresponds to the first vector of each source document) and send it to GPT3 along with a fixed question promt?

41 comments