Find answers from the community

y
yoelk
Offline, last seen 2 months ago
Joined September 25, 2024
y
yoelk
·

Hello,

Hello,
Suppose I have thousands of one-page documents (such as CVs) and I want to find the top 5 documents that meet a specific criteria. Do you have any recommendations for the architecture I should use? In light of the fact that these are unrelated one-page documents, I'm wondering if I should include embeddings at all (assuming I'm not trying to save money on LLM).
TIA!!
7 comments
y
T
How to save a composable graph to a json file?
2 comments
y
L
I have a ComposableGraph over a GPTListIndex. How do I set the response mode of the List indices to "compact"? (When creating the ComposableGraph I didn't provide the List indices "as query engines" so I wonder where I should set this)
28 comments
y
L
I have an issue with the logging mechanism for agents. It seems like debug info is not being printed, even though I wrote the below lines:

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG) logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

I was basically following this tutorial and I can't see the additional "thinking steps" prints. Is this a bug or am I missing something?
https://github.com/jerryjliu/llama_index/blob/main/examples/vector_indices/SimpleIndexDemo-multistep.ipynb
2 comments
y
L
y
yoelk
·

Gpt4

Can anyone share a working example of using GPT-4? I tried using it and I keep getting an error even though I was granted access to it.
I also regenerated the API key.

"openai.error.InvalidRequestError: The model: gpt-4 does not exist"
11 comments
L
y
Could anyone please share a working example of querying a ListIndex built on top of 2 SimpleVector indices? I looked at the docs but the examples are incomplete (require query_configs) and I couldn't make it work even after providing a query_configs parameter
30 comments
y
L
j
y
yoelk
·

Hello

Hello,
First I would like to thank and all other contributors who created and maintain this great library!
I have two questions:
  1. In the "Embeddings" documentation it says that the default model is text-embedding-ada-002 which can be used for both text search and similarity. Are there any examples / tutorials on how to use it for similarity, or more specifically anomaly detection?
  2. When using embedding for Q&A, what is the actual query that's being sent to GPT-3 along with the matching documents retrieved from the index?
TIA!
3 comments
y
j
Hey . I'm also using AWS lambda but apperently save_to_string takes very long time to run even on small text files. Any idea how things can be speed up?
9 comments
j
y
L
@Leonardo Oliva But I need some indication from the LLM that all details have been collected
2 comments
L
How to build a simple chat (no index) with a message history limit rather than token limit (i.e only the last K messages will be taken into account)?
16 comments
L
y
W
Is there a way to use a raptor tree in conjuction with meta data filtering? Given a query, basically I would like to filter relevant nodes that have certain meta data associated (VectorStore3B?), and then use a raptor tree search only on those to get the top_k.
3 comments
L
y
Hey everyone,
I'm running this tutorial:
https://pypi.org/project/llama-index-readers-smart-pdf-loader/
However, I'm getting this error:
from llama_index.readers.base import BaseReader ModuleNotFoundError: No module named 'llama_index.readers.base'

Is that a bug in the packages? I would assume that
pip install llama-index-readers-smart-pdf-loader
will also install the base packages.
16 comments
y
W
I have a keyword graph built on top of vector indexes. Is there a way to add some data (meta data) to the response of each index before they are being forwarded to the graph which produces the final answer?
10 comments
L
y
How can I get the debug info shown when setting verbose=True through the code?
11 comments
y
L
Is there an example showing the proper way for saving / loading several GPTVectorStoreIndex indices (each txt file should have an associated index)?
In the older version there was a single json per index but now it's more complicated and I'm not sure how to have multiple indices
7 comments
y
L
y
yoelk
·

Keyword

I built a composable graph over Vector Indices. Some queries work fine, but others fail on this error:
ZeroDivisionError: integer division or modulo by zero

I saw that a common output between the failed queries is that no keywords found -
INFO:gpt_index.indices.keyword_table.query:> Extracted keywords: []

Any idea what causes it?
17 comments
y
L
y
yoelk
·

Iterative

Hey Everyone,
Hope someone can help me out here -
I have a long document (longer than 8K tokens) already splitted into chunks. I would like to ask complex questions on this document which require an iterative process by using an agent.
Would love to hear any recommendations on how to approach it and highly appreciate any code snippets as I couldn't find anything that might be relevant.
TIA!!
23 comments
y
L
y
yoelk
·

Weaviate

I started getting into Weaviate vector store and it seems like it has many more options for inserting documents and performing queries than what the GPT-Index wrapper provides.
Any insights on how I should work with it in combination with GPT-Index?
10 comments
y
L
Very nice:) is there a link?
1 comment
G
Question for the NLP experts:
Say I have thousands of academic articles (~50 page per article on average) on a certain broad subject which I would like to index. The idea is to find paragraphs/articles related to a very specific use case given as free text input.
Originally I thought about indexing each article using the GPTSimpleVectorIndex with rather small chunk size (256) and then run the query (I use GPT3 as the LLM), but happy to hear your thoughts on more sophisticated indexing schemas (hierarchies?) as I'm afraid this doesn't work as good as I expected.
TIA for your valuable insights!
18 comments
L
T
M
y
N
Hello, I'm getting random crashes when using GPT3 to calculate embeddings using GPTSimpleVectorIndex. This is the error msg:
01:17:38.145 error_code=None error_message="[''] is not valid under any of the given schemas - 'input'" error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False

Note that I upgraded GPT-Index to the latest version and that sometimes when the document store contains less documents it works fine (so no issues with my OpenAI's API KEY)
25 comments
j
y
.
I have a SimpleIndexVector store that I created from very long source documents. What's the easiest way to take the first chunk of text from each source document (which corresponds to the first vector of each source document) and send it to GPT3 along with a fixed question promt?
41 comments
j
y
M