Find answers from the community

v
vrda
Offline, last seen 3 months ago
Joined September 25, 2024
Hey guys, I have tried to find an answer to this but to no avail.
For some reason the S3 loader (eg: https://docs.llamaindex.ai/en/stable/examples/data_connectors/simple_directory_reader_remote_fs/)

Fails to load/parse pdf files from my S3 bucket, but suceeds on .txt files.
Did someone encounter a similar issue? Any help would be much appreciated!
2 comments
v
L
I want to load the sentence + metadata index from a persisted directory (where I have saved the relevant docstore, indexstore and vectorstore json files). Does anyone have a code snippet how we might do so? Because when I call :

storage_context_s = StorageContext.from_defaults(
docstore=SimpleDocumentStore.from_persist_dir(persist_dir=r"persist_dir"),
vector_store=SimpleVectorStore.from_persist_dir(persist_dir=r"persist_dir"),
index_store=SimpleIndexStore.from_persist_dir(persist_dir=r"persist_dir"),
)

load_indices_from_storage

load_index_from_storage

sentence_index = load_indices_from_storage(storage_context_s)

query_engine = sentence_index.as_query_engine(
similarity_top_k=2,
# the target key defaults to window to match the node_parser's default
node_postprocessors=[
MetadataReplacementPostProcessor(target_metadata_key="window")
],
)
window_response = query_engine.query(
"Who is xxx?"
)
print(window_response)

I later get:

AttributeError: 'list' object has no attribute 'as_query_engine'

Help would be much obliged
3 comments
L
Hi, I am building a RAG app on top of llamaindex, and would want to store data from knowledge graph and summary index (the json files that are created when you persist the storage_context) remotely to a cloud Db (something like MongoDB). I cant find any good example of how to do so (since I created the knowledge graph by running code on parallel on AWS), and would now just want to save the indexes/jsons on a remote DB. If some of you habe faced similar issues, help would be much appreciated!!
3 comments
v
W
Hi good people! When are we getting the RAPTOR paper implementation in llamaindex (https://twitter.com/IntuitMachine/status/1753057138438492530)? Seems to fit perfectly for this framework
1 comment
L
v
vrda
·

Hi All,

Hi All,
What the current consensus on whats the best way to ingest finance data (complex pdfs with tables and excel files) in llamaindex? Currently I find the best result with the UnstructuredReader. Does anyone have a different experience?
Thanks
2 comments
v
L