I have been struggling to use streamlit file_uploader ever since. As I plan to query own documentation not via path but rather via uploading files from everywhere where located. I understand the Path being too static and cumbersome
I really appreciate support here or any exisitng code snippet that uploads documents via streamlit and then fetch it to nodes and storage_contex, service_context and alike
How shall I go about it? Please I am really struggling with this for ages and I can't get it to work. Any suggestions on the path forward are very welcomed please
Honestly @Emanuel Ferreira the stage I am now it is just the .py file that I can share with you. I haven't gone to repo's on Github yet. If you fine I can send the file across just now or DM you.
then sending your docs to something like that, can solve
Plain Text
from llama_index import Document
def parse_document(docs):
documents = []
for document in docs:
# Manually create a new metadata dictionary and exclude specific keys
documents.append(
Document(
text=document.page_content if hasattr(document, 'pagecontent') else "",
id=document.metadata.get("id", None),
metadata=document.metadata,
)
)
return documents
Thanks @Emanuel Ferreira. I will try it out in approx 1/2 hour. Just finishing work now. Will keep you posted. Appreciate.
Add a reply
Sign up and join the conversation on Discord
","dateCreated":"2023-09-11T17:59:21.126Z","dateModified":"2023-09-11T17:59:21.126Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/958dd987-cf2d-4f41-8237-a797246122fd","name":"Ataliba Miguel","identifier":"958dd987-cf2d-4f41-8237-a797246122fd","image":"https://cdn.discordapp.com/avatars/913036801697001522/2b42c2e44ffa2f09c78c057a78c9b646.webp?size=256"},"commentCount":0,"comment":[],"position":19,"upvoteCount":0},{"@type":"Comment","text":"Pls let know if it worked fine you accessing it","dateCreated":"2023-09-11T17:59:43.543Z","dateModified":"2023-09-11T17:59:43.543Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/958dd987-cf2d-4f41-8237-a797246122fd","name":"Ataliba Miguel","identifier":"958dd987-cf2d-4f41-8237-a797246122fd","image":"https://cdn.discordapp.com/avatars/913036801697001522/2b42c2e44ffa2f09c78c057a78c9b646.webp?size=256"},"commentCount":0,"comment":[],"position":20,"upvoteCount":0},{"@type":"Comment","text":"https://gist.github.com/achilela/5fcd85d690cfe4680a27e6f99f4bc226worked, I will do some tests and let you know","dateCreated":"2023-09-11T18:01:24.237Z","dateModified":"2023-09-11T18:01:24.237Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":21,"upvoteCount":0},{"@type":"Comment","text":"Really appreciate @Emanuel Ferreira","dateCreated":"2023-09-11T18:01:44.996Z","dateModified":"2023-09-11T18:01:44.996Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/958dd987-cf2d-4f41-8237-a797246122fd","name":"Ataliba Miguel","identifier":"958dd987-cf2d-4f41-8237-a797246122fd","image":"https://cdn.discordapp.com/avatars/913036801697001522/2b42c2e44ffa2f09c78c057a78c9b646.webp?size=256"},"commentCount":0,"comment":[],"position":22,"upvoteCount":0},{"@type":"Comment","text":"can you send me the requirements file?pip freeze > requirements.txt","dateCreated":"2023-09-11T18:07:18.991Z","dateModified":"2023-09-11T18:07:18.991Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":23,"upvoteCount":0},{"@type":"Comment","text":"then I can have the same versions/packages as you","dateCreated":"2023-09-11T18:07:24.103Z","dateModified":"2023-09-11T18:07:24.103Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":24,"upvoteCount":0},{"@type":"Comment","text":"Not sure if done it correctly pip freeze > requirements.txt ~/LLMWorkshop/ExperimentalLama_QA_Retrieval/llamaIndex_streamlit_chat.py","dateCreated":"2023-09-11T18:22:29.971Z","dateModified":"2023-09-11T18:22:29.971Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/958dd987-cf2d-4f41-8237-a797246122fd","name":"Ataliba Miguel","identifier":"958dd987-cf2d-4f41-8237-a797246122fd","image":"https://cdn.discordapp.com/avatars/913036801697001522/2b42c2e44ffa2f09c78c057a78c9b646.webp?size=256"},"commentCount":0,"comment":[],"position":25,"upvoteCount":0},{"@type":"Comment","text":"I see you aren't using venv, so it get all your packages (even the ones which isn't on the project)","dateCreated":"2023-09-11T18:59:09.827Z","dateModified":"2023-09-11T18:59:09.827Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":26,"upvoteCount":0},{"@type":"Comment","text":"but that's ok, I can find a way here","dateCreated":"2023-09-11T18:59:57.689Z","dateModified":"2023-09-11T18:59:57.689Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":27,"upvoteCount":0},{"@type":"Comment","text":"Really appreciate @Emanuel Ferreira","dateCreated":"2023-09-11T21:23:52.898Z","dateModified":"2023-09-11T21:23:52.898Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/958dd987-cf2d-4f41-8237-a797246122fd","name":"Ataliba Miguel","identifier":"958dd987-cf2d-4f41-8237-a797246122fd","image":"https://cdn.discordapp.com/avatars/913036801697001522/2b42c2e44ffa2f09c78c057a78c9b646.webp?size=256"},"commentCount":0,"comment":[],"position":28,"upvoteCount":0},{"@type":"Comment","text":"Hi @Emanuel Ferreira, do not meant to push, was just wondering are there any updates on the uploaded_file.","dateCreated":"2023-09-13T17:50:49.759Z","dateModified":"2023-09-13T17:50:49.759Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/958dd987-cf2d-4f41-8237-a797246122fd","name":"Ataliba Miguel","identifier":"958dd987-cf2d-4f41-8237-a797246122fd","image":"https://cdn.discordapp.com/avatars/913036801697001522/2b42c2e44ffa2f09c78c057a78c9b646.webp?size=256"},"commentCount":0,"comment":[],"position":29,"upvoteCount":0},{"@type":"Comment","text":"Hi! I'll be trying to take a look on it today yet","dateCreated":"2023-09-13T17:58:16.280Z","dateModified":"2023-09-13T17:58:16.280Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":30,"upvoteCount":0},{"@type":"Comment","text":"@Ataliba Miguel Your issue is that your current document coming from the PDF doesn't follow a format that the node parse requestloader = PyPDFLoader(file_path=tmp_file_path)\ndocs = loader.load()then sending your docs to something like that, can solvefrom llama_index import Document def parse_document(docs): documents = [] for document in docs: # Manually create a new metadata dictionary and exclude specific keys documents.append( Document( text=document.page_content if hasattr(document, 'pagecontent') else \"\", id=document.metadata.get(\"id\", None), metadata=document.metadata, ) ) return documents","dateCreated":"2023-09-13T23:57:53.289Z","dateModified":"2023-09-13T23:57:53.289Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":31,"upvoteCount":0},{"@type":"Comment","text":"I didn't run all the code until make it work, because there's a lot of things and envs, but you definetly can go to the next steps with that","dateCreated":"2023-09-13T23:58:16.899Z","dateModified":"2023-09-13T23:58:16.899Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":32,"upvoteCount":0},{"@type":"Comment","text":"will mention @Logan M if want to correct me or complement with something","dateCreated":"2023-09-13T23:58:27.161Z","dateModified":"2023-09-13T23:58:27.161Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":33,"upvoteCount":0},{"@type":"Comment","text":"@Ataliba Miguel Your issue is that it's a langchain document@Logan M suggested easier than that is to use a langchain Loaderfrom llama_index import Document document = Document.from_langchain_format(langchain_document)so you can go through your docs array and format to a llamaindex document","dateCreated":"2023-09-14T15:25:04.779Z","dateModified":"2023-09-14T15:25:04.779Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/f8bc593f-906c-4311-9751-20723458b663","name":"Emanuel Ferreira","identifier":"f8bc593f-906c-4311-9751-20723458b663","image":"https://cdn.discordapp.com/avatars/312981680027729942/384656b32a291d58050347a38a3e9a0f.webp?size=256"},"commentCount":0,"comment":[],"position":34,"upvoteCount":0},{"@type":"Comment","text":"Thanks @Emanuel Ferreira. I will try it out in approx 1/2 hour. Just finishing work now. Will keep you posted. Appreciate.","dateCreated":"2023-09-14T15:38:16.133Z","dateModified":"2023-09-14T15:38:16.133Z","author":{"@type":"Person","url":"https://community.llamaindex.ai/members/958dd987-cf2d-4f41-8237-a797246122fd","name":"Ataliba Miguel","identifier":"958dd987-cf2d-4f41-8237-a797246122fd","image":"https://cdn.discordapp.com/avatars/913036801697001522/2b42c2e44ffa2f09c78c057a78c9b646.webp?size=256"},"commentCount":0,"comment":[],"position":35,"upvoteCount":0}],"author":{"@type":"Person","url":"https://community.llamaindex.ai/members/958dd987-cf2d-4f41-8237-a797246122fd","name":"Ataliba Miguel","identifier":"958dd987-cf2d-4f41-8237-a797246122fd","image":"https://cdn.discordapp.com/avatars/913036801697001522/2b42c2e44ffa2f09c78c057a78c9b646.webp?size=256"},"interactionStatistic":{"@type":"InteractionCounter","interactionType":{"@type":"LikeAction"},"userInteractionCount":0}}]