Find answers from the community

s
F
Y
a
P
Updated 2 years ago

Hi I have a multipage report Word

Hi! I have a multipage report, Word template, that is filled out and processed hundreds of times per month, every month. There is also an identical copy of the report available, with inline minimum requirements/expectations for the individuals completing the report, we can call this "Reqs".

I'm planning to use LangChain's agents+tools to review/grade the reports vs. "Reqs" and also create a summary/rollup of all of that month's reports (+quarterly/annually). The primary goal of the review is to teach/instruct the individuals doing the reports on what was missed and why it's important to include going forward.

I believe LangChain's Agents + GPT-Index Document capabilities makes sense to use with this project. Am I mistaken and Langchain itself is enough? If this is a solid use case for GPT-Index, what is the best way for me to take advantage of this projects capabilities? I'm a newbie when it comes to programming, but am willing to put in the effort needed to learn and am more than capable of copy/pasting into ChatGPT πŸ™‚
1
j
S
M
12 comments
I think GPT Index can help with your use case! We provide a .docx parser, and the ability to easily create an index over your word files. With this index, you can then submit queries to the index to help fulfill some of your needs (e.g. grade the reports, create a summary of reports).

You can then use a langchain agent to access this GPT Index as a Tool! So the agent will forward queries to GPT Index. I will have an example notebook for this soon.
Awesome! I'll take a look over the .docx parser and will be eagerly awaiting the example notebook. Really appreciate the effort you, Harrison, and all of the contributors have been putting into both projects. This stuff is going to change the world, shame most have no idea.
@SJ you can take a look at an initial beta notebook re: using gpt index as a agent tool https://github.com/jerryjliu/gpt_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb
hey @jerryjliu0 wondering if you got to release any notebooks on this. Will greatly help with my use case as well. Looking to create an index over multiple word files, and then query over the files in a chatbot style. Thanks for your efforts.
that notebook above should include using Gpt index as both a Tool (to represent an external source of data), as well as a memory module (to maintain conversation history).
let me know if you have questins on that!
Thanks a lot for the quick response. To give you an idea, I have multiple non-English documents in word and pdf format and I would like to use cohere multilingual embedding to generate embeddings and build a question-answering system over the embeddings which provides a source to the particular text/document with answers. I am wondering if there is a notebook or a wiki that could guide me with this process. Also, is it better to use a vector store provider (pinecone, weaviate etc) or just use the simplevectorindex function of gpt_index.
Or any example using docx files could work @jerryjliu0
@Misbah you could start out with SimpleVectorIndex for now, it's easier to get setup.

Before thinking about langchain agents, I'd probably just try to get a simple version of gpt index working where you 1) load data into an index, and 2) query the index. See this notebook as an example: https://github.com/jerryjliu/gpt_index/blob/main/examples/vector_indices/SimpleIndexDemo.ipynb.

You can just put .docx and .pdf files in a directory and use SimpleDirectoryReader
Hi @jerryjliu0 ! Thanks for this answer. Do you think this would work even if my docx amd pdf has both text and tables?
I am specifically trying to build a question answering bot
@akhilesh we have some default parsers, but take a look at the extracted docs and let me know if they're insufficient
Add a reply
Sign up and join the conversation on Discord