Find answers from the community

Updated 3 months ago

@Logan M



Plain Text
from llama_index.readers.schema.base import Document

from llmsherpa.readers import LayoutPDFReader

llmsherpa_api_url = "https://readers.llmsherpa.com/api/document/developer/parseDocument?renderFormat=all"
pdf_path = "2023190_riteaid_complaint_filed.pdf" # also allowed is a file path e.g. /home/downloads/xyz.pdf
pdf_reader = LayoutPDFReader(llmsherpa_api_url)
doc = pdf_reader.read_pdf(pdf_path)

doc = pdf_reader.read_pdf(pdf_path)
for chunk in doc.chunks():
    # Create a Document object for each chunk.
    document = Document(text=chunk.to_context_text(), extra_info={})

is this how you would use the libary to chunk it i'm a bit confused lol
L
D
26 comments
uhhhh I've never used llmsherpa, but that looks like you are creating document objects correctly yes
do i convert to nodes after?
yeah...AttributeError: 'tuple' object has no attribute 'id_'
what are you doing with the document objects next?
Generally you would give them to an index, and ingestion pipeline, or parse them into nodes with a node parser/text splitter
i gave up lol but you can help fixt this prompt template for me @Logan M
Plain Text
`
qa_prompt_tmpl_str = (
    "<|im_start|>Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the query\n<|im_end|>"
    "<|im_start|>user: {query_str}\n"
    "<|im_start|>assistant Answer: "
)

my liife would be easier if i just used closed source models
heres the format
Attachment
image.png
Rather than modifying the prompt template, it might be easier to set messages_to_prompt/completion_to_prompt on the llm?

But anyways,

Plain Text
qa_prompt_tmpl_str = (
    "<|im_start|>user\n"
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the query:\n"
    "{query_str}<|im_end|>\n"
    "<|im_start|>assistant\n"
)


Might be more accurate?
@Logan M cant do that for vllm right
Oh right -- vllm should be automatically applying the prompt templates
at least from my understanding
you should just need to pass plain old text
(maybe I read that wrong somewhere)
ugh that vllm class is so messy :PSadge: I need to clean that up
yeah or vlllm needs to clean it up sad boy
my model lies
Attachment
image.png
also you should know im not getting any lies when using hte vllm langchian wrapper
Are you doing any other prompt setup for langchain?
But also, if it works better, definitely use it in llama-index lol
their LLM code isn't doing anything differently
Seems pretty equivilant in terms of invoking the LLM
Attachment
image.png
not really same set up as llama except the wrapper
i didnt set temperature using llama but for langchain at set at .3
also is there a way to test an llms output like the one in the finetune retrival test
Like, test the accuracy of the output? We have some eval stuff, but it mostly relies on using another llm (like gpt-4) to act as a judge for various aspects
Add a reply
Sign up and join the conversation on Discord