Find answers from the community

Updated 5 months ago

Tried using Llamaparse's Arabic language

Tried using Llamaparse's Arabic language parsing functionality. Apparently, the parsed data returns the results in the Left to Right format, when the original document was in Right to Left, Does Llamaparse not take care of this intricacy of Right to Left writing style in Arabic language?
Attachments
image.png
image.png
R
d
6 comments
can't you fix this post-parsing with some rule based code?
I could but for some documents/or even pages in the same document, the parsed data follows the same "Right to Left" direction of text.
For some pages, such as above it changes the flow, so just wanted to see if this is something still in progress in Llamaparse tech
that's odd cause i've seen conversations about llamaparse and arabic on google developers forums. no one mentioned this problem
plus, that balance sheet info is probably better off in a structured format...
if you insist on indexing it in a vectordb, you might need to actually do some data transformations in order to turn those tables rows into a sentences. like "long term bank loans for 2023 stood at 3,544,844$"
Add a reply
Sign up and join the conversation on Discord