[gs-bugs] [Bug 692308] New: improve extracting text in right-to-left alphabets
bugzilla-daemon at ghostscript.com
bugzilla-daemon at ghostscript.com
Tue Jun 28 14:42:17 UTC 2011
http://bugs.ghostscript.com/show_bug.cgi?id=692308
Summary: improve extracting text in right-to-left alphabets
Product: MuPDF
Version: unspecified
Platform: PC
URL: http://code.google.com/p/sumatrapdf/issues/detail?id=1
466
OS/Version: Windows 7
Status: NEW
Severity: normal
Priority: P4
Component: mupdf
AssignedTo: tor.andersson at artifex.com
ReportedBy: zeniko at gmail.com
QAContact: gs-bugs at ghostscript.com
Adobe Reader is much more successful for extracting text e.g. from
http://www.ice.gov/doclib/sevis/pdf/sevis_arabic_fs.pdf (one of the first
results from http://www.google.com/search?q=arabic+ext%3Apdf ). This seems
partially related to dev_text not expecting RtL text and inserting too many
unintended linebreaks, and also due to Unicode normalization divergences.
--
Configure bugmail: http://bugs.ghostscript.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
More information about the gs-bugs
mailing list