[gs-bugs] [Bug 691506] New: converting pdf with accented characters to text

bugzilla-daemon at ghostscript.com bugzilla-daemon at ghostscript.com
Wed Jul 28 11:54:48 UTC 2010


http://bugs.ghostscript.com/show_bug.cgi?id=691506

           Summary: converting pdf with accented characters to text
           Product: Ghostscript
           Version: 8.71
          Platform: PC
        OS/Version: Windows Vista
            Status: NEW
          Severity: minor
          Priority: P4
         Component: Text
        AssignedTo: ken.sharp at artifex.com
        ReportedBy: mckameh1 at armadillo.fr
         QAContact: gs-bugs at ghostscript.com
   Estimated Hours: 0.0


I'm trying to convert a french pdf into simple text, but accented characters
don't seem to be translated.

For example, occurrences of é are translated to e'.

The command I use is:

gswin32c.exe -q -dNODISPLAY -dSAFER -dDELAYBIND -dWRITESYSTEMDICT -dSIMPLE -c
save -f ps2ascii.ps my.pdf -c quit >file.txt

Am I doing something wrong?
Is there a way to convert text to utf-8 rather than to ascii?

Thanks for answering.

-- 
Configure bugmail: http://bugs.ghostscript.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


More information about the gs-bugs mailing list