[gs-bugs] [Bug 692345] New: Wrong coding of polish diacritic letters "=?UTF-8?Q?=C5=BC?=" and "=?UTF-8?Q?=C5=BB?="

bugzilla-daemon at ghostscript.com bugzilla-daemon at ghostscript.com
Fri Jul 15 11:39:25 UTC 2011


           Summary: Wrong coding of polish diacritic letters "ż" and "Ż"
           Product: Bug Tracker
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: P4
         Component: General
        AssignedTo: support at artifex.com
        ReportedBy: dwalitry at yahoo.com
         QAContact: gs-bugs at ghostscript.com

I'm not 100% sure if the problem I will describe is caused by Ghostscript, but
my analysis of it brought my to this conclusion.

I'm webmaster and because of this I have to care about how my content is shown
in Google's search result. While doing this I discovered that in many summaries
of search results for PDF files, polish diacritic letter "ż" (z with dot over
it) is replaced with letter "Ŝ" (S with circumflex - doesn't used in polish).
Because of this it is impossible to find those documents using correct search
phrases eg. "ważne", they are shown only for the phrase "waŜne" (that's no such
word in polish). You can see examples of this for this search "waŜne
filetype:pdf" (http://www.google.pl/search?hl=pl&q=waŜne+filetype%3Apdf).

The same problem is with letter "Ż" (capital version of "ż"), it's shown as a
"ś" (s with acute accent, this letter is used in polish).

I analysed the problem and it is most likely caused by the error in the free
software used to generate/print PDF files, to be exact error in the Ghostscript
library which these programs use. To prove this explanation, I can say that
this problem does not occur when the PDF is generated using Adobe software.

The problem occurs also in other search engines results e.g. Bing.

Of course when you open file the text is shown with the correct letters.

Configure bugmail: http://bugs.ghostscript.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

More information about the gs-bugs mailing list