[gs-bugs] [Bug 692381] New: utf8-Ghostscript - Incompatible changes to the GSAPI/GSDLL interfaces
bugzilla-daemon at ghostscript.com
bugzilla-daemon at ghostscript.com
Thu Jul 28 11:14:24 UTC 2011
Summary: utf8-Ghostscript - Incompatible changes to the
OS/Version: Windows XP
AssignedTo: support at artifex.com
ReportedBy: sags5495 at hotmail.com
QAContact: gs-bugs at ghostscript.com
The existing utf8 additions for the Windows platform resulted in
incompatible changes of the GSAPI and GSDLL interfaces. Existing GSAPI
clients could previously pass ‘command line’ options (= arguments to
functions like gsapi_init_with_agrs()) that included extended characters
(input filenames, include path, FONTPATH, etc) as long as these characters
are part of the installed ANSII codepage. Now this does not work anymore
unless the GSAPI client is specifically modified to convert ‘command line’
arguments to utf8 by itself, an encoding that is not a native one on Windows.
Steps to reproduce -------
- First we need an ‘existing GSPSI client’. A gswin32c.exe from a previous
release or the current gswin32c.exe compiled with WINDOWS_NO_UNICODE will
show the problem. We will name this executable ‘gswin32c_ansii.exe’.
- Create a PostScript file containing non-7-bit-ASCII characters in the
name but still in the system’s ANSII codepage. Let's say this is "Grün.ps"
("ü" is Alt+0252 in the ‘Windows: Western’ ANSII codepage). The contents
does not really matter, a single ‘showpage’ is enough.
- Attempt to display this file using the ‘existing GSPSI client’ and the
new *utf8* DLL:
GPL Ghostscript GIT PRERELEASE 9.04 (2011-03-30)
Copyright (C) 2010 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Error: /undefinedfilename in (Gr\374n.ps)
GPL Ghostscript GIT PRERELEASE 9.04: Unrecoverable error, exit code 1
The error happens because the ‘existing GSPSI client’ passes ANSII
strings for the argv parameter. The filesystem access functions of
the new DLL take these bytes, consider them an utf8 string, convert to
utf16 actually destroying it, and pass the result to Windows.
More on the scope and limits of this bug report -------
Current GS version introduces 2 incompatible changes.
(A) PostScript operators that interact with the host expect any PostScript
string parameter they take to be utf8-encoded.
(B) ‘Command line arguments’ (switches, input filenames, etc) are expected
to be utf8-encoded too. (More precisely, any filenames/ paths/ etc
inthere must be utf8; the switch letters, PS operator names, various
names recognized by GS, etc use only 7-bit ASCII characters and their
codification is identical ASCII <-> ANSII <-> utf8.)
The current report is about eliminating (B) and not about (A). (A) does not
typically affect PostScript-as-a-page-description-language files, only
PostScript-as-a-programming-language files plus some uses like concatenation
of PS files with save-runfile-restore-repeat. The changes help somehow
with (A) by making gsapi_revision()/gsdll_revision() to report if file
PS operators expect utf8 or ANSII strings, so the GSAPI client knows what
to do if generating PS code on the fly.
Next steps -------
Comment #1 below describes what type of changes I consider necessary.
Comment #2 is a list of all GSAPI and GSDLL interface functions/ messages/
and similar, telling which need changing and how, and which remain
untouched and why.
No patch yet. I hope to have something this weekend. (I hoped the last
weekend too, but nothing came out for lack of time...) It’s not much code,
but it’s possible the overhead to do this correctly (no platform-specific
code in i*.c files) to be significant. Plus, I don’t know if anyone agrees
with the design in general.
Configure bugmail: http://bugs.ghostscript.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
More information about the gs-bugs