[gs-bugs] [Bug 692381] utf8-Ghostscript - Incompatible changes to the GSAPI/GSDLL interfaces

bugzilla-daemon at ghostscript.com bugzilla-daemon at ghostscript.com
Thu Jul 28 11:17:12 UTC 2011


--- Comment #2 from SaGS <sags5495 at hotmail.com> 2011-07-28 11:17:11 UTC ---
List of GSAPI and GSDLL interface elements, and my opinion on what has to 
change and what can/must remain as-is:

FUNCTIONS THAT CHANGE by backwards-compatile additions -------

    Taking a shortcut here: the ‘product’ and ‘copyright’ strings always 
    consist of 1-byte 7-bit ASCII characters; the character encoding used 
    does not vary and there are no "A/W/8" variants.

    The ‘product’ changes to optionnaly include a ‘built-in features’ list.
    This list follows the product name proper, is enclosed in square 
    brackets, and consists of space-delimited predefined tokens.

    There is currently only one such feature:

    utf8osops   If present, operators that interface with the host os 
                interpret any [PostScript] strings they take as parameters
                as being UTF-8 encoded. Note: no initial BOM is expected or 
                If absent, these PostScript strings are assumed to use the
                same encoding as the host os. For Windows this is the ANSII
                codepage in effect, and currently happens (i) for GS versions 
                9.02 and earlier and (ii) for versions 9.04 and later when 
                compiled with the option WIDNOWS_NO_UNICODE. For the other 
                platforms, this is the only case currently implemented.
                Note: GS 9.03 does not return this information, although it
                      can be compiled with or without this feature.

FUNCTIONS THAT CHANGE because they need encoding-specific variants -------

It turs out there are only 3 of these (2 x GSAPI + 1 x GSDLL).

gsapi_init_with_args()      gsdll_init()
    These functions need "A/W/8" variants for the possible encoding used by 
    the ‘argv’ and ‘file_name’ parameters. The ‘callback’ parameter of 
    gsdll_init() does not vary depending on a character encoding, see below
    why (GSDLL_CALLBACK detailed by message).


gsapi_new_instance()        GSDLL_CALLBACK message GSDLL_POLL
gsapi_run_string_begin()    gsdll_execute_begin()
gsapi_run_string_end()      gsdll_execute_end()
gsapi_exit()                gsdll_exit()
the poll_fn() callback passed to gsapi_set_poll()
    These functions do not manipulate character data, and do not have any
    char/ char[]/ char* parameters, so they do not need any change.

gsapi_run_string_continue()            gsdll_execute_cont()
the stdin_fn() for gsapi_set_stdio()   GSDLL_CALLBACK message GSDLL_STDIN
the stdout_fn() for gsapi_set_stdio()  GSDLL_CALLBACK message GSDLL_STDOUT
the stderr_fn() for gsapi_set_stdio()
    Character buffers handled by these functions are actually fragments from
    the PostScript interpreter’s input/ output/ error streams, and as such
    are goverend by the PostScript syntax and semantics. Nothing here depends
    on the encodind used by the host os for textual information, so these 
    functions don’t change (no "A/W/8" variants of them).

gsdll_lock_device()         GSDLL_CALLBACK message GSDLL_DEVICE
gsdll_copy_dib()            GSDLL_CALLBACK message GSDLL_SYNC
gsdll_copy_palette()        GSDLL_CALLBACK message GSDLL_PAGE
gsdll_draw()                GSDLL_CALLBACK message GSDLL_SIZE
    The ‘device’ parameter does not point to textual information, it’s rather
    of an opaque type, so not subject to character encoding. 
    The ‘device’ parameter is actually of an opaque type, and not textual 
    information that may be subject to character encoding. The ‘pbitmap’
    is a pointer to binary data. Also, this is an OS/2-specific entry 
    point, and the utf8 file operators are not [yet?] implemented for this 

the display_callback passed to gsapi_set_display_callback()
    The only char data I found here is the ‘const char *component_name’
    parameter of the display_separation() structure member. This is used
    to pass SeparationNames from the PostScript program, so its contents
    (including any implied encoding, although this is primarily just an array
    of bytes) is dictated by the running PostScript code. No character 
    encoding to be specified at the GSAPI interface level.

the vd_trace_host_s pointed by a vd_trace_interface_s member
    This structure is completely defined and controlled by the module that
    draws the ‘trace image’, which is the GSAPI client. Hence, no need for 
    a character encoding contract at the GSAPI interface level.
the vd_trace_interface_s passed to gsapi_set_visual_tracer()
    The only member with text data is text(..., char *ASCIIZ), and the only 
    place I found it used is in base\gxhintn.c::t1_hinter__paint_glyph().
    From what I see, it is used to label the Visual Trace image. The text
    consists of debug information (does not come from PostScript nor the
    command line/ file names/ etc). I conclude the character encoding used
    is currently defined by the C compiler. Having a more precise 
    specification of the encoding would be marginally useful. I consider 
    defining "A/W/8" variants of vd_trace_interface_s to be overkill.
    If this were a WINAPI function, it would have "A/W" variants.
    But for the reasons explained above, I consider it does not need them.

Configure bugmail: http://bugs.ghostscript.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

More information about the gs-bugs mailing list