[gs-bugs] [Bug 691222] Gs cannot open Unicode file names on Windows

bugzilla-daemon at ghostscript.com bugzilla-daemon at ghostscript.com
Tue Jun 7 11:45:43 UTC 2011


http://bugs.ghostscript.com/show_bug.cgi?id=691222

--- Comment #14 from Robin Watts <robin.watts at artifex.com> 2011-06-07 11:45:41 UTC ---
(In reply to comment #7)
> Hi Henry, does this mean that you are using UTF-8 internally? 

Yes, exactly.

> If this is the case then I could do a quick test by providing a utf-8 filename
> to gsapi_init_with_args() and patch my current clone of gs so that gd_fopen()
> converts the utf-8 filename to wchar_t* and uses _wfopen(). 

That's all done now.

(In reply to comment #9)
> What does it mean for the api interface? I guess you have to provide UTF-8
> strings. If this is the case then this should be documented in api.html. Same
> for stdin, stdout, stderr texts.

You're absolutely right. I'll get onto that.

> Maybe also the copy/paste commands in the gs window systemmenu extension could
> then put/get wchar_t from clipboard and convert them to UTF-8. This would
> allow to copy unicode filenames via clipboard.

I am unfamiliar with that extension.

> The fix should should solve Bug 690026 too.

I'll look into that.

> Remark 1: gd_fopen() wmode[] should not use a fixed length. I know it should
> never be longer than 3, but...

This is an internal API, and I didn't feel the count/malloc/free overhead was
justified here.

> Remark 2: Is the call of setlocale() in main() still useful?

Not a clue.

(In reply to comment #11)
> So, how does this work for paths defined as either environment variables or
> PostScript strings (e.g., -I___ or -sOutputFile= -sGenericResourceDir=___ or
> GS_FONTPATH environment variable/registry key) ?
> 
> Where in the process are strings UTF-8 vs. wchar ?

In Unix, the environment always passes us UTF-8 encoded values (environment
keys, command lines etc), and gp_fopen (which just calls fopen) expects encoded
values too.

In windows, we are (as far as possible) in the same situation. 'main' (now
actually called main_utf8) is called with UTF-8 encoded values. Environment
keys are similarly assumed to be UTF-8. gp_fopen likewise assumes encoded
values too.

The difference between windows and unix is that we have a thin shim layer in
there to do the conversion for us (wmain converts from wchar to UTF-8 and calls
main_utf8, gp_fopen converts from UTF-8 to wchar and calls _wfopen).

(In reply to comment #12)
> I did a quick test in the morning feeding the api with utf8 arguments to
> access .ps and .pdf files with unicode file and folder names. This works fine
> now.

Fabulous. Many thanks for your help testing this. I don't have a non-english
version of windows, so I am particularly grateful for any testing/suggestions/
pointing-out-of-stupid-errors you can offer!

-- 
Configure bugmail: http://bugs.ghostscript.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


More information about the gs-bugs mailing list