Bug 691117 - Gs cannot open lib files when directory has some SJIS letters in its name.
Summary: Gs cannot open lib files when directory has some SJIS letters in its name.
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PS Interpreter (show other bugs)
Version: master
Hardware: PC Windows XP
: P2 normal
Assignee: Henry Stiles
URL:
Keywords: bountiable
Depends on:
Blocks:
 
Reported: 2010-02-18 03:05 UTC by Masaki Ushizaka
Modified: 2012-04-12 17:13 UTC (History)
2 users (show)

See Also:
Customer: 580
Word Size: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Masaki Ushizaka 2010-02-18 03:05:27 UTC
The customer reported this.

- Install GS on Japanese Windows XP.
- Create a folder named "機能", copy lib and Resource directories in it.
- Run gs with -I option specifying directories in 機能".
- GS cannot find files in there.

The customer said specifying directories by registry also results the same, I have not tried yet.

This is a classic example of "Second byte of Shift-JIS matched for backslash" case.  Japanese Windows uses Shift-JIS 
encoding for its ANSI APIs.  In Shift-JIS encoding, first byte is always greater than 0x80, but second byte may contain 
ASCII letters (0x40-0x7e).  The string "機能" (means "function") has a binary value of "0x8b 0x40 0x94 0x5c", and last 
byte is exactly same as backslash (0x5c).
When GS try to open lib files, it cut out each directory one by one (at '/' or '\') and concatenates it again.  During that, the 
directory name "機能" become broken.
To fix this, GS needs to handle path string by multi-byte aware way (search_separator() ?).  Although I am not sure if we 
should do so.

The customer also reported using English version of Windows XP does not produce this error.  I am not sure why it 
doesn't.
Comment 1 Masaki Ushizaka 2010-03-29 12:49:22 UTC
Using stdin redirect might work as a temporary solution.

- If you are using cmd.exe or batch file, then:

  C>gswin32c - <unicode_filename.ps

- If you launch gs from your code:
  1) Launch gs using cmd.exe and its redirection:

    CreateProcessW(NULL, L"C:\WINDOWS\system32\cmd.exe /c gswin32c.exe - < unicode_filename.ps", ...);

  2) or, redirect stdin by yourself:

    HANDLE hFile = CreateFileW(L"unicode_filename.ps", ...);
    STARTUPINFO si = { 0, };
    ...
    si.hStdInput = hFile;
    ...
    CreateProcess(NULL, _T("gswin32c.exe -"), ... &si, ..);
Comment 2 Masaki Ushizaka 2010-03-29 13:17:58 UTC
There is a disadvantage in stdin redirection in comment #1.  When using stdin, gs may spool its contents into temporary file, and may take more disk space and time to process.
Comment 3 Masaki Ushizaka 2010-03-30 12:26:04 UTC
Marcos, can you ask the customer if comment 1 and 2 can be their short term workaround?
(And please assign this back to me, thank you)
Comment 4 Robin Watts 2011-06-04 23:07:41 UTC
Believed fixed with:

commit 0ea739147fd02ee0e63e58c036bb63fa841ddd3c
Author: Robin Watts <Robin.Watts@artifex.com>
Date:   Sat Jun 4 22:04:12 2011 +0100

    Bug 691222: Make windows build use UTF-8 encoding.

    We change the windows builds to use the 'wmain' rather than 'main'
    entrypoints. This means we get the command line supplied in 'wchar_t's
    rather than chars. We convert back to chars using UTF-8 encoding, and
    call (what was) the main entrypoint.

    This means that we can cope with unicode filenames/paths etc.

    To match the encoding, we therefore need to wrap every use of the
    filenames with the associated utf-8 -> wchar_t conversion and use
    the unicode file access functions (_wfopen instead of fopen etc)
    instead.

    Simple testing seems to indicate that this works. I think I've got
    every occurence of file access, but it's possible I've missed some. If so
    I'll fix them piecemeal as they are reported.

    This should solve bug 691222, and hopefully 691117.
Comment 5 Henry Stiles 2011-06-09 18:07:43 UTC
Looks like Robin Watts has done enough to close this one.