Bug 688737

Summary: 'resourceforall' truncates names of file-based resources
Product: Ghostscript Reporter: SaGS <sags5495>
Component: PS InterpreterAssignee: leonardo <leonardo>
Status: NOTIFIED FIXED    
Severity: normal CC: bernd
Priority: P4    
Version: master   
Hardware: PC   
OS: Windows XP   
Customer: Word Size: ---
Attachments: Suggested patch.

Description SaGS 2006-06-05 11:42:02 UTC
SYMPTOMS ---

gswin32c.exe built from SVN rev 6840 (current at the time of 
this writing) does not start at all on Windows. Trying to start 
it, one gets the following error:

   "While reading gs_dbt_e.ps:
    Error: /undefinedresource in --findresource--"

The error is actually raised by "lib\gs_fntem.ps" -r6840 
line #273, inside a "resourceforall" loop. The cause is a bug 
in "lib\gs_res.ps::ResourceForAll". (The file name in the error 
message is incorrect, see bug #688738 "Misleading error message 
from 'runlibfile0'".)

Note: GS_LIB is set to something like 
"X:\Y\Z\gs\lib;X:\Y\Z\gs\Resource;X:\Y\Z\gs\fonts".

BUG DETAILS ---

To find file-based resources (those in "Resources\*\", 
not those already loaded in VM  by "defineresource"), 
"lib\gs_res.ps::ResourceForAll" enumerates files using 
"filenameforall", then cuts off the "path" parts to retain only 
the file names proper, expected to be equal to resource names.

Before each "filenameforall" loop, it computes the length 
(denoted in comments by "Lp") of the "path" part, and inside 
the loop it uses "getinterval" to remove that many characters 
from each full name string. The error happens because "Lp" is 
computed using the TEMPLATE STRING, which may contain extra "\"s 
for escaping "*?\", and used against the PLATFORM-SPECIFIC 
FILENAME STRINGS returned by "filenameforall" where the extra 
"\"s are NOT present. On Windows, where the path separator is 
"\" and needs to be escaped, the computed "Lp" is too large and 
a few chars from the beginning of resource names are lost.

For example, on my computer the line

    "(*) { == } 100 string /Decoding resourceforall"

gives (after removing "Resource\Encoding\", so that GS runs):

   "(ode)
    (Wingdings)
    (Symbol)
    (dardEncoding)
    (Unicode)
    (Dingbats)
    ()
    (n1)"

instead of:

   "(Unicode)
    (FCO_Wingdings)
    (FCO_Symbol)
    (StandardEncoding)
    (FCO_Unicode)
    (FCO_Dingbats)
    (.svn)                 (<-- different problem)
    (Latin1)"

This does not happen when using "/" instead of "\" in GS_LIB.

SUGGESTED FIX ---

The attached patch changes "lib\gs_res.ps::ResourceForAll" to 
compute "Lp" during the 1st iteration of each "filenameforall" 
call, by locating the LAST ".file_name_separator". For 
performance reasons, it is assumed the "path" part of all 
returned filenames is the same, so its length is computed only 
once. (For example does not expect "filenameforall" to return 
"X:\AAA\one.txt" and then "X:\.\BBB\..\AAA\two.txt".)
Comment 1 SaGS 2006-06-05 11:44:47 UTC
Created attachment 2257 [details]
Suggested patch.
Comment 2 leonardo 2006-06-07 09:01:23 UTC
The suggested patch must not be applied because it is Unix specific.
Comment 3 Ray Johnston 2006-06-07 09:11:36 UTC
It would solve MANY problems for users if all Windows users learned to
use '/' in paths instead of '\' (which is in strings as '\\').

The confusion caused by this ancient Microsoft perversity is myriad.

Maybe the correct thing to do is to munge the LIBPATH entries and the
GenericResourceDir to change '\' into '/' and avoid many issues.
Comment 4 leonardo 2006-06-07 09:14:31 UTC
I can see two ways for fixing it.

1. Improve .generate_dir_list_templates with preserving lengthes of directory 
pathes, which come from its argument. The result will be like this : template1 
length1 template2 length2 ...; Fix all calls with consuming extra results.

2. Compute the path length from the template length with searching all 
escapements. It looks pretty comples if the 2nd argument 
of .generate_dir_list_templates has its own escapements, and therefore it must 
be also passed to this computation.

The disadvantage of (1) is a modification of other calls. The disadvantage of 
(2) is introducing another constraint that 2 functions must correspond each 
another. 

I strongly prefer (1).
Comment 5 SaGS 2006-06-07 13:27:51 UTC
> The suggested patch ... is Unix specific.

I don't see why. Not even WINDOWS-specific.

The only assumptions made by the new code are as follows:

(a) The syntax of the strings returned by filenameforall is

        [ [ "dir" ] ".file_name_separator" ] "fname"

    where:

    - "[]" indicate optional parts;

    - "dir" is an arbitrary sequence of chars (it represents 
      the platform's notion of "directory"/ "folder"/ etc.);

    - ".file_name_separator" is the platform-specific string 
      returned by the GS operator with the same name (which is 
      the string returned by gp_file_name_separator() found in 
      gp_dosfs.c/ gp_ntfs.c/ gp_unifn.c/ etc);

    - "fname" is a character sequence, arbitrary except as noted 
      below (it represents the plaform's notion of "filename 
      proper, without any drive/ path/ etc.").

(b) The "fname" can never include ".file_name_separator".
    (I may be wrong here if, for example, VMS accepts "]" 
    in the "filename proper" part.)

(c) The string returned by the .file_name_separator operator 
    in non-empty;

(d) The resource name is equal to the "fname" part. This is the 
    only thing that makes the "fname" part essentially different 
    from the "dir" part.

    In particular, this means that a category's directory must 
    not have subdirs. (If it would, then it wouldn't work on 
    Windows anyway, because currently filenameforall does not 
    enumerate files in subdirs - but that's another issue.) 

    I find this to be the only real weakness/ limitation of the 
    method I suggested. If you want to support subdirs, I 
    think the only way is to implement a filenameforall-alike 
    that takes an additional "base dir" argument and returns 
    file names with partial paths relative to this "base dir".

Note that "dir" is allowed to contain the ".filenameseparator", 
(on Unix, and even Windows, both gp_file_name_separator() and 
gp_file_name_directory_separator() are "/"), but "fname" isn't. 
In any case, finding the LAST ".file_name_separator", if any, 
in the string returned by filenameforall, allows to separate the 
"fname" part and thus find the resource name.

Also note that this method does not depend on:

(i)  any escape chars that may have been present in the template;
(ii) what filenameforall and/or .generate_dir_list_templates 
     do to path parts like "./" or "DIR/../".

---

> 1. Improve .generate_dir_list_templates with preserving 
> lengthes of directory pathes, which come from its argument. 
> The result will be like this : template1 length1 template2 
> length2 ...

Almost OK. But, since .generate_dir_list_templates "simplifies" 
"./" and "DIR/../", what "lengthK" should be returned for

    [ (A/B/C) ] (../../../D/*) .generate_dir_list_templates ?

(BTW: Try the line above, and variations of it!)

> 2. Compute the path length from the template length with 
> searching all escapements.

"Simplification" of path parts like "./" and "DIR/../", 
especially at the boundary between the 2 strings coming from the 
2 arguments, compounds the problem.
Comment 6 leonardo 2006-06-14 07:59:00 UTC
I wrote :

> The suggested patch ... is Unix specific.

Well, I assumed OpenVMS. Actually my sentense isn't correct. The correct 
sentence is : "the suggested patch assumes no escapements in the directory 
path". This assumption isn't enough general. Here is an example :

(\?)

This path template string is longer than the corresponding path string

(?)

and contains no file name separators.
A directory named '?' may be created on Linux. 
Comment 7 leonardo 2006-06-14 08:01:42 UTC
Ray wrote :

> It would solve MANY problems for users if all Windows users learned to
> use '/' in paths instead of '\' (which is in strings as '\\').

This particular obstacle can't help in general. Also I don't think that forcing 
users to learn Unix helps to win new users.

Comment 8 SaGS 2006-06-15 01:01:42 UTC
> The correct sentence is : "the suggested patch assumes no escapements 
> in the directory path". This assumption isn't enough general.

In PostScript, escapement chars are present in:

(a) In the SYNTACTIC (source code) representation of PostScript strings.
    Example: for a file/dir named "\", the corresponding PS string 
    has a length of 1 and the first element the char code 0x5C (assuming 
    ASCII), but when written in a PS program's source it appears 
    as "(\\)".

and
(b) In filenameforall TEMPLATE strings, as you mention.

My patch is neither in case (a) (because it does not deal with PS source 
code), nor in case (b) (because it does not examine the TEMPLATE, but 
the resulting filename strings).

The assumption the patch makes is that the platform-specific 
".file_name_separator" does not appear in the filename-proper part 
of a (possibly complete) file specification. For example, that 
Windows and Unix do not allow "/" in filenames, OpenVMS does not 
allow "]" and Mac does not allow ":".
Comment 9 leonardo 2006-06-15 01:29:08 UTC
> Windows and Unix do not allow "/" in filenames, OpenVMS does not 
> allow "]" and Mac does not allow ":".

Postscript does allow them in resource names.
My experience is to map such names into subdirectories.
I've got many TT fonts with names like "Times/Cyrillic".

The following succeeds with GS HEAD on Windows :

(t/Wingdings) cvn /Encoding findresource
(s\\Wingdings) cvn /Encoding findresource
(*) {=} 256 string /Encoding resourceforall

if you place the file into Resource/Encoding/t/Wingdings and 
Resource/Encoding/s/Wingdings and change the resource name in the file 
into "(t/Wingdings) cvn" and "(s\\Wingdings) cvn" correspondingly.
The stdout is :

AFPL Ghostscript SVN PRE-RELEASE 8.55 (2006-05-20)
Copyright (C) 2006 artofcode LLC, Benicia, CA.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
StandardEncoding
MacExpertEncoding
SymbolEncoding
PDFDocEncoding
s\Wingdings
MacRomanEncoding
Wingdings
.GS_extended_SymbolEncoding
DingbatsEncoding
WinAnsiEncoding
ISOLatin1Encoding
t/Wingdings


Comment 10 SaGS 2006-06-15 03:04:29 UTC
> Postscript does allow them ["/\" etc] in resource names.
> My experience is to map such names into subdirectories.
> I've got many TT fonts with names like "Times/Cyrillic".

You are right with this, and as I mentioned in comment #5 ("In 
particular, this means that a category's directory ...") there is 
a limitation here.

If you don't want such a limitation, then it's your decision. In the 
meantime, the limitation is not specific to this patch. Two reasons:

(1) On Windows, "filenameforall", used to implement "resourceforall", 
    does not recurse into subdirectories, so resources in subdirectories 
    are not enumerated (although they are found by "findresource"). The 
    Unix implementation is better. I'm not sure what happens on OpenVMS 
    or Mac. But, given the special syntax of file specs on OpenVMS, I see 
    a potential problem with simply cutting (the initial) part of the 
    string returned by "filenameforall" in the middle of the directory 
    specification (not sure, don't know OpenVMS).

(2) The set of chars allowed in filenames is inherently platform-specific.
    As a consequence, name of FILE-BASED resources have some constraints 
    that resources defined with "defineresource", directly in VM, don't. 
    Not allowing ".file_name_separator" in these resources' names enters 
    in this class of restrictions. Note that the patch does not care about 
    ".DIRECTORY_name_separator" in resource names, only about 
    ".FILE_name_separator", but unfortunately these are equal for all 
    platforms except OpenVMS.

> The following succeeds with GS HEAD on Windows :
>
> (t/Wingdings) cvn /Encoding findresource
> (s\\Wingdings) cvn /Encoding findresource

OK. (After first removing "Resources\Encoding\" to be able to start GS, 
and putting it back before issuing those commands.)

> (*) {=} 256 string /Encoding resourceforall

No, not on Windows (GS compiled with VS .NET 2003). As I already mentioned, 
on this platform "filenameforall" and thus "resourceforall" do not 
recurse subdirectories. When testing, be carefull to try this command 
directly after starting GS, NOT after issuing "(t/Wingdings) cvn /Encoding 
findresource" etc. The latter does succeed, then "resourceforall" 
enumerates that resource because it finds it already in VM, NOT on disk. 
What I get is (with "/" and not "\" in GS_LIB, otherwise it's worse):

    AFPL Ghostscript SVN PRE-RELEASE 8.55 (2006-05-20)
    Copyright (C) 2006 artofcode LLC, Benicia, CA.  All rights reserved.
    This software comes with NO WARRANTY: see the file PUBLIC for details.
    GS>(*) {=} 256 string /Encoding resourceforall
    WinAnsiEncoding
    DingbatsEncoding
    PDFDocEncoding
    .GS_extended_SymbolEncoding
    ISOLatin1Encoding
    SymbolEncoding
    MacExpertEncoding
    MacRomanEncoding
    StandardEncoding
    .svn
    Wingdings
    t                      <-- the subdir itself, but none of the files inside
    GS>quit
Comment 11 leonardo 2006-06-15 03:54:41 UTC
> (After first removing "Resources\Encoding\" to be able to start GS, 
> and putting it back before issuing those commands.)

Not sure what do you mean.

I run this test with the following file layout :

./Resource/Encoding   directory
./Resource/Encoding/Wigdings   file 1 with /Wingdings
./Resource/Encoding/t   directory
./Resource/Encoding/t/Wingdings file 2 with (t/Wingdings) cvn
./Resource/Encoding/s   directory
./Resource/Encoding/s/Wingdings file 32 with (s\\Wingdings) cvn

The test case works fine.

I agree that the current implementation has a problem : if enumerate resources 
before (t/Wingdings) cvn /Encoding findresource (s\\Wingdings) cvn /Encoding 
findresource, these 2 files are missed in the enumeration. This is a luck of 
the current implementation, which to be fixed on some day. Dropping those 
resources from findresource would be a regression.





Comment 12 leonardo 2006-06-15 03:57:08 UTC
Dropping those resources from findresource would be a regression, 
because "(Times/Cyrillic) cvn findfont" won't work.
Comment 13 SaGS 2006-06-15 05:36:36 UTC
> > (After first removing "Resources\Encoding\" to be able to start GS, 
> > and putting it back before issuing those commands.)
> 
> Not sure what do you mean.

The moment I create a subdirectory in "Resources\Encoding\", trying to start 
gswin32c.exe fails with

   "While reading gs_dbt_e.ps:
    Error: /undefinedresource in --findresource--"

The filename printed is wrong, see bug 688738 "Misleading error message from 
'runlibfile0'". The error happens in gs_fntem.ps -r6840 line 273 "cvn dup 
/Encoding findresource" because the subdirectory itself is enumerated as a 
resource. Your workaround from gs_fntem.ps::resourceforall eliminates ".svn" 
from the enumeration, but not the other directories.

See 2nd paragraph of http://bugs.ghostscript.com/show_bug.cgi?id=688565#c7 , 
which refers to gp_ntfs.c::gp_enumerate_files_next() -r6651 line 169 
"&& (pfen->find_data.dwFileAttributes != FILE_ATTRIBUTE_DIRECTORY))". 
Debugging, the attribute for that "t" I created is 0x2010, which is 
FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_DIRECTORY.

The correct test there is:

   "if (! (pfen->find_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
        break;"

("&" not "==", also no need to special case "." and "..").

Either you fixed this, or your dirs have just a plain FILE_ATTRIBUTE_DIRECTORY.

---

> Dropping those resources from findresource would be a regression, 
> because "(Times/Cyrillic) cvn findfont" won't work.

The patch doesn't change "findresource" at all, this still works as before, 
even with resources in subdirectories. The limitation refers exclusively to 
"resourceforall", and to the fact that the behaviours of "resourceforall" and 
"findresources" are supposed to match (one must find exactly what the other 
finds too). Both problems are already present in TRUNK.

I don't have a "Times/Cyrilic", but I hacked a010013l.pfb to name the font 
"URWG/Book" and stored it as "Resources\Font\URWG\Book.". Both 
"(URWG/Book) cvn /Font findresource" and "(URWG/Book) cvn findfont" work OK 
with the patched GS version (just tested on Windows).
Comment 14 leonardo 2006-06-15 08:40:05 UTC
>    "While reading gs_dbt_e.ps:
>    Error: /undefinedresource in --findresource--"

Please update to current HEAD.

> The limitation refers exclusively to 

No need to introduce the limitation. I'll make a patch without it when I'll 
have a time. You may do as well.


Comment 15 leonardo 2006-08-17 08:29:40 UTC
Patch to HEAD :
http://ghostscript.com/pipermail/gs-cvs/2006-August/006756.html
Comment 16 Bernd Becker 2007-03-24 17:44:56 UTC
I'm running GS 8.56 on Windows 2000 and no matter what - using
SaGS's patch, the original, changing the backslashes in the
environment variables to slashes - it will exit with an error
message every time; the only difference is that with SaGS's
patch the reason is given as "/rangecheck in --resourceforall--".

Log file follows: (unmodified 8.56)

Current allocation mode is global
Current file position is 7361
gsapi_init_with_args returns -100
GPL Ghostscript 8.56 (2007-03-14)
Copyright (C) 2007 artofcode LLC, Benicia, CA.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
While reading gs_fntem.ps:
Error: /undefinedresource in /wingdings
Operand stack:
   (gs_fntem.ps)   1   FontEmulationProcs   encodingnames   --nostringval--   --
nostringval--   StandardEncoding   --nostringval--   ISOLatin1Encoding   --
nostringval--   SymbolEncoding   --nostringval--   DingbatsEncoding   --
nostringval--   SymbolEncoding   --nostringval--   DingbatsEncoding   --
nostringval--   StandardEncoding   --nostringval--   ISOLatin1Encoding   
wingdings   wingdings   Encoding   25   wingdings
Execution stack:
   %interp_exit   --nostringval--   --nostringval--   --nostringval--   %
array_continue   --nostringval--   --nostringval--   --nostringval--   false   1 
  %stopped_push   --nostringval--   1830   17   5   %oparray_pop   --
nostringval--   --nostringval--   --dict:17/21(ro)(G)--   --dict:1/1(G)--   --
nostringval--   1   %dict_continue   --nostringval--   --nostringval--   1828   
24   5   %oparray_pop   findresource   %errorexec_pop   --nostringval--   --
nostringval--   --nostringval--   --nostringval--
Dictionary stack:
   --dict:951/1123(G)--   --dict:0/20(G)--   --dict:64/200(L)--   --dict:951/
1123(G)--   --dict:10/10(G)--   --dict:17/21(ro)(G)--
Current allocation mode is global
Current file position is 7361
gsapi_init_with_args returns -100
Comment 17 leonardo 2007-03-24 23:16:21 UTC
Regarding Comment #16 :

'wingdings' is not a right resource name. The right one is Wingdings. The file 
name must be same as resource name, with no ignoring lower/upper case. Please 
rename the file and try again.
Comment 18 leonardo 2007-03-24 23:17:10 UTC
BTW gs8.56 does not need to apply SaGS patch. I think it is not applicable at 
all.
Comment 19 Bernd Becker 2007-03-25 08:53:38 UTC
<< 'wingdings' is not a right resource name. The right one is Wingdings. >>
Hmph, that was the culprit. Thanks a lot, man !! :-)

<< BTW gs8.56 does not need to apply SaGS patch. >>
I tried it to see if it would change anything, and it did, just not in the
hoped for way ;-)
And I'm back to unmodified GS 8.56, as renaming the file did the trick.