Customer files fail with: Substituting CID font resource/Adobe-GB1 for /����. Error: /undefinedresource in --findresource-- I (and presumedly they) don't have the font, but Ghostscript should fail more gracefully.
Created attachment 2304 [details] 19773296.pdf
Created attachment 2305 [details] bqqq005.pdf
The message is fairly clear ("looking for a specific simplified chinese font and couldn't find one"), but I am somewhat curious about the "/����" part - this is clearly 0xfffc 0xfffc 0xfffc 0xfffc, which does not correspond to any common simplified chinese font I know of, nor any sensible font name, so it is probably a bit wrong. Incidentally there is a cid-font mapping enhancement bug still open which I am supposed to work on, and I haven't forgotten it, just couldn't find the time...
19773296.pdf containg a font object /Subtype /CIDFontType2 /BaseFont /#B7#C2#CB#CE_GB2312, but the font file is not embedded. In this case Ghostscript creates a substitute CIDFontName Adobe-GB1, which is composed from the encoding. Note there are many possible encodings. The user must provide a CID font with such name either in Resource/CIDFont or in lib/cidfmap . The observed behavior appears correct, so closing the bug as invalid. Note #xx syntax in PDF names encodes "unprintable" characters - see the PDF spec. I can't reproduce the text "/����.". gswin32c.exe prints sone non-ascii charaters instead that, and they appear correct. If you surely get the incorrect printing to stderr, please provide more details how to reproduce it : the platform, applicatrion, command line, font maps, etc. Forward Ghostscript stdout to a file and attach the file.
/#B7#C2#CB#CE_GB2312 is Fangsong_GB2312 in GB2312 encoding. so probably should have been taken care of with a substitution similar to /STFangsong-Light [ /SimSun ]. The comment about &65533; make sense - just a cut and paste error somewhere. I agree the behavior appear to be correct, but as Ralph said, the error message could be a bit more friendly.
Created attachment 2312 [details] 688770-acrobatfontlist.tiff attaching a screenshot of the Document Info:Fonts panel from Adobe Acrobat 7.08. The font names show in Chinese. Are they correct, Hin-Tak? We're trying to figure out how Acrobat figures out what encoding the font name is in.
Along with an improved cidfmap alias, we should probably print non-ASCII names/strings with hex to aovid some confusion.
One hewristic for non-ascii font names is to apply the CMap attached to the font to its font name. GB312, etc. encoding uses ascii codes for ascii characters, and non-ascii codes for CJK characters, so the recognition should be pretty successful.
(comment 6) yes, the screen shot is correct. The first one is "Kai Ti GB2312" ("Kai" style), the 2nd one is "Song Ti" (Song/Sun style). The person who made the pdf file most probably had simkai and simsun (both microsoft fonts, the 2nd one more common and you should have it on win2k/XP if you select far east support, the 1st one only found in localised Simplified Chinese Windows or Office), so if you are providing a configuration for your client, simkai.ttf/simkai.ttc for the first one, and simsun.ttf/simsun.ttc for the 2nd one would be appropriate. Incidentally, you should also get reasonably authentic results with the arphic fonts gbsn00lp.ttf for Song/Sun style, and gkai00mp.ttf for Kai style. (both are GPL and can be found on ftp.gnu.org) The "s" in gbsn is Song/Sun, and g* for GB (simplified chinese).
Regarding Comment #5 "/#B7#C2#CB#CE_GB2312 is Fangsong_GB2312 in GB2312 encoding.". Dear Hin-Tak Leung, I'm unclear what do you want to get in this case. Ghostscript sends bytes to stdout, and it's a viewers's responsibility to represent them correctly on screen. We can't send entire stdout in a Chinese encoding or in Unicode when interpreting Postscript because it does not comply to Postscript specidfication. In the case of PDF interpretatiuon, sending Chinese or Unicode is not prohibited, but it's too hard to do with Ghostscript, which implements the PDF interpreter in Postscript. Please note Ghostscript doesn't send the sequence "/����.". It looks as the viewer application (that you use to view the text) misunderstood the encoding. I guess it happens because it treats non-ascii characters as something unusual. What we can do here is to encode non-ascii characters with PDF name encoding as /#B7#C2#CB#CE_GB2312, and a viewer should convert them properly. We also can provide the name of the CMap in the message, to help the viewer to convert it properly. Also we can send an enhancement request to GSView manufacturer to improve the message handling. Is this enough to close this bug ? Have you got other suggestions ?
Ralph filed the original report - what would be considered 'fail more gracefully'? The error message is just looking for a named font and can't find a substitute.
The error message only says it couldn't find a substitution font if you understand postscript. For most people it just says "Error, no document for you!" For a more graceful failure I'd suggest: Warning: Could not find a substitute font for 揩体_GB2312. Followed by rendering the document without characters in that font.
Sorry, I meant Warning: Could not find a substitute font for 楷体_GB2312. Note the 体 entities are a bugzilla issue; I'm pasting in the unicode glyphs.
Ralph, I'm unable to guess where did you took 体 from, how does it entitles a bugzilla issue and which bugzilla issue does it entitles, and why do you mention Unicode. So I'm unable to understand what did you mean at all.
Created attachment 4674 [details] patch.txt As far as I can figure out, the user wants a better message instead "undefined in findresoure", so here is a patch for that.
Patch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2009-January/008906.html Closing now.