688770 – /undefinedresource in --findresource--

Bug 688770 - /undefinedresource in --findresource--

Summary: /undefinedresource in --findresource--

Status:	NOTIFIED FIXED

Alias:	None

Product:	Ghostscript
Classification:	Unclassified
Component:	PDF Interpreter (show other bugs)
Version:	8.54
Hardware:	PC All

Importance:	P2 normal
Assignee:	leonardo

URL:
Keywords:	bountiable

Depends on:
Blocks:

Reported:	2006-06-26 20:08 UTC by Ralph Giles
Modified:	2009-01-20 17:23 UTC (History)
CC List:	2 users (show)

See Also:
Customer:	581
Word Size:	---

Attachments
688770-acrobatfontlist.tiff (122.02 KB, image/tiff) 2006-06-28 10:00 UTC, Ralph Giles	Details
patch.txt (1.48 KB, patch) 2009-01-03 07:10 UTC, leonardo	Details \| Diff
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ralph Giles 2006-06-26 20:08:34 UTC

Customer files fail with:

Substituting CID font resource/Adobe-GB1 for /&#65533;&#65533;&#65533;&#65533;.
Error: /undefinedresource in --findresource--

I (and presumedly they) don't have the font, but Ghostscript should fail more
gracefully.

Comment 1 Ralph Giles 2006-06-26 20:09:31 UTC

Created attachment 2304 [details]
19773296.pdf

Comment 2 Ralph Giles 2006-06-26 20:10:55 UTC

Created attachment 2305 [details]
bqqq005.pdf

Comment 3 Hin-Tak Leung 2006-06-27 17:31:37 UTC

The message is fairly clear ("looking for a specific simplified chinese 
font and couldn't find one"), but I am somewhat curious about
the "/&#65533;&#65533;&#65533;&#65533;" part - this is clearly 
0xfffc 0xfffc 0xfffc 0xfffc, which does not correspond to any common 
simplified chinese font I know of, nor any sensible font name, so it is 
probably a bit wrong.

Incidentally there is a cid-font mapping enhancement bug still open
which I am supposed to work on, and I haven't forgotten it, just 
couldn't find the time...

Comment 4 leonardo 2006-06-28 03:25:24 UTC

19773296.pdf containg a font object /Subtype /CIDFontType2 
/BaseFont /#B7#C2#CB#CE_GB2312, but the font file is not embedded. In this case 
Ghostscript creates a substitute CIDFontName Adobe-GB1, which is composed from 
the encoding. Note there are many possible encodings. The user must provide a 
CID font with such name either in Resource/CIDFont or in lib/cidfmap . The 
observed behavior appears correct, so closing the bug as invalid.

Note #xx syntax in PDF names encodes "unprintable" characters - see the PDF 
spec. 

I can't reproduce the text "/&#65533;&#65533;&#65533;&#65533;.". gswin32c.exe 
prints sone non-ascii charaters instead that, and they appear correct. If you 
surely get the incorrect printing to stderr, please provide more details how to 
reproduce it : the platform, applicatrion, command line, font maps, etc. 
Forward Ghostscript stdout to a file and attach the file.

Comment 5 Hin-Tak Leung 2006-06-28 04:32:06 UTC

/#B7#C2#CB#CE_GB2312 is Fangsong_GB2312 in GB2312 encoding.
so probably should have been taken care of with a 
substitution similar to /STFangsong-Light [ /SimSun ].

The comment about &65533; make sense - just a cut and paste error somewhere.

I agree the behavior appear to be correct, but as Ralph said, the error message
could be a bit more friendly.

Comment 6 Ralph Giles 2006-06-28 10:00:21 UTC

Created attachment 2312 [details]
688770-acrobatfontlist.tiff

attaching a screenshot of the Document Info:Fonts panel from Adobe Acrobat
7.08. The font names show in Chinese. Are they correct, Hin-Tak?

We're trying to figure out how Acrobat figures out what encoding the font name
is in.

Comment 7 Ray Johnston 2006-06-28 10:04:13 UTC

Along with an improved cidfmap alias, we should probably print non-ASCII
names/strings with hex to aovid some confusion.

Comment 8 leonardo 2006-06-28 10:06:49 UTC

One hewristic for non-ascii font names is to apply the CMap attached to the 
font to its font name. GB312, etc. encoding uses ascii codes for ascii 
characters, and non-ascii codes for CJK characters, so the recognition should 
be pretty successful.

Comment 9 Hin-Tak Leung 2006-06-28 13:05:59 UTC

(comment 6) yes, the screen shot is correct. The first one is "Kai Ti GB2312"
("Kai" style), the 2nd one is "Song Ti" (Song/Sun style).

The person who made the pdf file most probably had simkai and simsun 
(both microsoft fonts, the 2nd one more common and you should have it on 
win2k/XP if you select far east support, the 1st one only found 
in localised Simplified Chinese Windows or Office), so if you are providing 
a configuration for your client, simkai.ttf/simkai.ttc for the first one, 
and simsun.ttf/simsun.ttc for the 2nd one would be appropriate.

Incidentally, you should also get reasonably authentic results with
the arphic fonts gbsn00lp.ttf for Song/Sun style, and 
gkai00mp.ttf for Kai style. (both are GPL and can be found on ftp.gnu.org)
The "s" in gbsn is Song/Sun, and g* for GB (simplified chinese).

Comment 10 leonardo 2008-12-09 21:43:16 UTC

Regarding Comment #5 "/#B7#C2#CB#CE_GB2312 is Fangsong_GB2312 in GB2312 
encoding.".

Dear Hin-Tak Leung, I'm unclear what do you want to get in this case. 
Ghostscript sends bytes to stdout, and it's a viewers's responsibility to 
represent them correctly on screen. We can't send entire stdout in a Chinese 
encoding or in Unicode when interpreting Postscript because it does not comply 
to Postscript specidfication. In the case of PDF interpretatiuon, sending 
Chinese or Unicode is not prohibited, but it's too hard to do with 
Ghostscript, which implements the PDF interpreter in Postscript.

Please note Ghostscript doesn't send the 
sequence "/&#65533;&#65533;&#65533;&#65533;.". It looks as the viewer 
application (that you use to view the text) misunderstood the encoding. I 
guess it happens because it treats non-ascii characters as something unusual.

What we can do here is to encode non-ascii characters with PDF name encoding 
as /#B7#C2#CB#CE_GB2312, and a viewer should convert them properly. We also 
can provide the name of the CMap in the message, to help the viewer to convert 
it properly. Also we can send an enhancement request to GSView manufacturer to 
improve the message handling. Is this enough to close this bug ? Have you got 
other suggestions ?

Comment 11 Hin-Tak Leung 2008-12-10 00:49:59 UTC

Ralph filed the original report - what would be considered 'fail more gracefully'?
The error message is just looking for a named font and can't find a substitute.

Comment 12 Ralph Giles 2008-12-10 10:35:52 UTC

The error message only says it couldn't find a substitution font if you
understand postscript. For most people it just says "Error, no document for you!"

For a more graceful failure I'd suggest:

Warning: Could not find a substitute font for &#25577;&#20307;_GB2312.

Followed by rendering the document without characters in that font.

Comment 13 Ralph Giles 2008-12-10 10:43:42 UTC

Sorry, I meant

Warning: Could not find a substitute font for &#26999;&#20307;_GB2312.

Note the &#20307; entities are a bugzilla issue; I'm pasting in the unicode glyphs.

Comment 14 leonardo 2009-01-03 06:32:03 UTC

Ralph, I'm unable to guess where did you took &#20307 from, how does it 
entitles a bugzilla issue and which bugzilla issue does it entitles, and why 
do you mention Unicode. So I'm unable to understand what did you mean at all.

Comment 15 leonardo 2009-01-03 07:10:04 UTC

Created attachment 4674 [details]
patch.txt

As far as I can figure out, the user wants a better message instead "undefined
in findresoure", so here is a patch for that.

Comment 16 leonardo 2009-01-06 06:21:38 UTC

Patch to HEAD :
http://ghostscript.com/pipermail/gs-cvs/2009-January/008906.html
Closing now.