The first attached pdf document (which I uncompressed for easier inspection and manipulation) consists solely of the Latin letter A followed by the Greek letter mu, each in an old version of the TimesNewRoman-Italic font (but with two different encoding vectors for the two characters). When Ghostscript is run on it, it only displays the A and complains "Substituting .notdef for mu". Although I suspect this is being triggered by something weird in the TimesNewRoman-Italic font -- I notice, if I dump the tables, that there is no psName name="mu" entry although there are entries for all the other lowercase Greek letters, and the problem disappears if another Greek letter besides mu is used -- there does seem also to be something wrong with Ghostscript itself: The mu glyph IS defined in the font, both Adobe Reader and xpdf render the document correctly, and, most tellingly, GhostScript itself renders it correctly if the A is removed, or if the order of the A and the mu is simply reversed. Specifically, Ghostscript correctly renders the second attached pdf document, which a simple diff demonstrates is identical to the first except that the order of the A and the mu has been reversed: diff -au bad.pdf works.pdf --- bad.pdf 2009-12-15 13:55:01.000000000 -0500 +++ works.pdf 2009-12-15 14:36:43.000000000 -0500 @@ -3840,18 +3840,18 @@ 1 0 0 1 95.68265 732.14548 cm q BT -/Fo0S0 12.00000 Tf +/Fo0S2 12.00000 Tf 0.08627 0.07843 0.07451 RG 0.08627 0.07843 0.07451 rg 0 Tr 1.00000 0 0 1.00000 0.00000 -9.12695 Tm -<44> Tj -/Fo0S2 12.00000 Tf +<7D> Tj +/Fo0S0 12.00000 Tf 0.08627 0.07843 0.07451 RG 0.08627 0.07843 0.07451 rg 0 Tr 1.00000 0 0 1.00000 7.33008 -9.12695 Tm -<7D> Tj +<44> Tj ET Q Q
Created attachment 5778 [details] File where ghostscript fails to render the mu
Created attachment 5779 [details] File where ghostscript succeeds in rendering the mu
Digging further into the code, I have found both the problem and a possible patch. The first time a Font object is encountered for a particular TrueType font, that Font object's encoding vector is used to construct a "prebuilt encoding" vector mapping glyph names to cmap table indices, and this in turn is used to build the CharStrings array. After this, any glyphs mentioned in the post table but not yet included in CharStrings are added directly. Subsequent occurrences of Font objects for that particular TrueType font use the later occurrence's encoding vector to get a glyph name, but then use the originally constructed CharStrings array to get the actual glyph. In this case, the font is broken in that the "mu" glyph is not mentioned in the post table. (Maybe as a way of placating broken apps that were getting it when they really wanted mu1? Who knows). This means that, if it is not included in the encoding vector of the FIRST font object, it does not get added to the CharStrings array and hence isn't rendered even though it does appear in the encoding vector of the SECOND font object. However, since other pdf readers handle this case fine, it is a problem that ghostscript doesn't. The attached patch fixes this problem at least for cmap-3,1 fonts. In this case, AdobeGlyphList is already being used to build the glyph-name-to-cmap-table-index mapping except for glyphs not known to AdobeGlyphList in which case the original encoding vector's value is treated as the cmap table index. The patch simply augments this behaviour by also adding any additional glyph names mentioned in AdobeGlyphList and present in the cmap table but not mentioned in the original encoding vector or the post table; that way, if that glyph name is used in a later encoding vector, it can be properly found. A similar patch would presumably be needed for the code branch dealing with cmap types other than 3,1, using the list of glyph names appropriate to that particular cmap type.
Created attachment 5792 [details] Proposed patch to allow ghostscript to render both test files properly
Thank you for using and contributing to Ghostscript. The problem has been fixed in a different way. Old versions of Ghostscript cached the font instance on the font descriptor but it may be shared between fonts with different encodings. Since rev. 11148, the font is cached on the PDF font resource. See bug 690714 for the patch. *** This bug has been marked as a duplicate of bug 690714 ***