691809 – Regression: some PDF files produced by GhostPCL r11913 cannot be read by Acrobat 8.2.5

Bug 691809 - Regression: some PDF files produced by GhostPCL r11913 cannot be read by Acrobat 8.2.5

Summary: Regression: some PDF files produced by GhostPCL r11913 cannot be read by Acro...

Status:	NOTIFIED FIXED

Alias:	None

Product:	Ghostscript
Classification:	Unclassified
Component:	PDF Writer (show other bugs)
Version:	master
Hardware:	PC All

Importance:	P2 normal
Assignee:	Ken Sharp

URL:
Keywords:

Depends on:
Blocks:

Reported:	2010-12-01 20:13 UTC by Marcos H. Woehrmann
Modified:	2011-10-02 02:35 UTC (History)
CC List:	1 user (show)

See Also:
Customer:	460
Word Size:	---

Attachments
screenshot.png (110.21 KB, image/png) 2010-12-01 20:13 UTC, Marcos H. Woehrmann	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Marcos H. Woehrmann 2010-12-01 20:13:36 UTC

Created attachment 6980 [details]
screenshot.png

Starting with r11913 some GhostPCL produced PDF files are not correctly read by Acrobat 8.2.5 (see attached screenshot.png for an example).  Acrobat 9.4.1 reads the files correctly as does Ghostscript 9.00.

The command line I'm using:

  main/obj/pcl6 -sDEVICE=pdfwrite -sOutputFile=test.pdf -dNOPAUSE ./20080105.problem.pcl

Comment 2 Ray Johnston 2010-12-01 20:30:13 UTC

Unless the PDF violates the spec, I recommend closing this as WONTFIX.

That being said, Acrobat has been VERY sloppy about opening PDF files that
don't meet their own specs, causing no end of grief for readers that work
from the spec.

Comment 3 Ray Johnston 2010-12-01 20:32:45 UTC

Marcos, please attach the output PDF -- I'd like to survey all of my old
Acrobat readers and other readers to see which have problems with it.

Also, why not simply suggest that they upgrade to 9.0 (Acrobat or Ghostscript)
reader.

Comment 5 Marcos H. Woehrmann 2010-12-01 21:28:46 UTC

(In reply to comment #3)
> Also, why not simply suggest that they upgrade to 9.0 (Acrobat or Ghostscript)
> reader.

Acrobat 8 is supported by Adobe until 11/03/11 and I presumed we will attempt to maintain interoperability until at least then.  Or are we prepared to tell users who discover a bug in Ghostscript running on Windows XP to upgrade to Windows 7?

Comment 6 Ray Johnston 2010-12-01 21:47:36 UTC

"supported by Adobe" ???

We should test Adobe's support by reporting the bug that Michael found before
assuming that Adobe 'support' means anything.

Note that I _have_ done tests of Acrobat 5, 7, 8 and 9 in the past (similar
to our regression testing) and they break almost as many files as they fix
from one release to the next.

Also, I'll stand on my initial comment here: If we are generating an invalid
PDF, then, fine, we should fix it.

If not, tell the customers/users to upgrade to Acrobat 9 or go to Adobe for
support.

Comment 7 Ray Johnston 2010-12-01 22:02:15 UTC

Marcos, thanks for attaching this file. I don't seem to have Acrobat 8,
but this file works fine with Acrobat 7 and 9, Foxit (version 4.3, I think),
PDFXChange Viewer 2.0 (Free) and SummatraPDF based on some old mupdf.

I have Acrobat 4 and 5 installed on another system and I will test that.

Comment 8 Ken Sharp 2010-12-02 08:05:29 UTC

(In reply to comment #7)
> Marcos, thanks for attaching this file. I don't seem to have Acrobat 8,
> but this file works fine with Acrobat 7 and 9, Foxit (version 4.3, I think),
> PDFXChange Viewer 2.0 (Free) and SummatraPDF based on some old mupdf.
> 
> I have Acrobat 4 and 5 installed on another system and I will test that.

I tried them yesterday, 5 works 4 doesn't. In fact all the versions from 4 to 9 work (on Windows) except 8 and 4.

The PDF file is valid I believe, Ghostscript is more demanding than Acrobat in general about things like BoundingBox entries (I spent some effort getting this correct with GS after I got r11913 working with Acrobat), so I think these are generally correct.

I believe the problem is 'probably' the zero advance width argument supplied to setcachedevice (d1), obviously I can't say for sure. To be honest the only thing that's changed with r11913 is that glyphs end up in a larger number of fonts, and are more likely to have 'sensible' character codes.


I believe this is the same customer which asked for revision 11913 to be done, categorising the previous behaviour as a bug (which it isn't). I'm closing this as WONTFIX, if the customer insists on it being 'fixed' then I'll unroll revision 11913 and revert to the original behaviour (unsearchable text).

If this is a bug its an *Adobe* bug, as is obvious from the fact that various versions of Acrobat behave differently. If Adobe are supporting Acrobat 8 then I suggest the customer reports it as a bug to Adobe and gets them to fix it.

Comment 9 Ken Sharp 2010-12-02 14:05:44 UTC

The actual problem seems to be that we are emitting a font where the Encoding is WinAnsiEncoding with no Differences. For some bizarre reason this causes Acrobat 4 (presumably 8 also though I can't test that) to not display the glyphs.

The reason this occurs with the current code is because we are mapping the bitmap glyphs with regular ASCII character codes, in order to make them searchable. When we didn't do this we had to emit a complete Encoding array, because we were using non-standard character codes and so the problem did not arise.

So the 'problem' is a direct result of moving to ASCII codes in order to get searchable text. We could work round this, by forcing pdfwrite to emit a complete Encoding for type 3 fonts, even if the font is compatible with a standard Encoding, but this is wasteful.

I don't really see the point in making all PDF files larger simply in order to work around a bug in a couple of old versions of Acrobat.

Comment 10 Ray Johnston 2010-12-02 17:15:57 UTC

An ASCII Encoding only needs to exist in the PDF file once -- the 'object' can
be referenced for every Type3 font, right ? It's not like it is very big --
just 96 entries ? (showing my ignorance if printable ASCII is more than 95 or
96 glyphs).

Comment 11 Ken Sharp 2010-12-02 17:35:58 UTC

(In reply to comment #10)
> An ASCII Encoding only needs to exist in the PDF file once -- the 'object' can
> be referenced for every Type3 font, right ? It's not like it is very big --
> just 96 entries ? (showing my ignorance if printable ASCII is more than 95 or
> 96 glyphs).

We could embed an ASCII compliant Encoding (more accurately a complete copy of WinAnsiEncoding would be required). It would be 256 entries because its WinAnsi, not actually ASCII. Also each entry consists of the name of the glyph, so its reasonably big, especially as a fixed overhead on small files.

Given that the only reason for the existence of the pre-defined Encodings is so that you don't have to emit them, Adobe obviously thought this was worthwhile.

We'd need to do it for all fonts, well all type 3 fonts at least, I don't know if this Acrobat bug affects other font types. The pain from my point of view is that it means lots of bookkeeping in pdfwrite. I'd need to know whether I'd written a copy of the WinAnsi encoding whenever writing out a type 3 font. If not then I'd have to write it first, either way all the fonts would then have to reference the emitted Encoding. 

Currently we either use a pre-defined Encoding or a custom Encoding written directly to the FontDescriptor. I'd need to change the code to use a reference to a prior object. So changes to font structures, more logic in the resource emission and always the probability of introducing a bug.

Is it possible ? Yes of course. Easy ? No. All to work around a bug in Acrobat....

It would be a lot easier to unroll the revision that added 'searchability'.

Comment 12 Ken Sharp 2010-12-02 18:06:21 UTC

Actually....

I may have thought of an easy way to get an Encoding written, and only for type 3 bitmapped fonts written via the cache. SO it would limit the likelihood of a problem occurring.

I'll try it out later, or tomorrow.

Comment 13 Ken Sharp 2010-12-03 09:23:24 UTC

Marcos tested my work around, and it seems that Acrobat 8's problem is the same
one as Acrobat 4. If a type 3 font (possibly a type 3 bitmapped font) has a WinAnsi Encoding and no Differences then these versions of Acrobat simply don't render glyphs in that font.

revision 11931 works around the Acrobat bug by forcing the emission of a /Differences entry for each glyph that gets used. This is (of course) inefficient but it does resolve the problem for these buggy versions of Acrobat.

Patch here:
http://ghostscript.com/pipermail/gs-cvs/2010-December/011984.html

Comment 14 Marcos H. Woehrmann 2010-12-04 22:28:27 UTC

(In reply to comment #6)
> "supported by Adobe" ???
> 
> We should test Adobe's support by reporting the bug that Michael found before
> assuming that Adobe 'support' means anything.
> 

I'm pretty sure we don't have a support contract with Adobe, but if you think we should sign up I'm game to try it (it's $175 for 5 incidents, but the good news is that "If an incident is verified as a bug, your available incident balance will not be affected.").

Comment 15 Ray Johnston 2010-12-05 00:58:52 UTC

I think we _should_ sign up for the 5 incident service at $175. If nothing else,
we can confirm that whether or not they consider the problem Ken just put a 
fix in for Acrobat 4 and 8 is a bug in 8, and either have them fix it, or tell
us why it isn't a bug per the PDF spec.

Also Michael Vrhel has an issue that we have told customers _is_ a bug in
Acrobat transparency. It would be nice to get their buy in to that.