Bug 689972

Summary:	Regression: Ghostscript can't read Ghostscript produced pdf file (file Bug689516.pdf)
Product:	Ghostscript	Reporter:	Marcos H. Woehrmann <marcos.woehrmann>
Component:	PDF Interpreter	Assignee:	Ken Sharp <ken.sharp>
Status:	RESOLVED FIXED
Severity:	normal	CC:	alex, mlungu777
Priority:	P3
Version:	master
Hardware:	PC
OS:	Linux
Customer:		Word Size:	---

Description Marcos H. Woehrmann 2008-07-15 15:48:52 UTC

The attached PDF file, when written to a PDF by Ghostscript using the pdfwrite
device, can't be read by Ghostscript.  Acrobat Reader 8.0 reads the resulting
PDF file without error, so I assume the problem is in the PDF Interpreter and
not pdfwrite.  I've tested this with gshead (r8842), gs8.62 and earlier produce
an "/invalidfont in --run--" error during the pdfwrite step.

BTW, I may have entered a duplicate bug for this issue earlier, but I can't find
it.  

I'm using this following command line for testing:

  bin/gs -sDEVICE=pdfwrite -o test.pdf ./Bug689516.pdf
  bin/gs -sDEVICE=ppmraw   -o test.ppm ./test.pdf

With -sPDFDEBUG Ghostscript produces the following while reading test.pdf:
.
.
.
%Resolving: [38 0]
%Resolving: [27 0]
<<
/BaseFont /RYGZUL+80000006 /FontDescriptor 28 0 R
/Type /Font /FirstChar 32 /LastChar 54 /Widths [
278 556 667 556 556 500 556 500 1015 556 191 333 556 355 556 556 737 556 667 222
333 500 500 ]
/Encoding 58 0 R
/Subtype /TrueType >>
endobj
%Resolving: [28 0]
<<
/Type /FontDescriptor /FontName /RYGZUL+80000006 /FontBBox [
-9 -210 992 729 ]
/Flags 131076 /Ascent 729 /CapHeight 728 /Descent -210 /ItalicAngle 0 /StemV 148
/MissingWidth 750 /XHeight 728 /FontFile2 48 0 R
>>
endobj
%Resolving: [48 0]
<<
/Filter /FlateDecode /Length1 15388 /Length 9570 >>
stream
%FilePosition: 461818
endobj
%Resolving: [28 0]
%Resolving: [58 0]
<<
/Type /Encoding /BaseEncoding /WinAnsiEncoding /Differences [
33 /one /S /p /a /c /e /s /at /nine /quotesingle /hyphen /zero /quotedbl /three
/two /copyright /seven /K /i /r /k /y ]
>>
endobj
Error: /rangecheck in --run--
Operand stack:
   --nostringval--   --dict:8/17(L)--   R27   9.6   FontObject  
--dict:9/18(L)--   --dict:9/18(L)--   268358   --dict:9/18(L)--   --n
ostringval--   false   (\000\003\000\001)  
(\000\272\001-\001\214\002\022\002\236\002\350\003\200\004t\005\017\005\316\006\200\006\36
3\007\213\b9\t\b\t\274\nT\013\020\013\203\fM\000\000\000\002\000\001\000\000\000\000\000\024\000\003\000\000\000\000\001\032\000\000\0
01\006\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\003\b\017\025\020\021\022\027\r\f\005\006\007\004\n\t\031\013\016\023\026\024\030\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
00\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000...)  
(\000\272\001-\001\214\002\022\002\236\002\350\003\200\004t\005\017\005\316\006\200\006\363\007\213\b9\t\b\t\274\nT\013\020
\013\203\fM\000\000\000\002\000\001\000\000\000\000\000\024\000\003\000\000\000\000\001\032\000\000\001\006\000\000\000\000\000\000\00
0\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\003\b\017\025\020\021\02
2\027\r\f\005\006\007\004\n\t\031\013\016\023\026\024\030\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\00
0\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000...)
  356   8
Execution stack:
   %interp_exit   .runexec2   --nostringval--   --nostringval--  
--nostringval--   2   %stopped_push   --nostringval--   --nostringva
l--   --nostringval--   false   1   %stopped_push   1905   1   3   %oparray_pop
  1904   1   3   %oparray_pop   1888   1   3   %oparra
y_pop   --nostringval--   --nostringval--   2   1   1   --nostringval--  
%for_pos_int_continue   --nostringval--   --nostringval--   
--nostringval--   --nostringval--   %array_continue   --nostringval--   false  
1   %stopped_push   --nostringval--   %loop_continue  
 --nostringval--   --nostringval--   --nostringval--   --nostringval--  
--nostringval--   --nostringval--   --nostringval--   --nostr
ingval--   --nostringval--   45   1   300   --nostringval--  
%for_pos_int_continue   --nostringval--
Dictionary stack:
   --dict:1148/1684(ro)(G)--   --dict:1/20(G)--   --dict:75/200(L)--  
--dict:75/200(L)--   --dict:106/127(ro)(G)--   --dict:275/300(r
o)(G)--   --dict:22/25(L)--   --dict:4/6(L)--   --dict:21/40(L)--  
--dict:5/8(L)--   --dict:8/8(L)--   --dict:40/50(ro)(G)--   --dict
:26/40(L)--
Current allocation mode is local
Last OS error: 2
GPL Ghostscript SVN PRE-RELEASE 8.63: Unrecoverable error, exit code 1
marcos@amd64:[309]%

Comment 1 Marcos H. Woehrmann 2008-07-15 15:49:45 UTC

Created attachment 4223 [details]
Bug689516.pdf

Comment 2 Alex Cherepanov 2008-07-16 04:13:10 UTC

I don't think that this is a regression.
The source PDF is broken - correct results cannot be guaranteed.

V. 8.62 fails during pdfwrite run, generates incomplete PDF file, but the
generated file is correct PDF, accepted by AR5 and Ghostscript.

HEAD completed pdfwrite run but generates incorrect file, rejected
AR5 and Ghostscript.

So Ghostscript need to improve the error tolerance to AR8 level,
and generate correct PDF in the first place.

Comment 3 Marcos H. Woehrmann 2008-07-16 07:42:29 UTC

I agree that this isn't a regression.  It's one of the nightly regression files
that's failing, so that's why the summary says "Regression".

Comment 4 Ray Johnston 2008-07-17 09:52:06 UTC

Assigning to Ken  (as P3) to improve pdfwrite so we don't write a file that we cannot then read. 
 
If Alex wants to improve the robustness of our PDF interp to be able to oen this, then fine.

Comment 5 Ken Sharp 2009-01-09 04:07:38 UTC

I'm not certain what the problem is, exactly. It seems to be the embedded
TrueType font, which is broken in the original PDF file. When we write it to the
output PDF file its still broken.

It seems that the fact that we have additionally stripped out some tables, and
re-organised the table order is also causing GS some problems (also Acrobat,
version 5 or higher, complain about the font).

I don't think its possible to detect the damaged TrueType font in pdfwrite, not
without adding a considerable amount of validation code to check the consistency
of TrueType fonts when we write them out, thereby duplicating a lot of stuff in
the existing TrueType interpreter. We don't seem to have any 'bad font' flags.

If Alex can tell me why the pdfwrite output causes an error, and the original
PDF file doesn't, I'll try and recode the TT font embedding in gdevpsft.c so
that the output doesn't cause GS a problem either. As I said, I'm a bit in the
dark as to what the actual source of the problem is...

Given that the original file *is* broken, and we emit plenty of warnings about
it, I'm not convinced that we should go to great lengths to address this.

FWIW I'm assuming that the difference in behaviour is a fix in the PDF/TT
interpreters which allow the original broken font to be read, where previously
we threw an error.

Comment 6 Ralph Giles 2009-02-02 11:47:17 UTC

In addition to a lot of warnings about incomplete 'post' and 'loca' tables, with
r9431 I also see:

> Failed to interpret TT instructions in font 80000006. Continue ignoring
> instructions of the font.

Comment 7 Ken Sharp 2009-02-03 00:33:38 UTC

Yes, this is just more warnings that the font is broken. 

Unfortunately the way pdfwrite works with TT fonts at the moment, when we write
the font into the PDF file we don't reconstruct all of it from first principles
(which would probably 'fix' this). 

Instead we reconstruct some bits, and copy others, in particular we copy all the
glyph descriptions for example. The result is that the PDF file we make has a
TrueType font which is still broken, but broken in a different way to the
original PDF file. 

I still need to try and work out what it is that the PDF interpreter objects to
so strenuously, and try to work around it, but since the original is broken I
don't think its a priority.

Comment 8 Ken Sharp 2009-02-28 01:18:58 UTC

*** Bug 690310 has been marked as a duplicate of this bug. ***

Comment 9 Ken Sharp 2010-08-03 09:23:11 UTC

At some point this has been 'fixed'. The file produced by pdfwrite can be opened by both Ghostscript and Acrobat. When using FreeType Ghostscript opens the file without comment, when using the old code GS complains, but still opens and renders the file.

The underlying problems (broken font in input file, and pdfwrite doing a poor job of creating TT fonts) persist, but the result is much improved.