Summary: | Regression: Ghostscript can't read Ghostscript produced pdf file (file Bug689516.pdf) | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Marcos H. Woehrmann <marcos.woehrmann> |
Component: | PDF Interpreter | Assignee: | Ken Sharp <ken.sharp> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | alex, mlungu777 |
Priority: | P3 | ||
Version: | master | ||
Hardware: | PC | ||
OS: | Linux | ||
Customer: | Word Size: | --- |
Description
Marcos H. Woehrmann
2008-07-15 15:48:52 UTC
Created attachment 4223 [details] Bug689516.pdf I don't think that this is a regression. The source PDF is broken - correct results cannot be guaranteed. V. 8.62 fails during pdfwrite run, generates incomplete PDF file, but the generated file is correct PDF, accepted by AR5 and Ghostscript. HEAD completed pdfwrite run but generates incorrect file, rejected AR5 and Ghostscript. So Ghostscript need to improve the error tolerance to AR8 level, and generate correct PDF in the first place. I agree that this isn't a regression. It's one of the nightly regression files that's failing, so that's why the summary says "Regression". Assigning to Ken (as P3) to improve pdfwrite so we don't write a file that we cannot then read. If Alex wants to improve the robustness of our PDF interp to be able to oen this, then fine. I'm not certain what the problem is, exactly. It seems to be the embedded TrueType font, which is broken in the original PDF file. When we write it to the output PDF file its still broken. It seems that the fact that we have additionally stripped out some tables, and re-organised the table order is also causing GS some problems (also Acrobat, version 5 or higher, complain about the font). I don't think its possible to detect the damaged TrueType font in pdfwrite, not without adding a considerable amount of validation code to check the consistency of TrueType fonts when we write them out, thereby duplicating a lot of stuff in the existing TrueType interpreter. We don't seem to have any 'bad font' flags. If Alex can tell me why the pdfwrite output causes an error, and the original PDF file doesn't, I'll try and recode the TT font embedding in gdevpsft.c so that the output doesn't cause GS a problem either. As I said, I'm a bit in the dark as to what the actual source of the problem is... Given that the original file *is* broken, and we emit plenty of warnings about it, I'm not convinced that we should go to great lengths to address this. FWIW I'm assuming that the difference in behaviour is a fix in the PDF/TT interpreters which allow the original broken font to be read, where previously we threw an error. In addition to a lot of warnings about incomplete 'post' and 'loca' tables, with
r9431 I also see:
> Failed to interpret TT instructions in font 80000006. Continue ignoring
> instructions of the font.
Yes, this is just more warnings that the font is broken. Unfortunately the way pdfwrite works with TT fonts at the moment, when we write the font into the PDF file we don't reconstruct all of it from first principles (which would probably 'fix' this). Instead we reconstruct some bits, and copy others, in particular we copy all the glyph descriptions for example. The result is that the PDF file we make has a TrueType font which is still broken, but broken in a different way to the original PDF file. I still need to try and work out what it is that the PDF interpreter objects to so strenuously, and try to work around it, but since the original is broken I don't think its a priority. *** Bug 690310 has been marked as a duplicate of this bug. *** At some point this has been 'fixed'. The file produced by pdfwrite can be opened by both Ghostscript and Acrobat. When using FreeType Ghostscript opens the file without comment, when using the old code GS complains, but still opens and renders the file. The underlying problems (broken font in input file, and pdfwrite doing a poor job of creating TT fonts) persist, but the result is much improved. |