ps2pdf fails on a document that can be opened e.g. by evince. The file is "561834.pdf", from Govdocs1 data set (http://digitalcorpora.org/corpora/govdocs) The file can be found in the following archive: http://digitalcorpora.org/corp/files/govdocs1/zipfiles/561.zip The program output is **** Error: invalid token after startxref. **** Warning: An error occurred while reading an XREF table. **** The file has been damaged. This may have been caused **** by a problem while converting or transfering the file. **** Ghostscript will attempt to recover the data. Error: /invalidaccess in --run-- Operand stack: post_eof_count 4096 --nostringval-- 65034 Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1967 1 3 %oparray_pop 1966 1 3 %oparray_pop 1950 1 3 %oparray_pop --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- Dictionary stack: --dict:1191/1684(ro)(G)-- --dict:1/20(G)-- --dict:83/200(L)-- --dict:83/200(L)-- --dict:117/127(ro)(G)-- --dict:280/300(ro)(G)-- --dict:21/32(L)-- Current allocation mode is local GPL Ghostscript 9.16: Unrecoverable error, exit code 1 That was tested on Ubuntu 15-10 I would be grateful if you could confirm the problem. Kind regards, Tomasz
Created attachment 12262 [details] 561834.pdf
I've confirmed that Ghostscript produces an error reading the PDF file and that the other PDF viewers I tried (Acrobat Pro DC, Apple Preview 8.1, and muPDF) open the file without error (muPDF produces a warning(.
Thanks a lot Marcos for checking that
OK firstly please try and test with the current code where possible, the current release is 9.18 Please *attach* files, don't post a URL, it happens quite frequently that URLs go stale before anyone has a chance to examine the problem. Finally, you have bot supplied the command line you are using, we'll need that in order to reproduce any problem.
(In reply to Ken Sharp from comment #4) I'm sorry: the command line would be without any parameters pd2pdf 561834.pdf Unfortunately I was able to only check on the version available on Ubuntu 15-10. I will update the report if I also try the newest. Thank you. > OK firstly please try and test with the current code where possible, the > current release is 9.18 > > Please *attach* files, don't post a URL, it happens quite frequently that > URLs go stale before anyone has a chance to examine the problem. > > Finally, you have bot supplied the command line you are using, we'll need > that in order to reproduce any problem.
Hello, I've checked the git version of GS and it seems that the file also fails. For some reason the build of the git version shows 9.10 (though gs --version gives 9.19).
Fixed in commit 119e73617fb0f1b20e6d3257d26df0159c4ca81a The file is, as usual, damaged. It ends with 'startxref 1102random binary' when it should have a sensible number and a %EOF The fact that the startxref is present confused our error correcting code and ended up closing the underlying file, which caused ioerrors. Fixing that revealed a different problem to do with calculating the offset of the 'real' trailer dictionary when it lies early in the PDF file. This is common with Linearised (adobe calls this 'optimised for fast web view') and very uncommon otherwise. This commit fixes both problems, and Ghostscript is able to repair the file. NB when you open the file with Adobe Acrobat, and then exit it offers to 'save the changes'. Assuming you haven't made any changes this is a good indication that Acrobat has silently fixed a problem in the file.