Summary: | 'trailer' without 'startxref' | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | SaGS <sags5495> |
Component: | PDF Interpreter | Assignee: | Alex Cherepanov <alex> |
Status: | NOTIFIED FIXED | ||
Severity: | enhancement | ||
Priority: | P3 | ||
Version: | master | ||
Hardware: | PC | ||
OS: | All | ||
Customer: | Word Size: | --- | |
Attachments: |
Suggested patch.
Additional test case. Updated patch (against -r8312). Modified patch |
Description
SaGS
2005-08-24 13:50:18 UTC
Created attachment 1651 [details]
Suggested patch.
The attached patch fixes only the problem mentione din this report. For
the sample file to be processed 100% correctly, the following patches need
to be applied too:
#688150 "ReusableStreamDecode available but failing at languagelevel 2".
#688152 "'Undefined in get' and extra trailer keys with pdfwrite.ps
and PDF1.5+"
#688282 "Xref-streams ignored in hybrid-ref PDFs"
Because of a patch commited to tolerate "startxref" misspelled as "startref" (without the "x"), GNU patch partly rejects the diff in comment #1. Fix: the "/startref ..." line introduced by the mentioned commit may go away, because now scanning ends at the closing ">>", not at "startxref". Patched GS gives a warning for any misspelling or absence of the token "startxref". An updated version is file Bug687796-r7114IJ-to-r7114IJH.diff.txt in attachment #2548 [details] (from bug #687796 comment #25, "pdfinflt.ps creates an incomplete PDF"). That particular diff needs some other patches from there to be applied before it. Not that the fixes are interdependent, but the modified lines are physically too close for GNU patch to find the context it needs and apply the diffs. Created attachment 3500 [details] Additional test case. The PDF 1.7 Reference, section F.2.3 "First-Page Cross-Reference Table and Trailer (Part 3)", 2nd paragraph on page 1031 states: "The first-page trailer may optionally end with startxref, an integer, and %%EOF, just as in an ordinary trailer. This information is ignored." (note the word "optionally"), so in such a case a PDF interpreter cannot rely on anything particular (like "startxref"/ "startref"/ "xref") to follow the trailer dictionary. Now consider a linearized PDF being incrementally updated, and you get that in the general case a "previous trailer" may lack the "startxref". It's possible Acrobat does a similar thing when creating hybrid-ref PDFs. Attached PDF is such a file that "legally" lacks a "startxref". (Although originally generated by Distiller, it was binary-edited to overwrite the "startxref + number + %%EOF" with spaces.) Current GS (SVN rev 8312) fails to display this file because it scans beyond the end of the trailer dictionary. Reader displays it fine. - The 1st time the trailer dict is red, the error triggers an unnecessary rebuild. This is described by the current report and addressed by the suggested patch. - During the rebuild, it fails again with an "Error: /undefined in /BXlevel" or an "Error: /undefined in --run--", depending on how you run GS (*). Please see bug #687796 comment #25 point (I). (*) Don't know why, from some time most (all?) errors that appear while executing GS from the command line (platform: WIN32) are signaled as "... in --run--" or "... in --execute--". When run by GSView, the errors are more informative. Is this OK, or a bug? Created attachment 3501 [details]
Updated patch (against -r8312).
- The warning about missing "startxref" removed, since this keyword
is not always required. This also avoids getting the "This file had
errors that were repaired or ignored. ..." message for this case.
- Diff regenerated to be applicable to current TRUNK.
It undoes the change in rev 8312 and the addition of "/startref ..."
from rev 6979 too, as these are not usefull anymore.
Created attachment 3505 [details]
Modified patch
Thanks for the patch, SaGS.
I think the patch can be committed but I'd like to do 2 minor changes.
1. Use 'def' instead of 'store'. There's no other dictionaries on the stack
but 'store' suggests the opposite.
2. Move the initialization of dictlevelcount to pdfopenfile where many other
variables get initialized.
Do you agree ?
Regression testing shows no differences.
If you prefer these modifications, and no problem appeared with the modified code, it's OK with me. I'll just explain here why I coded like this: For (1): In general I prefer "store" for MODIFYING values, being more robust - it does not depend [too much] on the dict stack contents, and will immediately catch the error if the correct dict is missing from the dict stack. For (2): I think it's better to INITIALISE (not necessarily CREATE) "dictlevelcount" just before scanning for the trailer. What if there's an error in the trailer and "dictlevecount" remains > 0? The rebuild is triggered, but for the rebuild it will start with the wrong value. However, this won't matter too much because it's expected to fail anyway (nobody fixes the trailer in the meantime!). The patch from the comment #4 with "store" changed to "def" is committed as a rev. 8327. Although "store" may be more robust, it greatly extends the scope that the programmer has to consider reading the code. This is similar to use of globals in C instead of static variables. The rest of Ghostscript code uses "store" only when it's needed. |