Summary: | /UserUnit is not supported,yet in PDF1.6. | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Toru Ukita <ukita> |
Component: | PDF Interpreter | Assignee: | Alex Cherepanov <alex> |
Status: | NOTIFIED FIXED | ||
Severity: | normal | CC: | artifex, sags5495 |
Priority: | P2 | Keywords: | bountiable |
Version: | 8.51 | ||
Hardware: | All | ||
OS: | All | ||
URL: | n/a | ||
Customer: | 870 | Word Size: | --- |
Attachments: |
Open the PDF file by Acrobat7. Hige Page size
I captured this from my uploaded PDF Suggested patch. Test file: UserUnit-tests.pdf Patch relative to today's TRUNK (svn rev 7001). |
Description
Toru Ukita
2005-06-03 12:47:31 UTC
Please provide an example file. Created attachment 1424 [details]
Open the PDF file by Acrobat7. Hige Page size
Acrobat sees a 20x20inch blank page. Created attachment 1432 [details]
I captured this from my uploaded PDF
I have Acrobat7 and I opened my uploaded PDF file by Firefox.
See the Page size left bottom or scale value at center top.
I believe you don't open it by Acrobat7. Acrobat6 or earlier show it as 2 x 2
in.
Created attachment 2054 [details]
Sample file that uses UserUnit 2
The attached file test_userunit2.pdf uses a UserUnit value of 2. Adobe Reader 7
shows the correct size of 170x140 mm. Acrobat Reader 5 shows the wrong size
85x70 mm. GhostScript also creates output files of size 85x70 mm.
We received a question from a customer about this issue. I am bumping the priority. It is a PDF interpreter problem. Passing the bug to the PDF interpreter expert. Created attachment 2148 [details] Suggested patch. The attached patch adds support for /UserUnit. It also fixes some related bugs that stayed in the way while testing it. In particular it fixes bug #688359. Details about the changes follow. --- (A) The basic implementation: scaling the PS user space /UserUnit implementation is very similar to -dPDFFitPages. - If -dPDFFitPages=true, then /UserUnit is ignored; no matter how small or haw large is 1 PDF user space unit, it has to be scaled so that the PDF page fits on the paper. - Else, the /UserUnit scales the PS user space, exactly like -dPDFFitPage does. This solves [almost] everything related to drawing marks on the page. This also means that all problems with -dPDFFitPages affect the implementation of /UserUnit. The biggest problem was to get rid of those, especially for PDF->PDF "conversion" where some elements of the source PDF (outlines, links...) needed to be preserved. --- (B) Don't scale the border width in a border style dict (the change to pdf_draw.ps) See comment in code. Note that a border width in a border style ARRAY is specified in user space units, so it grows with /UserUnit and the scaling of the PS user space suffices. --- (C) No more /PAGES for /CropBox (pdf_main.ps hunk @@ -129,10 +129,6 @@ and the "pget" instead of "knownoget" for /PAGE pdfmark) /UserUnit, /Rotate and the translation due to non-(0,0) PDF page origin are "flattened" into the page; also -dPDFFitPage scales the page. This means the even if 2 source PDF pages had the same /CropBox, in the destination PDF these may need different /CropBox-es. Example: source: page #1 /UserUnit 1 and /CropBox [0 0 100 100] page #2 /UserUnit 2 and /CropBox [0 0 100 100] (the same) becomes: page #1 no /UserUnit and /CropBox [0 0 100 100] page #2 no /UserUnit and /CropBox [0 0 200 200] (differs) so the /CropBox cannot be inherited anymore. The new code puts a /CropBox into each page that has or inherits one. NOTE: There are 2 more places in pdf_main.ps that do a "knownget" for /CropBox, one a few lines after "%****** DOESN'T HANDLE COLOR TRANSFER YET ******" and one after "(Adobe Tech Note 5407, sec 9.2)". I think both should be using "pget". --- (D) EXTRA: /CropBox-es in intermediate /Pages were ignored Old code preserved only /CropBox-es that appeared in the root PDF /Pages object and in /Page objects, the ones in intermediate nodes of the /Pages tree being lost. In the new code, this gets fixed for PDF->PDF as a side effect of the implementation for (C). See also the note above for clipping of the marks drawn on the page. --- (E) PDF->PS "default" user space transform The new code computes a matrix that transforms the PDF default user space of a page to the PS default user space. This matrix accounts for the rotation (/Rotate), scaling (-dPDFFitPage or /UserUnit) and any translation needed to move the PDF lower-left corner the the lower-left corner of the paper. - This is done in the new pdf_main.ps::pdf_PDF2PS_matrix, which inherits, with the needed changes, almost all of the code in the old .pdfshowpage_Install. - The matrix is page-specific because different pages may have different dimensions (so -dPDFFitPage scales them differently), different /Rotate or /UserUnit. - pdf_main.ps::pdf_cached_PDF2PS_matrix is a utility proc that ensures the matrix for a given page is computed, caches it in the PDF page dictionary under the key given by pdf_main.ps::PDF2PS_matrix_key, then returns the matrix. - (The definition for PDF2PS_matrix_key exists only to allow binding of a complicated name into pdf_cached_PDF2PS_matrix; avoids doing "(complicated.name) cvn" at run-time.) - This matrix is currently used: (E.1) by .pdfshowpage_Install (which is now reduced to 2 lines) for setting up the PS user space; (E.2) for transforming the /Crop- or /MediaBox in pdfshowpage_setpage (E.3) to transform coordinates in view destinations (used by outline entries, PDF links ...). --- (F) /Orientation now always 0 pdfmark does not work correctly with /Orientation != 0 (long story). The old code used the /Orientation page device parameter to handle the /Rotate from PDF pages. The new code always sets /Orientation to 0 and handles /Rotate by explicitely doing a "rotate" (in pdf_PDF2PS_matrix). - Avoids GS-specific hackery otherwise needed to work around pdfmark problems when /Orientation != 0. - Simplifies the code, because a single transformation matrix needs to be computed both for setting up the PS user space and for transforming varions coordinates used in pdfmarks (/CropBox, view destinations). --- (G) PDF->PDF-migrated view destinations were wrong (pdf_main.ps hunks @@ -939,6 +935,45 @@ and @@ -947,18 +982,30 @@) Coordinates appearing in view destinations need to be recomputed, due the PS default user space, AS USED BY the pdfwrite driver, not being identical to the original PDF default user space. The list of causes included rotation due to /Rotate (implemented either with /Orientation or a simple "rotate"), translation due to non-(0,0) PDF page origin, and scaling due to -dPDFFitPages; now we add /UserUnit to this list. --- (H) EXTRA: -dPDFFitPage now chooses portrait or landscape (pdf_main.ps, near the end of hunk @@ -1031,62 +1078,133 @@) If -dPDFFitPage, the code after "% Preserve page size," chooses portrait or landscape orientation depending on the PDF page's width:height ratio. I consider this results in a better "fit to page". Example: PDF with mixed portrait + landscape letter pages, to be printed on A4 paper. Old code sometimes fitted landscape pages on portrait paper. --- (I) EXTRA: better placement of imaged area If -dPDFFitPage and the PDF page's width:height ratio differs from the paper's width:height, some unused space remains. With the old code, this extra space was placed at left/right/top/bottom depending on the page's /Rotate. New code always puts the extra space either at right or top (depending only on the PDF page being relatively "taller" or "wider" than the paper). This is mainly a side effect of not using /Orientation anymore. --- (J) -dNOUSERUNIT I added a new option that can be used to disable processing of /UserUnit. Named NOUSERUNIT, defaults to "false" meaning /UserUnit being taken into account. I implemented this default following Adobe Reader's 7.0.7/Windows default, but see note. Note: I suggest to set the default to ignore UserUnit, and have a -dDOUSERUNIT option to activate it. I can do this change if desired. I'll explain the reson for such a choice through an example: - I THINK that UserUnit was introduced by Adobe as part of it [Adobe] entering the CAD world. - Consider a floorplan plotted on a sheet of paper at a certain scale, let's say 1:50. - In this scenario, the PDF page corresponds to the plotted paper, so it has a MediaBox of that size. - If the scale is 1:50, set UserUnit = 50. This allows someone, given a suitable UI, to easily and accurately MEASURE various elements of the floorplan. - Ghostscript does not have such a UI, and I think such a UI is beyond GS's purpose. - GS's role, however, is to PRINT that PDF in order to obtain the "plotted" floorplan. - To obtain the equivalent of the plotted paper, printing must ignore UserUnit. Observing UserUnit for printing would require a building-sized sheet of paper! --- (K) TESTING DETAIL: "transform" returns reals "<x> <y> <matrix> transform" returns 2 realtype objects, even when <x> <y> are integertype and the matrix is [1 0 0 1 0 0] (identity, containing only integers). This makes, for example, a Media- or CropBox of [0 0 612 792] in a source PDF to be written as [0 0 612.0 792] in the destination, which is annoying if comparing the output of unpatched and patched Ghostscript. Created attachment 2149 [details]
Test file: UserUnit-tests.pdf
First 16 pages include all combinations of Rotate 0/90/180/270,
with/without UserUnit, with/without a CropBox. There are
bookmarks for each page, and links between each other. All the
view destinations involved are of type /FitR. The CropBox-es, if
present, are placed where the large dotted rectangles are.
Last 8 pages contain links with all types of view destinations
(/Fit, /XYZ, etc) to the first 16 pages, to verify the
coordinates involved are transformed correctly.
Links are somehow connected to the dark-green dotted rectangles
around the destination page number. For example /FitR magnifies
tha page to show exactly that rectangle, /XYZ points to its
upper-left corner, etc.
Notes:
- Attachement #1424 is not suitable for testing because it
requires a huge page size (200 x 200 inches), so taking
/UserUnit into account ends with a configurationerror in
setpagedevice.
- Adobe Reader for Windows sometimes has trouble computing the
zoom after clicking a link to a page with UserUnit > 1. It
seems to forget to take UserUnit into account, and the zoom
factor results exactly UserUnit times larger. After PDF->PDF
conversion with patched Ghostscript, these links work OK
because the UserUnits are "flattened into the pages".
- It [Reader] also has problems with /FitBV. I stopped trying
to understand why, those links are identical to the /FitV
ones, and /FitV works OK.
We would prefer that UserUnit is interpreted by default. There are no recommendations how to choose the UserUnit. Therefore there is no relation between the extent of a PDF-file that uses a UserUnit and the extent that it would have ignoring the UserUnit. UserUnit is a valid PDF-element and it should be handled. The only reason for not interpreting it could be to be compatible with previous versions of GhostScript. I believe the reason for introducing the UserUnit is more trivial. The implementation limit for Acrobat page size is 14400x14400 units. These are the well known 200x200 inches regarding 1/72 inch sizes. Adobe could not remove the limit of 14400x14400 for two reasons: 1.: The value is represented as a fixed float value (16 bit mantissa, 16 bit fractional part) whose range can not be increased significantly (+/-32000). 2.: PDF-files using a page size larger than 14400x14400 units can not be displayed with Acrobat. To be able to handle files larger than 200x200 inches, an additional "scale factor" element, the UserUnit, was added. This is just ignored by old Adobe Readers. Any feedback on the patch? I mention the patch, as posted, does take UserUnit into account by default. Also, fixing bug 688829 "Merging PDF files using gs: outlines and links not updated" seems to require changing "linkdest". If that patch will be relative to current TRUNK, applying the 2 patches will produce conflicts (not incompatibilities, just conflicts). Created attachment 2419 [details] Patch relative to today's TRUNK (svn rev 7001). I attach an updated patch, because commits made during the last 4+ months created conflicts that are not so easy to resolve. The fuctionality is the same as before; for details please see comment #8. Differences between the 2 patches: (1) Account for changes in rev 6850->6893 "Use /PageSize from the currrent page device dictionary when the /MediaBox pget fails. Bug 688771, customer 581." (2) Account for changes in rev 6893->6897 "Replace empty MediaBox or CropBox box with a box that is equal to the current page size. Bug 688744, customer 384." (3) Includes the change "knownoget->pget" mentioned in the note at the end of point (C) from comment #8 (pdf_main.ps, the 2 new hunks @@ -1213,7 +1313,7 @@ and @@ -1230,7 +1330,7 @@). The patch is committed as a rev. 7999. I've tested the patch ageainst our PDF file collection, changed -dNoUserUnit to th mixed case for better readability and documented the new option in Use.htm. Thank you, SaGS. |