Summary: | File converted to PDF displays badly in Acrobat | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Marcos H. Woehrmann <marcos.woehrmann> |
Component: | PDF Writer | Assignee: | leonardo <leonardo> |
Status: | NOTIFIED FIXED | ||
Severity: | normal | CC: | ken.sharp, mpsuzuki, sags5495 |
Priority: | P2 | Keywords: | bountiable |
Version: | master | ||
Hardware: | PC | ||
OS: | Linux | ||
Customer: | 210 | Word Size: | --- |
Attachments: |
My test file.
Suggested patch, part #1: pdfwrite fixes. Suggested patch, part #2: PDF interpreter fixes. Suggested patch, part #3: supplemental ps2write fixes (in opdfread.ps). |
Description
Marcos H. Woehrmann
2007-06-07 22:05:24 UTC
Created attachment 3008 [details]
p1_3.ps
The file has a type 3 font with /FontMatrix [0 1 -1 0 0 0] . The normal orientation of such font is not obvious and Ghostscript makes different assumptions than Distiller. Auto rotation can be easily suppressed by -dAutoRotatePages=/None . Ghostscript copies the font matrix to the PDF font as is but distiller generates a font with [1 0 0 1 0 0] matrix. All Acrobat Readers up to v. 8.1 cannot display fonts with xy coefficient not equal 0, for instance [1 1 0 1 0 0]. Note that there are NO Adobe Acrobat Readers (including 8.10) that display the output from Ghostscript correctly, with or without the -dAutoRotatePages=/None Also without the -dAutoRotatePages=/None Ghostscript has a different bar code than it does with -dAutoRotatePages=/None. Other PDF viewers (xpdf, Quartz, ...) also do not display the correct image with either PDF out of Ghostscript. Even though (as far as we can tell) we are meeting the "letter of the law" and conform to the PDF spec, the fact that no viewers can handle it suggests that we need to change Ghostscript, particularly since this is a customer bug (we have more than a few other hacks in our pdfwrite device for limitations in the Adobe Reader, so the precedent is set). Bountiable in the hope that SaGS can work on this. For the different AutoRotatePages behaviour, I have opened a separate bug #689418 "AutoRotatePages differences GS vs Distiller", because it is not related to the other problems described here. The patch I suggest is attached there. I have also opened 2 other reports (with patches) for 2 problems that interfered with working on this one: - Bug #689419 "Text missing if nested BT with opdfread.ps"; - Bug #689420 "Errors with ps2write and special chars in FontName". And now, about the main topic inhere: Reader's display of GS's output PDF is correct, and the PDF created by GS is wrong because of a bug in pdfwrite. GS displays the file "correctly" because there is a matching bug in GS's PDF interpreter. The other PDF interpreter in GS (opdfread.ps) has a similar problem. Distiller does not "normalize" the FontMatrix, at least not in all cases, and decently recent versions of Reader (6.x and later) display correctly PDFs that have nonstandard font matrices. The factors that create the problems are the following: (a) Minor (really minor) detail: the "wy" operand of "d0"/"d1" in Type 3 charproc streams must be 0. (b) When computing PDF glyph widths, and when evaluating current point movement produced by rendering glyphs, pdfwrite assumes the FontMatrix is a simple [s 0 0 s tx ty]. If this asumption is false, then glyphs are placed incorrectly. GS's PDF interpreters make the same asumption. (c) PDF1.7 Ref 5.1.3 "Glyph Positioning and Metrics" states the following: "This distance is a vector (called the displacement vector) in the glyph coordinate system; it has horizontal and vertical components. [...] In all cases, the text-showing operators transform the displacement vector into text space and then translate text space by that amount." (page 395) "w0 is the displacement vector that specifies how the text position is changed after the glyph is painted in writing mode 0; its vertical component is always 0." "w1 is the displacement vector for writing mode 1; its horizontal component is always 0." (page 396) While in PS the advance (=displacement) vector, in glyph space, can have an arbitrary direction, the PDF metrics always specify perfectly horizontal (WMode 0) or vertical (WMode 1) such vectors. This implies that, in PDF, the wy/x component of the advance vector is zeroed out a 1st time in glyph space. Also, PDF1.7 Ref, 5.3.3 "Text Space Details" states: "First, a combined displacement is computed, denoted by tx in horizontal writing mode or ty in vertical writing mode (the variable corresponding to the other writing mode is set to 0)" (page 410). This implies that wy/x is zeroed out a second time, this time in PDF text space. Of course, with all this zeroing of wy/x, care must be taken to explicitely position the next glyph to get the same printed page as with the original PS. Created attachment 3313 [details]
My test file.
Uses Type 1 and Type 3 fonts with varying direction of glyphs'
advance vectors, and (original) font matrices that include anisotropic
scaling, rotation in 15deg increments, x- and y- skewness, x- and -y
mirroring. For comparison, it prints pages that use fonts with a
"normal" FontMatrix but change the PS user space to obtain the same
display.
Created attachment 3314 [details]
Suggested patch, part #1: pdfwrite fixes.
- Generalize pdfwrite computations for glyph widths and placement,
which currently assume an [s 0 0 s tx ty]-type original (= ignoring
changes made by makefont & similar) FontMatrix, to work with an
arbitrary matrix.
- Fix: the "wy" operand of "d0"/"d1" in Type 3 charproc streams must
be zero. Note: other parts of the code already consider wy == 0,
and don't need to be changed.
- Remove font_orig_scale() and pdf_font3_scale(), which are not used
anymore. Their simple presence is a sign of the assumption that
font matrices are simple scalings.
- Minor: remove a dead variable ("int code = 0; ... return code;").
This variable is never changed, because there's another "int code"
inside the "for", but it is confusing and takes a lot of research
to figure out the behaviour is OK and no usefull error code is
lost (it's the 2nd time I stumble against this...)
Notes:
- All non-Adobe PDF viewers that I tested (Evince/ Fedora 7,
Foxit Reader 2.0/ Windows, Jaws PDF Editor 3.5/ Windows, and
Ghostscript -r8204) have problems with "weird" font matrices.
The bug in ghostscript is addressed below.
- I do have an alternate patch, which "normalises" the
FontMatrix, for better viewers compatibility . But this method
cannot work when fonts are not embedded: cannot "normalize"
something stored in an external file.
Created attachment 3315 [details]
Suggested patch, part #2: PDF interpreter fixes.
- Force the "wy" parameter of "d0", not only of "d1", in Type 3
charproc streams to 0. Comment changed to suggest this is an action
is expected to be done as part of enforcing the metrics stored in
the PDF (which always imply wy == 0), and not some strange
behaviour in Reader.
- A detail in PDF1.7 Ref 5.3.3 "Text Space Details" states that "wy"
(or "wx" in WMode 1) must be forced to 0 in PDF text space too, not
only in PDF glyph space. To implement this with non-[s 0 0 s tx ty]
font matrices, a different method is used: decode "show" strings
and extract glyph widths with "cshow", put these into an array,
then render the text with "x/yshow".
- since "wy/x" = 0 in glyph space, the code does not need to use
"idtransform", a "div" by "FontMatrix.xx" is sufficient;
- use a trick (see comment in code) when "FontMatrix.xx" = 0.
CIDFonts are not changed. Those with a "straight" matrix continue to
work, but there's no improvement if they have a weird "/FontMatrix".
I think this would require a radical change, and the result will be
much slower than it is now.
Created attachment 3316 [details]
Suggested patch, part #3: supplemental ps2write fixes (in opdfread.ps).
- opdfread.ps assumed the FontMatrix is a simple scaling by 0.001;
generalize the computations to work with an arbitrary matrix.
- Implement a detail of PDF1.7 Ref 5.3.3 "Text Space Details" (nuking
the "wy" in text space) if a non-[s 0 0 s tx ty] "/FontMatrix".
Similar to the change in the main PDF interpreter, but shorter
because opdfread.ps does not handle vertical writing or CID fonts.
Also "wy" in "d0"/"d1" is already 0 (see patch for pdfwrite), so
needs no adjustment.
Adding mpsuzuki to the CC list since he works on a similar problem. Adding Ken to the CC list since this bug affects the modules he works on. The patch 3314 introduces the following code : + gs_matrix *pmat = &pdfont->u.simple.s.type3.FontMatrix; + + pwidths->Width.xy.x *= pmat->xx; /* formula simplified based on wy in glyph space == 0 */ + pwidths->Width.xy.y = 0.0; /* WMode == 0 for PDF Type 3 fonts */ + gs_distance_transform(pwidths->real_width.xy.x, pwidths- >real_width.xy.y, pmat, &pwidths->real_width.xy); It is equivalent to pwidths->Width.xy.y = 0; pwidths->Width.xy.x = FontMatrix.xx ^ 2 . This code is wrong by 2 reasons : 1. It completely misses the character width from setcachedevice. 2. The result units is (text_scale/design_scale)^2 rather than inches. The patch causes a minor raster difference with prfmm.pdf page 2. I'm not sure whether it is important. Please provide an analysis. Next time please supply patches with log mesage in the form "diegest, details, expected differences" (see http://ghostscript.com/pipermail/gs-cvs/2007- August/007739.html). Oops sorry I misread the patch, so my comment 12 is partially wrong. The statement "It completely misses the character width from setcachedevice. " has to be removed. However the patch is still incorrect, because it multiplies twice with FontMatrix.xx . The first 2 assignments are equivalent to a gs_distance_transform() of pwidths->*Width* followed by nuking pwidths->Width.xy.y per 5.3.3 "Text Space Details" I mentioned. The last line concerns pwidths->*real*_width. I don't see how does the patch multiply by FontMatrix.xx^2. (For the "formula simplified": If FontMatrix = [a b c d tx ty], the advance vector is (wx, wy) in glyph space and (wx', wy') in text space, then wx' = a*wx + c*wy = a*wx, because in the PDF metrics wy is always 0.) Regarding Comment #14 : > I don't see how does the patch multiply by FontMatrix.xx^2. Well I do : + gs_matrix *pmat = &pdfont->u.simple.s.type3.FontMatrix; + + pwidths->Width.xy.x *= pmat->xx; /* formula simplified based on wy in glyph space == 0 */ The first multiplication with pmat->>xx in the line above. + pwidths->Width.xy.y = 0.0; /* WMode == 0 for PDF Type 3 fonts */ + gs_distance_transform(pwidths->real_width.xy.x, pwidths- >real_width.xy.y, pmat, &pwidths->real_width.xy); The second multiplication with pmat->xx is in the line above. Thus the result dimension (measure units) is wrong. I guess the test didn't detect it due to Font>matrix.xx == 1. Ops, sorry, comment #15 is wrong - I mixed Width and real_width. Now I accept the patch. Patch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2007-August/007796.html |