Bug 689267 - File converted to PDF displays badly in Acrobat
Summary: File converted to PDF displays badly in Acrobat
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Writer (show other bugs)
Version: master
Hardware: PC Linux
: P2 normal
Assignee: leonardo
URL:
Keywords: bountiable
Depends on:
Blocks:
 
Reported: 2007-06-07 22:05 UTC by Marcos H. Woehrmann
Modified: 2008-12-19 08:31 UTC (History)
3 users (show)

See Also:
Customer: 210
Word Size: ---


Attachments
My test file. (13.30 KB, application/postscript)
2007-08-26 04:31 UTC, SaGS
Details
Suggested patch, part #1: pdfwrite fixes. (12.92 KB, patch)
2007-08-26 04:32 UTC, SaGS
Details | Diff
Suggested patch, part #2: PDF interpreter fixes. (9.40 KB, patch)
2007-08-26 04:33 UTC, SaGS
Details | Diff
Suggested patch, part #3: supplemental ps2write fixes (in opdfread.ps). (3.77 KB, patch)
2007-08-26 04:35 UTC, SaGS
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marcos H. Woehrmann 2007-06-07 22:05:24 UTC
The customer reports, and I've verified, that converting the attached file with
ghotscript produces a PDF file that Acrobat does not display correctly.  The
file has two issues, the barcode is damaged and the file is landscape instead of
portrait.

Opening the Ghostscript created PDF file with Ghostscript results in a correctly
display barcode but still in landscape mode.

Converting the original PostScript file to a raster format with Ghostscript
results in a portrait file with a readable barcode. 

Using Distiller to generate a PDF file results in a file that both Acrobat and
Ghostscript open correctly.

The command line used to convert the PostScript file to PDF:

bin/gs -dCompatibilityLevel=1.4 -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite
-sOutputFile=test.pdf -c .setpdfwrite -f p1_3.ps

(The customer reports that they are using ps2pdf with 8.54, so I believe those
are the options that are used).

None of this is dependent on which version of Ghostscript is used, 8.54, 8.56,
8.57, and head all act the same way.  

Converting the
Comment 1 Marcos H. Woehrmann 2007-06-07 22:05:55 UTC
Created attachment 3008 [details]
p1_3.ps
Comment 2 Alex Cherepanov 2007-06-08 06:30:44 UTC
The file has a type 3 font with /FontMatrix [0 1 -1 0 0 0] .
The normal orientation of such font is not obvious and Ghostscript
makes different assumptions than Distiller. Auto rotation can be easily
suppressed by -dAutoRotatePages=/None .

Ghostscript copies the font matrix to the PDF font as is but distiller
generates a font with [1 0 0 1 0 0] matrix. All Acrobat Readers up to v. 8.1
cannot display fonts with xy coefficient not equal 0, for instance
[1 1 0 1 0 0]. 
Comment 3 Ray Johnston 2007-06-13 00:00:04 UTC
Note that there are NO Adobe Acrobat Readers (including 8.10) that display
the output from Ghostscript correctly, with or without the -dAutoRotatePages=/None

Also without the -dAutoRotatePages=/None Ghostscript has a different bar code
than it does with -dAutoRotatePages=/None.

Other PDF viewers (xpdf, Quartz, ...) also do not display the correct image
with either PDF out of Ghostscript.

Even though (as far as we can tell) we are meeting the "letter of the law" and
conform to the PDF spec, the fact that no viewers can handle it suggests that
we need to change Ghostscript, particularly since this is a customer bug
(we have more than a few other hacks in our pdfwrite device for limitations
in the Adobe Reader, so the precedent is set).
Comment 4 Ray Johnston 2007-07-19 11:44:38 UTC
Bountiable in the hope that SaGS can work on this.
Comment 5 SaGS 2007-08-26 04:30:29 UTC
For the different AutoRotatePages behaviour, I have opened a separate 
bug #689418 "AutoRotatePages differences GS vs Distiller", because it 
is not related to the other problems described here. The patch I 
suggest is attached there.

I have also opened 2 other reports (with patches) for 2 problems 
that interfered with working on this one:
- Bug #689419 "Text missing if nested BT with opdfread.ps";
- Bug #689420 "Errors with ps2write and special chars in FontName".

And now, about the main topic inhere:

Reader's display of GS's output PDF is correct, and the PDF created 
by GS is wrong because of a bug in pdfwrite. GS displays the file 
"correctly" because there is a matching bug in GS's PDF interpreter. 
The other PDF interpreter in GS (opdfread.ps) has a similar problem.
Distiller does not "normalize" the FontMatrix, at least not in all 
cases, and decently recent versions of Reader (6.x and later) 
display correctly PDFs that have nonstandard font matrices.

The factors that create the problems are the following:

(a) Minor (really minor) detail: the "wy" operand of "d0"/"d1" 
    in Type 3 charproc streams must be 0.

(b) When computing PDF glyph widths, and when evaluating current 
    point movement produced by rendering glyphs, pdfwrite assumes 
    the FontMatrix is a simple [s 0 0 s tx ty]. If this asumption 
    is false, then glyphs are placed incorrectly.

    GS's PDF interpreters make the same asumption.

(c) PDF1.7 Ref 5.1.3 "Glyph Positioning and Metrics" states the 
    following:

   "This distance is a vector (called the displacement vector) 
    in the glyph coordinate system; it has horizontal and 
    vertical components. [...] In all cases, the text-showing 
    operators transform the displacement vector into text space 
    and then translate text space by that amount." (page 395)

   "w0 is the displacement vector that specifies how the text 
    position is changed after the glyph is painted in writing 
    mode 0; its vertical component is always 0."
   "w1 is the displacement vector for writing mode 1; its 
    horizontal component is always 0." (page 396)

    While in PS the advance (=displacement) vector, in glyph space, 
    can have an arbitrary direction, the PDF metrics always specify 
    perfectly horizontal (WMode 0) or vertical (WMode 1) such 
    vectors. This implies that, in PDF, the wy/x component of the 
    advance vector is zeroed out a 1st time in glyph space.

    Also, PDF1.7 Ref, 5.3.3 "Text Space Details" states:

   "First, a combined displacement is computed, denoted by tx in 
    horizontal writing mode or ty in vertical writing mode (the 
    variable corresponding to the other writing mode is set to 0)"
    (page 410).

    This implies that wy/x is zeroed out a second time, this time 
    in PDF text space.

    Of course, with all this zeroing of wy/x, care must be taken to 
    explicitely position the next glyph to get the same printed 
    page as with the original PS.
Comment 6 SaGS 2007-08-26 04:31:23 UTC
Created attachment 3313 [details]
My test file.

Uses Type 1 and Type 3 fonts with varying direction of glyphs' 
advance vectors, and (original) font matrices that include anisotropic 
scaling, rotation in 15deg increments, x- and y- skewness, x- and -y 
mirroring. For comparison, it prints pages that use fonts with a 
"normal" FontMatrix but change the PS user space to obtain the same 
display.
Comment 7 SaGS 2007-08-26 04:32:26 UTC
Created attachment 3314 [details]
Suggested patch, part #1: pdfwrite fixes.

- Generalize pdfwrite computations for glyph widths and placement, 
  which currently assume an [s 0 0 s tx ty]-type original (= ignoring 
  changes made by makefont & similar) FontMatrix, to work with an 
  arbitrary matrix.

- Fix: the "wy" operand of "d0"/"d1" in Type 3 charproc streams must 
  be zero. Note: other parts of the code already consider wy == 0, 
  and don't need to be changed.

- Remove font_orig_scale() and pdf_font3_scale(), which are not used 
  anymore. Their simple presence is a sign of the assumption that 
  font matrices are simple scalings.

- Minor: remove a dead variable ("int code = 0; ... return code;"). 
  This variable is never changed, because there's another "int code" 
  inside the "for", but it is confusing and takes a lot of research 
  to figure out the behaviour is OK and no usefull error code is 
  lost (it's the 2nd time I stumble against this...)

Notes:
  - All non-Adobe PDF viewers that I tested (Evince/ Fedora 7, 
    Foxit Reader 2.0/ Windows, Jaws PDF Editor 3.5/ Windows, and 
    Ghostscript -r8204) have problems with "weird" font matrices.
    The bug in ghostscript is addressed below.
  - I do have an alternate patch, which "normalises" the 
    FontMatrix, for better viewers compatibility . But this method 
    cannot work when fonts are not embedded: cannot "normalize" 
    something stored in an external file.
Comment 8 SaGS 2007-08-26 04:33:40 UTC
Created attachment 3315 [details]
Suggested patch, part #2: PDF interpreter fixes.

- Force the "wy" parameter of "d0", not only of "d1", in Type 3 
  charproc streams to 0. Comment changed to suggest this is an action 
  is expected to be done as part of enforcing the metrics stored in 
  the PDF (which always imply wy == 0), and not some strange 
  behaviour in Reader.

- A detail in PDF1.7 Ref 5.3.3 "Text Space Details" states that "wy" 
  (or "wx" in WMode 1) must be forced to 0 in PDF text space too, not 
  only in PDF glyph space. To implement this with non-[s 0 0 s tx ty] 
  font matrices, a different method is used: decode "show" strings 
  and extract glyph widths with "cshow", put these into an array, 
  then render the text with "x/yshow".
  - since "wy/x" = 0 in glyph space, the code does not need to use 
    "idtransform", a "div" by "FontMatrix.xx" is sufficient;
  - use a trick (see comment in code) when "FontMatrix.xx" = 0.

CIDFonts are not changed. Those with a "straight" matrix continue to 
work, but there's no improvement if they have a weird "/FontMatrix". 
I think this would require a radical change, and the result will be 
much slower than it is now.
Comment 9 SaGS 2007-08-26 04:35:02 UTC
Created attachment 3316 [details]
Suggested patch, part #3: supplemental ps2write fixes (in opdfread.ps).

- opdfread.ps assumed the FontMatrix is a simple scaling by 0.001;
  generalize the computations to work with an arbitrary matrix.

- Implement a detail of PDF1.7 Ref 5.3.3 "Text Space Details" (nuking 
  the "wy" in text space) if a non-[s 0 0 s tx ty] "/FontMatrix". 
  Similar to the change in the main PDF interpreter, but shorter 
  because opdfread.ps does not handle vertical writing or CID fonts. 

Also "wy" in "d0"/"d1" is already 0 (see patch for pdfwrite), so 
needs no adjustment.
Comment 10 leonardo 2007-08-26 23:53:35 UTC
Adding mpsuzuki to the CC list since he works on a similar problem.
Comment 11 leonardo 2007-08-27 00:05:27 UTC
Adding Ken to the CC list since this bug affects the modules he works on.
Comment 12 leonardo 2007-08-27 02:15:27 UTC
The patch 3314 introduces the following code :

+	gs_matrix *pmat = &pdfont->u.simple.s.type3.FontMatrix;
+
+	pwidths->Width.xy.x *= pmat->xx; /* formula simplified based on wy in 
glyph space == 0 */
+	pwidths->Width.xy.y  = 0.0; /* WMode == 0 for PDF Type 3 fonts */
+	gs_distance_transform(pwidths->real_width.xy.x, pwidths-
>real_width.xy.y, pmat, &pwidths->real_width.xy);


It is equivalent to pwidths->Width.xy.y = 0; pwidths->Width.xy.x = 
FontMatrix.xx ^ 2 .

This code is wrong by 2 reasons :

1. It completely misses the character width from setcachedevice.
2. The result units is (text_scale/design_scale)^2 rather than inches.

The patch causes a minor raster difference with prfmm.pdf page 2. I'm not sure 
whether it is important. Please provide an analysis.

Next time please supply patches with log mesage in the form "diegest, details, 
expected differences" (see http://ghostscript.com/pipermail/gs-cvs/2007-
August/007739.html).
Comment 13 leonardo 2007-08-27 02:22:43 UTC
Oops sorry I misread the patch, so my comment 12 is partially wrong. The 
statement "It completely misses the character width from setcachedevice.
" has to be removed.

However the patch is still incorrect, because it multiplies twice with 
FontMatrix.xx .
Comment 14 SaGS 2007-08-27 02:47:30 UTC
The first 2 assignments are equivalent to a gs_distance_transform() of 
pwidths->*Width* followed by nuking pwidths->Width.xy.y per 5.3.3 "Text Space 
Details" I mentioned. The last line concerns pwidths->*real*_width. I don't 
see how does the patch multiply by FontMatrix.xx^2.

(For the "formula simplified": If FontMatrix = [a b c d tx ty], the advance 
vector is (wx, wy) in glyph space and (wx', wy') in text space, then 
wx' = a*wx + c*wy = a*wx, because in the PDF metrics wy is always 0.)
Comment 15 leonardo 2007-08-29 18:45:26 UTC
Regarding Comment #14 :

> I don't see how does the patch multiply by FontMatrix.xx^2.

Well I do :

+	gs_matrix *pmat = &pdfont->u.simple.s.type3.FontMatrix;
+
+	pwidths->Width.xy.x *= pmat->xx; /* formula simplified based on wy in 
glyph space == 0 */

The first multiplication with pmat->>xx in the line above.

+	pwidths->Width.xy.y  = 0.0; /* WMode == 0 for PDF Type 3 fonts */
+	gs_distance_transform(pwidths->real_width.xy.x, pwidths-
>real_width.xy.y, pmat, &pwidths->real_width.xy);

The second multiplication with pmat->xx is in the line above.

Thus the result dimension (measure units) is wrong.
I guess the test didn't detect it due to Font>matrix.xx == 1.


Comment 16 leonardo 2007-08-29 18:47:17 UTC
Ops, sorry, comment #15 is wrong - I mixed Width and real_width. Now I accept 
the patch.
Comment 17 leonardo 2007-08-29 19:38:18 UTC
Patch to HEAD :
http://ghostscript.com/pipermail/gs-cvs/2007-August/007796.html