Bug 690733

Summary: Rotated PDFs create bad pxlcolor output
Product: Ghostscript Reporter: artifex
Component: Printer DriverAssignee: Henry Stiles <henry.stiles>
Status: NOTIFIED FIXED    
Severity: normal CC: htl10, marcos.woehrmann
Priority: P2 Keywords: bountiable
Version: 8.64   
Hardware: PC   
OS: Windows XP   
Customer: 870 Word Size: ---
Attachments: rotated_public.pdf
Rotated_Mono_Tiger.pdf
first cut of a possible patch
additional patch to previous to limit effect to isotropic +90/-90 rotation
2nd cut of a patch
made-up new test file, with 90deg rotation + 1-D stretch
-180 rotationed image test file
4 pdf , zip'ed
3rd version of a patch

Description artifex 2009-08-27 06:14:44 UTC
Attached are two PDF-files which have the same content, but are using a
different rotation. Converting the unrotated PDF with GhostScript and
-sDEVICE=pxlcolor gives a small (380 KB) PCL6-file that produces a printable result.
Converting the rotated PDF-file gives a large (17 MB) PCL6-file which produces a
white page on the printer.
Comment 1 artifex 2009-08-27 06:15:32 UTC
Created attachment 5337 [details]
Unrotated PDF

met.inflt.pdf: unrotated PDF that can be converted with GhostScript
Comment 2 artifex 2009-08-27 06:16:56 UTC
Created attachment 5338 [details]
Rotated PDF gives bad result

Attached rotated PDF that gives bad result with pxlcolor device.
Comment 3 artifex 2009-09-28 02:43:46 UTC
Any information up to now ? Our customer can't print these rotated documents.
Comment 4 Alex Cherepanov 2009-09-28 21:25:00 UTC
The file size depends whether the image is represented as a high level object
or reduced to millions of rectangles. The rectangles, probably, overflow
the printer memory.

The decision about high vs. low level image is done in gdevph.c:1462

/* Currently we only handle portrait transformations. */
if (mat.xx <= 0 || mat.xy != 0 || mat.yx != 0 || mat.yy <= 0 || ... )
    goto use_default;

Comment 5 Henry Stiles 2009-09-29 21:12:16 UTC
We have determined the pxl's copy_mono procedure is the source of the problem, a
temporary workaround would be to replace pclxl_copy_mono with NULL near
gdevpx.c:136 until we've had a chance to look at this more carefully.
Comment 6 Henry Stiles 2009-10-01 11:58:30 UTC
The workaround in comment #5 results in 7.6M file that prints correctly.  The
comment in #2 is inaccurate (my mistake) the output generated is incorrect and
would not print correctly with any amount of memory.  We do recommend the
workaround, at least for now.
Comment 7 Hin-Tak Leung 2009-10-01 13:58:23 UTC
As a wider situation - what paper (NDA?) do I need to sign to have a look at
attachments marked as private?
Comment 8 artifex 2009-10-02 05:16:01 UTC
We added the attachments as type private, because we received the file from our
customer only for the purpose to solve the problem through Artifex. It has to be
treated confidential. 
Comment 9 artifex 2009-10-02 05:34:55 UTC
Created attachment 5438 [details]
rotated_public.pdf

We have passed the patch to our customer and are waiting for results. The
sample file gets a correct PCL6-output of 8 MB. But we have also seen that
other files with similar rotation and similar content now get an output size of
70 MB. Which is still a correct PCL6-file and which can be printed. But it
takes very long. See the enclosed attachment rotated_public.pdf.
Comment 10 Henry Stiles 2009-10-02 08:32:42 UTC
Yes, I am aware that other files sizes will be affected adversely, but the
temporary workaround gives correct results in many cases where it would be
wrong.  I thought that would be of interest until we have a chance to fix the
problem.

Also, we do have outside consultants who sign Non Disclosure Agreements with
whom we share private attachments.  If that is not acceptable to you please let
us know so we can setup a security arrangement that suits you.
Comment 11 Henry Stiles 2009-10-02 08:39:31 UTC
As I look at these simple files we should probably create public equivalents,
anyway.
Comment 12 Hin-Tak Leung 2009-10-02 18:50:50 UTC
FWIW, I have worked on various parts of ghostscript in the capacity of external
consultant from time to time in the past, just that the occasion hasn't arisen
where access to private attachments is needed. I have no problem with NDA and it
probably should have been done ages ago.
Comment 13 Henry Stiles 2009-10-04 09:40:05 UTC
Created attachment 5440 [details]
Rotated_Mono_Tiger.pdf

Public analog of the original reported problem.  What is needed is to support
orthogonal non portrait images (see the comments in the code
gdevpx.c:pclxl_begin_image) and to fix fallback to copy mono which produces
incorrect output.
Comment 14 Henry Stiles 2009-10-04 10:01:57 UTC
> FWIW, I have worked on various parts of ghostscript in the capacity of external
> consultant from time to time in the past, just that the occasion hasn't arisen
> where access to private attachments is needed. I have no problem with NDA and it
> probably should have been done ages ago.

The NDA should go out Monday, if you are interested in fixing this problem
please let me know.  You can start now with the public examples provided.
Comment 15 Hin-Tak Leung 2009-10-15 00:49:13 UTC
Created attachment 5488 [details]
first cut of a possible patch

Here is what I have come up with for the time being, as a first cut.

It now *try* to deal with orthogonal non-portrait images - the patch is still
missing some code for 180 degree rotation, and non-isotropic +90/-90 rotation;
but isotropic +90/-90 degree rotation now work. (and 180 degree rotation and
non-isotropic +90/-90 rotation will be broken by the patch) - I suppose I can
either modify the patch to *only* deal with isotropic +90/-90 and let the other
cases fall back to the old default, or I'll need to make up some 180-degree and
some non-isotropic +90/-90 test cases (i.e. +90/-90 plus stretching in only one
direction). 

I'll continue on this, but if somebody wants to comment on the patch and/or
supply those test cases, please do.

The patch does the correct thing for attachment 5438 [details], attachment 5440 [details], and
attachment 4422 [details] (they are all isotropic +90/-90 rotations). Apology I have sent
the NDA back yet, so I can't test against the private attachments yet.
Comment 16 Hin-Tak Leung 2009-10-15 01:22:28 UTC
Created attachment 5490 [details]
additional patch to previous to limit effect to isotropic +90/-90 rotation

This is an additional patch on top of the previous (attachment 5488 [details]) to limit
the previous effect to only deal with isotropic +90/-90 rotations. i.e. with
this additional patch, -180 and [+90/-90 plus 1-D stretch] will go back to the
default of inefficient-but-correct bahavior.

I'd prefer to implement case -180 and case [+90/-90 plus 1-D stretch] properly
also (pending creation or availability of suitable test files); but the
combination of these two patches should be sufficiently useful and should cause
no regressions.
Comment 17 Hin-Tak Leung 2009-10-15 01:26:42 UTC
Forgot to mention is a little details in the last patch - it treats scaling in
x, y which differ by less than 1 part in 1000 as equal-enough.
Comment 18 Hin-Tak Leung 2009-10-15 05:50:05 UTC
Created attachment 5491 [details]
2nd cut of a patch

Realised an oversight - the test condition in the previous patches let sheared
images through. Hence this update.

This update optimises portrait & isotropic +90/-90 , and leave the other cases
to default. will try to make some -180 & non-isotropic +90/-90 (which should be
optimizable) & some sheared-image test caess(which shouldn't be optimized).
Comment 19 Hin-Tak Leung 2009-10-15 07:12:52 UTC
Created attachment 5492 [details]
made-up new test file, with 90deg rotation + 1-D stretch 

modified from attachement 4422 (of bug 690069) so that the background image is
both 90-deg rotated and squashed in one dimension - for testing 3rd cut later.
(the 2nd cut should deal wit this with the default/current-generate-large-file
way, I think)
Comment 20 Hin-Tak Leung 2009-10-15 07:30:32 UTC
Created attachment 5493 [details]
-180 rotationed image test file

another made-up test file, with -180 rotation (made from the mono-tiger).
Comment 21 Hin-Tak Leung 2009-10-15 07:48:35 UTC
Created attachment 5494 [details]
4 pdf , zip'ed

4 pdf zip'ed, made from the golden date ps and rotated through 0,90,180,270.
Comment 22 Hin-Tak Leung 2009-10-16 05:21:21 UTC
I have another thought on this - originally my design was based on the fact that
+90/-90 rotation and horizontal/vertical scaling PCL XL instructions exists (and
only those, not arbitrary rotations), so a high-level graphic object which comes
with its own rotated/scaled transfer matrix can have its transfer matrix partly
sent to the printer and let some of the rendering be done in the printer rather
than within ghostscript. (i.e. send part matrix, part rendered data, reverse
matrix). I have started the PCL XL scaling code (bit of which was in the 1st
cut). I have changed my mind - it is simplier to only send the rotation as PCL
XL instructions and let the scaling done in ghostscript same as it did. The code
is simplier, and also graphics objects aren't rendered twice (partly in
ghostscript and partly by the printer).

Will be doing 3rd cut soon.
Comment 23 Henry Stiles 2009-10-16 15:57:18 UTC
We really need a test plan for this device (pxlcolor/mono), I copied in Marcos,
he might want to weigh in.  We do have complete testing of the ppm device so
(requires imagemagick):

# gs -> pxl
gs -r600 -sDEVICE=pxlcolor -sOutputFile=tiger.pxl ../examples/tiger.eps 

# gs -> ppm
gs -r600 -sDEVICE=ppmraw -sOutputFile=tiger.ppm ../examples/tiger.eps 

# pxl -> ppm
pcl6 -sDEVICE=ppmraw -sOutputFile=tigerfrompxl.ppm tiger.pxl

# trim output for pcl 1/6" border.

convert -shave 100x100 tiger.ppm tiger_trim.ppm
convert -shave 100x100 tigerfrompxl.ppm tigerfrompxl_trim.ppm

compare tiger_trim.ppm tigererfrompxl_trim.ppm diff.ppm

Then we can quickly review diff.ppm and check for anything more than pixel delta
differences.  Other thoughts on testing this device?
Comment 24 Ray Johnston 2009-10-16 18:08:53 UTC
I think that the consensus is that the 'pbm/pgm/ppm/pkm' class of devices are
adequate to cover the mono/gray/rgb_contone/rgb_cmyk devices.

As discussed elsewhere, adding the 'tiffsep' or some other device that supports
encoded colors and spot color overprinting would be useful.

The other class of devices are "high level" devices. These are unique in that
they use the 'gdevvec' mode. Included are pdfwrite (already part of regression
testing), ps2write (being discussed), pswrite (mostly deprecated), pxl***
(the topic of this thread) and a few other fledgling devices such as svg and
cairo.

Any devices which support spot colors, or have their own implementation of
high level functions or colors including 'fill_path', 'stroke_path' and
'fill_rectangle_hl_color' as well as custom 'devn' functions need to be
considered.
Comment 25 Henry Stiles 2009-10-16 19:48:12 UTC
I guess comment #23 wasn't clear.  We have fair confidence ppm driver output is
reasonably correct and little confidence pxlcolor output is correct.  So I
suggested a semi automatic means of comparing pxl output with ppm raster, to
avoid visual comparisons.  I was asking if others had ideas how we can test
pxlcolor without visually inspecting each page.
Comment 26 Hin-Tak Leung 2009-10-16 21:08:08 UTC
About the the testing part, using ghostpdl/pcl6 is what I do - but then we have
a slightly dubious thought of "what if ghostpdl is wrong"... in any case, so far
I have found that ghostpdl/pcl6 doesn't honour papersize/media instructions, but
that's probably partly intentional (output should use user-specified dimensions,
not job-specified dimensions) and partly a bug or lack-of-function.
Comment 27 Henry Stiles 2009-10-16 21:31:03 UTC
Okay, I think that is the only way to do the testing in reasonable time.  We are
interested in any pxl bugs you find, please open a new bug if you find something
questionable.
Comment 28 Hin-Tak Leung 2009-10-17 04:58:11 UTC
Created attachment 5502 [details]
3rd version of a patch

3rd, and hopefully final version of a patch.

Tested as in comment 23 against the available public pdf's (attachment 5440 [details],
5438, 5492, 5493, 5494, and 4422); in all the cases there are some 1-pixel
movements, and for texts, gray-level differences (i.e. graphic objects are off
by 1-pixel and black becomes a little more/less black), comparing two-stage
pxl->ghostpcl->ppm with gs->ppm.

I have sent the NDA back by post a few days ago.
The media size issue mentioned in comment 26 is filed as bug 690827.
Comment 29 Hin-Tak Leung 2009-10-17 05:07:47 UTC
I have thought it might be nice to also represent mirrored/reflected image as
high-level objects as well, but I don't have a suitable input file (yet); also
tried to see if can get away with a simplier 90 deg rotation for both +90/-90
(one of them will be inverted) and rotate back but this approach seems to have
bad interactions with clipping elsewhere, and images are not drawed if rotated
in the wrong direction.
Comment 30 Hin-Tak Leung 2009-10-17 05:43:27 UTC
Hmm, I think one can't do reflected/mirrored image as high-level object, nor
draw images in -ve direction ATM  (they are discarded by pclxl_image_write_rows():
if (dw <= 0 || dh <= 0) return 0;) . So patch#3 should be as best it could be, I
believe. 
Comment 31 Ray Johnston 2009-10-17 10:14:06 UTC
Following up on regression / automated examination of PXL, for pdfwrite we
simply use Ghostscript to render to raster and compare rasters (original ppm
vs. ppm from pdf that came from pdfwrite).

As Hin-tak mentions, I see no reason not to use GhostPDL (pcl6) to render the
ppm similarly to what we do for pdfwrite. If there are page size issues, I
guess we would have to "crop to intersection" or the like to get ppm's we could
compare (ImageMagick 'convert' can crop).
Comment 32 Hin-Tak Leung 2009-10-18 02:11:26 UTC
Tested attachment 5338 [details] (the file for which this bug was filed). With the patch 
the pclxl output is 385k byte which is as the patch intended. 

Testing as Henry outlined in comment 23 is a little tricky - ppm output
dimension is 4958x7017 and pxl->ppm output is 4960x7014. At -r600, if scale both
to 4956x7014, things are about 2-pixel off (and a little gray-level difference,
probably just numerical difference between two-stage and one-stage, and the
1/6-inch PCL border).

The difference dimension seems to come from ghostpdl (round to closest multiple
of 8). Automatic testing will need it properly clipped (probably bottom/right
edge) and trimmed the 100x100 pcl border.
Comment 33 Henry Stiles 2009-10-18 21:05:24 UTC
The XL paper sizes are given in gdevpxen.h in 300 dpi units as per the HP
manual.  Interesting that in Acrobat the width is 4959 (ripping this file to a
600 dpi raster file).  We should be able to do much of the testing with letter
size paper where PS and PCL get the same device geometry, but it seems like all
the languages should get the same results for page dimensions.
Comment 34 Hin-Tak Leung 2009-10-20 12:38:52 UTC
Attachment 5502 [details] committed as svn r10193:
http://ghostscript.com/pipermail/gs-cvs/2009-October/009919.html
Comment 35 Marcos H. Woehrmann 2011-09-18 21:47:26 UTC
Changing customer bugs that have been resolved more than a year ago to closed.