The attached PDF file when converted with Ghostscript at 200 dpi has white lines appearing in areas that should be solid black. This occurs with all the versions of Ghostscript that I tried, head (r9595), 8.64, and 8.54. Neither Adobe Acrobat nor Apple Preview show these white lines, at any resolution. The command line I'm using for testing: bin/gs -sDEVICE=tiff24nc -r200 -o test.tif ./Coppenrath4.pdf
Created attachment 4866 [details] Coppenrath4.pdf Portion of original PDF file.
Created attachment 4867 [details] Coppenrath.pdf Original, customer supplied, PDF file.
This is yey another occurrence of a PDF that paints by filling an area with one color, then uses an imagemask to paint on top of the rectangle that is filled. In some places, the 'f*' performed by Ghostscript paints white on top of an area previously marked with some black data. Step 1843 paints the top part of the word 'gebucht' (down to the baseline of the text, not including the descender on the 'g' as a result of "/R32 Do", but then the f* performed in step 1847 erases part of the very bottom of the text (to white). When the next image is painted with "/R33 Do" in step 1851, it doesn't connect, leaving a white gap. Ghostscript's and Adobe's 'fill' and 'image' operators are not painting the same areas, which is the same root cause as on some other bugs (bug 690184 and bug 689364). This is not a trivial issue since a change to the painting boundaries can be very sensitive -- avoiding painting too much or too little with 'fill' and with 'image'. Reassigning to myself since this is not strictly an image issue (as was originally reported) and since I've already begun looking into the issue for the other bugs mentioned above. Just another test case.
Sometimes Ghostscript paints different rectangles with image and rectfill operators. Adobe never does this.
> Sometimes Ghostscript paints different rectangles with image and rectfill > operators. Adobe never does this. When you say Adobe do you mean Acrobat? Certainly the Red Manual says they should be different: "... The region of device space to be painted by the image operator is determined similarly to that of a filled shape, though not identically. The interpreter trans- forms the image’s source rectangle into device space and defines a half-open region, just as for fill operations. However, only those pixels whose centers lie within the region are painted. The position of the center of such a pixel—in other words, the point whose coordinate values have fractional parts of one-half—is mapped back into source space to determine how to color the pixel. There is no averaging over the pixel area; if the resolution of the source image is higher than that of device space, some source samples will not be used..."
The foibles of copy-paste. fi is the ligature 'fi'
Created attachment 5029 [details] Coppenrath4.tiff Apple Preview does display the white lines. A 200 dpi tiff generated by Apple preview is attached and it is clear on the display at all scalings. Acrobat 8.0 also displays the horizontal white lines. Marcos?
I agree that Apple Preview also displays horizontal white lines, however they are much more subtle than Ghostcript's, it took me while to notice them in the attached tiff file.
It's anti-aliased, use -dGraphicsAlphaBits=4 to get a similar result in gs. There are one pixel high missing lines missing in AR, Preview and GS. As far as I can tell this is not a bug.
Created attachment 5041 [details] Coppenrath4.jpg Adobe Acrobat 8.0 rendering at 200 dpi.
Note that GraphicsAlphaBits=4 only works with grayscale devices. With tiffg4, or other 1-bit monochrome devices, the gaps still appear. While the gaps aren't as evident with GraphicsAlphaBits=4 and the tiffgray device, they are still present as 'gray' lines where the fill replaced part of the image. With the version of Adobe Acrobat Pro I have (7) Save to... TIFF only offers 150 or 300 dpi. At 150 dpi Ghostscript missing lines are even worse than 200 dpi and there are no missing lines from Acrobat. Comparison of the images shows that the image with Acrobat is "higher" on the page than with GS. This may point to some image coordinate adjustment that moves it higher than the the fill, so that the subsequent fill with white doesn't erase the bottom line of the image. While I agree that we should tell the customer that at 200 dpi Acrobat also produces gaps, I think this needs some more investigation, particularly since this is so prevalent an issue (other open bugs). If we want to close this bug, please mark it as a duplicate of 690184.
Created attachment 5043 [details] Acrobat_7_200_dpi.tif
Created attachment 5045 [details] Coppenrath4_adobe_8.tiff white lines at y coordinates equal to 32, 96, 128, 160, and 192
Created attachment 5046 [details] Coppenrath4_Acrobat_7_200_dpi.tif The smaller image from Acrobat 7
Created attachment 5048 [details] coppenrath4-Acrobat9-200dpi.tif Tiff produced by Acrobat Professional 9 on Windows Vista at 200dpi, CCITT G4 compression, from coppenrath4.pdf
> With the version of Adobe Acrobat Pro I have (7) Save to... TIFF only offers > 150 or 300 dpi. At 150 dpi Ghostscript missing lines are even worse than 200 dpi > and there are no missing lines from Acrobat. Comparison of the images shows that > the image with Acrobat is "higher" on the page than with GS. This may point to > some image coordinate adjustment that moves it higher than the the fill, so > that the subsequent fill with white doesn't erase the bottom line of the image. We can type in the resolution manually. For adobe acrobat pro 8 and 9 the pixel positions (200, 300dpi tiff) are exactly the same as gs. gs has white lines at 200 dpi and no lines at 300 dpi, just like acrobat. If we choose to emulate 7 it will necessarily break 8 and 9 compatibility, the images are translated 1 pixel in 7 relative to 8&9. We should derive the theoretical positions of the filled area in this file and see if they match AR 7 or 8&9, but I doubt we'd change the code and break 8&9 compatibility even if 7 were correct. This problem should be downgraded from a bug to an exercise in the mean time. bug# 690184 has been commented about also, it also is a problem on the "fringe" of interest and probably shouldn't be a bug.
The customer has confirmed that the -dGraphicsAlphaBits=4 is sufficient for them since they can use the 'gray' device and the gray lines do not confuse their OCR. Since AR 8 (and 9) also have white lines when rendering to TIFF at the same resolution as GS, closing this bug as 'INVALID' (a badly written PDF).
In a recent email the customer reports: Now this issue is relevant to us again, because we recognize, that the content of this PDF is only binary and so we will render to binary tiff image. Unfortunately the going around for this issue (-dGraphicsAlphaBits=4) doesn’t work for binary output devices. To recognize, when a PDF contains only information, that can be rendered to a binary image without loss of information, is an important feature for many of our customers. So it isn’t possible to abandon this feature.
The tiffscaled device solves the problem: gs904 -sDEVICE=tiffscaled -r800 -o test.tif -dDownScaleFactor=4 ./Coppenrath4.pdf I've notified the customer about this option and also opened a bug with Adobe for the white lines that appear in their output.