Bug 688395 - colors lost when converting PDF to TIFF
Summary: colors lost when converting PDF to TIFF
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: Color (show other bugs)
Version: 8.53
Hardware: PC Linux
: P2 enhancement
Assignee: Robin Watts
URL:
Keywords: bountiable
Depends on:
Blocks:
 
Reported: 2005-11-23 08:25 UTC by artifex
Modified: 2012-04-12 17:13 UTC (History)
2 users (show)

See Also:
Customer: 870
Word Size: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description artifex 2005-11-23 08:25:24 UTC
When viewing the attached PDF-file with Ghostscript, all colors are displayed. 
When converting it with
gs -sDEVICE=tiff24nc -sOutputFile=gs.tif -f colorloss.pdf
also all colors are displayed. When specifying a resolution some colors are 
lost:
gs -sDEVICE=tiff24nc -r100 -sOutputFile=gs.100.tif -f colorloss.pdf
Some boxes which have been filled yellow and green are now white.
Comment 1 artifex 2005-11-23 08:29:30 UTC
Created attachment 1797 [details]
sample for loosing colors
Comment 2 Alex Cherepanov 2005-11-23 15:12:45 UTC
This is a PDF 1.3 file that simulates highlighting effect by printing
checkboard image masks on top of the main image. The result depends on the
image sampling algorithm and the rasterization resolution. For instance,
at 600 dpi the results are quite good.

Probably, the best fix for this problem is the alpha transparency, available
in PDF 1.5 and higher and supported by Ghostscript. The sample PDF file was
generated from PostScript source by Ghostscript. It nay be possible to
use Idiom Recognition feature to generate better PDF file.

Please attach a PostScript file that was used to generate the sample
PDF file. This will help us to check the Idion Recognition approach to the
problem.
Comment 3 artifex 2005-11-24 06:07:51 UTC
I will try to get the file. But it might have appeared just temporarily from an 
automated process converting Office-documents to PDF.
Comment 4 Alex Cherepanov 2005-11-24 10:21:08 UTC
You don't need to search for the original PS file. Any other PS file that
has the same problem is OK.
Comment 5 artifex 2005-12-01 02:00:11 UTC
I am still waiting to get a PostScript-file that shows the problem. I have got 
from our customer the Excel-Sheet from where the PostScript-file has been 
generated. Which then was used to create the PDF using GhostScript. But when I 
am using my own MS-Excel 2002 SP-2 to create the PostScript-file it shows a 
different pattern to simulate the transparency. This is more irregular and does 
not show the problem of loosing color when converting to PDF and then 
converting PDF to TIFF.
So this problem may also be solved choosing an appropriate version of Excel. 
But I will continue to try to get a sample PostScript file that shows the 
problem.
Comment 6 artifex 2005-12-01 02:53:16 UTC
Created attachment 1816 [details]
sample for NOT loosing color

Sample PostScript-file generated with a different type of EXCEL that does NOT
show the loss of colors.
Comment 7 artifex 2005-12-05 04:38:03 UTC
Created attachment 1824 [details]
PostScript sample for loosing colors

Sample PostScript file that shows loss of colors. To reproduce convert the
PostScript-file to PDF using device pdfwrite. Then convert PDF to TIFF using
device tiff24nc at 100 dpi. On page 8 you see the colors are lost.
Comment 8 Alex Cherepanov 2005-12-05 07:49:49 UTC
The following fragment, included in the PostScript stream as a PassThrough
escape renders differently depending on resolution and alignment of device
pixels relatively to the image pixels.

%%BeginDocument: Pscript_Win_PassThrough
userdict /GpPBeg {gsave initclip 13 dict begin /c {curveto} bind def /l {lineto}
bind def /m {moveto} bind def /NP {newp
ath} bind def /CP {closepath} bind def /SC {setrgbcolor} bind def /S {stroke}
bind def /F {fill} bind def /AF {eofill} b
ind def 1 eq {setdash setlinewidth setmiterlimit setlinejoin setlinecap} if} put
/GSE {end grestore} def
0 GpPBeg
NP
657 1511 m 828 1511 l 828 1614 l 657 1614 l CP
eoclip
656 1510 translate 0 0.7968 0.9960 setrgbcolor
/ds
<aaaa5555aaaa5555aaaa5555aaaa5555aaaa5555aaaa5555aaaa5555aaaa5555>
 def
0 16 105 {0 16 173 {1 index gsave translate 16 16 scale 16 16 true [16 0 0 -16 0
16] ds imagemask grestore} for pop} for
GSE
%%EndDocument

This fragment attempts to sumulate semi-transparent objects by rendering a
chessboard pattern. PDF format supports true transparency and doesn't need this
hack. Unfortunately, like many other poorly written PostScript programs, this
one is not bound. So the idiom recognition approach is not possible.

Ghostscript can associate procedures with DSC comments. Probably, it's possible
to to attach a procedure to "%%BeginDocument:" comment
and look for "Pscript_Win_PassThrough" and "userdict /GpPBeg {gsave initclip 13
dict begin /c {curveto} bind def ..."
 
Comment 9 artifex 2005-12-06 01:56:22 UTC
We can do two things: Attaching own routines to the PostScript-driver. This 
driver is one we have build from the sources which are available from Microsoft.
And we can try to associate procedures with DSC comments. I assume you mean the 
user parameter ProcessDSCComment. Do you have a sample for that ?
Comment 10 Ray Johnston 2006-02-02 09:52:43 UTC
For an example of ProcessDSCComment, please see the file lib/gs_icc.ps

This defines a procedure .ProcessICCcomment then installs it by merging it
with the existing DSC comment procedures. The merge is important to keep
other uses of DSC comment processing working correctly, so please use code
like this.
Comment 11 Timothy Osborn 2007-02-12 12:49:29 UTC
This looks to be a design flaw in our downsampling algorithm that is being
applied to this imagemask. It is not tied to PDF output specifically, it occurs
when going direct to TIFF. Reassigning to Marcos per Henry Stiles.
Comment 12 Ray Johnston 2007-04-12 12:28:08 UTC
Adjusting priority to reflect that this is a customer bug. 
Comment 13 artifex 2007-07-13 06:52:34 UTC
Created attachment 3188 [details]
Color lost, transparency lost

This is a sample PDF-file that shows the problem of loosing color and loosing
transparency. On page 3 there are two yellow, transparent circles. If the file
is converted with -sDEVICE=tiff24nc -r300 , the left circle is no longer
transparent. The right circle disappeared, showing just the background. When
zooming into the PDF-file, you see that the yellow transparency is simulated by
a checkerboard pattern. This is a file we currently got from our customer.
Comment 14 Ray Johnston 2007-07-13 09:49:36 UTC
Created attachment 3189 [details]
688395-4s.png

Not just page 3, but almost every page thereafter exhibits the problem.

Not expecting much, I tried GraphicsAlphaBits=4 and as expected it didn't help.


I noticed that running this to 200 dpi, then using ImageMagick 'convert' to
scale down by 50% gives something decent (i.e., no missing colors). I used:

   gs -r200 -sDEVICE=png16m -o 688395-%d.png 688395.pdf
then
   convert -scale 50% 688395-4.png 688395-4s.png

--------

I guess to be able to do this in a single step, we'd have to have render at
a higher resolution (some integer scale factor greater than the end goal),
the apply an averaging downsample to the end goal resolution.

Definitely an enhancement, but this could be done as a graphics library
feature -- a generalized device parameter (e.g., /RenderSampleFactor <int>)
that would cause increase the internal rendering resolution and automatically
do the averaging prior to returning 'bits'. This would only work for devices
that have 'linear and separable' gx_color_index values since the averaging
would need to operate on components independently.

Could developing this be made bountiable ?
Comment 15 artifex 2007-07-15 23:36:55 UTC
Comment on attachment 3189 [details]
688395-4s.png

Not for public use
Comment 16 Marcos H. Woehrmann 2008-02-26 09:49:25 UTC
Reassign to Ralph who owns color.
Comment 17 Ralph Giles 2008-05-20 16:07:14 UTC
This is an unfortunate issue, but handling input like this is definitely an
enhancement; we don't expect to resolve this soon. Marking as bountiable.

Another idea is to resample patterns with alpha and composite them over.
Comment 18 Henry Stiles 2011-06-01 17:25:21 UTC
Assigning to Robin Watts to see if "tiffscaled" is a potential solution to this problem.
Comment 19 Robin Watts 2011-06-06 14:38:08 UTC
From quickly rereading the bug history, it sounds like these 'transparent' areas are being generated by drawing a checkerboard pattern over certain areas. This is a risky business as it relies on the file generators assumptions as to exactly what pixels will be set by a given rendering operaton. This is tricky to predict at best, and made intractably hard by having to support a range of resolutions. Presumably these files have been 'designed' to work with just one resolution and one known renderer.

This is a fault in the postscript file, and (as can be seen by the preceeding comments) is not really something we can hope to cope with in all generality within ghostscript.

It is possible that the new tiffscaled device may offer some help here, however.

If you have a desired output resolution where the default rendering gives the "wrong" result, then by selecting a multiple of that resoluton, and using the tiffscaled devices inbuilt downscaler, we can probably find a resolution where it does work.

For instance:

 gs -r100 -sDEVICE=tiff2nc -o out.tif 191723000000.pdf

gives the wrong result.

 gs -r200 -sDEVICE=tiffscaled24 -dDownScaleFactor=2 -o out.tif 191723000000.pdf

however, will render at 200dpi, then scale down to 100dpi, and will give the correct result.

Sadly, it's not enough to assume that all multiples will work:

 gs -r300 -sDEVICE=tiffscaled24 -dDownScaleFactor=3 -o out.tif 191723000000.pdf

doesn't work, but 400/4, 500/5 and 600/6 all do.

Does this count as enough of a workaround to be able to close this bug?
Comment 20 Robin Watts 2011-06-08 14:08:25 UTC
Closing, as we have a workaround. If you have additional questions or queries, or reasons why the workaround is unacceptable please reopen the bug. Thanks.
Comment 21 artifex 2011-06-08 15:58:55 UTC
We are using the conversion in an automatic process. So it is difficult to decide which files would need the special handling and which not. We will try the workaround.