Bug 690521 - Text clarity
Summary: Text clarity
Status: NOTIFIED WORKSFORME
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: Text (show other bugs)
Version: 8.63
Hardware: Other Linux
: P2 normal
Assignee: Ray Johnston
URL:
Keywords:
Depends on:
Blocks: 690640
  Show dependency tree
 
Reported: 2009-06-08 05:39 UTC by Robin Capone
Modified: 2011-09-18 21:47 UTC (History)
0 users

See Also:
Customer: 200
Word Size: ---


Attachments
test.pdf (184.08 KB, application/pdf)
2009-06-08 06:59 UTC, Robin Capone
Details
test.tif (209.48 KB, image/tiff)
2009-06-08 06:59 UTC, Robin Capone
Details
acrobat.png (68.41 KB, image/png)
2009-06-18 10:12 UTC, Marcos H. Woehrmann
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robin Capone 2009-06-08 05:39:55 UTC
I'm having an issue with text clarity for a client
after converting a pdf to tif.  All text in the 
document is good except for text that is a part
of a graph or a medical image.
Thanks,
Robin Capone
Comment 1 Alex Cherepanov 2009-06-08 06:23:44 UTC
Please attach your PDF and TIFF files. You can mark the files private to
restrict the access to a small group of Artifex employees and contractors.
Comment 2 Ray Johnston 2009-06-08 06:53:35 UTC
Also, please specify the command line you are using for Ghostscript so we
don't have to guess what resolution (-r___) and type of TIFF (-sDEVICE=___) and
whether or not you are using -dGraphicsAlphaBits=___ and/or -dTextAlphaBits=___
and/or -dDOINTERPOLATE (all of which affect the output appearance).

I suspect that the text that doesn't look 'clear' is in the original PDF as an
image.
Comment 3 Robin Capone 2009-06-08 06:59:08 UTC
Created attachment 5078 [details]
test.pdf
Comment 4 Robin Capone 2009-06-08 06:59:56 UTC
Created attachment 5079 [details]
test.tif

In the 1st image, refer to the words Lymphocytes and 
Monocytes.

In the 3rd image, refer to the numbers on the graph.
Comment 5 Robin Capone 2009-06-08 07:15:44 UTC
Here is the command used to convert the file: 

gs -r200 -sDEVICE=tiffg4 -o text.tif -Ilib stocht.ps -c "{2.8 exp } settransfer
<< /HalftoneMode 1 >> setuserparams " -sPAPERSIZE=letter -dQUIET -dNOPAUSE  -f
text.pdf -c quit]

I am not aware of using dGraphicsAlphaBits=___ and/or -dTextAlphaBits=___
and/or -dDOINTERPOLATE 
Comment 6 Ken Sharp 2009-06-08 07:59:18 UTC
The areas in question are indeed images, there is no 'text' in these, there is
an arrangement of pixels which reproduces letters. In addition the original
images appear to be JPEG compressed, which leads to the usual JPEG artefacts.
You can see this in the original PDF if viewed at high resolution in Acrobat.

When rendered at low resolution, and 200 dpi is quite low, the image has to be
'scaled down', this means losing some pixels. Also the page has been converted
from colour to monochrome, so the images are converted from (probably) RGB to
grey scale, and then converted to monochrome by dithering (using a stochastic
method as specified in the command line).

The result of all of this is that some of the almost invisible artefacts from
the JPEG are rendered as black pixels and some of the pixels making up the
letters of the text are lost.

For me adding -DDOINTERPOLATE actually makes the problem worse, and
-dGraphicsAlphaBits=4 makes a very slight improvement. Ray or Alex may have
other suggestions but it looks to me like the problem is simply conversion of a
colour image to a black and white image at a substantially lower resolution.
Comment 7 Marcos H. Woehrmann 2009-06-08 09:17:38 UTC
Using a different settransfer function significantly improves the text but screws up the images:

  << /Install { { 0.85 gt { 1 } { 0 } ifelse } settransfer } >> setpagedevice

As Ken pointed out the text is actually rendered as pixels in the original PDF file, so there isn't much that 
can be done to the text without affecting the images.
Comment 8 Alex Cherepanov 2009-06-08 14:55:26 UTC
A combined transfer function clears the light gray colors but still does
the same gamma correction for the remaining colors.

  {dup 0.85 gt {pop 1}{2.8 exp} ifelse } settransfer

A more sophisticated approach would use a CIEBasedABC color space
that lightens light achromatic colors and leaves the rest intact.
Comment 9 Robin Capone 2009-06-10 08:44:06 UTC
in what file is CIEBasedABC used?
Comment 10 Alex Cherepanov 2009-06-11 14:24:37 UTC
PostScript 3 can process /DeviceGray, /DeviceRGB. /DeviceCMYK color
spaces as device-independent color spaces using DefaultGray, DefaultRGB,and
DefaultCMYK color spaces of CIEBasedA, CIEBasedABC, CIEBasedDEFG, type
respectively.

On Ghostscript, this mode can be activated with -dUseCIEColor flag. With the
careful design of DefaultRGB color space, light achromatic colors can be rendered
white without major disturbance to other colors.
Comment 11 Robin Capone 2009-06-12 08:02:31 UTC
Using -dUseCIEColor did not help.   
Comment 12 Ray Johnston 2009-06-13 14:08:58 UTC
-dUseCIEColor will _only_ help with the use of a CRD that maps lighhter colors
to black/darker colors, or with 'Default***' colorspace definitions which do
the same thing. This is the 'careful design' that Alex mentions.
Comment 13 Ray Johnston 2009-06-18 10:08:04 UTC
Assign to me to generate output from Adobe
Comment 14 Marcos H. Woehrmann 2009-06-18 10:12:07 UTC
Created attachment 5125 [details]
acrobat.png

Adobe Acrobat 9 generated 200 DPI monochrome PNG file.
Comment 15 Ray Johnston 2009-06-18 10:30:14 UTC
After evaluating the relative quality between Acrobat monochrome output and
Ghostscript output, the clarity of the text in the images is VERY close,
so closing this as 'WORKSFORME'
Comment 16 Robin Capone 2009-06-23 11:07:41 UTC
The documents that are being converted to tif 
for my client are lab results of medical tests.
So I just want to make sure that we're getting
the best quality possible and that you are 
certain that there is nothing else I can do
to clarify the images.  Thanks for your help.
Robin
Comment 17 Marcos H. Woehrmann 2011-09-18 21:47:42 UTC
Changing customer bugs that have been resolved more than a year ago to closed.