Bug 692279 - Scaling 1 bit per pixel / component images in DeviceGray Colorspace
Summary: Scaling 1 bit per pixel / component images in DeviceGray Colorspace
Status: RESOLVED WONTFIX
Alias: None
Product: MuPDF
Classification: Unclassified
Component: fitz (show other bugs)
Version: unspecified
Hardware: PC Windows 7
: P4 normal
Assignee: Tor Andersson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-13 19:37 UTC by Dan
Modified: 2011-08-29 20:24 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dan 2011-06-13 19:37:34 UTC
I have an application that converts PDF documents to 1 bit per pixel Tiff images so they can be faxed.  The documents are converted to DeviceGray colorspace and then an error diffusion algorithm is run on the images to convert to 1 bpp/bpc Black/White.

If the source PDF contains a whole page image in BlackWhite 1 bpp/bpc colorspace they are rendered in DeviceGray colorspace.  If the images are scaled the resulting image created by MuPDF is no longer 1 bit per pixel.  The problem with this is that the error diffusion algorithm ruins the documents because they contain pixel values between 0 and 255 when they should only contain 0 or 255 (Black or White).

Is there a way to tell MuPDF not to change the color of the pixels for 1 bit per pixel images.  I.e leave all values 0 or 255 while scaling?

The output ends up looking like the scaled image was anti-aliased even though that is being disabled with fz_set_aa_level(0).
Comment 1 Dan 2011-06-13 22:11:10 UTC
The problem exists in the method fz_transform_pixmap which calls fz_scale_pixmap.   

I simply commented out the calls to scale the pixmaps and everything outputs as I want.  

So I'm not sure why the pixmaps are being scaled to begin with.
Comment 2 Pedro Rivera 2011-06-14 22:54:44 UTC
Dan,

Don't know if this could help. . .

I had a similar issue and resolved it by making sure that I create a gray scale  image with all the transformations I needed, then convert the image to black and white or binary with some threshold, lastly apply error diffusion or dithering. 

CCITT TIF images are required to be in "bit pack" format. To see an example c code of what I am doing you can download the JMuPDF source code and look at the JMuPDF-JNI source folder. There you will find a source file jr_write_tiff.c and jr_pixmap.c. Look at jr_pix_to_binary(...) or jr_pix_to_black_white(...). Both of the functions convert a gray scale image to black/white or binary then applies dithering. I am using Floyd-Steinberg algorithm.

The source is available @ SourceForge.

Hope this helps.

Pedro
Comment 3 Dan 2011-06-15 00:06:53 UTC
Thanks Pedro.  I'll check it out.   I'm also running into issues with CMYK->RGB where black is almost black, but not 255.   So I changed the code to clip everything above 96% to black for CMYK->RGB conversions.


cmyk_to_rgb:

if(c == 0 && m == 0 && y == 0 && k > .96)
	{
		rgb[0] = 0;
		rgb[1] = 0;
		rgb[2] = 0;
		return;
	}



Seems like there is a "gray" area for these conversions. Clipping retains black text which is what I'm after.  

Before my error diffusion algorithm (also FS) was throwing white error in the letters of text.  Also causing increased file sizes due to poor compression.
Comment 4 Robin Watts 2011-07-04 17:07:01 UTC
pdfdraw renders PDF files to 8bit greyscales (or RGB 24, but in this case, we're talking about 8bit greyscales). As such, its job is to give the best representation it can of that page in continuous tone grey.

If we have to scale a 1bpp image, this will inevitably mean generating some intermediate pixels that aren't fully black or white. Not to do so would mean either dropping pixels (in the case where we are scaling down), or producing nasty artifacts (when scaling up). As such for the greyscale case, we are doing exactly the right thing.

In the case where it is a black and white image you want, you have a choice; you can choose to constrain the output to pure black and white (in which case you will lose grey areas of the output entirely), or you can render to greyscale and then perform some cleverer black/white conversion; either error diffusion or halftoning for example.

The whole purpose of the latter is to try to carry some representation of the information contained in the grey pixels through to your final image.

If you are finding that the error diffusion you are doing is adversely affecting the readability of 'near black' (or near white) text, then you may want to consider a simple expansion/clamping regime before applying the error diffusion. (i.e. treat anything < 5% as white, anything over 95% as black, and anything in the middle error diffuse as normal).

None of that immediately has a place in MuPDF however; the existing code only outputs to greyscale, and we are hence doing exactly the right thing.

If we were to add a black/white output mode to MuPDF we might be tempted to offer a mode whereby we use nearest neighbour scaling of images rather than interpolated scaling (at least for upscales), but until then you're probably best off just commenting out the scales in your own code.