The customer reports and I've verified that the attached PostScript files generate large PCL-XL files when converted by Ghostscript. There are two issues here: The file rotate.ps is rotated 90 degrees; this sends blocks to gdevpx.c that are tall and skinny and the compression fails. The file image.ps contains a continuous tone image that doesn't compress well with RLE compression. Note that there was a third issue, a bug in gdevpx.c that caused it to always revert to uncompressed images, bug 689732, that has been fixed. I mention it because it affects the rotate.ps issue: previously gdevpx.c called s_RLE_process() with last set to false, which should allow compression to continue from one raster line to the next and if it worked would solve the rotate.ps problem. But calling s_RLE_process() with last set to false is apparently broken because it fails to compress all images, so my hack was to set it to true. This causes each raster line to be compressed independently, so it now works for non-rotated images (see the bug 689732 description for more details). The command line I'm using: bin/gs -sDEVICE=pxlmono -o test.pxl ./rotate.ps
Created attachment 4391 [details] rotate.ps
Created attachment 4392 [details] image.ps
Created attachment 4422 [details] golden_gate.ps Another file that we do a poor job converting to pxlmono. In this case we generate many small rectangles.
Make bountiable and copy in Hin Tak. It was suggested we try delta row compression here instead of RLE. JPEG is also possible but we note modern hp drivers aren't producing JPEG.
Yes, delta row should work much better for attachment 4422 [details] than RLE - I think there is already some delta row compression code in ghostscript somewhere (maybe in the tiff* driver?). It should be relatively straight-forward.
Attachment 4422 [details] now works well with the patch attachment 5488 [details] (bug 690733), and outputs 222k instead of 3MB; so implementing DeltaRow compression probably isn't necessary. I have started looking at DeltaRow compression - there are two slight issues with it: it requires PCL XL 2.1 (not in 2.0/1.1), and it is all-or-nothing, i.e. one cannot max-and-match with RLE/nocompression, according to the PXL spec. I don't know if any printer is sensitive to the PCL XL version, but it probably needs to be done as a user option. Still want to implement DeltaRow compression?
Argh, I knew I saw it somewhere - the PCL XL specs refers to DeltaRow compression - but the algorithm itself is called mode 3/mode 9 in the PCL (5) spec, and implemented in base/gdevpcl.c . (in fact RLE in PCL XL is mode 2 in PCL, I think, but it is 're-done').
With the rotation patch (attachment 5502 [details]) there are some dramatic improvments - output from attachment 4391 [details]: down 388x: 261131850 -> 672883 (261MB -> 0.7MB) output from attachment 4392 [details]: down 5x: 63348733 -> 12411437 (63MB -> 12MB) There is little to gain from the first case, but the 2nd case can gain from DeltaRow compression, I think. With either patched or unpatched ghostscript, the resulting pxl code from 4392 seems to be broken: ------------- Warning interpreter exited with error code -986 Flushing to end of job End of page 2, press <enter> to continue. PCL XL error Subsystem: KERNEL Error: ExtraData Operator: ReadImage Position: 4078747 file position of error = 63348724 Position: 959 file position of error = 12411428 End of page 3, press <enter> to continue. --------------- In both cases, the problem is 9 bytes from the end of the pxl data, so it looks like it is ghostscript silently generating wrong pxl code, which is worrying.
A little amendment for the last comment - they were of pxlcolor. For pxlmono, here are the file sizes: 261131542 / 671351 , (388x) 47028769 / 4292701 , (11x) With pxlmono, pcl6 reads outputs of both patched/unpatched ghostscript alright, and they look to be the same; and both have visual artefacts. It looks like some of the images on page two may have transparency masks, which generates broken pxl streams in pxlcolor and results in visual effects with pxlmono.
The ExtraData ReadImage issue in Comment 8 is bug 688320 .
With rotation patch, DeltaRow compression seem to do worse than RLE: attachment 4391 [details]: 672883 671351 739965 738433 attachment 4392 [details]: 12411437 4292701 12410256 5876405 attachment 4422 [details]: (222k?) 609409 226669 Without the rotation patch, it does give a 2x on 4391: color mono 261131850 261131542 136898647 136875859 63348733 47028769 63747550 48359062 So DeltaRow compression is rarely worth it. The size of 5.8MB vs 4.3Mb (RLE) is somewhat interesting - RLE has visual artifects, and DeltaRow causes segmentation fault (new bug, 690844). I guess I'll throw in jpeg compression as well, to see how it pans out.
Created attachment 5541 [details] deltarow compression patch Replease RLE with DeltaRow. It does not improve beyond the rotation patch at all; probably just for completeness. Should not be applied unless/until it is implemented as an option.
Created attachment 5552 [details] updated patch for DeltaRow compression THis adds a new option -dCompressionMode={1,3} for optionally use DeltaRow compression, and also updated the documentation. Compared to previous patch, corrected two issues - it was compressing some additional garbage added to end of bitmask and 1-pixel high images should not be deltarow-compressed. Now the result is more sensible - it gives a 12% advantage for the largest, 5% for the 2nd largest cases by RLE: attach b/c b/m a/c a/m 4391 672883 671351 739965 738433 4392 12411467 4292701 10915582 4100753 4422 792273 222455 609409 226669 Some additional logic may benefit: e.g. switching to RLE for h=2 also expands slightly for color (11431902) but gains further for mono (3939899) for 4392. Without the rotation patch, it does give a 2x on 4391 and marginal for 4392: 261131850 261131542 136760082 136759774 63348733 47028769 62241388 46599270 So it looks like deltarow compression benefits mostly intermediate size rasters.
Created attachment 5554 [details] made-up pdf which compresses poorly with RLE but well with DeltaRow. a made-up pdf (consisting mainly of a horizontal gradient), which should compress well with delta row but not RLE. gs -sDEVICE=pxlcolor -o a.pxl output.pdf with or without -dCompressMode=3 gives 13591962 -> 859422 (16x) As expected. (the ratio isn't higher than 16x because the image is rendered in strips of 67-pixels and so another poorly compressed initial row is sent per 67-pixels).
Patch committed as r10232 . RLE is still the default. -dCompressMode=3 to activate new code.
patch commented, functionality optional
Changing customer bugs that have been resolved more than a year ago to closed.