Created attachment 7121 [details] This file, when rendered by GS, generate huge PCL file The following command renders a 460k PDF file into a 89M PCL file cat original_page3.pdf | gs -q -dBATCH -dPARANOIDSAFER -dNOPAUSE -sDEVICE=pxlmono -sDEVICE=pxlcolor -r600x600 -sPAPERSIZE=letter -sOutputFile=- - > rip.pcl
This is somewhat related to bug 690867 - the file has an image with an unusual color space; before Michael's recent ICC work some of them are embedded as is and stripped of the colorspace info (small but wrongly, like bug 690867) or rendered correctly but with a large file size. With Michael's recent work, they are mostly rendered correctly but with a large size, I think. -------------------- 15 0 obj [ /Indexed 16 0 R 255 17 0 R ] endobj 16 0 obj [ /DeviceN [ /Black /PANTONE#20354#20U ] /DeviceCMYK 18 0 R 19 0 R ] endobj 19 0 obj <</Subtype /NChannel /Process 20 0 R /Colorants 21 0 R >> endobj 21 0 obj <</PANTONE#20354#20U 22 0 R >> endobj 22 0 obj [ /Separation /PANTONE#20354#20U /DeviceCMYK <</Range [ 0 -------------- This is on my TODO list, but I am not getting to this any time soon so others please feel free to have a go. A hint would be http://bugs.ghostscript.com/show_bug.cgi?id=690867#c5 and utilize Michael's linked icc transforms. BTW, 8.71 seems to have some JPX-related problem and won't read the input file. The large output file size is with 9.x?
This file is likely not a good bountiable candidate. We hope to see some improvement with planned changes for the image code to emit more compact device calls. Note two devices are specified on the command line with the original problem report.
Can be a duplicate of bug 692329. Please check.
Filed additional Bug 692531 for JPX-related issue. The file has an image of unusual colorspace within a transparency group scaled up 6x in both dimensions. More compact device calls aren't going to be of much help unless those use PCL/PXL ROPs to deal with transparency. Hooking up the icc transform plus tuning the compression gives 17MB though (about the 6x6 estimate), which is still a lot better than 89MB.
Implemented compression mode 2 (jpeg) - and the output is 1.75MB. That's probably agreeable as the original was JPX. I'll tidy up and fix the remaining problems and post soon.
Created attachment 10746 [details] pxl JPEG compression mode enhancement patch. This patch add Compression Mode 2 (jpeg), and updates the documentation, etc. 1=RLE and 3=DeltaRow are the existing modes. Those two can compress arbitrary depth/colorspace, but jpeg is only suitable for full color images (gray 8-bit is possible but the pxl driver cannot tell between 8-bit gray and 8-bit indexed so it isn't done) so whereas the other two does it wholesale, this one will still revert back to RLE for unsuitable images/bitmaps/masks. Cluster tested. It is an optional new feature, no difference is expected. I'll post some detailed numbers.
pxlcolor sizes: 100624709 default mode 1 1751773 mode 2 17298082 mode 3 pxlmono sizes: 7330837 default (mode 1) 7330837 mode 2 4872400 mode 3 The original is largely a JPX image size of 441651 byte at 875 x 1125 (roughly 100 dpi). Rendered to r600 then recompressed with jpeg to about 4x size seems reasonable. It looks like for pxlmono, requests to compress to jpeg silently switched back to RLE. I'll take a look, it may or may not be a flaw - gray 8-bit and indexed 8-bit aren't distinguished in some part of the pxl imaging code. The latter isn't appropriate for jpeg compression so requests to compress to jpeg for both are silently ignored.
Patches were committed: commit 4b44b41c9b6c4a7e5ebf03b6970f9be39548443b Author: Hin-Tak Leung <hintak@ghostscript.com> Date: Wed Mar 12 15:03:58 2014 +0000 Implements PCL XL Compression Mode 2 (JPEG), and updated documentation and other support files. Bug 694282. commit 41ab485d48890ecadc3d5f74657b644f9d1a8d7f Author: Hin-Tak Leung <hintak@ghostscript.com> Date: Wed Mar 12 15:16:05 2014 +0000 pxlmono/pxlcolor: Transform deep (24-bit) images with an ICC transform to emit high-level images. Bug 690867. There was a regression with the latter and were fixed by 8ae4ee220766aa180150eafeffe4f094f1354f92 (Bug 695103). People thinking of back-porting needs all three. The new functionalities of compression mode 2 and icc-transform deep images are optional. Patch to remove -diccTransform option and some newly discovered issues are tracked in bug 695124 .
Applying all patches mentioned in comment #8, including the one of bug #695124 does not give any size advantage when running cat original_page3.pdf | gs -q -dBATCH -dPARANOIDSAFER -dNOPAUSE -sDEVICE=pxlcolor -r600x600 -sPAPERSIZE=letter -sOutputFile=- - > rip.pcl only adding -dCompressMode=2 gives the mentioned size advantage for the JPEG compression mode, independent whether the ICC transform patches are applied or not. So for the sample file the ICC transform patches seem irrelevant concerning size. Now my questions: 1. Can one make "-dCompressMode=2" the default? Or does it have any disadvantages? Could you supply a patch to make this the default? 2. Can you attach a sample file where the ICC transform patches reduce the output size significantly? 3. If the ICC transform patches give a size advantage for a certain file, does this happen only with pxlcolor or also with pxlmono?
(In reply to comment #9) > 1. Can one make "-dCompressMode=2" the default? Or does it have any > disadvantages? Could you supply a patch to make this the default? compression mode 2 and compression mode 3 were introduced in PXL 2.0 and 2.1 respectively. Though the 2.1 spec is nearly 14 years old now, I think we can only assume genuine HP printers implement the full spec. Ricoh, etc are free not to, I guess, and assume only PCL XL 1.1 features, and still claim to "support PCL XL". So you probably should enable them for non-HP printers only on a case-by-case basis. Also, jpeg compression is lossy and should only be used for large images and where losing some details are acceptable, or the input is jpeg/jpx which are already lossy in the first place. > 2. Can you attach a sample file where the ICC transform patches reduce the > output size significantly? I know of at least two files (mentioned in Bug 690867, for which this change was written for, actually), and T2CharString.pdf (in comparefiles/, mentioned in 695124). Both of them are part of the private_ regression suite. You should be able to get them if you can run the regression suite, I think. but they are private, as you know... however, you probably can make one up by the information in http://bugs.ghostscript.com/show_bug.cgi?id=690867#c3 - i.e. if you put "/Intent /RelativeColorimetric" in a the dictionary for a image in a pdf file. > 3. If the ICC transform patches give a size advantage for a certain file, > does this happen only with pxlcolor or also with pxlmono? The icc patch is for both. The jpeg patch currently is written to limit to color's - and silent go back to RLE for "unsuitable" images. I haven't figured out how to distinguish 8-bit indexed color images vs gray 8-bit gray, and do gray-jpeg properly.
(In reply to comment #9) > 2. Can you attach a sample file where the ICC transform patches reduce the > output size significantly? One way to tell what pdf's would benefit is by looking at "mutool into ...". The two files I mentioned shows: Images (6): 1 ( 8 0 R): [ DCT ] 285x164 8bpc ICC (14 0 R) 1 ( 8 0 R): [ DCT ] 285x184 8bpc ICC (15 0 R) 1 ( 8 0 R): [ DCT ] 63x94 8bpc ICC (16 0 R) and Images (4): 1 ( 80 0 R): [ DCT ] 323x375 8bpc Lab (123 0 R) (note the "ICC" and "Lab" part). Whereas others which are not affected tend to show DevCMYK/DevRGB/DevGray.
(In reply to comment #9) > 2. Can you attach a sample file where the ICC transform patches reduce the > output size significantly? stupid me... there is a public file http://svn.ghostscript.com/ghostscript/tests/pdf/icc_rendering_intent.pdf , the file which Michael uses for his talks on ICC link tranform! size 2371499 head pxlcolor -diccTransform 15670504 gs 9.14 pxlcolor 783259 head pxlmono -diccTransform 11641096 gs 9.14 pxlmono that's 6x for pxlcolor, and 14x for pxlmono.
(In reply to comment #12) > stupid me... there is a public file > http://svn.ghostscript.com/ghostscript/tests/pdf/icc_rendering_intent.pdf , > the file which Michael uses for his talks on ICC link tranform! > > size > 2371499 head pxlcolor -diccTransform > 15670504 gs 9.14 pxlcolor > 783259 head pxlmono -diccTransform > 11641096 gs 9.14 pxlmono > > that's 6x for pxlcolor, and 14x for pxlmono. Just for completeness: 223269 head pxlcolor -diccTransform -dCompressMode=2 153819 head pxlmono -diccTransform -dCompressMode=2 The original is 1278534 byte, with 4 jpeg images 150k each (so about 600k of images, the rest is font data, etc). The jpeg images in the pxl output is about 50k each, smaller than the original because applying extreme rendering intents means losing details.