Summary: | PDF created by GS pdfwrite larger than Acrobat 6 and 7 | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Ray Johnston <ray.johnston> |
Component: | PDF Writer | Assignee: | Ken Sharp <ken.sharp> |
Status: | NOTIFIED LATER | ||
Severity: | enhancement | CC: | christinedelight.top85, marcos.woehrmann, mdevries, shailesh.mistry |
Priority: | P3 | ||
Version: | master | ||
Hardware: | All | ||
OS: | All | ||
Customer: | 170 | Word Size: | --- |
Attachments: | 3page_ad7.pdf |
Description
Ray Johnston
2005-12-13 10:00:30 UTC
Created attachment 1861 [details]
3pages.ps.zip
input PS file
Created attachment 1862 [details]
3page_ad7.pdf
PDF created by Adobe Acrobat 7 from the 3page.ps file
A fixed overflow happens in gx_curve_log2_samples while computing 'd'. Source data x0 = -2147483648 == 0x80000000 isn't good. Please ignore the last comment - it was put into a wrong bug. we are the same size as acrobat 5. Bumping the priority for customer bug. Passing to Ken since he handles pdfwrite from now. Hmm. I could do with some input from more experienced GS developers on this one. 1) I see no reason why MaxInlineImageSize=0 would result in images being 'shared'. This is only possible when the images are identical. There is no easy way (in PostScript) to know that ,unless the images are from (for example) a Form or some other identifiably identical source (we could compare all images sample by sample, but that would be sloooow...). However, in this case, I *do* see the images being 'shared'. For example, in the content stream of the first page: W* n q 0 6000 -5 -0 235 510.921 cm /R8 Do Q q 0 6000 -5 -0 240 510.921 cm /R8 Do Q q 0 6000 -5 -0 245 510.921 cm /R8 Do Q q 0 6000 -5 -0 250 510.921 cm /R8 Do Q q 0 6000 -5 -0 255 510.921 cm /R9 Do Q Note the reuse of the first image, R8. 2) Merging images. This is something I was involved with in my previous incarnation. I don't recommend it. It significantly harms performance, for what are usually tiny gains on any sensibly constructed file. This file doesn't look bad enough to gain. In fact even the customer that forced the adoption in my previous life didn't save space, and their job had each scan line as a separate image. 3) So, why is the job larger ? Simple, all the images outpuit by GS are RGB, and are therefore 3 bytes per image sample. All (or at least, a lot) of the images in the Acrobat output are /Indexed /RGB , using one byte per image sample. THe job is mostly composed of images, save 1/3 of each image and the output is indeed much smaller. It looks to me like there is no 'auto-indexification' of images in pdfwrite. This *is* a useful feature, it does make processing a little slower, but many jobs (especially from MS Office applications) benefit from much smaller output. This file was a PowerPoint presentation ;-) *** Bug 689923 has been marked as a duplicate of this bug. *** *** Bug 690319 has been marked as a duplicate of this bug. *** Enhancement still missing in Ghostscript 9.03 |