Summary: | pdfwrite with multi-stripe images takes f.o.r.e.v.e.r | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Mark DeVries <mdevries> |
Component: | PDF Writer | Assignee: | Ken Sharp <ken.sharp> |
Status: | NOTIFIED DUPLICATE | ||
Severity: | enhancement | ||
Priority: | P2 | ||
Version: | 8.63 | ||
Hardware: | PC | ||
OS: | Windows XP | ||
Customer: | 1130 | Word Size: | --- |
Attachments: |
onerowperstrip.ps
pdfwrite-profile.txt |
Description
Mark DeVries
2009-03-05 13:14:02 UTC
Created attachment 4827 [details]
onerowperstrip.ps
This file has one image from tiff2ps with 8642 iterations through the imagemask
procedure, each one row deep. Converting it to PDF on Windows XP takes 1:45
with gs and :05 with Acrobat Distiller 3.0.
Created attachment 4828 [details]
pdfwrite-profile.txt
This is a profile of gs distilling a very large file to PDF on Red Hat Linux.
gs was processing our user's file, which is about 27 MB and includes, among
other things, 230 tiff2ps, Level-2 images which typically draw 2400 rows in
32-row strips. This process takes about 4 minutes on Linux and Windows XP.
You'll see that "bytes_compare" get 1.9 billion calls.
This is *not* a profile for the onerowperstrip.ps file I attached. We do not
yet have user permission to post the real-world file that was used for this
profile.
This looks very much like an extreme case of the 'image merging' feature of Acrobat. Each row of the image is actually represented in PostScript as an individual image, resulting in 8462 calls to the image operator. Ghostscript reflects this by producing a PDF file with 8642 image XObjects. So we need 8642 image dictionaries, 8642 entries in the Page XObject declaration and so on. This means we spend an awful lot of time outputting these dictionaries, and also checking them. Acrobat detects the fact that the images are contiguous, and merges them all into one single image before output. This saves considerable space and checking of existing objects. See issues #688448 and #689923 for further details. This is not actually a bug, its a feature request to do the same consolidation as Acrobat. Frankly this is a pretty poor way to write images, its slow on any implementation (watch GS draw this for example). It would be much better to write all the lines of the image as a single image. Mark I expect you've tried this, but if you load the image into something like GIMP or Photoshop, and resave without the multi-stripping, I imagine this works well enough ? As I believe this is an enhancement not a bug, and a duplicate, I'm marking it as such, we already have a P2 enhancement request for this. *** This bug has been marked as a duplicate of 688448 *** IMHO the problem is the best to fix at the source. tiff2ps is a small open-source utility that can be easily modified to generate better PostScript. Hi Alex, I've exchanged a couple of mails privately with the customer, and it looks like they will probably have a work around. I suggested using tiffcp to convert the multi-strip TIFF to a single strip, which works for me. I've done some work on the tiff2ps source code and it might be worth your time to download a current version from the LIBTIFF CVS tree, either 3.9.0 or 4.0 (if you need BIGTIFF support). I have not modified the Postscript generation code except to correct some incompatible options and to allow for rotations of 90, 180, and 270 degrees so I cannot comment on the efficiency of it. However, the default is to produce Postscript Level 1 which is generally a much larger file than if you specify level 2 or level 3 and the additional IO may be part of the problem. Try the -map3 option for Level 3 Postscript using the ImageMask operator when the image only contains 1 bit per sample. If you cannot get the current source from CVS, let me know. Changing customer bugs that have been resolved more than a year ago to closed. |