I have a PS file containing EPS images converted from TIFF by tiff2ps. The images are Level 2. Our real-life customer's .ps file is 27 MB. Converting to PDF with gs takes about 4 minutes on my Windows PC; Acrobat Distiller 3.0 does the same task in about 15 seconds. The troublesome part seems to be 230 graphics that are typically 2400 rows deep, coded as 75 strips 32 rows deep. In other words there are 75 repetitions of the image procedure per graphic. There are also 947 other graphics that are drawn as 7 strips 16 rows deep, but they are small and don't eat much time. I also converted a single TIFF that's 8462 rows deep, drawn at one row per strip, and created a PDF. This pathological case took 1:45 with gs on my PC, and about :05 with Acrobat Distiller. My command line is: gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=name.pdf -c .setpdfwrite -f name.ps We've seen similar timings for gs on Red Hat Linux. (The big file took 42 minutes on Solaris, but that's a *really* old system!)
Created attachment 4827 [details] onerowperstrip.ps This file has one image from tiff2ps with 8642 iterations through the imagemask procedure, each one row deep. Converting it to PDF on Windows XP takes 1:45 with gs and :05 with Acrobat Distiller 3.0.
Created attachment 4828 [details] pdfwrite-profile.txt This is a profile of gs distilling a very large file to PDF on Red Hat Linux. gs was processing our user's file, which is about 27 MB and includes, among other things, 230 tiff2ps, Level-2 images which typically draw 2400 rows in 32-row strips. This process takes about 4 minutes on Linux and Windows XP. You'll see that "bytes_compare" get 1.9 billion calls. This is *not* a profile for the onerowperstrip.ps file I attached. We do not yet have user permission to post the real-world file that was used for this profile.
This looks very much like an extreme case of the 'image merging' feature of Acrobat. Each row of the image is actually represented in PostScript as an individual image, resulting in 8462 calls to the image operator. Ghostscript reflects this by producing a PDF file with 8642 image XObjects. So we need 8642 image dictionaries, 8642 entries in the Page XObject declaration and so on. This means we spend an awful lot of time outputting these dictionaries, and also checking them. Acrobat detects the fact that the images are contiguous, and merges them all into one single image before output. This saves considerable space and checking of existing objects. See issues #688448 and #689923 for further details. This is not actually a bug, its a feature request to do the same consolidation as Acrobat. Frankly this is a pretty poor way to write images, its slow on any implementation (watch GS draw this for example). It would be much better to write all the lines of the image as a single image. Mark I expect you've tried this, but if you load the image into something like GIMP or Photoshop, and resave without the multi-stripping, I imagine this works well enough ? As I believe this is an enhancement not a bug, and a duplicate, I'm marking it as such, we already have a P2 enhancement request for this. *** This bug has been marked as a duplicate of 688448 ***
IMHO the problem is the best to fix at the source. tiff2ps is a small open-source utility that can be easily modified to generate better PostScript.
Hi Alex, I've exchanged a couple of mails privately with the customer, and it looks like they will probably have a work around. I suggested using tiffcp to convert the multi-strip TIFF to a single strip, which works for me.
I've done some work on the tiff2ps source code and it might be worth your time to download a current version from the LIBTIFF CVS tree, either 3.9.0 or 4.0 (if you need BIGTIFF support). I have not modified the Postscript generation code except to correct some incompatible options and to allow for rotations of 90, 180, and 270 degrees so I cannot comment on the efficiency of it. However, the default is to produce Postscript Level 1 which is generally a much larger file than if you specify level 2 or level 3 and the additional IO may be part of the problem. Try the -map3 option for Level 3 Postscript using the ImageMask operator when the image only contains 1 bit per sample. If you cannot get the current source from CVS, let me know.
Changing customer bugs that have been resolved more than a year ago to closed.