The Quality Logic PCL6 CET file c329.bin hangs or progresses very slowly with the bitcmyk and bitrgb devices at 600 dpi. Problem command line: main/obj/pcl6 -dBATCH -dNOPAUSE -Z: -r600 -sDEVICE=bitcmyk -sOutputFile=out.cmyk ../tests_private/xl/pcl6cet/c329.bin The interpreter does appear from the -Z: output to be making very slow progress, but doesn't complete after over 2 hours, as compared with 5 seconds at 100 dpi. Interrupting the debugger suggests the majority of the time is being spent in the clist reader.
Raising priority since this file was blocking the regressions. I've removed it from the cluster tests for the time being.
Since clist is Igor's responsibility, assigning to him. Since it is a QL test file and it impacts the regression test, changing to P1.
It completed for me after several hours. Sorry I don't have an exact time. There is a *very* large pattern in the file which should be the source of the bottleneck.
During performance benchmarking with customer 531, I noticed that when the pattern was large enough to trigger the pattern_clist_accumulator, at least one file was three times slower. I think we need a MUCH larger default value in src/gxpcmap.c line 177 for the threshold: # define MAX_BITMAP_PATTERN_SIZE (1024*1024) Customer 531 is using 20M (20*1024*1024).
I can confirm Ray's suspicion from #4. Running on my own machine: main/obj/pcl6 -dBATCH -dNOPAUSE -Z: -dMaxBitmap=30000000 -K1000000 -r600 -sDEVICE=bitcmyk -sOutputFile=out.cmyk ../tests_private/xl/pcl6cet/c329.bin a release build completes in about 20 minutes. After bumping MAX_BITMAP_PATTERN_SIZE from 1M to 20M it completes in 23 seconds, only a factor of 5 slower than at 100 dpi.
Can't check out ghostpcl due to external item gs. Passing to Ralph beause he is assigned to fix that. Please give it back to me when it works.
Created attachment 4225 [details] c329.bin
I've uploaded a current snapshot of the ghostpdl trunk for those without svn access. It can be downloaded from http://ghostscript.com/snapshots/ghostpdl-r8845.tar.gz The external problem won't be resolved until after the 8.63 release.
In the meantime, please increase the limit to 20 or 30 Mbytes or so. Then you can continue with this at a lower priority as a performance enhancement issue.
Can't reproduce. I checked out the revision 8862 of ghostpdl and built pcl6 with MSVC8 debug. The command line is : ..\..\ghostpdl\main\obj\pcl6.exe -dBATCH -dNOPAUSE -Z: -r600 -sDEVICE=bitcmyk - sOutputFile=out.ppm c329.bin It finishes in 170 seconds with consuming up to 526Mb (a peak value) and about 300MB in average. Release build should be faster. My computer is Core 2 duo 2.4 GHz 4Gb RAM. On machines with lesser than 768MB RAM it may take hours for virtual memory swapping. I noticed the RAM peak and a delay for tens seconds happens after printing [;]Beginning compression. Returning to the bug author. Please check whether the problem still persists and check the dependence on physical RAM size. If you still observe hours delays, lets figure out whether it is Linux specific.
Created attachment 4232 [details] c329.log.gz Some of these pages are MUCH slower than others (see below). The total time on my machine with MAX_PATTERN_BITMAP_SIZE = 32Mb (rev >= 8851). In particular Page 19 trips over the memory based clist threshold for doing compression (my build had BAND_LIST_STORAGE=memory). I think we need to understand these pages, but the original REALLY excessive (hours) time issue is now fixed. Page 8: % done : time = 14.234, pages = 7 [:] Width = 5100, Height = 6600 [:]clist_end_page at cfile=338645651, bfile=7416 [:]tot_raw=338653043, tot_compressed=0 [:]width=5100, band_width=5100, band_height=1251, nbands=6 % done : time = 85.516, pages = 8 Page 19: % done : time = 108.266, pages = 18 [:]Beginning compression [:] Width = 5100, Height = 6600 [:]clist_end_page at cfile=608688708, bfile=9936 [:]Number of raw buffers allocated=65 [:]tot_raw=608698620, tot_compressed=13734762 [:]width=5100, band_width=5100, band_height=1251, nbands=6 % done : time = 329.234, pages = 19 Page 23: [:]width=5100, band_width=5100, band_height=1251, nbands=6 % done : time = 354.594, pages = 22 [:] Width = 5100, Height = 6600 [:]clist_end_page at cfile=340487816, bfile=3864 [:]tot_raw=340491656, tot_compressed=0 [:]width=5100, band_width=5100, band_height=1251, nbands=6 % done : time = 423.234, pages = 23 Page 24: % done : time = 423.234, pages = 23 [:] Width = 5100, Height = 6600 [:]clist_end_page at cfile=470047107, bfile=5304 [:]tot_raw=470052387, tot_compressed=0 [:]width=5100, band_width=5100, band_height=1251, nbands=6 % done : time = 572.797, pages = 24 -------------------------------------------------- It's a side issue, but the code also leaks like a sieve. There were a 1477 objects 'leaked' during this 62 page test, total 700 Kbytes. This is partly captured in another bug assigned to Igor (bug 689822).
My build options : NMAKE pcl6_msvc.mak ALLOW_VD_TRACE=1 XCFLAGS="/DPCL_INCLUDED /DALLOW_VD_TRACE" DEBUG=1 TDEBUG=1 DEBUGSYM=1 MSVC_VERSION=8 DEVSTUDIO=M:
Yes, with Ray's change of MAX_PATTERN_BITMAP_SIZE to 32M in r8851, I get 23s with a release linux build and the command line from the description. With the command line from #10 it still takes 8 minutes. Peak memory use was around 570 MB, with no swap activity. The machine has 3 GB of ram. I agree the regression issue is resolved by Ray's commit, so I'm ok to close. Or feel free to leave it open or open a new bug for the issues mentioned in #11
With BAND_LIST_STORAGE=file, rev 3117+8863 I see total time = 212 seconds (on my Core 2 Duo 2GHz) and the maximum RAM is about 35 Mbytes (watching the Task manager on Windows). I consider this issue FIXED by rev 8851 if Ralph concurs, I suggest he close it.
The revision 8851 change allows a release with an acceptable behavior. To reproduce the original problem the 8851 change to be unwinded. A complete resolution needs a buffering of the pattern raster after clist.
I'm not sure I agree that the 8851 change should be 'unwound', since a 32Mb allocation on a host machine is trivial. This, and some other memory tuning parameters _do_ need attention on an embedded (or other limited memory) systems, so the work to improve the performance of pattern clist performance.
I didn't say that it has to be unwond *now*. I said it needs to be unwond to reproduce the original problem. There is nothing to argue here before the incoming release. I'm fine to do a releaase with that change. What I'm taking care here (of) is a proper documentation of the original problem and providing a right information for a further development.
Patch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2008-August/008508.html
Please post the timings for the most recent code compared to the version with MAX_PATTERN_BITMAP_SIZE set to the large value.
I'm not sure what timing is expected in Comment #19. Nevertheless the test case demonstrates 160 seconds with unpatched build (the 32M threshold) and 160 seconds with patched build (1M threshold). The measurement is done on Core 2 duo 2.4MHz with MSVC8 debug build with this command line : ..\..\ghostpdl\main\obj\pcl6.exe -dBATCH -dNOPAUSE -Z: -r600 -sDEVICE=bitcmyk - sOutputFile=out.ppm c329.bin Note the patch 8928 causes another problem in pcl6 which is described in 689990, so it shouldn't be applied.