Bug 689966 - c329.bin hang with bitcmyk in clist code
Summary: c329.bin hang with bitcmyk in clist code
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: Graphics Library (show other bugs)
Version: master
Hardware: PC Linux
: P2 normal
Assignee: leonardo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-07-14 15:23 UTC by Ralph Giles
Modified: 2008-12-19 08:31 UTC (History)
0 users

See Also:
Customer:
Word Size: ---


Attachments
c329.log.gz (1.77 KB, application/octet-stream)
2008-07-22 10:37 UTC, Ray Johnston
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ralph Giles 2008-07-14 15:23:32 UTC
The Quality Logic PCL6 CET file c329.bin hangs or progresses very slowly with
the bitcmyk and bitrgb devices at 600 dpi. Problem command line:

main/obj/pcl6 -dBATCH -dNOPAUSE -Z: -r600 -sDEVICE=bitcmyk -sOutputFile=out.cmyk
../tests_private/xl/pcl6cet/c329.bin

The interpreter does appear from the -Z: output to be making very slow progress,
but doesn't complete after over 2 hours, as compared with 5 seconds at 100 dpi.
Interrupting the debugger suggests the majority of the time is being spent in
the clist reader.
Comment 1 Ralph Giles 2008-07-14 15:24:09 UTC
Raising priority since this file was blocking the regressions. I've removed it
from the cluster tests for the time being.
Comment 2 Ray Johnston 2008-07-14 15:33:51 UTC
Since clist is Igor's responsibility, assigning to him. Since it is a QL test
file and it impacts the regression test, changing to P1.
Comment 3 Henry Stiles 2008-07-14 15:46:29 UTC
It completed for me after several hours.  Sorry I don't have an exact time. 
There is a *very* large pattern in the file which should be the source of the
bottleneck.
Comment 4 Ray Johnston 2008-07-15 08:07:17 UTC
During performance benchmarking with customer 531, I noticed that when the
pattern was large enough to trigger the pattern_clist_accumulator, at least
one file was three times slower.

I think we need a MUCH larger default value in src/gxpcmap.c line 177  for the
threshold:

#  define MAX_BITMAP_PATTERN_SIZE (1024*1024)

Customer 531 is using 20M (20*1024*1024).
Comment 5 Ralph Giles 2008-07-15 11:36:58 UTC
I can confirm Ray's suspicion from #4.

Running on my own machine:

 main/obj/pcl6 -dBATCH -dNOPAUSE -Z: -dMaxBitmap=30000000 -K1000000 -r600
-sDEVICE=bitcmyk -sOutputFile=out.cmyk ../tests_private/xl/pcl6cet/c329.bin

a release build completes in about 20 minutes. After bumping
MAX_BITMAP_PATTERN_SIZE from 1M to 20M it completes in 23 seconds, only a factor
of 5 slower than at 100 dpi.
Comment 6 leonardo 2008-07-16 13:38:10 UTC
Can't check out ghostpcl due to external item gs.
Passing to Ralph beause he is assigned to fix that.
Please give it back to me when it works.
Comment 7 Marcos H. Woehrmann 2008-07-16 15:55:32 UTC
Created attachment 4225 [details]
c329.bin
Comment 8 Ralph Giles 2008-07-16 19:28:02 UTC
I've uploaded a current snapshot of the ghostpdl trunk for those without svn
access. It can be downloaded from
http://ghostscript.com/snapshots/ghostpdl-r8845.tar.gz

The external problem won't be resolved until after the 8.63 release.
Comment 9 Ray Johnston 2008-07-17 09:44:54 UTC
In the meantime, please increase the limit to 20 or 30 Mbytes or so. 
 
Then you can continue with this at a lower priority as a performance enhancement issue. 
Comment 10 leonardo 2008-07-22 06:31:59 UTC
Can't reproduce.

I checked out the revision 8862 of ghostpdl and built pcl6 with MSVC8 debug. 
The command line is :
..\..\ghostpdl\main\obj\pcl6.exe -dBATCH -dNOPAUSE -Z: -r600 -sDEVICE=bitcmyk -
sOutputFile=out.ppm c329.bin

It finishes in 170 seconds with consuming up to 526Mb (a peak value) and about 
300MB in average. Release build should be faster.

My computer is Core 2 duo 2.4 GHz 4Gb RAM.
On machines with lesser than 768MB RAM it may take hours for virtual memory 
swapping.

I noticed the RAM peak and a delay for tens seconds happens 
after printing [;]Beginning compression.

Returning to the bug author. Please check whether the problem still persists 
and check the dependence on physical RAM size. If you still observe hours 
delays, lets figure out whether it is Linux specific.
Comment 11 Ray Johnston 2008-07-22 10:37:33 UTC
Created attachment 4232 [details]
c329.log.gz

Some of these pages are MUCH slower than others (see below). The total time
on my machine with MAX_PATTERN_BITMAP_SIZE = 32Mb (rev >= 8851).

In particular Page 19 trips over the memory based clist threshold for doing
compression (my build had BAND_LIST_STORAGE=memory).

I think we need to understand these pages, but the original REALLY excessive
(hours) time issue is now fixed.

Page 8:
%  done : time = 14.234, pages = 7
[:] Width = 5100,	Height = 6600
[:]clist_end_page at cfile=338645651, bfile=7416
[:]tot_raw=338653043, tot_compressed=0
[:]width=5100, band_width=5100, band_height=1251, nbands=6
%  done : time = 85.516, pages = 8

Page 19:
%  done : time = 108.266, pages = 18
[:]Beginning compression
[:] Width = 5100,	Height = 6600
[:]clist_end_page at cfile=608688708, bfile=9936
[:]Number of raw buffers allocated=65
[:]tot_raw=608698620, tot_compressed=13734762
[:]width=5100, band_width=5100, band_height=1251, nbands=6
%  done : time = 329.234, pages = 19

Page 23:
[:]width=5100, band_width=5100, band_height=1251, nbands=6
%  done : time = 354.594, pages = 22
[:] Width = 5100,	Height = 6600
[:]clist_end_page at cfile=340487816, bfile=3864
[:]tot_raw=340491656, tot_compressed=0
[:]width=5100, band_width=5100, band_height=1251, nbands=6
%  done : time = 423.234, pages = 23

Page 24:
%  done : time = 423.234, pages = 23
[:] Width = 5100,	Height = 6600
[:]clist_end_page at cfile=470047107, bfile=5304
[:]tot_raw=470052387, tot_compressed=0
[:]width=5100, band_width=5100, band_height=1251, nbands=6
%  done : time = 572.797, pages = 24

--------------------------------------------------

It's a side issue, but the code also leaks like a sieve. There were a 1477
objects 'leaked' during this 62 page test, total 700 Kbytes. This is partly
captured in another bug assigned to Igor (bug 689822).
Comment 12 leonardo 2008-07-22 10:44:02 UTC
My build options :
NMAKE pcl6_msvc.mak ALLOW_VD_TRACE=1 
XCFLAGS="/DPCL_INCLUDED /DALLOW_VD_TRACE"  DEBUG=1 TDEBUG=1 DEBUGSYM=1 
MSVC_VERSION=8 DEVSTUDIO=M: 
Comment 13 Ralph Giles 2008-07-22 10:57:00 UTC
Yes, with Ray's change of MAX_PATTERN_BITMAP_SIZE to 32M in r8851, I get 23s
with a release linux build and the command line from the description.

With the command line from #10 it still takes 8 minutes. Peak memory use was
around 570 MB, with no swap activity. The machine has 3 GB of ram.

I agree the regression issue is resolved by Ray's commit, so I'm ok to close. Or
feel free to leave it open or open a new bug for the issues mentioned in #11
Comment 14 Ray Johnston 2008-07-22 11:26:51 UTC
With BAND_LIST_STORAGE=file, rev 3117+8863 I see total time = 212 seconds
(on my Core 2 Duo 2GHz) and the maximum RAM is about 35 Mbytes (watching the
Task manager on Windows).

I consider this issue FIXED by rev 8851 if Ralph concurs, I suggest he close it.
Comment 15 leonardo 2008-07-22 14:48:37 UTC
The revision 8851 change allows a release with an acceptable behavior. To 
reproduce the original problem the 8851 change to be unwinded. A complete 
resolution needs a buffering of the pattern raster after clist.
Comment 16 Ray Johnston 2008-07-22 14:55:00 UTC
I'm not sure I agree that the 8851 change should be 'unwound', since a 32Mb
allocation on a host machine is trivial.

This, and some other memory tuning parameters _do_ need attention on an
embedded (or other limited memory) systems, so the work to improve the
performance of pattern clist performance.
Comment 17 leonardo 2008-07-22 15:30:59 UTC
I didn't say that it has to be unwond *now*.
I said it needs to be unwond to reproduce the original problem.

There is nothing to argue here before the incoming release. I'm fine to do a 
releaase with that change. What I'm taking care here (of) is a proper 
documentation of the original problem and providing a right information for a 
further development.
Comment 18 leonardo 2008-08-03 09:37:00 UTC
Patch  to HEAD :
http://ghostscript.com/pipermail/gs-cvs/2008-August/008508.html
Comment 19 Ray Johnston 2008-08-03 16:06:29 UTC
Please post the timings for the most recent code compared to the version
with MAX_PATTERN_BITMAP_SIZE set to the large value.
Comment 20 leonardo 2008-08-04 04:40:08 UTC
I'm not sure what timing is expected in Comment #19. Nevertheless the test 
case demonstrates 160 seconds with unpatched build (the 32M threshold) and 160 
seconds with patched build (1M threshold). The measurement is done on Core 2 
duo 2.4MHz with MSVC8 debug build with this command line : 
..\..\ghostpdl\main\obj\pcl6.exe -dBATCH -dNOPAUSE -Z: -r600 -sDEVICE=bitcmyk -
sOutputFile=out.ppm c329.bin

Note the patch 8928 causes another problem in pcl6 which is described in 
689990, so it shouldn't be applied.