Bug 691978 - Performance problem reading PostScript file
Summary: Performance problem reading PostScript file
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: General (show other bugs)
Version: master
Hardware: PC All
: P1 enhancement
Assignee: Ray Johnston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-16 20:52 UTC by Marcos H. Woehrmann
Modified: 2012-04-12 17:13 UTC (History)
2 users (show)

See Also:
Customer: 531
Word Size: ---


Attachments
gs_lev2.ps.diff (570 bytes, patch)
2011-03-02 18:55 UTC, Marcos H. Woehrmann
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marcos H. Woehrmann 2011-02-16 20:52:33 UTC
The customer reports and I've verified that the attached file takes a  long time to process when converted with Ghostscript.  This is not a regression, all versions of Ghostscript that I have tried are slow.

Here is the command line I'm using:

  bin/gs -sDEVICE=tiffpack -r200 -o test.tif ./EL03pdf.ps

This command takes ~30 minutes on my 2.8 GHz iMac.
Comment 2 Marcos H. Woehrmann 2011-02-16 20:54:01 UTC
The customer reports:

I did a bit more research on this. The PS file that I sent you that
takes a long time to process is encoded using RLE. We made this change
in our PS driver to reduce spool file sizes. When we use a version of
our PS driver that does not use RLE, the spool file is enormous, but GS
handles it quite quickly (under 20 seconds).
Comment 3 Marcos H. Woehrmann 2011-02-17 13:57:44 UTC
Chris reports:

I ran the job through gprof (output attached). Almost all the time is taken up in the garbage collector. -dNOGC doesn't seem to affect this, although it does cause memory use to increase (from ~140Mb to ~260Mb).
Comment 4 Marcos H. Woehrmann 2011-02-17 13:58:37 UTC
Ken:

It turns out that the reason we spend so much time in the garbage collector is because the job tells us to!

Running grep over the file, looking for 'vmreclaim' results in 11369 hits. As far as I can tell easily each of these is '2 vmreclaim' which means 'perform immediate collection in local and global VM'.

This is of course an *extremely* expensive operation, as the interpreter must examine the entire memory allocation looking for objects which can be discarded. The PLRM is pretty clear that there really is no good reason to use vmreclaim, the only time it can be used like this is in an interactive application which can use idle time to clean up memory. There is *no* sense in using it as this job does.

FWIW this appears to be because the job uses *many* patterns and each pattern does an immediate vmreclaim.

All that being said, the output looks (to me) like it is generated from Adobe Acrobat, so I think we need to be prepared for more jobs like this, though probably without quite so much insane use of patterns.

Defining /vmreclaim as a no-op considerably improves the performance of this job, Chris tells me it goes from ~30 minutes to 57 seconds by doing so.

I think we should consider defining vmreclaim as a no-op (by changing the C source code to zvmreclaim) so that we ignore values of 1 and 2 (immediate reclamation).
Comment 5 Chris Liddell (chrisl) 2011-02-17 14:57:35 UTC
Actually, I've now timed it properly, and it actually takes 63 minutes 16 seconds, with the unmodified file, vs the 57 seconds with vmreclaim redefined as a null op.
Comment 6 Customer 531 2011-02-17 19:37:31 UTC
I modifed the PS file and removed all occurences of '2 vmreclaim' and got approx 20 seconds processing time.
Comment 7 Marcos H. Woehrmann 2011-02-18 15:40:24 UTC
(In reply to comment #4)
> I think we should consider defining vmreclaim as a no-op (by changing the C
> source code to zvmreclaim) so that we ignore values of 1 and 2 (immediate
> reclamation).

The customer would like to evaluate this modification; can you temporarily add a conditional compile to zvmem2.c?
Comment 8 Ken Sharp 2011-02-18 15:44:19 UTC
(In reply to comment #7)
> (In reply to comment #4)
> > I think we should consider defining vmreclaim as a no-op (by changing the C
> > source code to zvmreclaim) so that we ignore values of 1 and 2 (immediate
> > reclamation).
> 
> The customer would like to evaluate this modification; can you temporarily add
> a conditional compile to zvmem2.c?

Do you just want a patch, or add a conditional compile to the code in the repository ?
Comment 9 Marcos H. Woehrmann 2011-02-21 18:11:08 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #4)
> > > I think we should consider defining vmreclaim as a no-op (by changing the C
> > > source code to zvmreclaim) so that we ignore values of 1 and 2 (immediate
> > > reclamation).
> > 
> > The customer would like to evaluate this modification; can you temporarily add
> > a conditional compile to zvmem2.c?
> 
> Do you just want a patch, or add a conditional compile to the code in the
> repository ?

Presuming we decide to plan to eventually disable vmreclaim let's generate a patch.
Comment 10 Chris Liddell (chrisl) 2011-02-21 18:24:34 UTC
(In reply to comment #9)
> 
> Presuming we decide to plan to eventually disable vmreclaim let's generate a
> patch.

It may not be trivial. There are a number of places where our "internal" Postscript calls vmreclaim, which ought to be reviewed - for myself, I doubt anywhere but possibly gs_init.ps should use it. But if any are legit uses, we'll need to define an internal operator for our use, whilst hobbling the "real" vmreclaim operator.
Comment 11 Marcos H. Woehrmann 2011-03-02 18:55:41 UTC
Created attachment 7306 [details]
gs_lev2.ps.diff

Patch from Ken with the following caveat:

I did a cluster push to test this, and it produced a *load* of differences. Practically all of them are in the QL test suite where shades of grey appear slightly darker on 'some' files and tests.

Please emphasise that this is *NOT* a fix or even a suggested work-around, if we decide to disable this kind of vm reclamation then we'll need to study our own code to see where we use it and decide whether its important (in which case we can use /.vmreclaim instead of /vmreclaim). We will also need to study the output and at least account for them.
Comment 12 Ray Johnston 2011-10-18 20:12:43 UTC
I made a slightly modified form of Ken's patch that properly passes the 30-10.PS
CET test and then also fixed some bugs that showed (seg faults) caused when the
GC timing changed -- there were places that didn't allow for the case when
the i_ctx_p and/or i_ctx_p->pgs would change due to GC running and relocating
the structures. I also made the suggested changes to our internal usage to
use .vmreclaim directly.

On my laptop, this runs the test file in 18 seconds vs. 1425 seconds with GC
running every time the file asks for it. Since the GC isn't disabled, it still
runs when the amount allocated since the last GC exceeds the vmthreshold, so
the memory usage doesn't change much (I didn't measure it).

Since it passes the cluster testing and speeds things up so much, I don't
see any downside to this change and have committed it as 4d0f6ec