Bug 690665 - MuPDF uses a lot of memory for large images
Summary: MuPDF uses a lot of memory for large images
Status: RESOLVED FIXED
Alias: None
Product: MuPDF
Classification: Unclassified
Component: mupdf (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P4 enhancement
Assignee: MuPDF bugs
URL: http://www.gdesigns.org/temp/test.zip
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-29 04:43 UTC by steve smith
Modified: 2021-01-11 15:46 UTC (History)
4 users (show)

See Also:
Customer:
Word Size: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description steve smith 2009-07-29 04:43:39 UTC
We currently use GS8.64 to convert PDF files in our application EasyCopy.  
Some of the PDF files we have take a LONG LONG time to convert so we have 
started looking at MuPDF as it seems faster.  

We are having problems with MuPDF when converting some PDF files to PNM even 
at 200dpi, in some cases we need to convert at up to 600dpi.

The URL contains a zip file with one of the problem PDF's and a screen dump 
showing the error.

Ideally we would like GS to work a lot faster so that we don't have to do 
extra development to incorporate MuPDF.

Steve Smith
Comment 1 Tor Andersson 2009-07-29 08:17:26 UTC
As you can see from the error message, it refuses to draw because it cannot allocate the 600 megabytes 
that it needs to decode the 8400 x 17979 image. It works fine for me without crashing. It's slow, but that's 
to be expected of any software that has to deal with images of that size. Try setting your malloc limits 
higher, or splitting the image into smaller pieces when generating the PDF.

Ideally MuPDF should decode and render large images piecemeal to deal with these oddball files more 
gracefully.
Comment 2 steve smith 2009-07-29 08:24:43 UTC
How do i set the malloc limit higher.  I can't see it in the pdfdraw command 
line options
Comment 3 Ray Johnston 2009-07-29 10:32:34 UTC
As Tor points out, this PDF is nothing but one big JPEG compressed image.

On my laptop (2GHz Core 2) it takes 36 seconds to write the 455Mb clist (this
works out to about the size of the uncompressed image) and completes to png16m
600 dpi in 459 sec (434 Mb png file).

Using -dNumRenderingThreads=2 completes in 337 sec. I tried this on my Quad core
3.2GHz machine with RAID-5 yet (peeves) and with -dNumRenderingThreads=4 it
completes in 171 sec (5 sec to write the clist). On peeves, it took 189 sec
without -dNumRenderingThreads=4.

Thus GS can be sped up by using -dNumRenderingThreads=# where # is the number
of cores on the processor. Getting a faster machine helps a lot too :-)

The speed up is not very significant when using 4 cores probably because most
of the time is spent doing the PNG compression which is _not_ multi-threaded
by -dNumRenderingThreads=4. Running to the ppmraw device -o /dev/null on
peeves completes in 10 sec (24 sec cpu time) which shows the processing time
in GS without the PNG compression

Also, building gs with 'BAND_LIST_STORAGE=memory' make option will speed
things up since it won't need to write the 450Mb clist out to disk, but this
is not really the bottleneck.

All in all, it seems that things in both GS and MuPDF are working reasonably
as expected. I would like to know the comparative timings of muPDF vs gs.

I guess this bug can remain open for Tor to track the needed improvement to
incrementally load images.
Comment 4 Ray Johnston 2009-07-30 10:15:26 UTC
It is worth doing, but marking this as 'enhancement'
Comment 5 steve smith 2009-08-02 09:40:40 UTC
Thanks for the comment guys.  I guess the big problem i have is why is Adobe 
Acrobat so much faster opening the file and saving-as an Image format.

Steve
Comment 6 Ray Johnston 2009-08-02 10:41:29 UTC
On my laptop (see GS timings in comment #3) Acrobat 7 Professional takes 22
seconds to open the file and takes 135 seconds to save as PNG at 150 dpi (total
157 sec).

At 300 and 600 dpi, Acrobat complains "the image is too wide" and won't do it

At 150 dpi, Ghostscript (once again on my laptop) Ghostscript completes in 100
sec. (including the 36 seconds to load the clist). Thus Ghostscript is about
1.6 times faster than Acrobat 7 Professional.

What version of Acrobat are you using, what resolution, what format raster out,
and what timing do you see? 
Comment 7 Robin Watts 2021-01-11 15:46:13 UTC
The test file was not attached to the bug, therefore we cannot test it anymore. In the past 11 years, MuPDFs image handling has been substantially rewritten to allow for improved handling of large images. Therefore this is probably solved.

Please reopen another bug, and attach the test file if this is still a problem.