Bug 696257 - File encountered 'VMerror' error while processing an image
Summary: File encountered 'VMerror' error while processing an image
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: Images (show other bugs)
Version: 9.17
Hardware: PC Linux
: P2 enhancement
Assignee: Ray Johnston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-09 04:58 UTC by artifex
Modified: 2021-10-12 17:51 UTC (History)
3 users (show)

See Also:
Customer:
Word Size: ---


Attachments
input.pdf (9.68 MB, application/pdf)
2015-10-09 05:00 UTC, artifex
Details
Work around patch (1.94 KB, text/plain)
2016-03-09 19:02 UTC, Ray Johnston
Details
Work-around patch (2.33 KB, patch)
2016-03-15 09:04 UTC, Ray Johnston
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description artifex 2015-10-09 04:58:10 UTC
Converting the file input.pdf to tiffg4 with following command:

gs -dNOPAUSE -dBATCH -r600 -sDEVICE=tiffg4 -sOutputFile=gs917.tif input.pdf

I will get following messages:

   **** Warning: File encountered 'VMerror' error while processing an image.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> PDFTron PDFNet, V6.20238
 <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

The TIFF-output is broken. Increasing the BufferSpace does not help.
The PDF has a large format ( ISO A0 portrait ), but with PDF-testfiles from me, which are greater then this one, I have no problems. So I don't believe, that memory limits are exceeded.
Comment 1 artifex 2015-10-09 05:00:52 UTC
Created attachment 11970 [details]
input.pdf
Comment 2 Ken Sharp 2015-10-09 06:24:37 UTC
The VM error is genuine.

The file consists almost entirely of a very large image (11342x9938) with an equally large mask applied. This is turned into a type 3 image by the PDF interpreter (the PostScript equivalent of the PDF construct)

When we interpret the file, internally we create a memory device at the resolution of the output, and render the image and mask together into that memory. We then use this as a regular type 1 image to draw the final result.

Because the memory is allocated at the resolution of the output, very large images can require very very large amounts of memory as the resolution rises, in this case we end up trying to allocate a chunk of memory slightly larger than 2.5 Gb. On my 32-bit build that fails immediately. Chris tells me that on a 64-bit build it gets further, but nevertheless we are trying to allocate monstrous amounts of memory and sooner or later we *are* going to fail as the resolution rises.

There are two solutions that I can see:

1) Allocate the memory at the source size, not the output resolution. This still has [potential for problems as the image size rises. Even at source sizes we would still be allocating > 330 Mb. In addition this might cause rendering errors.

2) Have the device, and the masking, performed post-clist. This should prevent any memory problems (by clipping to the band) and since we would still be rendering at device resolution there should be no rendering problems. I suspect it might well have negative performance implications though.

In any event this is not a problem with the PDF interpreter, I'm assigning it tentatively to Ray, because my preference is to have this done post clist and that would mean significant changes to the clist to deal with it. The assignment and solution may well be changed!

Also making P2 for a customer bug, I'm sure Marcos will tell me if that's not correct.
Comment 3 Chris Liddell (chrisl) 2015-10-20 08:14:20 UTC
My thought was along the lines of a device that accumulates image and mask lines until there are enough in it's buffer to allow it to flush them as a masked image operation.

The simplest case being when the image and mask samples are the same size, handling cases where the mask and image samples differ in size will require more management of lines between the flushes.

The size of the buffer (thus number of lines that can be handled at ones) could be configured (larger for better performance, lower for smaller memory footprint). As we could do the coordinate calculations in device space, it should be feasible to avoid risk of gaps between the flushed groups of lines.


This approach has the advantage of working with non-clist device instances.
Comment 4 Ray Johnston 2015-10-20 08:20:17 UTC
Thanks to Ken for the initial study.

When the mask and the image have the same source resolution, things would be
much simpler than the general PS Type 3 image case that the code is designed to
handle. That's why the processing works at full device resolution currently.

Putting both mask (at its source res) and image (at its source res) into the
clist and processing post-clist has the advantage that the work to render the
image can be done in multiple threads in bands.

The difficulty in the clist comes with providing the mask data for those bands
that need that (with appropriate "support" putting some lines of source data
into more than one band) and then during the clist playback, saving the mask
image for use when the subsequent colored image is read from the band so that
the color image can be painted using the mask. This is probably most easy to
do by having the image mask create a clip path as it is read, then the color
image would paint using the clipper device with that path.
Comment 5 Henry Stiles 2016-03-08 16:36:55 UTC
I had mistakenly told Ray the bumping up gxmclip.h:tile_clip_buffer_request would fix this.  Actually the request is only a few K larger then the max, 18K vs 16K.  Anyway I bumped it up to 32K and that fixes the first vmerror I encountered here in gxmclip.c:

        if (buffer_height <= 0) {
            /*
             * The tile is too wide to buffer even one scan line.
             * We could do copy_mono in chunks, but for now, we punt.
             */
            cdev->mdev.base = 0;
            return_error(gs_error_VMerror);
        }

The build I was using hacked the memory manager to always use heap allocations by setting imem->large_size to 1.  With that in place the file runs and prints correctly, albeit slowly.  Without that change we get another vmerror in ireclaim.c:98

        if (allocated >= mem->gc_status.max_vm) {
            /* We can't satisfy this request within max_vm. */
            return_error(gs_error_VMerror);
        }
  
I'm running 64 bit linux.
Comment 6 Ray Johnston 2016-03-08 21:22:09 UTC
I am not surprised that we still get a VMerror when processing with the normal
(real) memory manager. I do not understand why the hobbled memory setting
(large_size = 1) runs since the VMerror stems from opening a memory device
to hold the mask, and the size of that won't change with the use of single
object allocations.

The size of the failing request is shown with a debug build:

[a+]gs_malloc(large object chunk)(2608004624) = 0x0: exceeded limit, used=25913541, max=25948341

The limit and max_vm values are set to 2Gb, so the VMerror still happens.

IFF I change to code to use "max_ulong" to initialize can I get to the
point where the tile_clip_buffer_request causes a VMerror to be thrown.

Increasing that to 32K, then I can get further, but it still fails, now with
a rangecheck from setparams when attempting to do the context_state_load
after a GC reclaim.

It might be possible to change the max_vm param to unsigned long and get this
to run.

But, it still dodges the issue that we really should change the way image
type 3 is processed to avoid the large allocation.
Comment 7 Henry Stiles 2016-03-08 22:33:56 UTC
(In reply to Ray Johnston from comment #6)
> I do not understand why the hobbled memory setting
> (large_size = 1) runs since the VMerror stems from opening a memory device
> to hold the mask, and the size of that won't change with the use of single
> object allocations.
> 
> The size of the failing request is shown with a debug build:
> 
> [a+]gs_malloc(large object chunk)(2608004624) = 0x0: exceeded limit,
> used=25913541, max=25948341
> 
> The limit and max_vm values are set to 2Gb, so the VMerror still happens.
> 

The memory limit is set to the maximum long, 2^31 - 1 on your windows box and 2^63 - 1 on my 64 bit linux machine.


That's why I said I was using 64 bit linux.  I thought that would come up.  I think everything is working as we'd expect.
Comment 8 Henry Stiles 2016-03-09 10:49:07 UTC
And finally the ireclaim.c vm error can be fixed on 64 bit processors and the file printed successfully by reverting Alex's change to ialloc, maxing it out at 2G as part of (d84be56069dc2579c88d360da2ccc2ad5825a490) 32 bit Postscript integer compatibility.  

This bug aside, I do question crippling our ability to handle large allocations on 64 bit platforms so we can match PostScript Quality Logic tests for max vm.  

Also, I'm a little fuzzy why the GC choked.  I thought the large allocation would only affect the heap allocator yet during reclaim we seem to be checking max_vm for ialloc.  Didn't look into that but if we are mixing up the allocators that could be a problem worth looking at.
Comment 9 Chris Liddell (chrisl) 2016-03-09 15:47:27 UTC
(In reply to Henry Stiles from comment #8)
> 
> This bug aside, I do question crippling our ability to handle large
> allocations on 64 bit platforms so we can match PostScript Quality Logic
> tests for max vm.  

If we care enough, we already have a "CPSI mode" to which we could add a redefinition of the relevant VM related operator, so the tests still run as expected.
Comment 10 Ray Johnston 2016-03-09 19:02:29 UTC
Created attachment 12374 [details]
Work around patch

git patch suitable for: git am ...
Comment 11 Ray Johnston 2016-03-09 19:03:24 UTC
The max_vm problem was due to the ialloc_alloc_state initializing the max_vm
to 0x7fffffff rather than max_long.

Fixing that, and changing the tile_clip_buffer_size to 32768 allows this file
to complete on peeved in 282 seconds (debug build) and does not get a VMerror
during GC reclaim.

This is only a workaround for 64-bit linux builds since the max_long will
still be 2G on Windows (even 64-bit builds) and on other 32-bit systems.

The tile_clip_buffer_size really should not be limited to anything this small,
and the mask image really should not have to be entirely buffered. This can
seriously impact performance and limits use to 64-bit systems. The real fix
for the mask image has already been discussed.

Patch for the work around is attachment #12374 [details]
Comment 12 Henry Stiles 2016-03-09 19:44:35 UTC
(In reply to Ray Johnston from comment #11)
> The max_vm problem was due to the ialloc_alloc_state initializing the max_vm
> to 0x7fffffff rather than max_long.
> 
> Fixing that, and changing the tile_clip_buffer_size to 32768 allows this file
> to complete on peeved in 282 seconds (debug build) and does not get a VMerror
> during GC reclaim.
> 
> This is only a workaround for 64-bit linux builds since the max_long will
> still be 2G on Windows (even 64-bit builds) and on other 32-bit systems.
> 
> The tile_clip_buffer_size really should not be limited to anything this
> small,
> and the mask image really should not have to be entirely buffered. This can
> seriously impact performance and limits use to 64-bit systems. The real fix
> for the mask image has already been discussed.
> 
> Patch for the work around is attachment #12374 [details]

The problem was identified and a fix prescribed (revert that part of Alex's patch) in Comment #8 and then Chris suggested we might use CPSI mode in Comment #9.  

If you're okay with that and others don't mind possibly breaking "max vm" tests and we don't want to use CPSI mode I'm fine with patches.  The log message should say something about why we changed the code from max long to 2G and why we are changing it back.  I think you might have missed reading Comment #8 and #9.  I catch myself missing comments too in these long bugs.
Comment 13 Ray Johnston 2016-03-10 08:05:44 UTC
As long as the CPSI mode check is only performed in ialloc_alloc_state where
gc_status->max_vm is initialized, then the CET tests that check for this
should be fine. I'd be concerned about putting CPSI mode (runtime) checks
in something like ireclaim, but we shouldn't need it.

Note that this is still only a work-around for 64-bit linux, but I'll add
the changes, and run a regression test and then commit the patch if it
passes regression, but leave the bug open since the customer may not be
using 64-bit.
Comment 14 Ray Johnston 2016-03-15 09:04:19 UTC
Created attachment 12388 [details]
Work-around patch

Patch suitable for git am

Increase limits for tile_clip_buffer and Max VM on 64-bit machines

These changes allow the file from bug 696257 to complete on 64-bit gcc
builds where a "long" is 64 bits. Also the MaxLocalVM and MaxGlobalVM
values returned in CPSI mode are truncated to return only the low 32
bits (CET 99-01.PS)
Comment 15 Peter Cherepanov 2020-12-30 02:59:20 UTC
As expected, 64-bit build runs to completion with maximum allocation of 2GB. 32-bit build fails with VMerror after allocating just 800M.