Customer reports: The attached file causes gs (v 8.53) to crash with all raster devices (e.g. tiffg3), at 2880 dpi (this corresponds, for the user who reported the crash, to a 720 dpi print with a 400% scale). The file has transparency. I've check the core, and found out that gs tries to create a bitmap of 59020 x 72481 pixels in 8 bits (this is he bitmap needed to process transparency, I suppose). This should normally make 4, 3 Gb, but because all functions that get and returns number of bytes needed for memory allocation use unsigned int, it becomes finally 200 Mb or something like that, so the allocation is ok, but of course the first fill_rectangle crashes. Do you have any plan to process memory allocation quantities in 8 bytes? I'm not talking specifically of transparency, but there are other problems caused by the fact that gdev_mem_bits_size() (and other functions) return an unsigned int instead of a long long, as I've alreay reported (e.g. bug 688808). I've tried to insert a line of code, in gdev_mem_open_scan_lines(), to check that the memory allocation will not overflow, and to return an error if it will (gs_error_limitcheck), thus simulating a memory allocation error. Because I would prefer GS to exit properly with an error message. But what happened is that only a warning was reported: "File encountered 'limitcheck' error while processing an image". And the file was ripped normally, but of course half the data were missing. Is that the normal behavior? I personnaly think that it would be much better to make a fatal error (not only in my case, but also when it's really a memory allocation failure). Is there a switch, an option to do that? Ray adds: The failure is in the image3x handling where it tries to allocate the bitmap for the mask. This is _not_ resolved by clist banding as the customer thought it might.
Created attachment 2734 [details] KOHN.pdf
This is another instance of the image3x mask buffer handling that needs to avoid allocation of the complete image at parsing time and instead use the clist (or an intermediate clist the way the pattern clist accumulator does).
This is _probably_ a duplicate of bug 689618. Leaving up to Igor to determine if it is and mark is as duplicate if he determines it is.
A correction : (int64_t)width*height,x = 0x00000001108858f4 It is few bigger than 4G, so it can't fit into a 32 bits address space. Comment #2 is correct, and transparency isn't relevant.
Here is my analyzis and conclusion about the bug nature and further development strategy. As Ray correctly wrote in Comment #2, currently images with soft mask cannot write to clist. Instead that a decomposition happens before writing to clist. Doing so, the graphics library allocates a buffer for entire mask at the device resolution, which in the test case runs over the 4G limit of the address space, To work around it we need to write the image with soft mask into the clist and allocate buffers during the clist interpretation, restricting the buffer size with the band size. Readng Ghostscript code, I can see that images with soft mask are implemented with some assumptions, which are never satisfied with modern PDF. Those assumptions are : 1. The image may have both opacity and shape masks. 2. The main image and its masks may be encoded in a single stream. 3. The main image and its mask must coinside in the user space with an upside down mapping or with left to right flip, or both. I guess this design was done by Peter and Raph from early definitions of PDF transparency (which were not clear), as an analogue of Type 3 images. Possibly they also took in account some othat graphic formats in order to generalize the graphics library for future developmant. Rather I'm not sure whether we still need these assumptions, I don't want to break them, becauase a big effort is spent in the past to satisfy them in other parts of the old code. Particularly, the upside down mapping of the image to the mask, together with a single stream encoding imply that the top down rendering would need a reverse stream reading, which isn't possible. Due to that, as a strong consequence, we need to allocate a temporary buffer either for mask or for the image so that the stream is being read in the normal direction only, and the masking is done *after* the mask and image are *completed*. This approach is really implemented in gximag3x.c . Remark : If we do not need to maintain a single data source (in which the main image and a mask are interleaved), then we could code the implementation in gs/lib like this : allocate a soft mask buffer, render the soft mask to it as a Type 1 image, close the soft mask buffer, install the soft mask buffer as a transparency mask, then render the main image as a Type 1 image, drop the mask buffer. In this implementation gximag3x.c could be dropped forever, and the big mask buffer problem dissolutes themselves without a special effort. Nevertheless there exist a case, when we do have a single stream for the main image and the mask : the JPXDecode filter may supply both in a single stream. So possibly Peter accounted this case in his design with a strong requirement of a single scan through JPXDecode. BTW few mothes ago Ralph and me discossed the JPXDecode/transparency issue in the tech mailing list, and I gave some wrong information about capabilities of the old code. I'm sorry for that, and this remark must replace the old info. End of remark. Thus we need to maintain both single and multiple streams, and allocate buffers after clist. Of course we'd like to reuse the old code so much as possible. So here is my suggestion how to implement it. When begin_typed_image is called at first time with an "image with soft mask", the current device is pdf14clistRGB in the test case, or a transparency clist writer for another blending space, and the worker is pdf14_clist_begin_typed_image. Few later and few C stack frames upper the function gx_begin_image3x_generic is called and I intend to reuse this construction. I noticed that the failure happens in make_midx_default, which is supplied by gx_begin_image3x. I want to change pdf14_clist_begin_typed_image so that it *modifies* the recieved image enumerator with inserting new functions for 'make_midx', 'make_mcde', 'render' and release. Note the replacement of the methods is kinda hackish, so it needs a perfect commenting. An alternative is to release the supplied enumerator and allocatre an own one, but I'm not sure that image_cleanup never relies on the original enumerator. An additional investigation may be useful. A third alternative is to store a pointer from the original enumerator to the substituted enumerator, but in this case the original enumerator will need to expand with an opaque pointer to the substitued enumerator, and the release method still needs a replacement. In any case the replacement should be represented as a dynamic subclassing. The new fuctions allocate temporary devices, which perform a writing of image data to clist instead the rasterization. Actually both devices may be instances of same class, and only differ each from another with a private 3- state field, which encodes the role : main image, shape mask, opacity mask. When the "render" method is called from the (sub)image enumerator (we know that we supplied an instandaerd enumerator, which has instandard methods), its (a new instandard) implementation finds the clist writer device (well we'll need a pointer to it from our temporary midx and mcde), and calls clist_image_plane_data from it. The latter calls clist_image_plane_data, which writes cmd_opv_image_data. Here we need to extend cmd_opv_image_data format with the 3-state switch for the subimage role. I think writing and extra byte won't be too harmful. I think it's enough for the clist writing, now go to clist reading. read_begin_image needs to extend with image type 3x, which allocates the temporary devices for rendering the mask and the main image. Those devices are transparency buffers. Note that a mask buffer and a group buffer needs to be allocated at same time, and this fact breaks an old invariant that only one buffer is active at a time. All this may be done with proper calls to gdevp14.c functions. When the clist reader recieves cmd_opv_image_data, it dispatches the data to a corresponding temporray device. Note that cmd_opv_ext_create_compositor is not exactly applicable for allocating the soft mask bufer, because the image group is active at that time. So the mask buffer must be allocate separately by the clist interpreter, and passed to gdevp14.c after it is *completed*. IMO the best interaction should go like this : 1. send PDF14_BEGIN_TRANS_GROUP for the main image. gdevp14.c allocates the blendin g space buffer. 2. Allocate the mask buffer as a property of clist interpreter. 3. Render both the main image and the mask(s). 4. send PDF14_BEGIN_TRANS_MASK with 'replacing' flag unset. 5. Copy the mask to the transparency buffer. Note it's a pixel to pixel copying, butb the coverage area is not necessary parallel to coordinate axes. 6. send PDF14_END_TRANS_MASK 7. send PDF14_END_TRANS_GROUP Well here we need to account the optimization about degenerate transparency groups. For doung so we need to send the mentioned command while clist writing rather than interpretation. So here we need to go back to the clist writing and redesign it with these commands like this : 1. When begin_typed_image is written to clist, first write PDF14_BEGIN_TRANS_GROUP, then cmd_opv_begin_image. 2. Immediately after the image ends, send PDF14_BEGIN_TRANS_MASK PDF14_END_TRANS_MASK PDF14_END_TRANS_GROUP. The clist interpreter will behave as this : 1. recieve PDF14_BEGIN_TRANS_GROUP and process it as usual. 2. recieve cmd_opv_begin_image with image type 3x, allocate mask buffer and its temporary device. 3. recieve cmd_opv_image_data, extract the subimage role from it and dispatch either to the target device or to the mask buffer device. 4. recieve PDF14_BEGIN_TRANS_MASK and process it as usual. 5. Replace the transparency buffer with the mask buffer. This is a new operation which never happened before. It will need a special signal from the clist reader to gdevp14.c stuff. IMO the best way to represent the signal is a new special argument pointer of PDF14_BEGIN_TRANS_MASK. 6. Recieve PDF14_END_TRANS_MASK and process it as usual. 7. recieve PDF14_END_TRANS_GROUP Note gdevp14.c can't handle 2 masks (both opacity and shape). This design delivers them both to gdevp14.c, and doesn't define what to do next if them both come. This implementation schedule affects modules which are owned by Alex, Ralph, Michael and me, and my part is the minimal one (just call gx_begin_image3x from the clist interpreter). Therefore I'll not start making the patch before recieving a confirmation from bosses to exclude possible collisions.
temporary reassignment until it can be discussed at the Tuesday color meeting.
IMO the best way to fix this and other similar issues is to change ownership of the modules gstrans.c and gdevp14.c to me. I believe that the ownership of the was choosen from scratch, and the current owner would be happy to get rid of them. These modules are not related to colors. Alternatively give them to Alex because me and he collaborate with no problem.
In addition to the JPXDecode images (and I've never seen a PDF file that used the tranparency features; we should investigate what Adobe Reader actually does, since the spec is a bit vague) both XPS and SVG include PNG as an image source type which encodes an interleaved transparency plane. Currently the xps code extracts that plane separately and uses it to create a separate mask, so this is more of a convenience and efficiency issues than an absolute requirement. But generally the way transparency is handled as a separate mask in pdf rather than an intrinsic part of object drawing is old fashioned by the standards of newer formats. So I do not think breaking assumption 2 is a good direction to go for the future.
As agreed in a conference call, I take assignment for a quick implementation of the "Remark' approach before release. The rest to be done after release together with Alex's problem about the size of image enum.
Setting P1 because we want it before release.
Created attachment 4238 [details] patch.txt A suggested patch to HEAD as a temporary workaround. Iy is being tested now.
The patch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2008-July/008459.html provides a temporary workaround against the crash. We still need to implement the Comment #5 method doe to SMaskInData. So please don't close the bug. Will need to discuss it after release. Decreesing to P2 sinse the crash is fixed.
I'm closing this bug because the crash has been fixed with Comment#12 patch. We still have a big work here, because currently we can't handle SMaskInData. I opened the new bug 690053 "Unimplemented SMaskInData" for tracking that issue.
*** Bug 689618 has been marked as a duplicate of this bug. ***