Bug 692550 - PostScript to tiffg4 bus error core dump
Summary: PostScript to tiffg4 bus error core dump
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: General (show other bugs)
Version: 9.04
Hardware: Sun SunOS
: P4 normal
Assignee: Chris Liddell (chrisl)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-27 19:41 UTC by George Norris
Modified: 2011-10-12 16:18 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments
Zip file contiaining issue.ps, 2.ps and 3.ps (275.07 KB, application/x-zip-compressed)
2011-09-27 19:41 UTC, George Norris
Details
workaround patch (966 bytes, patch)
2011-10-06 09:14 UTC, Chris Liddell (chrisl)
Details | Diff
Possible patch for gxht_thresh.c for review (2.33 KB, patch)
2011-10-10 13:26 UTC, Chris Liddell (chrisl)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description George Norris 2011-09-27 19:41:25 UTC
Created attachment 7940 [details]
Zip file contiaining issue.ps, 2.ps and 3.ps

Platform SUNOS
Ghostscript 9.04 (2011-08-05)

Getting Bus Error(coredump) when we do a PostScript to tiff conversion on attached file issue.ps

issue.zip contains: issue.ps, 2.ps, and 6.ps
issue.ps contains run commands for 02.ps and 06.ps.  
I have X'd and 0'd out personal information - this does not cause
or prevent the bug.

Command:
gs -P- -dBATCH -dNOPAUSE -sDEVICE=tiffg4 -r200 -sOutputFile=904_issue.tif issue.ps

8.70 does not core dump on this file

pstack is:
core 'core' of 13786:   /BSG/local/bin/gs -P- -dBATCH -dNOPAUSE -sDEVICE=tiffg4 -r200 -sOutput
 0042d864 mem_mono_copy_mono (81b7b0, d8, 1fff0000, 2d, fffffff0, 423) + 950
 003ffe84 gxht_thresh_plane (34, a5a988, 423, 2, 0, 0) + 7a8
 00406300 image_render_mono_ht (160, 0, ffbfd490, 150, 0, 81b7b0) + 4c4
 004014e8 gx_image1_plane_data (c54240, ca2ac8, 18, ffbfd8bc, 0, 1) + 630
 00403438 gx_image_plane_data_rows (c54240, ca2ac8, 18, ffbfd8bc, ffbfd970, 0) + 18
 003ce50c gs_image_next_planes (ca2958, ffbfd970, ffbfd930, 0, 0, ca1360) + 388
 00113d0c image_file_continue (7e57a0, 0, f80, 4562dc, ffbfd9e8, f78) + 114
 000e70c8 gs_interpret (ffbfe080, 7e57a0, 1, ffbfe6e4, 7dba40, ffbfe5b8) + dd4
 000de298 gs_main_run_string_end (7bcf28, 1, ffbfe6e4, ffbfe6e8, ffbfe6e4, ffbfe6e8) + 28
 000df070 run_string (7bcf28, 823c30, 1, 453778, 823c30, 7e57a0) + 1c
 000df1e0 runarg   (7bcf28, 4537c0, 7eae48, 4537e8, 3, 0) + 114
 000df35c argproc  (7bcf28, 7eae40, 7eae10, fffffffd, 80808080, 1010101) + 124
 000e0b08 gs_main_init_with_args (ffffff92, 8, ffbff62c, feeb7c18, 453800, 453c00) + 1b8
 00057a0c main     (8, ffbff62c, ffbff650, 7bbddc, ff290100, 0) + 38
 0005786c _start   (0, 0, 0, 0, 0, 0) + 5c

Have you seen this issue?  I did not see one in my saved Bugzilla emails.
Can you provide me with a fix or patch?
Does a fix or patch already exist?
Comment 1 Alex Cherepanov 2011-09-28 16:08:40 UTC
I cannot reproduce this on Linux, AMD64 in v.9.04 or the current development
version. Valgrind doesn't show anything relevant either. We are in the wonderful
world of platform-dependent bugs.

Please provide more information your copy of Ghostscript.
Where did you get it?
What compiler did you use?
What gs was configured ?
Do you have any unusual environment variables?
Comment 2 George Norris 2011-09-28 18:04:13 UTC
No unusual environment variables.  Here is the other information:

The Ghostscript, that we are using was downloaded from:
http://sourceforge.net/projects/ghostscript/files/GPL%20Ghostscript/9.04

I used the following to build it:
./configure --prefix=$PREFIX --without-jbig2dec --disable-fontconfig --disable-compile-inits --without-libiconv --with-fontpath=$FONTPATH

It was built on Solaris 5.10  Generic_137137-09 sun4u sparc SUNW using gmake GNU make 3.80 The compiler is gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath) Thread model: posix

Let me know if you nee any other information.  Thanks...
Comment 3 George Norris 2011-09-28 19:12:55 UTC
We have gs 9.04 installed on Linux and SUN.  We found that it did not happen on the Linux box.  So, I can confirm that it sppears to be platform specific to SUN boxes.
Comment 4 Alex Cherepanov 2011-10-04 19:17:20 UTC
Reassigning to Chris, who has access to a Sparc box.
Comment 5 Chris Liddell (chrisl) 2011-10-05 17:59:59 UTC
I can reproduce the problem on my Solaris/Sparc box.

More when I have it......
Comment 6 Chris Liddell (chrisl) 2011-10-06 09:14:32 UTC
Created attachment 7968 [details]
workaround patch

I have found the problem, but I don't yet have a solution.

However, I've attached a patch which is a workaround (it removes a performance optimisation which is primarily aimed at PCL input). It will get you up and running until a full fix is available.
Comment 7 Chris Liddell (chrisl) 2011-10-10 13:26:19 UTC
Created attachment 7977 [details]
Possible patch for gxht_thresh.c for review

I've taken this about as far as I can, it really needs Michael's eye cast over it now.

The crash happens in gdevm1.c at line 672. Inside that macro is a loop (yuck!), and the bus error occurs because we're dereferencing a uint * at an unaligned address on the second iteration of the loop. This stems from the stride of the source bitmap being 45, so that, although the base address of the bitmap is aligned to 4 bytes, subsequent scanlines (probably) are not. SPARC being one of the few platforms that strictly enforces pointer alignment.

The source of the problem is actually in gxht_thresh.c, the function gxht_thresh_image_init(), where the penum->line_size and penum->ht_stride are not calculated to account for pointer alignment.

Also, it looks to me as if there is an invalid assumption made, that mem_mono_copy_mono() writes samples in 16 bit (ushort) "chunks" - that is only true on little-endian platforms. On big-endian platforms, we write 32 bit (uint) "chunks" (see line 499 in gdevm1.c).

The attached patch uses the "bitmap_raster()" macro to calculate the stride (to be consistent with other raster memory allocations), and also changes what I *think* may be required to deal with the extra unaligned bits at the beginning of the area being marked (the value of penum->ht_offset_bits is set using "bitmap_raster()" to get appropriate alignment for the value). This will overestimate the value, but I *think* it's a small price for consistent code.

The patch fixes the problem on SPARC, and a cluster run shows, *I think*, no new indeterminisms. One caveat is that I don't know if there are other places in the code which make assumptions about the values being setup as they are without this patch.

Just ping me if you need me to run tests on a SPARC machine.
Comment 8 Michael Vrhel 2011-10-11 19:11:50 UTC
Chris,  The patch looks reasonable to me.  I agree that I should have been using bitmap_raster to calculate the stride.

Do you want me to move forward with this or do you want to test and verify and commit since you did all the work?
Comment 9 Chris Liddell (chrisl) 2011-10-12 07:11:00 UTC
Michael, Wow I didn't expect you to get to this until you'd been back a few days.

I'm quite happy to take responsibility/blame for it - I've done quite a lot of testing of the fix, so I think between that, and your confirmation that it doesn't look insane, we're good.

I *would* like to run more general testing on the SPARC - clearly, it's the only platform we currently have around which is strict with alignment, but that is, I think, separate from this bug.

I'll commit later today.
Comment 10 Chris Liddell (chrisl) 2011-10-12 16:18:28 UTC
Fixed in:

http://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=1a9f31