Bug 688574 - Segv in split_curve_s() running self-intersect2.ps
Summary: Segv in split_curve_s() running self-intersect2.ps
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PS Interpreter (show other bugs)
Version: master
Hardware: PC Linux
: P3 normal
Assignee: Alex Cherepanov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-28 17:24 UTC by Alex Cherepanov
Modified: 2008-12-19 08:31 UTC (History)
1 user (show)

See Also:
Customer:
Word Size: ---


Attachments
The patch to PS files that causes SEGV (4.51 KB, patch)
2006-02-28 17:26 UTC, Alex Cherepanov
Details | Diff
The shell script (328 bytes, application/octet-stream)
2006-02-28 17:30 UTC, Alex Cherepanov
Details
disassembly and registers (11.53 KB, text/plain)
2006-03-02 05:02 UTC, Alex Cherepanov
Details
patch (1.60 KB, patch)
2006-03-12 20:37 UTC, Alex Cherepanov
Details | Diff
improved patch (1.64 KB, patch)
2006-04-01 14:04 UTC, Alex Cherepanov
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Cherepanov 2006-02-28 17:24:13 UTC
I've discovered a crash during local regression testing that turned out to
be sensitive to the environment and rather difficult to debug.
Steps to reproduce:
[alexcher@bufo gs_test]$ uname -a; gcc --version
Linux bufo 2.4.7-10 #1 Thu Sep 6 16:46:36 EDT 2001 i686 unknown
2.96
Release build. Debug works.
Apply the attached patch ps PS files.
Run the attached shell script.

The crash doesn't happen under GDB or when small changes are done to the
environment, for inatance removing -dFirstPage=1
valgrind finds an unrelated problem that I've posted earlier.
Comment 1 Alex Cherepanov 2006-02-28 17:26:56 UTC
Created attachment 2063 [details]
The patch to PS files that causes SEGV
Comment 2 Alex Cherepanov 2006-02-28 17:30:08 UTC
Created attachment 2064 [details]
The shell script
Comment 3 Alex Cherepanov 2006-03-01 17:24:56 UTC
Here's the stack trace from the core dump.
(gdb) bt
#0  0x08105d65 in split_curve_s (pole=0xbfff8cb0, q0=0xbfff7c00, q1=0xbfff7750,
pole_step=4) at ./src/gxshade6.c:1245
#1  0x0811ec68 in fill_patch (pfs=0xbfffd130, p=0xbfff8cb0, kv=0, kv0=0, kv1=0)
at ./src/gxshade6.c:3036
#2  0x0811f8ac in fill_patch (pfs=0xbfffd130, p=0xbfff9d60, kv=0, kv0=0, kv1=0)
at ./src/gxshade6.c:3342
#3  0x0811f8ac in fill_patch (pfs=0xbfffd130, p=0xbfffae10, kv=1, kv0=1, kv1=1)
at ./src/gxshade6.c:3342
#4  0x0811f8ac in fill_patch (pfs=0xbfffd130, p=0xbfffbec0, kv=2, kv0=2, kv1=2)
at ./src/gxshade6.c:3342
#5  0x0811f8ac in fill_patch (pfs=0xbfffd130, p=0xbfffc6c0, kv=4, kv0=4, kv1=4)
at ./src/gxshade6.c:3342
#6  0x0812203d in patch_fill (pfs=0xbfffd130, curve=0xbfffcbb0, interior=0x0,
transform=0x81041c4 <Cp_transform>)
    at ./src/gxshade6.c:3576
#7  0x0810458b in gs_shading_Cp_fill_rectangle (psh0=0x8526f40, rect=0xbfffddb0,
rect_clip=0xbfffddd0, dev=0xbfffd8f0,
    pis=0x853c0b4) at ./src/gxshade6.c:330
#8  0x080ffffd in gs_shading_fill_path (psh=0x8526f40, ppath=0xbfffe340,
prect=0x0, orig_dev=0x83ddd64, pis=0x853c0b4,
    fill_background=0) at ./src/gsshade.c:557
#9  0x080ff112 in gx_dc_pattern2_fill_path (pdevc=0xbfffe510, ppath=0xbfffe340,
rect=0x0, dev=0x83ddd64)
    at ./src/gsptype2.c:220
#10 0x08256899 in gx_default_fill_path (pdev=0x83ddd64, pis=0x83a67c4,
ppath=0xbfffea60, params=0xbfffe4a0,
    pdevc=0xbfffe510, pcpath=0x8526724) at ./src/gxfill.c:607
#11 0x0826dc0f in gx_fill_path (ppath=0xbfffea60, pdevc=0xbfffe510,
pgs=0x83a67c4, rule=-1, adjust_x=0, adjust_y=0)
    at ./src/gxpaint.c:53
#12 0x080fdc32 in gs_shfill (pgs=0x83a67c4, psh=0x8526f40) at ./src/gscolor3.c:88
#13 0x08124919 in zshfill (i_ctx_p=0x83b6f98) at ./src/zshade.c:80
#14 0x080b3e80 in call_operator (op_proc=0x81248ec <zshfill>, i_ctx_p=0x83b6f98)
at ./src/interp.c:107
#15 0x080b60f4 in interp (pi_ctx_p=0x83953a4, pref=0xbffff118,
perror_object=0xbffff190) at ./src/interp.c:1492
#16 0x080b42e0 in gs_call_interp (pi_ctx_p=0x83953a4, pref=0xbffff118,
user_errors=1, pexit_code=0xbffff18c,
    perror_object=0xbffff190) at ./src/interp.c:488
#17 0x080b41d4 in gs_interpret (pi_ctx_p=0x83953a4, pref=0xbffff118,
user_errors=1, pexit_code=0xbffff18c,
    perror_object=0xbffff190) at ./src/interp.c:446
#18 0x080ac406 in gs_main_interpret (minst=0x83951d0, pref=0xbffff118,
user_errors=1, pexit_code=0xbffff18c,
    perror_object=0xbffff190) at ./src/imain.c:304
#19 0x080acb94 in gs_main_run_string_end (minst=0x83951d0, user_errors=1,
pexit_code=0xbffff18c,
    perror_object=0xbffff190) at ./src/imain.c:628
#20 0x080aca3b in gs_main_run_string (minst=0x83951d0,
    str=0x83ced08
"<2f686f6d652f67686f73747363726970742f636f6d7061726566696c65732f73656c662d696e74657273656374322e7073>.
runfile", user_errors=1, pexit_code=0xbffff18c, perror_object=0xbffff190) at
./src/imain.c:569
#21 0x080aed44 in run_string (minst=0x83951d0,
    str=0x83ced08
"<2f686f6d652f67686f73747363726970742f636f6d7061726566696c65732f73656c662d696e74657273656374322e7073>.
runfile", options=3) at ./src/imainarg.c:780
#22 0x080aed08 in runarg (minst=0x83951d0, pre=0x833b652 "",
    arg=0x83bc300 "/home/ghostscript/comparefiles/self-intersect2.ps",
post=0x8343074 ".runfile", options=3)
    at ./src/imainarg.c:771
#23 0x080aeaa6 in argproc (minst=0x83951d0, arg=0xbffffd65
"/home/ghostscript/comparefiles/self-intersect2.ps")
    at ./src/imainarg.c:706
#24 0x080ada4c in gs_main_init_with_args (minst=0x83951d0, argc=12,
argv=0xbffffbb4) at ./src/imainarg.c:214
#25 0x0804ad62 in main (argc=12, argv=0xbffffbb4) at ./src/gs.c:47
#26 0x401c1507 in __libc_start_main (main=0x804ad30 <main>, argc=12,
ubp_av=0xbffffbb4, init=0x804a194 <_init>,
    fini=0x8292d60 <_fini>, rtld_fini=0x4000dc14 <_dl_fini>, stack_end=0xbffffbac)
    at ../sysdeps/generic/libc-start.c:129
(gdb)
Comment 4 leonardo 2006-03-02 00:06:04 UTC
Are you sure that it is exactly a segv ? I'm unclear how a segment violation 
can happen at this point. Maybe it's a floating point exception ? Or you 
observe a stack frame damage ? Check addresses with disassembly and registers 
values.
Comment 5 Alex Cherepanov 2006-03-02 05:02:19 UTC
Created attachment 2071 [details]
disassembly and registers

Yes, this is a SEGV. Attached is the disassembly and the registes.
I don't see any clear signs of the stack damage.
Comment 6 leonardo 2006-03-02 20:45:33 UTC
Alex,

From what you saying I conclude that it writes outiside a segment at -52(ebp). 
However ebp is the C stack frame pointer when executing :

0x8105d65 <split_curve_s+289>:  mov    %edi,0xffffffcc(%ebp)

Thus either it's a stack overflow (doubtly due to the small stack), or the ebp 
register has bean loaded with a bad value. 

1. Please know ebp when entring 'main' and compute the stack size from it. 
2. A bad value could be loaded into ebp if some function wrote outside a stack-
allocated array in a function, which worked before the crash point, when ebp is 
loaded at a 'return'. In this case all stack frame members should look wrongly,
rather it may be not obvious due to code optimization.
3. Aslo ebp could be damaged if a function call has been compiled with a wrong 
prototype, or with default types if no prototype. Please check the compiler 
listing. It is not necessary in this module, color conversion routines are 
suspicious as well due to the change to gs_cspace.ps .

From this disassembly I conclude that split_curve_s and split_patch have been 
expanded inline, and the pole_step variable was substitued with a constant. The 
topmost stack frame has no data, but the frame #1 may give some information.

I'm unclear about the dependence on the PS patch. As far as I could figure out, 
it doesn't change PS array sizes, so that memory allocation should be same. A 
minor effect could be caused by an earlied definition (appearence) of the 
name .systemvar, but it's unlikely. Try to replace .systemvar with .blablabla 
to check whether an error branch has been executed. Maybe it erroneusely 
continues after returning with an error from a color conversion routine.

I'll not be able to do a real debugging until I return from CA on March 8.
Comment 7 Ray Johnston 2006-03-08 10:07:27 UTC
Please spend some time testing builds with gcc 3.3 or 3.4 to make sure that this is not 
just some problem in gcc 2.9x. If you can't get builds to fail with 3.3 or 3.4, then we 
will just close this and recommend that people move to gcc 3.3 or later. 
Comment 8 Alex Cherepanov 2006-03-12 10:01:27 UTC
Here's an interesting observation. The program doesn't fail if run as:
/home////////ghostscript/gs_segv/gs/bin/gs ...
The program fails if the path is shorter.

Comment 9 Alex Cherepanov 2006-03-12 20:37:03 UTC
Created attachment 2093 [details]
patch

Work around a bug in GCC 2.96 for x86. Pre-load stack pages to avoid a SEGV.

DETAILS:
It is well known that GCC 2.96 for x86 sometimes forgets to adjust $esp
and leaves automatic variables at small distance below the stack pointer.
Apparently, when the access to the automatic variable causes a page fault
Linux sends a SEGV signal instead of committing the page if the access
happens below the stack pointer. Pre-loading the stack pages from a known
good context resolves the problem.

DIFFERENCES:
Testing now.

Discussion.
I didn't check the code, generated by other versions of GCC, but Valgrind
has a special work-around for the access below the stack pointer only
for GCC 2.96.

It is possible to write a custom test to detect the bug in the configure
script.
I'm not sure whether it's worth the trouble.
Comment 10 Alex Cherepanov 2006-03-12 20:58:28 UTC
I've just checked the generated code -- GCC 3.3.5 adjusts the stack.
Comment 11 leonardo 2006-03-21 04:07:16 UTC
Raph, please review the patch.
Comment 12 Ray Johnston 2006-04-01 09:28:49 UTC
Rather than leaving a 'no-op' call to commit_stack_pages in most instances,
I think it would be more clear to remove the initial '#else' clause and
instead add an additional #if at the point of invocation. The revised patch
in that area would be:

+#ifdef NEED_COMMIT_STACK
+/* 
+ * It is well known that GCC 2.96 for x86 sometimes forgets to adjust $esp
+ * and leaves automatic variables at small distance below the stack pointer.
+ * Apparently, when the access to the automatic variable causes a page fault
+ * Linux sends a SEGV signal if the access happens below the stack pointer.
+ * Pre-loading the stack pages resolves the problem.
+ */
+private void 
+commit_stack_pages( void )
+{
+    char buf[65536]; /* In most cases GS survives in 64K stack */
+    int i;
+    for(i=0; i<sizeof(buf)-1; i+=1024)
+    buf[i]=0;
+}
+#endif
+
+
 int
 main(int argc, char *argv[])
 {
-    int exit_status = 0;
-    gs_main_instance *minst = gs_main_alloc_instance(gs_malloc_init(NULL));
+    int exit_status, code;
+    gs_main_instance *minst;
 
-    int code = gs_main_init_with_args(minst, argc, argv);
-
+#ifdef NEED_COMMIT_STACK    /* hack for bug in gcc 2.96 */
+    commit_stack_pages();
+#endif
+    exit_status = 0;
+    minst = gs_main_alloc_instance(gs_malloc_init(NULL));
+    code = gs_main_init_with_args(minst, argc, argv);
+    
 #ifdef RUN_STRINGS
     {				/* Run a list of strings (for testing). */
 	const char **pstr = run_strings;

It's OK with me to commit this patch, but we could also just issue an
#error to anyone running this broken compiler version.
Comment 13 Alex Cherepanov 2006-04-01 14:04:21 UTC
Created attachment 2140 [details]
improved patch

This is the patch that follows Ray's suggestions.
GCC 2.96 was included in many GNU/Linux distributions including Red Hat 7.
I'm going to commit the patch because:
 - old GNU/Linux boxes work well enough and may be not upgraded. 
 - at least one of the customers is running even older gcc 2.95.
 - the patch doesn't affect other gcc versions
 - the patch documents a general problem that was rather difficult to find.
Comment 14 Alex Cherepanov 2006-04-01 14:18:35 UTC
The patch #2 is committed, revision 6697.