Bug 688876 - garbager :Valgrind reports uninitialized data
Summary: garbager :Valgrind reports uninitialized data
Status: RESOLVED DUPLICATE of bug 690176
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: General (show other bugs)
Version: master
Hardware: PC Windows XP
: P4 normal
Assignee: leonardo
URL:
Keywords: bountiable
Depends on:
Blocks:
 
Reported: 2006-09-07 06:26 UTC by leonardo
Modified: 2008-11-27 10:35 UTC (History)
1 user (show)

See Also:
Customer:
Word Size: ---


Attachments
range.txt.bz2 (498.31 KB, application/x-bzip2)
2006-09-14 08:03 UTC, Alex Cherepanov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description leonardo 2006-09-07 06:26:03 UTC
See http://ghostscript.com/pipermail/gs-cvs/2006-September/006782.html
about igc.c .
Comment 1 leonardo 2006-09-07 09:56:19 UTC
+{
+   <gc_trace_chunk (igc.c:745)>
+   Memcheck:Cond
+   fun:gc_trace_chunk
+   fun:gs_gc_reclaim
+}
Comment 2 Alex Cherepanov 2006-09-09 11:08:39 UTC
See bug 688873 for a better log-and-suppression file.
Comment 3 Alex Cherepanov 2006-09-13 08:03:32 UTC
This is one case of suspected bug in GC in details. Ghostscript is compiled with
-O0 option to avoid the side effects of optimization, if any.


valgrind --error-limit=no --db-attach=yes --tool=memcheck gs/debugobj/gs -Z@
-dNOPAUSE -dBATCH\
-sDEVICE=pgmraw -sOutputFile=/dev/null /home/alexcher/comparefiles/148-05.ps

After reporting a few trivial bugs, we have this message without -Z@ .

==14880== Conditional jump or move depends on uninitialised value(s)
==14880==    at 0x811E871: ptr_struct_mark (igc.c:1070)
==14880==    by 0x811E1AA: gc_trace (igc.c:860)
==14880==    by 0x811CABE: gs_gc_reclaim (igc.c:326)
==14880==    by 0x80EA24E: gs_vmreclaim (ireclaim.c:153)
==14880==    by 0x80EA035: ireclaim (ireclaim.c:75)
==14880==    by 0x80E3A8D: interp_reclaim (interp.c:419)
==14880==    by 0x80E719C: interp (interp.c:1670)
==14880==    by 0x80E3CA5: gs_call_interp (interp.c:488)
==14880==    by 0x80E3B1D: gs_interpret (interp.c:446)
==14880==    by 0x80D894A: gs_main_interpret (imain.c:214)
==14880==    by 0x80D9424: gs_main_run_string_end (imain.c:530)
==14880==    by 0x80D92F9: gs_main_run_string_with_length (imain.c:488)


With -Z@ this bug is not observed. If gs_alloc_memset() is nodified
to clear the memory for 12983 invocation only, the bug is not
observed either. 

static ulong set_count = 12983;
void gs_alloc_memset(void *ptr, int /*byte */ fill, ulong lsize)
{
    ulong msize = lsize;
    char *p = ptr;
    int isize;

    if(set_count)
      { set_count--;

        if (set_count == 0)
          {
            for (; msize; msize -= isize, p += isize)
              {
                isize = min(msize, max_int);
                memset(p, fill, isize);
              }
          }
      }
}

This is a back trace from gs_alloc_memset() on 12983 invocation.

#0  0x08302d3c in gs_alloc_memset (ptr=0x44573b0, fill=161, lsize=20000) at
./src/gsmemory.c:96
#1  0x082ff3c0 in gs_heap_alloc_bytes (mem=0x4023028, size=20000,
cname=0x843bb2e "chunk") at ./src/gsmalloc.c:195
#2  0x083023a3 in gs_locked_alloc_bytes_immovable (mem=0x4023180, size=20000,
cname=0x843bb2e "chunk")
    at ./src/gsmemlok.c:189
#3  0x083036c2 in gs_retrying_alloc_bytes_immovable (mem=0x4023290, size=20000,
cname=0x843bb2e "chunk")
    at ./src/gsmemret.c:184
#4  0x082dc98b in alloc_acquire_chunk (mem=0x40247ec, csize=20000,
has_strings=1, cname=0x843bb2e "chunk")
    at ./src/gsalloc.c:1678
#5  0x082dc7b2 in alloc_add_chunk (mem=0x40247ec, csize=20000, cname=0x843bb2e
"chunk") at ./src/gsalloc.c:1633
#6  0x082dbbbf in alloc_obj (mem=0x40247ec, lsize=272, pstype=0x8391560,
flags=ALLOC_DIRECT,
    cname=0x838b1f7 "dict_create_contents(values)") at ./src/gsalloc.c:1244
#7  0x082da4ee in i_alloc_struct_array (mem=0x40247ec, num_elements=34,
pstype=0x8391560,
    cname=0x838b1f7 "dict_create_contents(values)") at ./src/gsalloc.c:651
#8  0x0811bc3e in gs_alloc_ref_array (mem=0x40247ec, parr=0x43d019c, attrs=112,
num_refs=33,
    cname=0x838b1f7 "dict_create_contents(values)") at ./src/ialloc.c:198
#9  0x080df1cc in dict_create_contents (size=17, pdref=0xbeb3e7fc, pack=1) at
./src/idict.c:190
#10 0x080deff9 in dict_alloc (mem=0x40247ec, size=17, pdref=0xbeb3e880) at
./src/idict.c:145
#11 0x081161d3 in zdefault_make_font (pdir=0x402d104, oldfont=0x44d48e8,
pmat=0xbeb3e9e0, ppfont=0xbeb3e990)
    at ./src/zfont.c:386
#12 0x08116111 in zbase_make_font (pdir=0x402d104, oldfont=0x44d48e8,
pmat=0xbeb3e9e0, ppfont=0xbeb3e990)
    at ./src/zfont.c:359
#13 0x082ef65f in gs_makefont (pdir=0x402d104, pfont=0x44d48e8, pmat=0xbeb3e9e0,
ppfont=0xbeb3e990)
    at ./src/gsfont.c:509
#14 0x08115fd1 in make_font (i_ctx_p=0x4034bdc, pmat=0xbeb3e9e0) at
./src/zfont.c:319
#15 0x081157a1 in zscalefont (i_ctx_p=0x4034bdc) at ./src/zfont.c:86
#16 0x080e3631 in call_operator (op_proc=0x811572e <zscalefont>,
i_ctx_p=0x4034bdc) at ./src/interp.c:104
#17 0x080e5f33 in interp (pi_ctx_p=0x40233a4, pref=0xbeb3f090,
perror_object=0xbeb3f118) at ./src/interp.c:1261
#18 0x080e3ca6 in gs_call_interp (pi_ctx_p=0x40233a4, pref=0xbeb3f090,
user_errors=1, pexit_code=0xbeb3f120,
    perror_object=0xbeb3f118) at ./src/interp.c:488
#19 0x080e3b1e in gs_interpret (pi_ctx_p=0x40233a4, pref=0xbeb3f090,
user_errors=1, pexit_code=0xbeb3f120,
    perror_object=0xbeb3f118) at ./src/interp.c:446
#20 0x080d894b in gs_main_interpret (minst=0x4023350, pref=0xbeb3f090,
user_errors=1, pexit_code=0xbeb3f120,
    perror_object=0xbeb3f118) at ./src/imain.c:214
#21 0x080d9425 in gs_main_run_string_end (minst=0x4023350, user_errors=1,
pexit_code=0xbeb3f120,
    perror_object=0xbeb3f118) at ./src/imain.c:530
#22 0x080d92fa in gs_main_run_string_with_length (minst=0x4023350,
    str=0x41d7f68
"<2f686f6d652f616c6578636865722f636f6d7061726566696c65732f3134382d30352e7073>.runfile",
length=84,
    user_errors=1, pexit_code=0xbeb3f120, perror_object=0xbeb3f118) at
./src/imain.c:488
#23 0x080d9262 in gs_main_run_string (minst=0x4023350,
    str=0x41d7f68
"<2f686f6d652f616c6578636865722f636f6d7061726566696c65732f3134382d30352e7073>.runfile",
    user_errors=1, pexit_code=0xbeb3f120, perror_object=0xbeb3f118) at
./src/imain.c:470
#24 0x080dbe6d in run_string (minst=0x4023350,
    str=0x41d7f68
"<2f686f6d652f616c6578636865722f636f6d7061726566696c65732f3134382d30352e7073>.runfile",
options=3)
    at ./src/imainarg.c:803
#25 0x080dbe1e in runarg (minst=0x4023350, pre=0x838999b "", arg=0x404a8d8
"/home/alexcher/comparefiles/148-05.ps",
    post=0x8389a35 ".runfile", options=3) at ./src/imainarg.c:793
#26 0x080dbb28 in argproc (minst=0x4023350, arg=0xbeb3fcef
"/home/alexcher/comparefiles/148-05.ps")
    at ./src/imainarg.c:728
#27 0x080da684 in gs_main_init_with_args (minst=0x4023350, argc=7,
argv=0xbeb3fb74) at ./src/imainarg.c:211
#28 0x0804af64 in main (argc=7, argv=0xbeb3fb74) at ./src/gs.c:77
(gdb)

So far I found nothing wrong in gs code. Suggestions are welcome.

Comment 4 leonardo 2006-09-13 09:48:31 UTC
Well, alloc_add_chunk creates a chunk of memory, which will store a lot of 
objects. You clean entire chunk, so you initialize all those objects. We need 
to know which object is important. 

Here is my suggestion :
1. Patch alloc_obj to know how many objects are allocated in same chunk.
2. Use a binary search for finding the object, which needs an initialization. I 
mean yuou patch the code with specific object number interval and apply an 
initialization for objects in that interval, run Valgrind, then divide the 
interval and so forth. Well it will need a dozen of runs for licating the bad 
object.
Comment 5 Alex Cherepanov 2006-09-13 18:32:46 UTC
This is a minimal interval that fixes the bug. 

void gs_alloc_memset(void *ptr, int /*byte */ fill, ulong lsize)
  { ulong msize = lsize;
    char *p = ptr;
    if(set_count)
      { set_count--;
        if (set_count == 0)
          { if(msize == 20000) /* safety */
              { int n = 1196;
                memset(p+n, fill, 1991-n);
              }
          }
      }
  }
Comment 6 leonardo 2006-09-13 23:36:40 UTC
Alex,

Hmm, your interval is not what I expected, but lmaybe we can go with it after 
an additional effort. 

You set up an interval in the address space.
I suggested to do in the object number space, with counting calls to alloc_obj.

Well, lets continue with your interval.

First I'm not clear why did you drop set_count. You need both the count and the 
interval.

Then set a breakpoint to alloc_obj and find which calls return an object, which 
has a non-empty intersection with this interval; If the interval is really 
minimal, it should find a single call. Obtain C stack track from that call.
Comment 7 Alex Cherepanov 2006-09-14 08:03:41 UTC
Created attachment 2480 [details]
range.txt.bz2

Attached is the allocation trace with -ZA. Search for the word "range" for the
memset range in the trace file.

It looks like the 1st offending allocation is
alloc_save_change alloc_change(20) = 0x4457868

The interval is not exactly minimal. I found the start and end points but there

may be holes in the middle. It doesn't have to contain a single object, but
start and end at the affected objects.

Unfortunately, I cannot debug. When I call debugger from Valgring, the debugger

fails afrer a few commands. When I call the same program without Valgrind, I
get 
a different memory allocation pattern and the numbers don't apply.
Comment 8 Alex Cherepanov 2006-09-23 17:13:33 UTC
Affected files:
  messenger16.pdf
  Testform.v1.0.2.pdf
  415-01.ps
  ai2.pdf
  86554321.pdf
  HeiseiMinStd.pdf
  laballade.pdf
  mspro.pdf
  test.pdf
  test_multipage_prob.pdf
  Popafinalrep.pdf
  Altona-Testsuite_p2_S_x3.pdf
  gs.anotherfailure.pdf

The bug is added to toolbin/vg_bugs.supp
Comment 9 leonardo 2006-09-26 01:56:41 UTC
Changing the bug title for an uniform style. The old title was "Valgrind: 
Uninitialized data in garbage collection".
Comment 10 Alex Cherepanov 2006-09-28 16:39:27 UTC
Here's an important discovery: there's no GC-related warnings with
I_FORCE_GLOBAL_GC = true;
Comment 11 leonardo 2006-09-28 17:21:23 UTC
I think "I_FORCE_GLOBAL_GC = true" simply miss the effect due to a different 
memory map. So it is important for reproducing the effect, but it is not 
important for debugging.
Comment 12 Alex Cherepanov 2006-09-29 17:23:57 UTC
I've tested the whole regression set - not just a single file.
It looks like the the object is not marked during the local mark
phase but it is enumerated in the enumeration phase. One of the
offending objects is a data buffer referenced from a stream
structure. How can it be?
Comment 13 leonardo 2006-10-01 14:29:11 UTC
Alex,

If you surely associate the effect with a stream buffer, it's a great progress.

I'm not sure what do you mean saying 'enumeration'. Each object provide 2 
garbager methods : enum_ptrs, and reloc_ptrs. They must deal with same set of 
pointers. A particular bug may appear if they deal with different sets.

Another harmful thing is a pointer from global memory to local memory. Such 
things happen too frequently. See bug 688333. If your casae is such one, please 
don't resolve it as a dup of 688333, because that bug is too bug and hard. If a 
global structure points to a local one, the local one may be released during 
the local heap reclaim, its chunk may be freed, and some later same address 
space may be allocated for a new chunk, which will be uninitialized or 
partially initialized. Tha partial initialization may explain the effect you 
observe.

I suggest you to obtain an additional information : what steram subclass is 
that ? Who allocates the stream and the data buffer ? Are they allocated in 
same memory heap ? What functions enumerate and relocate pointers for that 
stream subclass ? Are that functions deal with same pointer set ?

Hope this helps.
Comment 14 Alex Cherepanov 2008-11-27 10:35:39 UTC
Bug 690176 tackles the same problem from the other end.

*** This bug has been marked as a duplicate of 690176 ***