Bug 694871 - Crash in jpeg_free when using run operator on certain EPS files
Summary: Crash in jpeg_free when using run operator on certain EPS files
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PS Interpreter (show other bugs)
Version: 9.10
Hardware: PC All
: P1 normal
Assignee: Chris Liddell (chrisl)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-07 06:42 UTC by Michael Toftdal
Modified: 2014-02-09 21:30 UTC (History)
3 users (show)

See Also:
Customer: 200
Word Size: 32


Attachments
Tar file with rogtest5.test and Test.eps. Test.eps wraps rogtest5.eps. (7.49 MB, application/octet-stream)
2014-01-07 06:42 UTC, Michael Toftdal
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Toftdal 2014-01-07 06:42:54 UTC
Created attachment 10509 [details]
Tar file with rogtest5.test and Test.eps. Test.eps wraps rogtest5.eps.

I sometimes get a crash related to jpeg_free on certain EPS files when I wrap them with a run operator. I don't know if this is related to http://bugs.ghostscript.com/show_bug.cgi?id=694658 .

On Linux I have a clean (Artifex) Ghostscript 9.10 and get the following (on the attached files):

%> ghostscript-9.10/bin/gs -sDEVICE=ppmraw -o /dev/null Test.eps
Artifex Ghostscript 9.10 (2013-08-30)
Copyright (C) 2013 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb045834!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb0461f4!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb04548c!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb04548c!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb0460b4!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb922ab4!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb205f34!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb205f34!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb205f34!
Artifex Ghostscript 9.10: ./base/sjpegc.c(234): Freeing unrecorded JPEG data 0xb206334!

%> valgrind ghostscript-9.10/bin/gs -sDEVICE=ppmraw -o /dev/null Test.eps
==23990== Memcheck, a memory error detector
==23990== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==23990== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==23990== Command: ghostscript-9.10/bin/gs -sDEVICE=ppmraw -o /dev/null Test.eps
==23990== 
Artifex Ghostscript 9.10 (2013-08-30)
Copyright (C) 2013 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
==23990== Invalid read of size 4
==23990==    at 0x80F98D1: jpeg_free (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x80FB0EE: free_pool (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x80FB1B9: self_destruct (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x80FADB6: jpeg_destroy (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x80F9B30: gs_jpeg_destroy (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x8114849: s_DCTD_release (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x8156B66: sclose (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x8161318: zclosefile (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x814A6DD: gs_call_interp (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x814B54B: gs_interpret (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x813E32C: gs_main_run_string_with_length (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x813E399: gs_main_run_string (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==  Address 0x78b3cf0 is 64 bytes inside a block of size 68 free'd
==23990==    at 0x517F79D: free (vg_replace_malloc.c:325)
==23990==    by 0x8474EC5: i_free_all (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x818D4A3: alloc_restore_step_in (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x816B409: zrestore (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x814A6DD: gs_call_interp (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x814B54B: gs_interpret (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x813E32C: gs_main_run_string_with_length (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x813E399: gs_main_run_string (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x813EFF6: run_string (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x813F867: runarg (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x814174C: gs_main_init_with_args (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990==    by 0x809D17F: main (in /tmp/mito/ghostscript-9.10/bin/gs)
==23990== 
...
and then a lot of more related errors.

Regards,
Michael
Comment 1 Ken Sharp 2014-01-07 06:54:04 UTC
For me this seg faults running the original EPS file, no need to execute 'run', the call stack is as per the Valgrind output. Its not a PostScript interpreter problem though, its the DCTDecoder.

I'm assigning this to Henry for now, possibly it should go to someone else.

P1 for crash and for customer bug report.
Comment 2 Henry Stiles 2014-01-08 11:35:22 UTC
with:

gs -sDEVICE=ppmraw -o out.ppm rogtest5.eps

I get no crash in the 9.10 release or the current code (linux 64).  I also get a clean valgrind run.  Am I missing something, Ken?
Comment 3 Michael Toftdal 2014-01-08 11:40:48 UTC
I also don't see a problem without using the Test.eps wrapper. Can you reproduce the symptoms I describe if you use the wrapper?

Regards,
Michael
Comment 4 Henry Stiles 2014-01-08 11:58:43 UTC
(In reply to comment #3)
> I also don't see a problem without using the Test.eps wrapper. Can you
> reproduce the symptoms I describe if you use the wrapper?
> 
> Regards,
> Michael

I confirm that yes, will continue looking, very strange.
Comment 5 Ken Sharp 2014-01-08 12:01:02 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > I also don't see a problem without using the Test.eps wrapper. Can you
> > reproduce the symptoms I describe if you use the wrapper?
> > 
> > Regards,
> > Michael
> 
> I confirm that yes, will continue looking, very strange.

For me it crashes on a 32-bit Windows 7 debug build (64-bit Windows) as described without using the wrapper. Using the wrapper has exactly the same result for me.

However...... its possible I was using the Luratech decoder, I'm not sure if we use that for regular JPEG data.
Comment 6 Ken Sharp 2014-01-09 00:30:50 UTC
(In reply to comment #5)

> However...... its possible I was using the Luratech decoder, I'm not sure if
> we use that for regular JPEG data.

On Windows, 32-bit debug build, both EPS files crash in what looks like the same fashion. The call stack is as per the Valgrind report in comment #1. This is without the Luratech decoder being built in.

NB the rogtest5.eps file seems to enter an infinite loop instead of crashing unless its run in the debugger. The test.eps file crashes no matter what.

In no cases do I get any of the warning messages.
Comment 7 Hin-Tak Leung 2014-01-09 12:35:30 UTC
9.10 x86_64 linux release (shipped by redhat) finishes successfully, but the debug build of head - also x86_64 linux - I have lying around segfault at:

Program received signal SIGSEGV, Segmentation fault.
0x00000000004d2e1e in jpeg_free (cinfo=0x20ab170, data=0x20abb48, info=0xaa44e8 "Freeing JPEG small internal data")
    at ./base/sjpegc.c:229
229	    while(p && p->data != data)

The rest of the call stack is essentially the same.
Comment 8 Henry Stiles 2014-01-09 16:04:53 UTC
I had a chance to look at this a bit today.  The patch below works, butorks but needs more review.  The memory associated with the DCT stream is allocated in a save context and the jpeg library issues frees after the context is restored.  The "restore" frees these memory references before the JPEG library. I *think* the DCT stream is torn down in the saved context but the JPEG library is lazy about when it calls the free, I'm guessing that part from seeing "free_pool" in the backtrace but I haven't verified.

In the meantime, this patch passes regression, I'll do more follow up tomorrow.

diff --git a/gs/base/sjpegc.c b/gs/base/sjpegc.c
index 2ea60d2..6711853 100644
--- a/gs/base/sjpegc.c
+++ b/gs/base/sjpegc.c
@@ -202,9 +202,9 @@ jpeg_alloc(j_common_ptr cinfo, size_t size, const char *info)
     jpeg_compress_data *jcd = cinfo2jcd(cinfo);
     gs_memory_t *mem = jcd->memory;
 
-    jpeg_block_t *p = gs_alloc_struct_immovable(mem, jpeg_block_t,
+    jpeg_block_t *p = gs_alloc_struct_immovable(mem->stable_memory, jpeg_block_t,
                         &st_jpeg_block, "jpeg_alloc(block)");
-    void *data = gs_alloc_bytes_immovable(mem, size, info);
+    void *data = gs_alloc_bytes_immovable(mem->stable_memory, size, info);
 
     if (p == 0 || data == 0) {
         gs_free_object(mem, data, info);
@@ -225,7 +225,7 @@ jpeg_free(j_common_ptr cinfo, void *data, const char *info)
     jpeg_block_t  *p  =  jcd->blocks;
     jpeg_block_t **pp = &jcd->blocks;
 
-    gs_free_object(mem, data, info);
+    gs_free_object(mem->stable_memory, data, info);
     while(p && p->data != data)
       { pp = &p->next;
         p = p->next;
@@ -234,7 +234,7 @@ jpeg_free(j_common_ptr cinfo, void *data, const char *info)
       lprintf1("Freeing unrecorded JPEG data 0x%lx!\n", (ulong)data);
     else
       *pp = p->next;
-    gs_free_object(mem, p, "jpeg_free(block)");
+    gs_free_object(mem->stable_memory, p, "jpeg_free(block)");
 }
 
 void *
Comment 9 Ken Sharp 2014-01-10 00:14:08 UTC
(In reply to comment #8)

> but needs more review.  The memory associated with the DCT stream is
> allocated in a save context and the jpeg library issues frees after the
> context is restored.  The "restore" frees these memory references before the
> JPEG library.

This looks to be the same class of problem Chris helped me with at the beginning of the week. I really think that using the stable memory is the way to go, I presume that when you say it needs more work you mean you need to check the allocater usage more closely ?

EG this looks wrong:

> +    void *data = gs_alloc_bytes_immovable(mem->stable_memory, size, info);
>  
>      if (p == 0 || data == 0) {
>          gs_free_object(mem, data, info);

Looks like you are using the wrong allocater to free the memory
Comment 10 Henry Stiles 2014-01-10 07:04:41 UTC
(In reply to comment #9)
> (In reply to comment #8)
> 
> > but needs more review.  The memory associated with the DCT stream is
> > allocated in a save context and the jpeg library issues frees after the
> > context is restored.  The "restore" frees these memory references before the
> > JPEG library.
> 
> This looks to be the same class of problem Chris helped me with at the
> beginning of the week. I really think that using the stable memory is the
> way to go, I presume that when you say it needs more work you mean you need
> to check the allocater usage more closely ?
> 
> EG this looks wrong:
> 
> > +    void *data = gs_alloc_bytes_immovable(mem->stable_memory, size, info);
> >  
> >      if (p == 0 || data == 0) {
> >          gs_free_object(mem, data, info);
> 
> Looks like you are using the wrong allocater to free the memory

Oh of course, thank you.  What I wanted to look at or ask you or Chris is if we are terminating the DCT stream properly before the restore or the jpeg library waits to do the frees later even though we have shutdown the stream.  Perhaps the stream is not terminated properly and we don't shutdown jpeg properly then we do a restore that would suggest a different fix: terminating unfinished streams before any restore.
Comment 11 Ken Sharp 2014-01-10 07:12:53 UTC
(In reply to comment #10)

> Oh of course, thank you.  What I wanted to look at or ask you or Chris is if
> we are terminating the DCT stream properly before the restore or the jpeg
> library waits to do the frees later even though we have shutdown the stream.

AFACT the library is trying to free private memory after the restore.

> Perhaps the stream is not terminated properly and we don't shutdown jpeg
> properly then we do a restore that would suggest a different fix:
> terminating unfinished streams before any restore.

I think that could be problematic, I suspect its possible in PostScript to save, open a stream and then restore back. Detecting the open stream during the save would be difficult I think (Possibly even impossible). Hmm actually I guess we must be able to spot this, presumably when we restore to a point where the stream goes out of scope, we must close the stream. So the stream *should* terminate before the restore completes. However I guess its possible that memory it relies on might previously have been freed.

I believe its better to have the library use stable memory for any allocations which are not *required* to be affected by save and restore. So the stream itself is allocated in non-stable memory (because it has to obey save and restore) but any objects its holding on to should be in stable memory, and the stream should free them when it is restored out of existence.
Comment 12 Chris Liddell (chrisl) 2014-01-10 07:17:34 UTC
(In reply to comment #10)
> 
> Oh of course, thank you.  What I wanted to look at or ask you or Chris is if
> we are terminating the DCT stream properly before the restore or the jpeg
> library waits to do the frees later even though we have shutdown the stream.
> Perhaps the stream is not terminated properly and we don't shutdown jpeg
> properly then we do a restore that would suggest a different fix:
> terminating unfinished streams before any restore.

That effectively would mean to two gc scans: one to detect no longer referenced streams, and terminate them, and a second to to actually do the garbage collection of the restore. I really don't think that's a good idea.

If we move the jpeg memory to non-gc'ed memory, then the stream will get garbage collected, and as part of that, the stream will be terminated. That ought to solve the problem.
Comment 13 Henry Stiles 2014-01-10 07:23:36 UTC
(In reply to comment #12)
> (In reply to comment #10)
> > 
> > Oh of course, thank you.  What I wanted to look at or ask you or Chris is if
> > we are terminating the DCT stream properly before the restore or the jpeg
> > library waits to do the frees later even though we have shutdown the stream.
> > Perhaps the stream is not terminated properly and we don't shutdown jpeg
> > properly then we do a restore that would suggest a different fix:
> > terminating unfinished streams before any restore.
> 
> That effectively would mean to two gc scans: one to detect no longer
> referenced streams, and terminate them, and a second to to actually do the
> garbage collection of the restore. I really don't think that's a good idea.
> 
> If we move the jpeg memory to non-gc'ed memory, then the stream will get
> garbage collected, and as part of that, the stream will be terminated. That
> ought to solve the problem.

Oh I do like that, you win the bug assignment!
Comment 14 Chris Liddell (chrisl) 2014-01-10 07:24:15 UTC
(In reply to comment #11)
> However I guess its
> possible that memory it relies on might previously have been freed.

That's exactly the problem: the memory allocated by libjpeg is not known to any ENUM_**** or RELOC_**** gc functions, so they will be gc'ed as soon as the garbage collector touches them. There's no guarantee that will happen *after* the stream object has been gc'ed (and therefore, terminated).

> 
> I believe its better to have the library use stable memory for any
> allocations which are not *required* to be affected by save and restore. So
> the stream itself is allocated in non-stable memory (because it has to obey
> save and restore) but any objects its holding on to should be in stable
> memory, and the stream should free them when it is restored out of existence.

It should be non-gc memory, not just stable memory - stable memory is not subject to save/restore, but is still subject to garbage collection.
Comment 15 Henry Stiles 2014-01-13 07:13:53 UTC
(In reply to comment #14)
> (In reply to comment #11)
> > However I guess its
> > possible that memory it relies on might previously have been freed.
> 
> That's exactly the problem: the memory allocated by libjpeg is not known to
> any ENUM_**** or RELOC_**** gc functions, so they will be gc'ed as soon as
> the garbage collector touches them. There's no guarantee that will happen
> *after* the stream object has been gc'ed (and therefore, terminated).
> 
> > 
> > I believe its better to have the library use stable memory for any
> > allocations which are not *required* to be affected by save and restore. So
> > the stream itself is allocated in non-stable memory (because it has to obey
> > save and restore) but any objects its holding on to should be in stable
> > memory, and the stream should free them when it is restored out of existence.
> 
> It should be non-gc memory, not just stable memory - stable memory is not
> subject to save/restore, but is still subject to garbage collection.

That was not my read of the code:

/*
 * We don't want to allocate JPEG's private data directly from
 * the C heap, but we must allocate it as immovable; and to avoid
 * garbage collection issues, we must keep GC-traceable pointers
 * to every block allocated.
 */
typedef struct jpeg_block_s jpeg_block_t;
struct jpeg_block_s {
    jpeg_block_t *next;
    void *data;
};
#define private_st_jpeg_block()	/* in sjpegc.c */\
  gs_private_st_ptrs2(st_jpeg_block, jpeg_block_t, "jpeg_block_t",\
    jpeg_block_enum_ptrs, jpeg_block_reloc_ptrs, next, data)


But naturally the code might have evolved away from the original design.  For example I don't immediately see the code to enumerate the ptrs ... assign it back to me if you want me to look some more.  We should at least change this misleading comment and declaration if you are correct and the block list is not GC'd.
Comment 16 Chris Liddell (chrisl) 2014-01-13 07:32:34 UTC
(In reply to comment #15)
> That was not my read of the code:
> 
> /*
>  * We don't want to allocate JPEG's private data directly from
>  * the C heap, but we must allocate it as immovable; and to avoid
>  * garbage collection issues, we must keep GC-traceable pointers
>  * to every block allocated.
>  */
> typedef struct jpeg_block_s jpeg_block_t;
> struct jpeg_block_s {
>     jpeg_block_t *next;
>     void *data;
> };
> #define private_st_jpeg_block()	/* in sjpegc.c */\
>   gs_private_st_ptrs2(st_jpeg_block, jpeg_block_t, "jpeg_block_t",\
>     jpeg_block_enum_ptrs, jpeg_block_reloc_ptrs, next, data)

Hmm, I didn't/don't understand that comment. To me "directly from the C heap" would mean no garbage collection is involved, so we shouldn't need to keep GC-traceable pointers.

Everything we currently allocate for libjpeg comes from garbage collected "Postcript" (but immovable) memory, so it doesn't move, but is subject to GC *and* is also subject to save/restore. The file in question does something like:

/In (in-file) file /DCTDecode filter def
save
<<>> image
restore
In closefile

As a result, libjpeg allocates memory when decoding the DCT stream during the image operator, but that memory is freed by the restore, without libjpeg knowing about it.

Then we come to close the filter file, and try to clean up the libjpeg context, and memory libjpeg thinks it still holds has been freed - and we crash.

Now, I consider it wrong that a third party library should be using garbage collected memory, and *extremely* wrong that a third party library should be using memory subject to save/restore!

So, what I have done here is do away with the block list crap, use a chunk allocator for the libjpeg memory, and add a "_finalize" method for the DCT stream states (removing the "_release" method in the process).

Using the chunk allocator gives us a "firewall" in case libjpeg leaks memory (as we would have had previously).

I haven't quite finished testing and tidying it up, but getting there......
Comment 17 Chris Liddell (chrisl) 2014-01-16 01:28:24 UTC
Fixed in:
http://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=ec1c4adc