The following command lines act differently: ./build/release/mutool draw -o test.pgm ./missing_page_tree.pdf With the debug build the output is as expected: error: Malformed pages tree error: cannot load page 1 in file './missing_page_tree.pdf' error: cannot draw './missing_page_tree.pdf' and the mutool exit code is 1. With the release build the final error message is missing: error: Malformed pages tree error: cannot load page 1 in file './missing_page_tree.pdf' and more importantly the mutool exit code is 0. Note that in neither is case an output file produced.
So far I've only been able to reproduce this with Ubuntu 15.04. It does not appear to be related to which libraries/packages are installed. mutool built and run on clean install of Ubuntu 15.04 Desktop amd64 acts the same way.
I've looked into this and the problem is fixed by declaring fz_context *ctx to be volatile in mudraw_main(): diff --git a/source/tools/mudraw.c b/source/tools/mudraw.c index 693f6f9..56910c5 100644 --- a/source/tools/mudraw.c +++ b/source/tools/mudraw.c @@ -858,7 +858,7 @@ int mudraw_main(int argc, char **argv) char *password = ""; fz_document *doc = NULL; int c; - fz_context *ctx; + volatile fz_context *ctx; fz_alloc_context alloc_ctx = { NULL, trace_malloc, trace_realloc, trace_free }; fz_var(doc); This causes a bunch of warnings since ctx is passed to functions which are not expecting the fz_context * parameter to be volatile. I don't fully understand the fz_try/fz_catch macros (as the comment says "pay no attention to the man behind the curtain"), so it's possible that gcc 4.9.2 has a bug in the optimizer that is masked by the volatile declaration. I believe the issue happens in the call to drawrange() in mudraw_main() (or perhaps in the drawpage() call in drawrange()), in one of those places the fz_context gets corrupted.
(In reply to Marcos H. Woehrmann from comment #3) > I've looked into this and the problem is fixed by declaring > fz_context *ctx to be volatile in mudraw_main(): ctx is only ever assigned once, and it is never changed for the duration of the execution. It's not even assigned within an fz_try/fz_catch, so we don't have the excuse of a setjmp maybe resetting it. > This causes a bunch of warnings since ctx is passed to functions which are > not expecting the fz_context * parameter to be volatile. I'd be tempted to try calling fz_var(ctx); rather than setting it to be volatile, but that's a shot in the dark. I can't see why that should have an effect. A better test might be to put some prints in the fz_catch clauses within that function to see which ones are hit.
(In reply to Robin Watts from comment #4) > > I'd be tempted to try calling fz_var(ctx); rather than setting it to be > volatile, but that's a shot in the dark. I can't see why that should have an > effect. That does not help; still no error return with the release build. > > A better test might be to put some prints in the fz_catch clauses within > that function to see which ones are hit. As far as I can tell this is what is happening (in more detail than is necessary so that I can recall what I was doing when I revisit this later), line numbers based on mupdf 5fb01f6: In source/tools/mudraw.c the routine mudraw_main():855 calls drawrange():783 in a fz_try():1073/fz_catch():1188 block. drawrange() calls drawpage():243 which calls fz_load_page() in a fz_try()/fz_catch() block. This call fails because the input file is broken, so the fz_catch() clause at line 260 is executed. This calls source/fitz/error.c:fz_rethrow_message() which ends with a call to source/fitz/error.c:throw(). throw() calls fz_longjmp(ex->stack[ex->top].buffer, ex->stack[ex->top].code + 2) but the "+ 2" is not being passed back, causing ctx->error->stack[ctx->error->top].code to be left set to 0 and therefore the fz_catch() block at line 1115 is not executed. I ran mudraw.c through the preprocessor and line 1073 expands to: if (fz_push_try(ctx->error) && ((ctx->error->stack[ctx->error->top].code = __sigsetjmp (ctx->error->stack[ctx->error->top].buffer, 0)) == 0)) { do { I've run this prepressed code in a debugger and by settings breakpoints appropriately I can get to ctx->error->stack[ctx->error->top].code = __sigsetjmp (ctx->error->stack[ctx->error->top].buffer, 0) when the __sigsetjmp has been called by the fz_longjmp() in throw(). At this point: (gdb) print ctx->error->stack[ctx->error->top].code value has been optimized out (gdb) This obviously doesn't happen for the non-optimized build, it correctly reports .code set to a 2 after a single step. It also explains why declaring ctx as volatile works, since volatile variables can't be optimized out. It seems like optimizing out a variable that is being set by a call to __sigsetjmp() is not a good idea. Otoh, my understanding of compiler optimization is minimal; as far as I'm concerned many of the optimizations that gcc does are magic. The current released gcc (5.2.0) behaves in the same way, so if it is a compiler bug it hasn't been fixed. The last version of gcc that works appears to be 4.8.4 (I haven't tested 4.8.5, but 4.9.0 fails).
I was curious if it would be possible to bisect gcc to determine the commit that started this issue. Turns out it's easy: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=commit;h=a5ef9e4d
This is GCC (or in any case, compiler-) specific, you can tell GCC not to optimize out something by declaring: __attribute__((used)) i.e. fz_context __attribute__((used)) *ctx; I think. This is a little tips I came across in a few other projects where the improved link-time-optimization in gcc 4.9+ (4.8+ ? ) are overly agressive and optimize out things which should stay... if this work for you, you can make it a bit more portable with #ifdef __GNUC__ .
(In reply to Hin-Tak Leung from comment #7) > This is GCC (or in any case, compiler-) specific, you can tell GCC not to > optimize out something by declaring: > > __attribute__((used)) > > i.e. > > fz_context __attribute__((used)) *ctx; > > I think. > This does not work for me. The internets suggests: __asm__ __volatile__("" :: "m" (ctx)); but that doesn't work either.
Fixed with: commit 1f15bade3112bfdc8ddbae9c4b06047427c2367f Author: Robin Watts <robin.watts@artifex.com> Date: Mon Sep 28 15:24:00 2015 +0100 Bug 696115: Further fix for setjmp/longjmp. Tor turned up an interesting section in the C spec about this. See page 275 of http://open-std.org/jtc1/sc22/wg14/www/docs/n1494.pdf regarding acceptable places for setjmp to occur. It seems that: if (setjmp(buf)) if (!setjmp(buf)) if (setjmp(buf) {==,!=,<,>} <integer constant>) etc are all valid things to do, but assignments (and subsequent testing of values) like: if ((code = setjmp(buf)) == 0) are not allowed. Further, it's not even clear that: if (a() && setjmp(buf)) is permissible. We therefore recast the macros into the form: a(); if (setjmp((buf)) == 0) which should be acceptable under the C spec. To keep try atomic, we introduce a block '{{{' around this, along with a matching close block '}}}' in the catch clause. This has the nifty extra effect of giving us a compile time error if we mismatch our try/catches. Thanks!