Bug 689756 - process CPU spin then SIGSEGV when rendering user defined font
Summary: process CPU spin then SIGSEGV when rendering user defined font
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: X Display Driver (show other bugs)
Version: 8.61
Hardware: PC Linux
: P4 normal
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-03-18 07:30 UTC by Eric Lee
Modified: 2010-05-12 15:03 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments
A better test data file than try.ps (44.10 KB, application/postscript)
2008-03-21 12:01 UTC, Eric Lee
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Lee 2008-03-18 07:30:55 UTC
The command:
ghostscript try.ps
Fails to draw in the output X window, but it enters a CPU loop, it's VM  size
increases over several seconds until a SEGV kills it.
ghostscript: ghostscript-8.61-8.fc8
X11: xorg-x11-server-Xorg-1.3.0.0-42.fc8
OS: Fedora release 8 (Werewolf)

Test data file: http://www.sentry.plus.com/try.ps

This test file 'try.ps' renders perfectly using ghostscript 7.07
on top of
X11: xorg-x11-6.8.2-1.FC3.45.2

It seems to be related to resolution as adding "-r100" to the ghostscript 8.61
command line makes it work, this was however never previously necessary with 7.07.
Using a different output device on 8.61 also has no problems e.g.
ghostscript -sDEVICE=ppm -sOutputFile=try.ppm try.ps < /dev/null
works fine
Comment 1 Marcos H. Woehrmann 2008-03-20 11:03:30 UTC
On my linux box "bin/gs try.ps" works but "bin/gs -r500 try.ps" produces an error:

  Error: /unknownerror in --.type1execchar--
Comment 2 Ray Johnston 2008-03-20 11:23:49 UTC
On Cygwin, I don't get a report of a 'seg fault', but I do see the X11 window,
close after printing ONLY 'Get here' (the 'Never get here' message is not printed)
when using the default resolution or if I use -r75. At other resolutions, 72, 80,
100, 200, 500, 600 Cygwin x11 device works fine (unlike what I see on linux).
Comment 3 Marcos H. Woehrmann 2008-03-20 11:26:35 UTC
This problem goes away when using -sDEVICE=x11 instead of x11alpha (x11alpha is
the default device with gs8.61 and later).

Using -sDEVICE=x11 -dTextAlphaBits=4 also fails with -r500 (with all versions of
Ghostscript).
Comment 4 Marcos H. Woehrmann 2008-03-20 11:32:23 UTC
My comment #3 explains the behaviour change originally reported.

Please try -sDEVICE=x11 and see if that restores that old behaviour.
Comment 5 Marcos H. Woehrmann 2008-03-20 11:41:37 UTC
And, as Ray points out, adding -dMaxBitmap=400000000 fixes the issue with
-sDEVICE=x11alpha
Comment 6 Ray Johnston 2008-03-20 11:51:46 UTC
Testing with Cygwin at low resolutions (72 -> 85 dpi) I see failures at some of
the resolutions: 73, 74, 75, 78, 81, 83, 84 BUT some of the other resolutions
work: 72, 76, 77, 79, 80, 82 and 85 dpi

STRANGE.
Comment 7 Eric Lee 2008-03-21 12:01:48 UTC
Created attachment 3883 [details]
A better test data file than try.ps

gs crashes with the command:
gs schdview.12656.ps 
and also:
gs -sDEVICE=x11 schdview.12656.ps
but gs 7.07 (and my Adobe-equipped printer!) is OK with either
Comment 8 James Cloos 2009-04-21 12:27:23 UTC
Wow.  The backtrace (from running attachment #3883 [details] through gs -dBATCH -r96
test.ps, which defaults to the x11alpha device) ends with:

#260845 0x08096492 in bbox_stroke ()
#260846 0x08096032 in bbox_finish_stroke ()
#260847 0x08096492 in bbox_stroke ()
#260848 0x08096032 in bbox_finish_stroke ()
#260849 0x08096f02 in charstring_execchar_aux ()
#260850 0x0809703f in charstring_execchar ()
#260851 0x0809709a in ztype1execchar ()
#260852 0x080f2311 in interp ()
#260853 0x080f2f9e in gs_interpret ()
#260854 0x080e7e08 in gs_main_run_string_with_length ()
#260855 0x080e7e4a in gs_main_run_string ()
#260856 0x080e8b07 in run_string ()
#260857 0x080e9288 in runarg ()
#260858 0x080e948e in argproc ()
#260859 0x080ead97 in gs_main_init_with_args ()
#260860 0x08085d5a in main ()

That means there are more than 130 thousand recursions through
bbox_finish_stroke() and bbox_stroke().

I’d have to spend more time reading the PS, but does the font itself recurse?

(I ran it with 8.64 on (32 bit) x86 using gentoo’s -r2 ebuild.  The patches
gentoo adds to the 8.64 tar are available at:

http://gentoo.mirrors.pair.com/distfiles/ghostscript-gpl-8.64-patchset-3.tar.bz2

I believe that all of the patches in that tar came from upstream svn.)
Comment 9 Eric Lee 2009-05-20 03:18:15 UTC
I don't think it's worth examining the PS as it renders without error on gs-7.07
or genuine Adobe interpreters. It's only gs-8.xx that has a problem.
Comment 10 James Cloos 2010-05-07 17:04:14 UTC
Although gs 8.71 still dumps core when running test.ps, the icc_work branch does not.

I tried:

:; for ij in pbm pgm; do
:; /opt/iccgs/bin/gs -r96 -dBATCH -dNOPAUSE -sDEVICE=${ij} -o test.${ij} test.ps
:; done

There is still a bug, however.  When using the pbm device, the result is a portrait aspect image with vertical text, whereas with the pgm device the image is still portrait aspect, but the text and graphics are rendered horizontally, leaving the top of the image blank and cropping off the right of the text and graphics.

However, that said, I see that test.ps has code to treat gs’s x11, ppm, pgm and png DEVICEs specially, setting /GSdevice to 1 for x11 and to 2 for those three image DEVICEs.  That may be why the landscaping fails for pgm & ppm.

For some reason, pbm vs pgm changes how landscaping occurs.

All of the x11* devices work like the pbm in terms of the landscaping, and all but x11cmyk like pgm in terms of colour, when rendering test.ps.

I suspect the icc_work branch is successful with this file because of its use of FAPI and freetype, suggesting that the bug which causes gs to recurse until it dumps core is in gs’s internal font handling.

FAPI outputs this:

Loading NimbusSanL-Regu font from /opt/iccgs/share/ghostscript/8.72/Resource/Font/NimbusSanL-Regu...
Font NimbusSanL-Regu is being rendered with FAPI=FreeType
2568432 1212504 4317244 3020348 1 done.
Font Helvetica is mapped to FAPI=FreeType
Font Helvetica-FatOutline is mapped to FAPI=FreeType
Font font8 is mapped to FAPI=FreeType
Loading Dingbats font from /opt/iccgs/share/ghostscript/8.72/Resource/Font/Dingbats...
Font Dingbats is being rendered with FAPI=FreeType
2588528 1270222 4407312 3112721 2 done.
Font ZapfDingbats is mapped to FAPI=FreeType
Font font99 is mapped to FAPI=FreeType

I’d guess that either Helvetica-FatOutline, font8 (a re-encoding of Helvetica) or font99 (a re-encoding of Helvetica-FatOutline) causes gs w/o FAPI to recurse.
Comment 11 Ken Sharp 2010-05-08 08:49:55 UTC
The only file attached here is 'schdview.12656.ps', which sadly doesn't cause me a problem with 8.71, even without FAPI. It also doesn't use the fonts James named in comment 10. Since the problem is with strokes, it seems likely that we need a font which is stroked, which is not the case with this test file.

I assume that there was some other test file somewhere, James could you attach test.ps to this bug report please ?

Also changed assignment to me and CC'ed Chris, for font bug reports.
Comment 12 James Cloos 2010-05-08 13:16:09 UTC
The test.ps I used for comment #11 is the same as the one from comment #8, which is attachment #3883 [details].

I just downloaded it again and the sha1sum(1)s match.

And the segv with 8.71 is exactly the same as with 8.64: bbox_stroke() and bbox_finish_stroke() recurse until the process runs out of resources.  (I presume it runs out of stack?)

Which suggests that the file’s definition of Helvetica-FatOutline may be to bame.
Comment 13 James Cloos 2010-05-08 13:26:03 UTC
I forgot to add:

I also dont see why you see different font usage.

And running the icc_work branch with -dDisableFAPI=true also generates a SEGV, after recursing through those same two functions, but with the extra bit at the top (this particular build was configured with -O2 -ggdb):

#0  gdev_mem_open_scan_lines (mdev=0x89ca46c, setup_height=39) at ./base/gdevmem.c:435
#1  0x08484512 in mem_open (dev=0x0) at ./base/gdevmem.c:400
#2  0x08436183 in gx_open_cache_device (dev=0x89ca46c, cc=0x8979c20) at ./base/gxccman.c:698
#3  0x08436820 in gx_alloc_char_bits (dir=0x8807bd4, dev=0x89ca46c, dev2=0x0, iwidth=38, 
    iheight=39, pscale=0xbf608c0c, depth=1, pcc=0xbf608c14) at ./base/gxccman.c:680
#4  0x08438704 in set_cache_device (penum=0x8998ccc, pgs=0x880f5d4, llx=-940.16034039911392, 
    lly=-1035, urx=1891.0880296074722, ury=1703) at ./base/gxchar.c:605
#5  0x08438cdf in gx_show_text_set_cache (pte=0x8998ccc, pw=0xbf608d30, 
    control=TEXT_SET_CACHE_DEVICE) at ./base/gxchar.c:354
#6  0x084323f2 in gs_text_setcachedevice (pte=0x8998ccc, wbox=0xbf608d30)
    at ./base/gstext.c:641
#7  0x080b1b35 in zchar_set_cache (i_ctx_p=0x881fcf8, pbfont=0x89769dc, pcnref=0xbf608e04, 
    psb=0x0, pwidth=0xbf60b81c, pbbox=0xbf60b830, cont=0x80affe0 <bbox_finish_stroke>, 
    exec_cont=0xbf60b8ec, Metrics2_sbw_default=0x0) at ./psi/zcharout.c:276
#8  0x080af916 in type1exec_bbox (i_ctx_p=0x881fcf8, penum=<value optimized out>, 
    pcxs=<value optimized out>, pfont=0x89769dc, exec_cont=0xbf60b8ec) at ./psi/zchar1.c:421
#9  0x080b033b in bbox_draw (i_ctx_p=0x881fcf8, draw=<value optimized out>, 
    exec_cont=0xbf60b8ec) at ./psi/zchar1.c:661
#10 0x080b0450 in bbox_stroke (i_ctx_p=0x881fcf8) at ./psi/zchar1.c:682
#11 0x080b0012 in bbox_finish_stroke (i_ctx_p=0x881fcf8) at ./psi/zchar1.c:500
#12 0x080b0462 in bbox_stroke (i_ctx_p=0x881fcf8) at ./psi/zchar1.c:684
#13 0x080b0012 in bbox_finish_stroke (i_ctx_p=0x881fcf8) at ./psi/zchar1.c:500
#14 0x080b0462 in bbox_stroke (i_ctx_p=0x881fcf8) at ./psi/zchar1.c:684
(etc)


One interesting point I noticed looking at bbox_stroke() and bbox_finish_stoke() is that the former calls the function pointed to by exec_cont like:

        code = (*exec_cont)(i_ctx_p);

whereas the latter uses:

        code = exec_cont(i_ctx_p);

The fill functions do the same thing.

FWIW.
Comment 14 Ken Sharp 2010-05-08 15:19:03 UTC
James, I'm not casting doubt, but I can't reproduce any problems. I've tried on Windows and Fedora, with a variety of devices. I've also picked up the original file (try.ps).

Both files work for me with and without FreeType in any configuration I have. I'd love to close the issue as 'fixed', but I'm not totally comfortable with doing so if I can't see the original fault.

Chris, could you try this as well please ? On a Linux system, no need to run Windows.

Final note, if I run the file with -sDEVICE=x11 -r500 I *can* get a seg fault, but it occurs for both FreeType and regular rendering, so I'm really not convinced that the FreeType code 'fixes' anything.
Comment 15 James Cloos 2010-05-08 18:09:10 UTC
Even with the icc branch (up to date, and as such almost up to date with trunk),
when I run that file at any resolution with any device it works when FAPI is used and crashes when -dDisableFAPI=true.

And 8.71 never works.

Gentoo’s patches for 8.71, based on reading the ebuild as I have it installed, are borrowed from fedora and so should be similar to what you have.

Maybe you just have more ram installed?

I used this when I configured my compile of the icc branch:

env CFLAGS='-march=pentium3 -O2 -ggdb' ./configure --prefix=/opt/iccgs --build=i686-pc-linux-gnu --host=i686-pc-linux-gnu --enable-cairo --disable-cups --enable-gtk --with-jasper --with-x --disable-compile-inits --enable-dynamic --enable-fontconfig --with-drivers=ALL --with-ijs --with-jbig2dec --with-libpaper --with-jbig2dec

that was mostly grabbed from the logs of my last system build of 8.71, with a different prefix, cups disabled (because it tries to install to cups’ prefix rather than to gs’s prefix) and -ggdb added to the CFLAGS.
Comment 16 James Cloos 2010-05-08 19:18:09 UTC
And I can confirm that testing on a 64bit f11 box using
ghostscript-8.71-6.fc11.x86_64 also succeeds for me.

Further testing on the laptop confirms that stack size was the issue.  When I
raised the stack size from the default of 8192 to 10240 (matching my default on
that 64 bit server box) the file worked with non-x11 devices.

It still crashes, though, with any of the x11 devices, even with a stacksize of
65536, eight times the default.

But not with freetype, even with large values of -r, such as -r500.

SO it is probably not worth further investigation, given that fapi is the way
forward.

But if you would like to see one of the core files, I can post one.
Comment 17 Hin-Tak Leung 2010-05-09 00:14:08 UTC
(In reply to comment #16)
But from what you wrote, it is both specific to 32-bit x86 (no crash on x86_64) and also sensitive to stack size.

To be honest, any postscript file which checks the output device being 'x11' and does different things compared to other devices, is a rather evil.
Comment 18 Chris Liddell (chrisl) 2010-05-10 07:47:29 UTC
I cannot reproduce this issue at any of the above stated resolutions and devices. That's using the bleeding edge source in svn.

Perhaps we could add these jobs to our regression suite, so if the problem is only temporaraily hidden, or recurs, we should catch it reasonably quickly (the files look simple enough to just drop straight in)?
Comment 19 James Cloos 2010-05-12 14:36:02 UTC
(In reply to comment #17)
I wasn’t able to avoid the segv by altering the code in that file which sets or
uses GSdevice.

But changing the /Helvetica-FatOutline name to, eg, /Helvetica-Bold just before
the lines which print «Get here» and «Never get here», does stop the (seemingly
infinite) recursion.

The /Helvetica-FatOutline font creates something akin to PDF’s Tmode 2 (stroke
and fill), using a black fill and a heavy white stroke in this file.

GS, in some cases, is unable to determine the bounding box of that font; it
ends up with bbox_stroke() calling bbox_finish_stroke() by reference, which
then calls bbox_stroke() by reference, which then ....

With »ulimit -s 10240« it will manage to complete, but with other devices even
»ulimit -s 65536« is insufficient.

FAPI and freetype seems to avoid it.

The alternative fix would be to recode the bbox routines to avoid recursion for
that kind of font, but that seems to be too much work given that freetype
already avoids it.
Comment 20 Ken Sharp 2010-05-12 15:03:22 UTC
(In reply to comment #19)

> The /Helvetica-FatOutline font creates something akin to PDF’s Tmode 2 (stroke
> and fill), using a black fill and a heavy white stroke in this file.
> 
> GS, in some cases, is unable to determine the bounding box of that font; it
> ends up with bbox_stroke() calling bbox_finish_stroke() by reference, which
> then calls bbox_stroke() by reference, which then ....

This isn't really determining the bounding box, as far as I can tell its rendering the operation (stroke a path), limited by a defined bounding box. So in this case its doing a stroke operation, clipped to a specific region.
 
> With »ulimit -s 10240« it will manage to complete, but with other devices even
> »ulimit -s 65536« is insufficient.

I have to confess I'm not at all sure *why* the type 1 font code goes recursive here, it seems bizarre to me. After a quick inspection it seems that the stroke lies at least partially outside the BoundingBox. That causes the code to expand the BoundingBox and try again. Trying again in this case means going back to the stroke code and seeing if it fits inside the new bounding box.

This seems rather pointless to me, might as well just ignore the bounding box and carry on, but that's the way its coded. For me this never goes to more than 2 levels of recursion, as the expanded BBox always succeeds, so I'm not sure why you are seeing such deep recursion. (as far as I can see it shouldn't ever need to go any deeper than 2, but I may be missing something subtle).

 
> FAPI and freetype seems to avoid it.

Using FreeType to render the glyph results in a quite different set of code, so not too surprisingly it doesn't exercise this (type 1 font specific) code path.


> that kind of font, but that seems to be too much work given that freetype
> already avoids it.

It would be 'interesting' to know why the old font code goes massively recursive, but as you say FreeType doesn't go down this code path at all, so its not really worth it in my opinion either. I am at least happy that I know why the FreeType code 'fixes' the problem, so I'm going to close it as 'fixed'. Unless anyone objects, in which case it can be reopened.