Bug 691792 - core dumps when rendering (some glyphs of) CFF fonts
Summary: core dumps when rendering (some glyphs of) CFF fonts
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: Font API (show other bugs)
Version: 9.00
Hardware: All All
: P4 normal
Assignee: Ray Johnston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-23 16:06 UTC by Thomas Kaiser
Modified: 2012-07-22 03:39 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Kaiser 2010-11-23 16:06:57 UTC
Trying to render one of the following PDF files supplied here

http://kaiser-edv.de/tmp/q0tRYM/

with GS 9 results in a core dump (Solaris/Sparc) or "bus error" (MacOS X Intel). If I use the '-dDisableFAPI'-switch everything's allright so I would assume it's a FreeType issue? I collected the pstack-/pargs-output for 2 core files here:

http://kaiser-edv.de/tmp/q0tRYM/pstack-pargs.txt

We analyzed the PDF files in question. They all contain these two CFF fonts (originally type 1 flavoured OTF). pdffont's output:

FranklinGotTOT-Hea  Type 1C  yes no  yes  154  0
FranklinGotTOT-Med  Type 1C  yes no  yes  443  0
Comment 1 Alex Cherepanov 2010-12-24 05:17:18 UTC
Local copy of the sample files is available in
spectre.ghostscript.com:/home/support/691792/

On Linux amd64 sample files run to completion when 
-dNumRenderingThreads=1 and deadlock otherwise.
Comment 2 Chris Liddell (chrisl) 2010-12-24 08:33:26 UTC
Hmm, I wrote this up a couple of weeks ago, but I must have forgotten to hit the commit button :-(

The deadlock, I thought, was due to a race condition in gsicc_findcachelink() here:

while (curr->valid == false) {
    curr->num_waiting++;
    gx_monitor_leave(icc_link_cache->lock);
    gx_semaphore_wait(curr->wait);
    gx_monitor_enter(icc_link_cache->lock);	/* re-enter breifly */
}

I figured that if, say, thread A is blocked by icc_cache_link->lock in gsicc_set_link_data(), and thread B is doing a cache lookup, thread B finds the cache entry, and identifies that the entry is not yet valid, and enters the above while loop to await for gsicc_set_link_data() to complete its work.

In the time between thread B unlocking icc_link_cache->lock and reaching the low level "wait" function, thread A can have completed gsicc_set_link_data() and moved on. The trouble, then, is that gx_semaphore_wait(curr->wait) above is waiting for a signal that has already gone past. This may not be an issue for Windows (I'm not totally sure of the behavior of ReleaseSemaphore/WaitForSingleObject), but is a problem with pthreads.

But an experiment to allow gx_semaphore_wait() to unlock the data monitor (icc_link_cache->lock) after it had locked the wait condition mutex didn't change the lockup, so maybe I'm barking up the wrong the tree.

However, changing the loop to not use a semaphore, but busy loop polling for valid to become true prevents the lockup (not a suggested fix!).

I don't *think* the lockup is to do with FAPI directly, I suspect it's just that it changes the timings slightly, and changes how the rendering code is exercised slightly, and pushes it through the problem code. So reassigning to Michael (sorry!), and explicitly cc-ing Ray, as it's threading related.
Comment 3 Michael Vrhel 2011-02-01 17:14:09 UTC
Reassigning to ray since this is a multi threaded issue.
Comment 4 Michael Vrhel 2011-02-01 17:15:21 UTC
Reassigning to ray since this is a multi threaded issue.
Comment 5 Alex Cherepanov 2012-07-22 03:39:26 UTC
It looks like everything works in the current development version.
A new release is scheduled this August.

I've tried the sample files on AMD x6 box and found no SEGVs.
At 360 dpi, 2 rendering threads have the smallest run time.
At 1440 dpi, rendering time is about reverse proportional 
to the number of threads up to 6, which is the number of cores on my box.
(I used bmp16m instead of tiff32nc and /dev/null for output)

Threaded rendering works well now.