688833 – Using gs on a page with Postscript OpenType fonts, it can take 12-15 minutes for the page to display

Bug 688833 - Using gs on a page with Postscript OpenType fonts, it can take 12-15 minutes for the page to display

Summary: Using gs on a page with Postscript OpenType fonts, it can take 12-15 minutes ...

Status:	NOTIFIED FIXED

Alias:	None

Product:	Ghostscript
Classification:	Unclassified
Component:	General (show other bugs)
Version:	8.54
Hardware:	Sun Solaris

Importance:	P2 normal
Assignee:	Alex Cherepanov

URL:	http://www.xyenterprise.com
Keywords:

Depends on:
Blocks:

Reported:	2006-08-09 13:59 UTC by Jonathan Dagresta
Modified:	2008-12-19 08:31 UTC (History)
CC List:	1 user (show)

See Also:
Customer:	1130
Word Size:	---

Attachments
Test input file to use with gs - one page (25.83 KB, application/postscript) 2006-08-09 14:03 UTC, Jonathan Dagresta	Details
The first PostScript OpenType font used in the test input file. (94.21 KB, application/octet-stream) 2006-08-09 14:05 UTC, Jonathan Dagresta	Details
patch for the crash (1007 bytes, patch) 2007-02-11 20:36 UTC, Alex Cherepanov	Details \| Diff
patch (1.47 KB, patch) 2007-02-14 05:41 UTC, Alex Cherepanov	Details \| Diff
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jonathan Dagresta 2006-08-09 13:59:12 UTC

When using a Postscript OpenType font that is found via PSRESOURCEPATH, it 
causes the normal .findfontvalue (in gs_fonts.ps) to be called by the font 
stuff (to get the FontName) and that causes the OpenType font file to 
incorrectly be binary tokenized via the scan_binary_token() routine.
That results in an attempt to malloc() a gigabyte of memory.
Normally that doesn't succeed and things process fine, but when a system has 
lots of (virtual) memory, the 1G malloc succeeds and is then being managed. 
Once a garbage collection occurs, it can take 12-15 minutes to finish the 
display of the page due to the "bad" 1G malloc's that are occurring.

Comment 1 Jonathan Dagresta 2006-08-09 14:03:20 UTC

Created attachment 2404 [details]
Test input file to use with gs - one page

This is the input file to use: gs test.ps
The next attachment will have the first PostScript OpenType font used in the
input file in case it is needed.
Debug gs and set a breakpoint in scan_binary_token() and follow along after it
breaks there and you'll see it attempt to do a 1 gigabyte malloc.

Comment 2 Jonathan Dagresta 2006-08-09 14:05:02 UTC

Created attachment 2405 [details]
The first PostScript OpenType font used in the test input file.

In case it's needed, this is the first PostScript OpenType file (from Adobe)
used in the test input file.
The problem in gs happens when the font is found via the PSRESOURCEPATH
mechanism.

Comment 3 Jonathan Dagresta 2006-08-09 14:12:15 UTC

The source of the "problem" is that gs_cff.ps does not "overload" 
the .findfontvalue function for OpenType fonts (like is done in the gs_ttf.ps 
file for TrueType fonts) and the regular .findfontvalue function in gs_fonts.ps 
is terribly wrong for PostScript OpenType fonts.
When something in the font stuff tries to get the FontName from the OpenType 
font via .findfontvalue in gs_fonts.ps, it causes the scan_binary_token() 
funtion to be called in a way that causes a bad 1 gigabyte malloc to occur and 
that causes severe performance problems on the next garbage collection when 
there is enough virtual memory for that malloc() to succeed.
Our workaround was to add the following to the gs_cff.ps file (patterned after 
some code in gs_ttf.ps):

% THIS IS BEING STUBBED BECAUSE OPENTYPE FONTS ARE GOING THROUGH THE
% REGULAR .findfontvalue STUFF AND THAT CAUSES THE FONT TO BE BINARY
% TOKENIZED IN A WAY THAT ENDS UP CAUSING A GIGABYTE SIZE MALLOC TO OCCUR!
% THE OPENTYPE FONT STUFF SEEMS TO WORK EVEN WITH THESE STUBS WITH OPENTYPE
% FONTS REFERENCED VIA PSRESOURCEPATH, BUT IT MAY NOT WORK IF OPENTYPE
% FONTS ARE USED BY PLACING THEM IN THE GS SEARCH PATH.
% <file> <key> .findfontvalue <value> true
% <file> <key> .findfontvalue false
% Closes the file in either case.
/.findnonottofontvalue /.findfontvalue load def
/.findfontvalue {
  exch dup 4 string .peekstring pop (OTTO) eq {
                % If this is a font at all, it's an OpenType font.
      exch dup /FontType eq {
        % THIS IS A STUB UNTIL REAL CODE IS WRITTEN TO DO THIS.
        pop closefile false
      } {
        dup /FontName eq {
          pop .findottofontname
        } {
          pop closefile false
        } ifelse
      } ifelse
  } {
                % Not an OpenType font.
    exch .findnonottofontvalue
  } ifelse
} bind def

% <file> .findottofontname <fname> true
% <file> .findottofontname false
% Closes the file in either case.
/.findottofontname {
  % THIS IS A STUB UNTIL REAL CODE IS WRITTEN TO DO THIS.
  closefile false
} bind def

Comment 4 leonardo 2007-02-07 10:55:15 UTC

Bumping priority for customer bug.

Note: a long ago I dreamed about inplementing .findfontvalue in C, as well as 
an extraction of CIDSystemInfo from CIDFont or CMap.

Comment 5 Alex Cherepanov 2007-02-11 05:05:55 UTC

I cannot reproduce the problem exactly as described.
First, there's no PSRESOURCEPATH name or variable anywhere in Ghostscript
sources. The directory to search for fonts can be specified using -sFONTPATH
command line parameter.

The supplied OpenType font is is parsed by .findttfontname, not the standard
.findfontvalue. There's no memory overflow or crash during normal operation.

I've found a SEGV that happens in the following program
(file.ttf)(r)file /FontName .findnonttfontvalue
but .findnonttfontvalue should never apply to TrueType or OpenType fonts.

Comment 6 Alex Cherepanov 2007-02-11 20:36:10 UTC

Created attachment 2746 [details]
patch for the crash

Change the order of allocations because alloc_save_change_alloc() leaves
alloc_change_t structire it allocates in the state that causes SEGV in GC
if where member is not initialized.

DETAILSL
Allocation of alloc_change_t structure without further initialisation happened
when allocation of a new run of references failed. The latter may be easily
triggered by interpretation of random bytes as a binary object sequence.

DIDDERENCES:
None

The nest patch will try to add more robust syntax checking to the
binary object sequences before memory allocation,

Finally, it would be great to figure out how the customer managed to use
a wrong font parser.

Comment 7 leonardo 2007-02-12 00:56:59 UTC

The patch looks good exept a lot of typoes in the log message. Please edit and 
commit.

Comment 8 Alex Cherepanov 2007-02-12 06:12:37 UTC

The patch from the comment #6 is committed as rev. 7694.

Comment 9 Alex Cherepanov 2007-02-14 05:41:28 UTC

Created attachment 2751 [details]
patch 

Avoid large memory allocation that can happen when random data are recognized
as a binary object sequence. Add a preliminary syntax check before allocation
of a reference array for a binary object sequence.

Also remove a check that is always false.

DIFFERENCES:
None. No CET differences.

Comment 10 leonardo 2007-02-14 09:30:42 UTC

The patch looks good except a space line missed after declarations.

Comment 11 Alex Cherepanov 2007-02-14 10:59:58 UTC

The patch from the comment #9 is committed as rev. 7699.

I'm closing the bug as fixed because undesired effects of the mis-recognition
of a binary object sequence are fixed.

The bug report doesn't provide enough information to establish how
Ghostscript was [mis-] configured to parse OpenType font as a Type 1.
Feel free to reopen this bug report when this information will be available.