When using a Postscript OpenType font that is found via PSRESOURCEPATH, it causes the normal .findfontvalue (in gs_fonts.ps) to be called by the font stuff (to get the FontName) and that causes the OpenType font file to incorrectly be binary tokenized via the scan_binary_token() routine. That results in an attempt to malloc() a gigabyte of memory. Normally that doesn't succeed and things process fine, but when a system has lots of (virtual) memory, the 1G malloc succeeds and is then being managed. Once a garbage collection occurs, it can take 12-15 minutes to finish the display of the page due to the "bad" 1G malloc's that are occurring.
Created attachment 2404 [details] Test input file to use with gs - one page This is the input file to use: gs test.ps The next attachment will have the first PostScript OpenType font used in the input file in case it is needed. Debug gs and set a breakpoint in scan_binary_token() and follow along after it breaks there and you'll see it attempt to do a 1 gigabyte malloc.
Created attachment 2405 [details] The first PostScript OpenType font used in the test input file. In case it's needed, this is the first PostScript OpenType file (from Adobe) used in the test input file. The problem in gs happens when the font is found via the PSRESOURCEPATH mechanism.
The source of the "problem" is that gs_cff.ps does not "overload" the .findfontvalue function for OpenType fonts (like is done in the gs_ttf.ps file for TrueType fonts) and the regular .findfontvalue function in gs_fonts.ps is terribly wrong for PostScript OpenType fonts. When something in the font stuff tries to get the FontName from the OpenType font via .findfontvalue in gs_fonts.ps, it causes the scan_binary_token() funtion to be called in a way that causes a bad 1 gigabyte malloc to occur and that causes severe performance problems on the next garbage collection when there is enough virtual memory for that malloc() to succeed. Our workaround was to add the following to the gs_cff.ps file (patterned after some code in gs_ttf.ps): % THIS IS BEING STUBBED BECAUSE OPENTYPE FONTS ARE GOING THROUGH THE % REGULAR .findfontvalue STUFF AND THAT CAUSES THE FONT TO BE BINARY % TOKENIZED IN A WAY THAT ENDS UP CAUSING A GIGABYTE SIZE MALLOC TO OCCUR! % THE OPENTYPE FONT STUFF SEEMS TO WORK EVEN WITH THESE STUBS WITH OPENTYPE % FONTS REFERENCED VIA PSRESOURCEPATH, BUT IT MAY NOT WORK IF OPENTYPE % FONTS ARE USED BY PLACING THEM IN THE GS SEARCH PATH. % <file> <key> .findfontvalue <value> true % <file> <key> .findfontvalue false % Closes the file in either case. /.findnonottofontvalue /.findfontvalue load def /.findfontvalue { exch dup 4 string .peekstring pop (OTTO) eq { % If this is a font at all, it's an OpenType font. exch dup /FontType eq { % THIS IS A STUB UNTIL REAL CODE IS WRITTEN TO DO THIS. pop closefile false } { dup /FontName eq { pop .findottofontname } { pop closefile false } ifelse } ifelse } { % Not an OpenType font. exch .findnonottofontvalue } ifelse } bind def % <file> .findottofontname <fname> true % <file> .findottofontname false % Closes the file in either case. /.findottofontname { % THIS IS A STUB UNTIL REAL CODE IS WRITTEN TO DO THIS. closefile false } bind def
Bumping priority for customer bug. Note: a long ago I dreamed about inplementing .findfontvalue in C, as well as an extraction of CIDSystemInfo from CIDFont or CMap.
I cannot reproduce the problem exactly as described. First, there's no PSRESOURCEPATH name or variable anywhere in Ghostscript sources. The directory to search for fonts can be specified using -sFONTPATH command line parameter. The supplied OpenType font is is parsed by .findttfontname, not the standard .findfontvalue. There's no memory overflow or crash during normal operation. I've found a SEGV that happens in the following program (file.ttf)(r)file /FontName .findnonttfontvalue but .findnonttfontvalue should never apply to TrueType or OpenType fonts.
Created attachment 2746 [details] patch for the crash Change the order of allocations because alloc_save_change_alloc() leaves alloc_change_t structire it allocates in the state that causes SEGV in GC if where member is not initialized. DETAILSL Allocation of alloc_change_t structure without further initialisation happened when allocation of a new run of references failed. The latter may be easily triggered by interpretation of random bytes as a binary object sequence. DIDDERENCES: None The nest patch will try to add more robust syntax checking to the binary object sequences before memory allocation, Finally, it would be great to figure out how the customer managed to use a wrong font parser.
The patch looks good exept a lot of typoes in the log message. Please edit and commit.
The patch from the comment #6 is committed as rev. 7694.
Created attachment 2751 [details] patch Avoid large memory allocation that can happen when random data are recognized as a binary object sequence. Add a preliminary syntax check before allocation of a reference array for a binary object sequence. Also remove a check that is always false. DIFFERENCES: None. No CET differences.
The patch looks good except a space line missed after declarations.
The patch from the comment #9 is committed as rev. 7699. I'm closing the bug as fixed because undesired effects of the mis-recognition of a binary object sequence are fixed. The bug report doesn't provide enough information to establish how Ghostscript was [mis-] configured to parse OpenType font as a Type 1. Feel free to reopen this bug report when this information will be available.