Created attachment 13489 [details] pdfa_def.ps I attempted to create a PDF with
Apologies for incomplete comment above. I attempted to create a PDF/A with pdfa_def.ps. /Author <feff5b545b50>
Apologies for the second incomplete comment. I don't get along with Bugzilla. It seems that gs 9.21 does not tolerate multibyte characters or certain Unicode characters in the document info pdfmark, such as when creating a PDF/A. When invoked under such conditions, pdfwrite will exit with an error such as: GPL Ghostscript 9.21: ERROR: VMerror (-25) on closing pdfwrite device. Or in other cases: Error: /undefinedfilename in --file-- Operand stack: --nostringval-- --nostringval-- (srgb.icc) (r) Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1983 1 3 %oparray_pop 1982 1 3 %oparray_pop 1966 1 3 %oparray_pop 1852 1 3 %oparray_pop --nostringval-- %errorexec_pop .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- Dictionary stack: --dict:1208/1684(ro)(G)-- --dict:1/20(G)-- --dict:79/200(L)-- Current allocation mode is local Last OS error: No such file or directory Current file position is 793 GPL Ghostscript 9.21: Unrecoverable error, exit code 1 This seems to be a regression. This test comes from a test suite the passed for gs 9.20 and several earlier versions. Full command line that demonstrates failure: gs -dQUIET -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sProcessColorModel=DeviceRGB -sColorConversionStrategy=/RGB -dPDFA=2 -dPDFACompatibilityPolicy=1 -o _gs.pdf ccitt.pdf pdfa_def.ps Full command line that works, the only difference being removal of multibyte characters from pdfmark: gs -dQUIET -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sProcessColorModel=DeviceRGB -sColorConversionStrategy=/RGB -dPDFA=2 -dPDFACompatibilityPolicy=1 -o _gs.pdf ccitt.pdf pdfa_def_good.ps
Created attachment 13490 [details] Working pdfa.ps
Created attachment 13491 [details] Test PDF file The original image used in this file was released under a free license (Creative Commons BY-SA 3.0).
The undefinedfilename is caused by the fact that you are using the same name as the default we supply. It seems we are opening the file with a .libfile (or something similar) which means that it searches the paths, apparently (at least sometimes) before searching the current working directory. If it hits the pdf_def.ps in ghostpdl/lib then it runs that one. Which is why you get an undefined filename on (srgb.icc), which is not the path and spec of the ICC prfile from your pdfa_def.ps If you either change the name of the file, or use a fully qualified path for pdfa_def.ps, then I believe that problem will not exhibit. I'm going to pass that particular part of the puzzle to a colleague. I'll look into the problem surrounding the VMerror separately, since this fails to produce a PDF file. For what its worth, both of these problems only seems to happen with the release branch, I can't reproduce either with the current HEAD on master.
Note that there is a switch defined in: https://ghostscript.com/doc/current/Use.htm#Finding_files 1. The current directory if enabled by the -P switch Adding this option to your command line, e.g., gs -P -dQUIET ... should allow the file named pdfa.def from the current working directory to be used rather than on from one of the LIBPATH paths.
Thanks Ray for the clarification about -P. I introduced this unrelated issue while trying to create a test case for this report. The test suite that picked up the regression uses absolute paths for all inputs to gs so it avoids the "undefinedfilename" issue. That would also be why I observed different error output. When the absolute path is specified, the only error message I get is: GPL Ghostscript 9.21: ERROR: VMerror (-25) on closing pdfwrite device.
The problem is actually nothing to do with Unicode, and it also isn't a regression. The length of the Unicode strings is important, in order to set up a particular memory configuration, but the fact that they are Unicode is not relevant. You could also trigger exactly the same behaviour in previous versions, but getting the memory configuration just right is 'difficult'. This is also why I couldn't reproduce the behaviour with HEAD, the memory layout is slightly different because the binary has changed. (I'm a little surprised, though relieved, that it was possible to reproduce it with a debug build) The problem is actually caused by specifying empty strings for some metadata. When writing the XMP metadata we use gs_alloc_bytes() to allocate a string buffer, if the length of the buffer to be allocated is 0 bytes then the function can, sometimes, return a NULL pointer. Since we use a NULL pointer to indicate that the memory could not be allocated, this causes a VM error. In commit c06135acc959dbc0458352579bafe238794f2733 we now take an early exit when presented with a 0 length string, rather than trying to allocate and fill a buffer for the metadata. This prevents the possibility of a VM error. This does not address any other places in the code which might suffer from the same problem (not testing the allocation length, and relying on a non-NULL return). Re-assigning this to Chris to look at gs_alloc_bytes() to try and figure out why it might return a NULL pointer, and ideally prevent it. NB, we won't be changing the behaviour of the path searching, so its best not to use a filename which matches one of the resource files, or use a fully qualified pathspec. As Ray has mentioned, if you can't do either of these, then use -P (see the documentation in ghopstpdl/doc/Use.htm, section 8, How Ghostscript finds files).
Neither Ken nor I can now reproduce the issue with the memory manager, and with visual inspection of code, it doesn't look like there's a way for it to return NULL with a zero length allocation. So closing
Do you have any suggestions to mitigate this problem in 9.21? I have tried both removing empty keys and setting them to a single space, but I can still find examples of strings that will trigger this error. e.g. [ /Author (Just Author) /DOCINFO pdfmark and: [ /Author <feff0020> /Title <feff0020> /Subject <feff0020> /Keywords <feff0020> /Creator (Just Creator) /DOCINFO pdfmark Perhaps the XMP metadata contains some empty strings by default or the set of five values I have there does not include all values handed over to XMP? I threw together a crude test and it seems about 1% of short random non-empty strings can produce the error. In addition, a different error always seems to occur for valid Unicode characters above U+FFFF, possibly a distinct problem ("ERROR: rangecheck (-15) on closing pdfwrite device").
(In reply to James R Barlow from comment #10) > Do you have any suggestions to mitigate this problem in 9.21? As I explained, this is not limited to 9.21, its present in every version of Ghostscript's pdfwrite device which handles Unicode strings in a pdfmark. The solution is to rebuild Ghostscript from the current source. > In addition, a different error always seems to occur for valid Unicode > characters above U+FFFF, possibly a distinct problem ("ERROR: rangecheck > (-15) on closing pdfwrite device"). Unicode strings in PDF must be UTF-16BE, no other form is permitted.