I've applied the "simple" commandline gs -o output.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress input.pdf to a set of PDFs. (My job was to test Ghostscript's capabilities to do simple "preflight" fixes like: embed missing fonts, etc.) On the resulting PDF output I use "pdffonts output.pdf" in order to check what happened to the fonts. Here is one sample PDF which shows a sort of "regression" from v8.71 to v9.02svn (sorry, I can't test v9.01 release right now). On the original PDF, pdffonts (the Poppler-based version) gives this output: kp@kpuntu:~$ pdffonts bad-orig-#NNNNN.pdf name type emb sub uni object ID ------------------------------------ ----------------- --- --- --- --------- PZOXVN+FrutigerLTCom-Light TrueType yes yes yes 7 0 PZOXVN+FrutigerLTCom-Roman TrueType yes yes yes 13 0 QHYDJL+MinionPro-Regular Type 1C yes yes yes 8 0 QHYDJL+FrutigerLTCom-Light CID TrueType yes yes yes 19 0 PZOXVN+MinionPro-It Type 1C yes yes yes 14 0 ORERHP+MinionPro-Bold Type 1C yes yes yes 15 0 This output looks OK to me. (With the "bad-orig" filename I don't mean to say that the original PDF is "bad". I mean to say that this file will give a "bad" result when treated by gs9.0x...) Note, that according to pdffonts, all used fonts are embedded as subsets, and all fonts do have a "ToUnicode CMap". After v8.71's treatment of the original, pdffonts returns this on the output, which looks OK to me: kp@kpuntu:~$ pdffonts gs871-bad-orig-#NNNNN.pdf name type emb sub uni object ID ------------------------------------ ----------------- --- --- --- --------- TCBQTQ+FrutigerLTCom-Light-Identity-H CID TrueType yes yes yes 22 0 GNPESZ+MinionPro-Bold Type 1C yes yes no 19 0 UEFSOY+FrutigerLTCom-Roman TrueType yes yes no 15 0 NQPBKA+MinionPro-Regular Type 1C yes yes yes 8 0 YRHNWT+MinionPro-It Type 1C yes yes no 17 0 XLWRJU+FrutigerLTCom-Light TrueType yes yes no 11 0 Note however, that now 4 of the original "ToUnicode CMaps" have disappeared, and that one font has acquired the additional suffix "-Identity-H". After v9.02svn's treatment of the original, pdffonts returns this on the output PDF, which definitely looks not OK to me: kp@kpuntu:~$ pdffonts gs902svn-bad-orig-#NNNNN.pdf name type emb sub uni object ID ------------------------------------ ----------------- --- --- --- --------- Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap Error: Illegal entry in bfrange block in ToUnicode CMap TCBQTQ+FrutigerLTCom-Light-Identity-H CID TrueType yes yes yes 22 0 YRHNWT+MinionPro-It Type 1C yes yes no 17 0 XLWRJU+FrutigerLTCom-Light TrueType yes yes yes 11 0 GNPESZ+MinionPro-Bold Type 1C yes yes no 19 0 UEFSOY+FrutigerLTCom-Roman TrueType yes yes yes 15 0 NQPBKA+MinionPro-Regular Type 1C yes yes yes 8 0 Note, that gs9.0x did keep 4 "ToUnicode CMaps" and removed 2 (gs8.71 did keep 2 and remove 4). But what concerns me much more are all these "Error: Illegal entry in bfrange block in ToUnicode CMap" lines... Now, this may well be a bug in the pdffonts utility, which claims to see an error where there is none. However, some change in pdfwrite's handling of (already embedded!) fonts between 8.71 and 9.0x may be causing a problem which was not there before. This is not happening with *ALL* PDF files. I'll attach another example original PDF ("good-orig-#NNNNN.pdf") which has similar type of fonts embedded, created by the same application, where neither gs8.71 nor gs9.02svn cause such an effect. I'm attaching all the files named above (replacing all NNNNN with the actual bug number).
Created attachment 7301 [details] Original PDF file used for testing (not "bad" per se, despite its name)
Created attachment 7302 [details] gs8.71 output of "gs -o out.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress in.pdf"
Created attachment 7303 [details] gs9.02svn output of "gs -o out.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress in.pdf"
Created attachment 7304 [details] Another (similar) original PDF where described problem does NOT occur with gs9.0Xsvn Another
Well, Acrobat appears to like all the files, but I suspect this is because in the 8.71 case it is ignoring the ToUnicode CMap and simply using the character codes. See revision 11170 (Bug #691274), where the first change was made because we were writing an invalid ToUnicode CMap. This altered the emission to follow the specification for CMaps in general by emitting a single byte where possible. This was then reverted in revision 11975 (Bug # 691849) because it caused a regression with Acrobat. Further investigation is documented in revision 11993 (which references bug #691849 and #691862). I suspect that pdffonts is complaining because a ToUnicode CMap is not 2-bytes and 0 padded in the bfrange (the warnings about illegal entries would seem to support this). If you read through the log in revision 11993 you'll see that as far as I can tell the ToUnicode specification does not match what Acrobat actually expects. So technically (from reading the spec) pdffonts is correct, and the ToUnicode CMap is invalid. However in practice the CMap now matches what Acrobat expects. Its rather more important to us that Acrobat search/copy works, than conformance with a non-Adobe validator, so I don't plan to change this. Of course, if you can find a PDF fie which demonstrates that I'm wring in my understanding of the behaviour of Acrobat I will work on this problem some more.