Bug 699937

Summary: Poorly constructed PDF file can inappropriately reuse a CIDFont
Product: Ghostscript Reporter: Ken Sharp <ken.sharp>
Component: PDF InterpreterAssignee: Ken Sharp <ken.sharp>
Status: NOTIFIED FIXED    
Severity: normal    
Priority: P4    
Version: master   
Hardware: PC   
OS: Windows 7   
Customer: 531 Word Size: ---
Attachments: highly reduced PDF file

Description Ken Sharp 2018-10-10 15:52:20 UTC
Created attachment 15825 [details]
highly reduced PDF file

The attached (much reduced) PDF file has, in the body of the page a single glyph which uses a CIDFont SimSun which uses an Identity-HCMap. The CIDFont called /SimSun is not a complete copy of the original CIDFont, it is a subset, but lacks the subset prefix.

It also uses a form, which also uses a CIDFont, again SimSun, this time with the UniGB-UTF16-H CMap. *This* CIDFont is not embedded in the PDF file (that is, the FontDescriptor of the descendant font has no FontFile entry)

So, when we come to execute the form, we try to load the font in object 19, and we go right down to the FontDescriptor in object 16. At this point we discover the font is not embedded. This means we need to find a substitute font. To do this we use the FontName.

We can't find a font called SimSun-UniGB-UTF16-H, so we try to see if we can find a CMap which matches part of the name. We do find that, and the remainder of the name is 'SimSun'. So we assume we have a CIDFont+CMap pair.

So the next thing to do is try to find a CIDFont called SimSun. Aha! we already have one loaded. Unfortunately, its a subset CIDFont. So when we compose it with the CMap and try to use it, the glyphs we want are either missing or incorrect.

Of course, the CIDFont *ought* to be marked as a subset, then we wouldn't have this problem.
Comment 1 Ken Sharp 2018-10-10 16:20:40 UTC
Fixed in commit 04a517f39cc3e25a7aec9766b45aeb2144718ee6
Comment 2 Ken Sharp 2018-10-10 18:04:32 UTC
Fixed in commit 04a517f39cc3e25a7aec9766b45aeb2144718ee6