Bug 700367

Summary: CID fonts can not be embedded
Product: Ghostscript Reporter: Masamichi Hosoda <trueroad>
Component: PDF WriterAssignee: Ken Sharp <ken.sharp>
Status: RESOLVED WONTFIX    
Severity: normal CC: knut_petersen
Priority: P4    
Version: 9.26   
Hardware: PC   
OS: Windows 10   
Customer: Word Size: ---
Attachments: no-embedded.pdf

Description Masamichi Hosoda 2018-12-15 07:50:57 UTC
Created attachment 16558 [details]
no-embedded.pdf

I use Cygwin's Ghostscript package.

Attached `no-embedded.pdf` uses a CID font, but it is not embedded.
gs-9.25 could embed the font with the following command.

```
$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=embedded.pdf no-embedded.pdf
```

However, gs-9.26 raises the following error and can not embed the font.

```
$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=embedded.pdf no-embedded.pdf
GPL Ghostscript 9.26 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
Can't find CID font "SourceHanSerif-Regular".
Attempting to substitute CID font /Adobe-Identity for /SourceHanSerif-Regular, see doc/Use.htm#CIDFontSubstitution.
The substitute CID font "Adobe-Identity" is not provided either. attempting to use fallback CIDFont.See doc/Use.htm#CIDFontSubstitution.
Loading a TT font from /usr/share/ghostscript/9.26/Resource/CIDFSubst/DroidSansFallback.ttf to emulate a CID font Adobe-Identity ... Done.

$
```

I've tried to revert bug699937 with gs-9.26.
It works fine like gs-9.25.
Comment 1 Ken Sharp 2018-12-15 09:29:22 UTC
(In reply to Masamichi Hosoda from comment #0)

> I use Cygwin's Ghostscript package.
> 
> Attached `no-embedded.pdf` uses a CID font, but it is not embedded.
> gs-9.25 could embed the font with the following command.

I think there must be some piece of configuration which you have not supplied. Possibly you have also not applied that configuration to your 9,26 installation.

Have you a copy of the font (SourceHanSerif-Regular) locally, which you have added to the cidfmap ?

Because the font is not embedded in the PDF file, Ghostscript will *not* be able to 'embed the font'. The best it can do, which is what I see happening, and what appears from the transcript you have provided to be the case for you, is that we fall back to a substitute font. If there is no substitute provided, then we use the fallback CID font.

That font *does* get embedded. Naturally, because its using an inappropriate substitute font, the text is probably not what you expected. 

I get *exactly* the same messages and output from Ghostscript 9.25 as I do for 9.26.

In short I suspect this is a configuration error in your Ghostscript 9.26 setup. If you have added the font to your local cidfmap, and that is now not working, then I'm going to need a copy of the font and your cidfmap entry in order to investigate further.
Comment 2 Masamichi Hosoda 2018-12-15 12:05:47 UTC
(In reply to Ken Sharp from comment #1)
> I think there must be some piece of configuration which you have not
> supplied. Possibly you have also not applied that configuration to your 9,26
> installation.

Sorry, I made a mistake when trying to make the tiny example.
Here is another example which is reproducible.

`SourceHanSerif-Regular.font.ps` is a Postscript file containing SourceHanSerif-Regular font.
I've not edited cidfmap etc.

`no-embedded.pdf` uses a CID font, but it is not embedded.
gs-9.25 could embed the font with the following command.

```
$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=embedded.pdf SourceHanSerif-Regular.font.ps no-embedded.pdf
GPL Ghostscript 9.25 (2018-09-13)
Copyright (C) 2018 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
$ pdffonts embedded.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
OCRYJT+SourceHanSerif-Regular        CID Type 0C       Identity-H       yes yes no       9  0
$
```

gs-9.26 raises the following error and can not embed the font.

```
$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=embedded.pdf SourceHanSerif-Regular.font.ps no-embedded.pdf
GPL Ghostscript 9.26 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
Can't find CID font "SourceHanSerif-Regular".
Attempting to substitute CID font /Adobe-Identity for /SourceHanSerif-Regular, see doc/Use.htm#CIDFontSubstitution.
The substitute CID font "Adobe-Identity" is not provided either. attempting to use fallback CIDFont.See doc/Use.htm#CIDFontSubstitution.
Loading a TT font from /usr/share/ghostscript/9.26/Resource/CIDFSubst/DroidSansFallback.ttf to emulate a CID font Adobe-Identity ... Done.
$ pdffonts embedded.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
BNOOTV+DroidSansFallback             CID TrueType      Identity-H       yes yes no       9  0
$
```

Revering bug699937 with gs-9.26 works fine.

```
$ diff -u /usr/share/ghostscript/9.26/Resource/Init/pdf_font.ps.org /usr/share/ghostscript/9.26/Resource/Init/pdf_font.ps
--- /usr/share/ghostscript/9.26/Resource/Init/pdf_font.ps.org   2018-11-21 23:27:57.000000000 +0900
+++ /usr/share/ghostscript/9.26/Resource/Init/pdf_font.ps       2018-12-15 21:01:36.539701800 +0900
@@ -1868,16 +1868,8 @@
 /findCIDFont {
   {
     dup /CIDFont resourcestatus {
-      pop %size
-      % Check status. If its 1 then we loaded this CIDFont resource from disk, and
-      % its safe to use it. If its 0 then it was loaded from the PDF file and its
-      % *not* safe to use as a replacement for a missing font. If its 2 then its
-      % not loaded, but is available from external resource and is safe to use. So
-      % if status is 0, *don't* use this CIDFont.
-      0 eq not {
-        /CIDFont findresource
-        exit
-      }if
+      pop pop /CIDFont findresource
+      exit
     } if
     .remove_font_name_prefix
     dup dup length string cvs
$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=embedded.pdf SourceHanSerif-Regular.font.ps no-embedded.pdf
GPL Ghostscript 9.26 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
$ pdffonts embedded.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
OCRYJT+SourceHanSerif-Regular        CID Type 0C       Identity-H       yes yes no       9  0
$
```

Thank you for your investigation.
Comment 3 Masamichi Hosoda 2018-12-15 13:36:28 UTC
Whole `SourceHanSerif-Regular.font.ps` is here.
https://drive.google.com/file/d/1RV8E_d4RKOFIaqTg5p2ksHSeJ4TAyrm9/view?usp=sharing

The header is as follows.

```
%%BeginFont: SourceHanSerif-Regular
%%BeginResource: font SourceHanSerif-Regular
%!PS-Adobe-3.0 Resource-FontSet
%%DocumentNeededResources: ProcSet (FontSetInit)
%%Title: (FontSet/SourceHanSerif-Regular)
%%Version: 0
%%EndComments
%%IncludeResource: ProcSet (FontSetInit)
%%BeginResource: FontSet (SourceHanSerif-Regular)
/FontSetInit /ProcSet findresource begin
%%BeginData: 22634026 Binary Bytes
/SourceHanSerif-Regular 22633983 
```
Comment 4 Ken Sharp 2018-12-15 14:05:19 UTC
(In reply to Masamichi Hosoda from comment #2)
> (In reply to Ken Sharp from comment #1)

> Sorry, I made a mistake when trying to make the tiny example.
> Here is another example which is reproducible.
> 
> `SourceHanSerif-Regular.font.ps` is a Postscript file containing
> SourceHanSerif-Regular font.
> I've not edited cidfmap etc.
> 
> `no-embedded.pdf` uses a CID font, but it is not embedded.
> gs-9.25 could embed the font with the following command.
> 
> ```
> $ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=embedded.pdf
> SourceHanSerif-Regular.font.ps no-embedded.pdf

Yes, that will fall foul of that recent change. We've never said that will work and right now, it won't. You are relying on defining a font in PostScript, and then using it in PDF. So you are expecting the font to persist from the PostScript environment into the PDF environment.

I'll talk to one of my colleagues about this on Monday, but I should warn you now that this is not going to work in the future. There are significant changes to the PDF interpreter planned which will mean that it won't be able to use fonts from the PostScript environment.

You should investigate a different way of defining your fonts to make them available to Ghostscript. Fonts and CIDFonts can be added through the FontMap and cidfmap files.
Comment 5 Masamichi Hosoda 2018-12-15 14:26:12 UTC
(In reply to Ken Sharp from comment #4)
> You should investigate a different way of defining your fonts to make them
> available to Ghostscript. Fonts and CIDFonts can be added through the
> FontMap and cidfmap files.

Thank you for your suggestion.

If I understand correctly, on most systems, only super user (root or Administrators) can edit cidfmap.
Also, only super user can put files to Fonts and CIDFonts directories.
On the other hand, the method of defining a font in PostScript does not need super user.
Is there a better way for non-super users to define fonts?
Comment 6 Ken Sharp 2018-12-15 14:29:10 UTC
(In reply to Masamichi Hosoda from comment #5)

> If I understand correctly, on most systems, only super user (root or
> Administrators) can edit cidfmap.
> Also, only super user can put files to Fonts and CIDFonts directories.
> On the other hand, the method of defining a font in PostScript does not need
> super user.

All of this would seem (to me) to depend entirely on how and where Ghostscript is installed.


> Is there a better way for non-super users to define fonts?

If the directories are locked like this, then you cannot install fonts. I think you should approach your system administrator and ask them to review the policy, or add the fonts you require for you.
Comment 7 Ken Sharp 2018-12-16 10:21:49 UTC
(In reply to Masamichi Hosoda from comment #5)

> If I understand correctly, on most systems, only super user (root or
> Administrators) can edit cidfmap.
> Also, only super user can put files to Fonts and CIDFonts directories.
> On the other hand, the method of defining a font in PostScript does not need
> super user.
> Is there a better way for non-super users to define fonts?

The simple answer to this is to copy the resources locally and add the CIDFont.

So if you copy the ghostpdl/Resource tree to your home directory (or a convenient sub-directory) you can then put SourceHanSerif-Regular.ps in the Resource/CIDFont folder (but use the filename SourceHanSerif-Regular). You can then invoke Ghostscript with the -I switch to tell it to use the local resources:

gs -I/home/<user>/Resource -sDEVICE=pdfwrite -sOutputFile=embedded.pdf no-embedded.pdf

When I do that the resulting file embedded.pdf reads 'Sample' and has an embedded subset CIDFont (Type 1) with an Identity-H Encoding named SourceHanSerif-Regular.

I feel this is a better solution than running the font to define it in the PostScript environment, not least because you only ever need to supply the local resources, no matter how many fonts you want to define. Also you can perform other local customisations.
Comment 8 Masamichi Hosoda 2018-12-16 14:05:00 UTC
(In reply to Ken Sharp from comment #7)
> The simple answer to this is to copy the resources locally and add the
> CIDFont.

Thank you for your advice.

Is it necessary that copying ghostscript resource tree to local?
The following commands without copying resource tree work fine with gs-9.26.

```
rm -fr mygsres
mkdir -p mygsres/CIDFont/
cp SourceHanSerif-Regular.font.ps mygsres/CIDFont/SourceHanSerif-Regular
gs -dBATCH -dNOPAUSE -I./mygsres/ -sDEVICE=pdfwrite -sOutputFile=embedded.pdf no-embedded.pdf
```
Comment 9 Ken Sharp 2018-12-16 14:58:11 UTC
(In reply to Masamichi Hosoda from comment #8)
> (In reply to Ken Sharp from comment #7)
> > The simple answer to this is to copy the resources locally and add the
> > CIDFont.
> 
> Thank you for your advice.
> 
> Is it necessary that copying ghostscript resource tree to local?

I was trying to keep life simple, and answer multiple questions at once.

The exact behaviour of the Ghostscript search path is confusing, even to people who have used it extensively. You do not *need* to copy the resources generally. For some cases of making local customisations, you do need to. Rather than try and define all the conditions under which you might need to do so, its easier just to tell people to take a local copy.

If your solution works for you then obviously you don't, in this case, need to copy the resource tree.
Comment 10 Ken Sharp 2018-12-17 10:00:21 UTC
I've discussed this with my colleague, and he thinks there is a strong possibility that loading the CIDFont in PostScript and then trying to use it in PDF will break aspects of CIDFont handling in our existing PDF interpreter.

Also, as I mentioned earlier, this is likely not to work in the long term as we intend significant changes to the PDF interpreter.

Since installing the font is a reasonable way to proceed, I'm going to close this as 'wontfix'.
Comment 11 Knut Petersen 2018-12-18 09:06:22 UTC
(In reply to Ken Sharp from comment #10)
> I've discussed this with my colleague, and he thinks there is a strong
> possibility that loading the CIDFont in PostScript and then trying to use it
> in PDF will break aspects of CIDFont handling in our existing PDF
> interpreter.

There are some people that use this feature intensively, I never realized that there is problem ... 

> Also, as I mentioned earlier, this is likely not to work in the long term as
> we intend significant changes to the PDF interpreter.

I have to admit that there a valid reasons for commit 04a517f39, but I don't like the fact that the fix to one problem removes valuable features. In the commit description you incorrectly wrote:

>    This commit continues to use resourcestatus to locate a substitute
>    CIDFont, but it checks the 'status' flag. If its 0 then that means we
>    created it via an explicit defineresource, which means (I believe and
>    testing seems to confirm) that we created it from the PDF file.

A status of 0 does not necessarily mean that the CID font is provided by the pdf currently being processed, it also might be provided by earlier postscript code, e.g. a file myfonts.ps
   
   (/mypath/emmentaler-16.cid) (r) file .loadfont
   /Emmentaler-16  /Identity-H [ /Emmentaler-16 ]  composefont  pop

in combination with e.g.

   gs [...] -sDEVICE=pdfwrite  -sOutputFile=final.pdf myfonts.ps source.pdf

If this bug stays at 'WontFix' we have to adapt our code. It would be nice if you could be a bit more precise on the intended long term changes. 

Building lilypond's  documentation and some other software that  interfaces *TeX and lilypond currently relies on ghostscript's capability to produce pdfs without embedded (normal and CID) fonts. Those pdfs are used in (pdf/xe/lua)latex documents. The pdfs produced by *latex are then postprocessed by ghostscript to insert the missing fonts.

Will that still be possible after the intended changes?
Comment 12 Ken Sharp 2018-12-18 09:20:52 UTC
(In reply to Knut Petersen from comment #11)

> >    This commit continues to use resourcestatus to locate a substitute
> >    CIDFont, but it checks the 'status' flag. If its 0 then that means we
> >    created it via an explicit defineresource, which means (I believe and
> >    testing seems to confirm) that we created it from the PDF file.
> 
> A status of 0 does not necessarily mean that the CID font is provided by the
> pdf currently being processed, it also might be provided by earlier
> postscript code, e.g. a file myfonts.ps

Well obviously we realise that, I said so in earlier comments. I also pointed out that we had never said that this was a viable means for making font resources (especially CIDFont resources) available to PDF files. It never occurred to me that someone would even expect this to work.


> If this bug stays at 'WontFix' we have to adapt our code. It would be nice
> if you could be a bit more precise on the intended long term changes. 

If I were ready to spell them out, then I would have done so.


> Building lilypond's  documentation and some other software that  interfaces
> *TeX and lilypond currently relies on ghostscript's capability to produce
> pdfs without embedded (normal and CID) fonts. Those pdfs are used in
> (pdf/xe/lua)latex documents. The pdfs produced by *latex are then
> postprocessed by ghostscript to insert the missing fonts.
> 
> Will that still be possible after the intended changes?

If the fonts are available in the usual resource searching machinery (that is, stored on disk), yes. If you intend to rely on changes in the PostScript execution environment being reflected in the PDF execution environment, then no, or at least not generally.
Comment 13 Knut Petersen 2018-12-18 13:03:32 UTC
(In reply to Ken Sharp from comment #12)

> It never occurred to me that someone would even expect this to work.

The opposite is true for me ;-)))

> If you intend to rely on changes in the PostScript
> execution environment being reflected in the PDF execution environment, then
> no, or at least not generally.

Hmm. We also rely on other postscript code to be reflected in the PDF environment. Masamichi Hosodas extractpdfmark utility extracts named destinations from intermediate pdfs to a postscript program, e.g. 

   changes.pdfmark
   ===============
   % Extract PDFmark 1.0.2 (with poppler-core)
   % https://github.com/trueroad/extractpdfmark/
   [ /PageMode /UseOutlines /DOCVIEW pdfmark
   [ /Dest (1) /Page 1 /View [/XYZ 72 769.89 0] /DEST pdfmark
   [ /Dest (2) /Page 2 /View [/XYZ 72 769.89 0] /DEST pdfmark
   [ /Dest (Top) /Page 1 /View [/XYZ 72 769.89 0] /DEST pdfmark

Then something like

   gs -sDEVICE=pdfwrite [...] -sOutputFile=changes.final.pdf [...]
      changes.pdfmark cidres.ps changes.intermediate.pdf

is currently used to produce the final pdf.

Do you expect this code to stop working in the future?
Comment 14 Ken Sharp 2018-12-18 13:09:36 UTC
(In reply to Knut Petersen from comment #13)

> Then something like
> 
>    gs -sDEVICE=pdfwrite [...] -sOutputFile=changes.final.pdf [...]
>       changes.pdfmark cidres.ps changes.intermediate.pdf
> 
> is currently used to produce the final pdf.
> 
> Do you expect this code to stop working in the future?

I'm not going to be pressured into discussing a project which is not yet ready for discussion.

This bug report is *closed* stop posting to it.