Bug 698471 - Chinese special symbols show errors
Summary: Chinese special symbols show errors
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: Font API (show other bugs)
Version: 9.21
Hardware: PC Linux
: P4 normal
Assignee: Chris Liddell (chrisl)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-01 01:45 UTC by ge ning zhen
Modified: 2017-09-05 19:04 UTC (History)
1 user (show)

See Also:
Customer:
Word Size: ---


Attachments
original pdf file (624.41 KB, application/pdf)
2017-09-01 01:45 UTC, ge ning zhen
Details
converted png (2.86 MB, image/png)
2017-09-01 01:52 UTC, ge ning zhen
Details
replace uni300E and uni300F (3.73 MB, application/x-font-ttf)
2017-09-04 00:34 UTC, ge ning zhen
Details
simple pdf (57.74 KB, application/pdf)
2017-09-04 02:12 UTC, ge ning zhen
Details
FZLTZHUNHK_HBBY.TTF (7.34 MB, application/x-font-ttf)
2017-09-04 02:14 UTC, ge ning zhen
Details
cidfmap (2.25 KB, application/postscript)
2017-09-04 02:14 UTC, ge ning zhen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ge ning zhen 2017-09-01 01:45:01 UTC
Created attachment 14180 [details]
original pdf file

When I convert the Chinese pdf to png by ghostscript, found the "﹃"  and  "﹄" symboles display is incorrect and when the conversion is complete, it is displayed as "『" and "』".

The command line I'm using for testing:

   gs -dNOSAFER -r300 -dBATCH -sDEVICE=png16m -dNOPAUSE -dEPSCrop -
   sOutputFile=test2.png ./test.pdf

The command excute info:

GPL Ghostscript 9.21 (2017-03-16)
Copyright (C) 2017 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
Loading a TT font from /usr/local/share/fonts/default/TrueType/FZLTXIHK_HBBY.TTF to emulate a CID font FZLTXIHK_HBBY--GBK1-0 ... Done.
Loading a TT font from /usr/local/share/fonts/default/TrueType/FZLTZHK_HBBY.TTF to emulate a CID font FZLTZHK_HBBY--GBK1-0 ... Done.
Loading a TT font from /usr/local/share/fonts/default/TrueType/FZCYSK_HBBY.TTF to emulate a CID font FZCYSK_HBBY--GBK1-0 ... Done.
Loading a TT font from /usr/local/share/fonts/default/TrueType/FZHTK.TTF to emulate a CID font FZHTK--GBK1-0 ... Done.
Loading a TT font from /usr/local/share/fonts/default/TrueType/FZLTDHK_HBBY.TTF to emulate a CID font FZLTDHK_HBBY--GBK1-0 ... Done.
Loading a TT font from /usr/local/share/fonts/default/TrueType/FZBYSK_HBBY.TTF to emulate a CID font FZBYSK_HBBY--GBK1-0 ... Done.
Loading a TT font from /usr/local/share/fonts/default/TrueType/FZKTK.TTF to emulate a CID font FZKTK--GBK1-0 ... Done.
Loading a TT font from /usr/local/share/fonts/default/TrueType/FZFSK.TTF to emulate a CID font FZFSK--GBK1-0 ... Done.
Comment 1 ge ning zhen 2017-09-01 01:52:51 UTC
Created attachment 14181 [details]
converted png

This is the conversion of the finished picture, the "﹃" and "﹄" diplay error.
Comment 2 Ken Sharp 2017-09-01 02:10:20 UTC
(In reply to ge ning zhen from comment #0)

> "﹄" symboles display is incorrect and when the conversion is complete, it is
> displayed as "『" and "』".
> 
> The command line I'm using for testing:
> 
>    gs -dNOSAFER -r300 -dBATCH -sDEVICE=png16m -dNOPAUSE -dEPSCrop -
>    sOutputFile=test2.png ./test.pdf
> 
> The command excute info:
> 
> GPL Ghostscript 9.21 (2017-03-16)
> Copyright (C) 2017 Artifex Software, Inc.  All rights reserved.
> This software comes with NO WARRANTY: see the file PUBLIC for details.
> Processing pages 1 through 1.
> Page 1
> Loading a TT font from
> /usr/local/share/fonts/default/TrueType/FZLTXIHK_HBBY.TTF to emulate a CID
> font FZLTXIHK_HBBY--GBK1-0 ... Done.
> Loading a TT font from

The numerous warnings of this type indicate that the file uses CIDFonts which it does not include. As a result a substitute font is used.

Whenever font substitution takes place the result is incorrect. It may be comprehensible, but since the font is not the one intended by the document author the result is not correct.

In your case, it looks like you have substituted a horizontal font for a vertical one, hence the difference in the orientation of the punctuation glyphs.

To avoid this you must either create the PDF file with the fonts embedded (highly recommended by everyone) or define an appropriate substitute font for Ghostscript to use.

Note that since we do not have the same fonts available, or the same substitution defined, we cannot reproduce your problem precisely. If you want someone to investigate this more closely you will have to provide the relevant fonts, and also considerably reduce the complexity of the file. This is a huge page, with 8 different missing CIDFonts. You need to reduce this to a file with a single missing font which demonstrates your problem, and ideally a single glyph or two.
Comment 3 ge ning zhen 2017-09-04 00:34:32 UTC
Created attachment 14185 [details]
replace uni300E and uni300F
Comment 4 ge ning zhen 2017-09-04 00:35:46 UTC
Hi Ken Sharp:
    Thanks for you help,but I think this is not a font problem. First in 9.21 version,
if ghostscript did not find pdf fonts,it will use "DoridSansFallback.ttf(/Resource/CIDFSubst/DoridSansFallback.ttf)" to replace the missing CID font,although the style has changed, but the characters shown are not wrong. In DoridSansFallback.ttf and all of us used Font package contains "﹃ "(uniFE43)," ﹄" (uniFE44) these two characters, you can use FontCreator to analysis.
but the conversion of the completed picture shows the "『"(uni300E) and"』"(uni300F).
    So want to reproduce the problem, do not configure any font, just let ghostscript automatically replace, see if the characters are correct,I guess whether ghostscript will uniFE43 and uniFE44 by uni300E and uni300F parsing caused by the error? in order to confirm this problem,I use FontCreator to change the "『" (uni300E) to "﹃ "," "』" (uni300F) to "﹄" on "DoridSansFallback.ttf" and save as a new font, then use this font to convert png image, the result is the generated picture that the two symbols show are correct,this confirms that my guess is correct.
In the attachment I uploaded my modified font, you can use this font and the original font to generate pictures to compare the two symbols of the display.

Original Font location:
/usr/local/share/ghostscript/9.21/Resource/CIDFSubst/DoridSansFallback.ttf
New Font location:
/usr/local/share/ghostscript/9.21/Resource/CIDFSubst/DoridSansFallback_NEW.ttf

Thanks.
Comment 5 Ken Sharp 2017-09-04 00:54:59 UTC
(In reply to ge ning zhen from comment #4)

>     Thanks for you help,but I think this is not a font problem. 

Clearly it is, because its a glyph which is rendered 'incorrectly', that's fonts, nothing to do with the PDF interpreter.

> if ghostscript did not find pdf fonts,it will use
> "DoridSansFallback.ttf(/Resource/CIDFSubst/DoridSansFallback.ttf)" to
> replace the missing CID font,although the style has changed, but the
> characters shown are not wrong.

I think they are. Using the DroidSansFallbackFont which we ship (as opposed to any font which your distribution may have shipped instead), I see the same problem that you describe, using horizontal punctuation instead of vertical punctuation.

This is not surprising, since this is a fallback font, that glyphs (where it matters) are horizontal.


> In the attachment I uploaded my modified font, you can use this font and the
> original font to generate pictures to compare the two symbols of the display.

No. We are not going to replace a fallback font with horizontal glyphs with a hacked up font where *some* (but not all) of the horizontal punctuation has been replaced by vertical punctuation. Apart from anything else, this will simply mean that text using horizontal writing will get vertical punctuation marks, which will then be wrong instead.

If you want us to look into this problem you are going to have to supply the fonts *you* are using, as described in your initial comment. If you aren't willing to do that then we will close this bug report.

We already understand why using a horizontal-writing font to replace a vertical-writing font leads to problems when using punctuation marks, and we do not plan to do anything about this (fundamentally because we can't).

Whenever you replace fonts, especially CIDFonts and double especially when the language is complex, then you can expect problems. If you don't want that, then embed the CIDFonts in the PDF file (as recommended in the PDF specification) or use the original fonts as substitutes.

If you use the original font as a substitute, and that fails to render correctly, then that may be a bug (or it may be a mis-configuration on your part). We are prepared to look at that, but you will need to supply the original font and mapping that you are using. When you can do that you may reopen this bug report.
Comment 6 ge ning zhen 2017-09-04 02:12:33 UTC
Created attachment 14186 [details]
simple pdf
Comment 7 ge ning zhen 2017-09-04 02:14:08 UTC
Created attachment 14187 [details]
FZLTZHUNHK_HBBY.TTF
Comment 8 ge ning zhen 2017-09-04 02:14:50 UTC
Created attachment 14188 [details]
cidfmap
Comment 9 ge ning zhen 2017-09-04 02:21:10 UTC
Hi:

The command line I'm using for testing:

   gs -dNOSAFER -r300 -dBATCH -sDEVICE=png16m -dNOPAUSE  -sOutputFile=test2.png ./test6.pdf

The command excute info:
   GPL Ghostscript 9.21 (2017-03-16)
   Copyright (C) 2017 Artifex Software, Inc.  All rights reserved.
   This software comes with NO WARRANTY: see the file PUBLIC for details.
   Processing pages 1 through 1.
   Page 1
   Loading a TT font from /usr/local/share/fonts/default/TrueType/FZLTZHUNHK_HBBY.TTF 
   to emulate a CID font FZLTZHUNHK_HBBY--GBK1-0 ... Done.

Simple pdf:
   attachment/simple pdf(test6.pdf)

Font:
   attachment/FZLTZHUNHK_HBBY.TTF

Cidfmap:
   attachment/cidfmap

Thanks
Comment 10 Chris Liddell (chrisl) 2017-09-05 05:39:43 UTC
Fixed in:
http://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=7dd033589c
Comment 11 ge ning zhen 2017-09-05 19:04:30 UTC
Hi all:
   Thanks very much.