Ghostscript shows unrotated punctuation forms with Japanese vertical text. It also does not use a correct centerline for vertical text, a regression from 7.07 and ESP Ghostscript 8.15.1. Originally reported by mpsuzuki in http://ghostscript.com/pipermail/gs-devel/2007-June/003538.html
Created attachment 3072 [details] PS file including vertical text Test file. Requires an appropriate cidfmap for Japanese text.
Created attachment 3073 [details] Scanned image of Japanese PS printer output Correct output for reference.
Created attachment 3074 [details] Ghostscript 7.07 output
Created attachment 3075 [details] ESP Ghostscript 8.15.1 without CJK patch Vanilla ESP Ghostscript 8.15.1 output showing unrotated punctuation, a regression from 7.07.
Created attachment 3076 [details] ESP Ghostscript 8.15.1 with CJK patch Output from ESP Ghostscript 8.15.1 with a proposed patch. This shows correctly rotated punctuation, but also a slight vertical offset.
Created attachment 3077 [details] Current trunk output Output with current development HEAD version, showing unrotated punctuation and centerline shift problems relative to 7.07 and the reference output.
Created attachment 3079 [details] cidfmap for the fonts referenced by article9.ps Attaching the cidfmap file the original submitter used. This provides a substitution map from the fonts referenced by article9.ps to the IPA set, which is available bundled with the GRASS software package, and included in this archive: http://www.grass-japan.org/FOSS4G/ipafonts/grass5.0.3_i686-pc-linux-i18n-ipafull-gnu_bin.tar.gz The absolute paths will of course have to be altered to match where they are in any particular install.
Assigning to the developer of the GS 8 CJK support.
Patches were proposed by Koji Otani, but they raised bug 689404 and bug 689405. The patches were applied with rev 8187 and due to the regressions removed again in rev 8190. For further studies I attach the patches to this bug report.
Created attachment 3280 [details] Add support for the %% Replace keywords used with COMPILE_INITS=1 Here is an additional patch on top of r8187 that replaces the EXTRA_ hack with the proper %% Replace directives. This allows geninit.c to automatically include the files when compiling in the ps library, stripping comments and making other space-saving reductions. Please include this patch when reinstating the cjkv patch.
Created attachment 3281 [details] CJK patches from Koji Otani, part 1
Created attachment 3282 [details] CJK patches from Koji Otani, part 2
Created attachment 3283 [details] CJK patches from Koji Otani, part 3
Created attachment 3284 [details] CJK patches from Koji Otani, part 4
Additional note: All of the CJK patches is reverted in rev 8191 not in 8190.
The regressions caused by Koji Otani's patch: bug 689404 and bug 689405 are possible to fix (see each entries, I proposed the patches and regression test has passed). So I return this bug to support, for the evaluation of the proposed patch.
What is required is a patch for the gs 8.60 code that fixes this vertical writing issue, NOT the entire CJKV set of patches. We have many customers relying on the functionality of the existing Ghostscript method and the massive CJKV patch changes many things that do not need to be changed, possibly introducing new bugs. The regression test currently only has limited testing of Asian fonts, so it is not an authoritative test of actual customer usage of Ghostscript. Note that if there are other problems in gs 8.60 that are addressed by the CJKV patch, these should be entered as new bugs.
Changing assignment due to no progress.
Bumping priority for re-testing after recent changes. Likely it is fixed.
Reassign to Marcos for re-test as requested by Igor
Created attachment 4400 [details] screenshot.png
As of r9806 the punctuation is still rotated compared to the text (see the attached screenshot.png).
The fonts can be found as part of the Common Open Printing System: http://lx1.avasys.jp/OpenPrintingProject/openprinting-jp-0.1.3.tar.gz
Created attachment 4401 [details] cidfmap This is a minimal cidfmap that can be used to read the file. It assumes the ipam.ttf and ipaq.ttf files are in the current directory.
Thanks to mpsuzuki for reviewing the images of attachment of comment #21. The 'brackets' are not rotated in the left hand (Ghostscript) image -- this is at the top and bottom of the seventh column from the left. Also I notice that column 2 has a '2' that has a circle around it as well as the '1' in the sixth column. The circles are missing in the printer output.
>Also I notice that column 2 has a '2' that has a circle around it >as well as the '1' in the sixth column. The circles are missing in >the printer output. Oh, the line of the circle in the printer output is very very thin, but there are the circles in printer output.
Created attachment 4477 [details] patch.txt A patch to HEAD for glyph orientation. It's a partial fix except glyph positions.
Comment #16 doesn't look useful for the mainstream GS development, because it fixes Koji Otani's problems only, which are not included into the mainstream. Comment #27 may be neccessary to apply to make this statement true. Toshiya please explain if I'm missing something.
Patch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2008-October/008705.html commits the Comment #27 patch. It's a partial fix except glyph positions.
Created attachment 4478 [details] PDF including punctuations to be rotated Now I don't have sufficient time to build SVN HEAD and review your code. Please show the rasterization result of attached PDF.
Created attachment 4502 [details] test.png The PDF file from Comment #30 converted to PNG file using gshead (r9149).
Regarding Comment #30, 31: the document doesn't embed a CID font, so its interpretation depends on installed fonts CID. Marcos, please attach cidfmap you used to run it. The raster is not useful without that information.
I used the cidfmap from Comment #24.
Regarding Commnent #31-33 : After decompressinmg streams in PDF_including_punctuations_to_be_rotated.pdf I see both texts are printed with same font, which uses the *horizontal* writing mode. The texts are started with CID=634 and CID=7887 correspondingly. With the cidfmap attached above I run this test : /Ryumin-Light-Identity-H findfont /FDepVector get 0 get % dup { exch = = % } forall /CIDMap get /CID 634 def dup 0 get dup CID 2 mul get 256 mul exch CID 2 mul 1 add get add = /CID 7887 def dup 0 get dup CID 2 mul get 256 mul exch CID 2 mul 1 add get add = It prints : 500 500 It means that in the supplied font both CIDs map to same glyph (unless we have a bug in the TT cmap decoder (written by mpsuzuki), but I don't think so). Then I conclude that the test case is incorrect. The reason for the incorrectness is that the supplied Open Type font is not sufficient to emulate the CID font Ryumin-Light. BTW when I change Identity-H to Identity-V (2 occurances) in PDF_including_punctuations_to_be_rotated.pdf , Ghostscript renders rotated glyphs, which are correct (and the text is pronted vertically, as Adobe does). So I believe that Commnent #31-33 to be closed with RESOLVED INVALID. Please use a better font for the CID font emulation.
I don't understand why you conclude the testcase in Comment #30 as incorrect. It was generated by Adobe Acrobat 7. The utilization of CID for vertical glyph in horizontal writing mode is found in Adobe Technical Note #5078 (the official specification of Adobe- Japan1-6). So such usage (using CID for vertical glyph in horizontal writing mode) must be accepted, to provide the compatibility with Adobe products. In fact, Adobe Reader displays both of horizontal and vertical glyphs from the testcase in Comment #30. However, if you want to restrict your scope to the vertical glyph in vertical writing mode only and you feel the testcase in Comment #30 is beyond of the scope, I will file Comment #30 as another bug. You can close it as "WON'T FIX", but it is not "INVALID".
I thought more on comment #34 and I think it needs a correction (even without Commnet #35). This quote is wrong : "It means that in the supplied font both CIDs map to same glyph (unless we have a bug in the TT cmap decoder (written by mpsuzuki),...)" Actually the CID font emulation first maps CIDs to Unicode and then Unicode to GIDs. We fail at the first step because Unicode uses same codes for vertical and horisontal glyphs. Thus the failure is not related to the TT cmap decoder, which runs in the other step. Then I thought how we can work around the failure. Likely we need a list of CIDs pairs, which correspond to single glyphs in Unicode (or in another encoding used for CIDDecoding resource). I think the right way is to create a new resource category HVGlyphs and its instances will be dictionaries that define V mapping and H mapping. Both V and H mapping are dictionaries that map CIDs to CIDs. Actually they're reverses of each other. The resource names are various orderings (Japan1, CNS1, and so forth). Such resource will be loaded when a CID font emulation loads a True Type font or "an Open Type font with True Type data" and associated with the font. When the text decomposition happens, we get CID in the text enumerator structure. Here we can lookup the HVGlyphs resource and see if the CID belongs to another writing mode. If so, trigger the call to gs_type42_substitute_glyph_index_vertical with the "exclusive or" logic. Rater all above looks working, I don't like 2 things : (1) the dependence on Postscript code and (2) it works for Open Type only (doesn't work for True Type). The (1) looks more or less acceptable because this problem should not happen with other languages. Or we'll need (and we can) to fix it later. As to (2), the True Type case needs more investigation and more effort. Likely it will need to choose a right subfont from a True Type Collection or merge several fonts of the collection. Thanks to mpsuzuki for reference to tech note in Commant #35. It proves that we need to do this job (I was in doubt before getting it).
I'm still in doubts about the name for the new resource category. It must be good becuse it is global. HVGlyphs doesn't look good. The most meanful name is WModeDependentCIDs or SubstituteCIDsDependingOnWMode, but I don't like its length. Suggestions are welcome. Maybe CIDSubstitution, assuming that the resource instance will explain itself what the substitution dependss on ?
2nd fatch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2008-November/008788.html closes Comment 30-36.
3d patch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2008-November/008789.html
One more patch for glyph positions : http://ghostscript.com/pipermail/gs-cvs/2008-November/008800.html Now we've got 3 patches that close this bug all together. But we still want to do some additional improvememnts, so don't close the bug now.
One more patch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2008-November/008801.html
Here is the completye list of patches for closing this bug : http://ghostscript.com/pipermail/gs-cvs/2008-October/008705.html http://ghostscript.com/pipermail/gs-cvs/2008-November/008788.html http://ghostscript.com/pipermail/gs-cvs/2008-November/008789.html http://ghostscript.com/pipermail/gs-cvs/2008-November/008801.html Closing the bug now.
One more patch to head : http://ghostscript.com/pipermail/gs-cvs/2008-November/008816.html
Reopening the bug because it is reproducible with a newer version of Japanese fonts http://ossipedia.ipa.go.jp/ipafont/IPAfont00203.php
Created attachment 4693 [details] IPAfont00203.zip A local copy of the fonts.
A fast checking shows it falls to unimplewmented case coverage_format == 2. If it is only problem, a fix shouldn't be difficult.
Created attachment 4694 [details] patch4.txt A patch is being tested.
One more patch to HEAD : http://ghostscript.com/pipermail/gs-cvs/2009-January/008913.html