Created attachment 7051 [details] testcase this document contains text info following 136 <0061> def 137 <0062> def 138 <0063> def 139 <0064> def 140 <0065> def 141 <0066> def 142 <0067> def 143 <0068> def 144 <0069> def 145 <006A> def 146 <006B> def and copy & paste works correctly on gs 8.71. but in recent trunk, something got definitely different. gs 9.01 can't do proper mapping between glyph and text it is easily caused by downloading sourceforge's official release build 8.71, 9.01 and running seperate ps2pdf run for both gs.. to this attachment. compare the result from copy&pasting text element to notepad. the pdf file from 8.71 works great and other one from 9.01 is awkward
Created attachment 7052 [details] generated pdf from 9.01 for preceeding attachment.
Created attachment 7053 [details] expected result(from 8.71)
Fixed in revision 11975: http://ghostscript.com/pipermail/gs-cvs/2010-December/012056.html
Created attachment 7062 [details] another testcase this bug not resolved. i added another testcase that causes same problem
Created attachment 7063 [details] expected result(from 8.71)
(In reply to comment #4) > Created an attachment (id=7062) [details] > another testcase > > this bug not resolved. i added another testcase that causes same problem This is quite a different issue, nothing to do with CMaps at all. The font in question (Arial) is embedded as a Symbolic TrueType font. Symbolic TrueType fonts should not have an Encoding (and in PDF/A *must* not). Previously we embedded an Encoding anyway, because (as in this case) Acrobat would use the Encoding to search and extract text. However this *is* technically invalid, caused us to create invalid PDF/A files and recently started causing other problems. As a result the code was revised to create proper Symbolic TrueType fonts with no Encoding. If you would like to raise an enhancement request I'm happy to consider whether we can do a better job of identifying Symbolic fonts, but as it stands this is not a bug nor a regression, even though it does appear to be.
--- base/gdevpdtt.c (revision 11735) +++ base/gdevpdtt.c (revision 11734) in this changeset, font->FontType == ft_TrueType is commented out. but it causes some non-cid truetype cid font ( fonts which has valid encodings such as winansi...isolatin..) loses its original encoding. not depending whether it is tagged with symbolic flag. am i understanding it correctly? i dun know this is truetype font with symbolic flag. but there would be several improvements for workaround this non-bug problem...
(In reply to comment #7) > --- base/gdevpdtt.c (revision 11735) > +++ base/gdevpdtt.c (revision 11734) > in this changeset, font->FontType == ft_TrueType is commented out. > but it causes some non-cid truetype cid font ( fonts which has valid encodings > such as winansi...isolatin..) loses its original encoding. not depending > whether it is tagged with symbolic flag. am i understanding it correctly? I'm afraid I don't quite understand what you mean. The change prevents the addition of an Encoding to a font whose type is TrueType. Since we always write TrueType fonts as Symbolic fonts (as noted in the comment) it doesn't matter what the original font's Encoding was. This change had no effect on any font other than a TrueType font. I'm not sure what you mean by a 'non-cid truetype cid font'.
>This change had no effect on any font other than a TrueType font. yes. indeed. > I'm not sure what you mean by a 'non-cid truetype cid font'. i'm sorry for this confusing phrase, it would mean non-cid truetype font.