With Debian's ghostscript 10.02.0~dfsg-2 package (under Debian unstable), if I run ps2pdf on the PDF file from https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3163.pdf the text arithmetic. Also, C99’s informative annex G offered [...] is changed to arithmetic. !lso, C99’s informative annex G offered [...] i.e. the letter "A" is changed to the exclamation point "!". Issues with special Unicode characters are common due to the lack of proper support for ToUnicode CMap, but here, this just concerns an ASCII character. Note that this is a regression: there is no such issue with the ghostscript 10.0.0~dfsg-11+deb12u1 package under Debian 12 (bookworm). The PDF file may be particularly ugly, but Ghostscript 10.0.0 could handle it. If I decompress the streams of the original PDF file with "qpdf --stream-data=uncompress", the TJ lines around the problematic text are: [(i)5(n)-9(tro)10(d)10(u)7(c)9(e)-11(d)10( )-393(n)-9(e)-11(a)-11(r)5(ly)25( )-393(c)9(o)8(m)-8(p)-11(le)-13(t)20(e)11( )-393(su)5(p)11(p)-11(o)8(r)5(t )-396(f)8(o)8(r)5( )-393(th)4(e)11( )-393(I)6(E)7(C)-4( )-370(6)9(0)9(5)9(5)-13(9)9(:)-8(1)9(9)-13(8)9(9)9( )-393(st)-4(a)-11(n)-9(d)10(a)-11(r)5(d)10( )-393(f)8(o)8(r)5( )-393(bi)7(n)13(a)-11(r)5(y)4( )-393(f)8(lo)7(a)-11(ti)25(n)-9(g)] TJ [(-)] TJ [(p)-11(o)8(i)5(n)-9(t)20( )] TJ [<0083>-11<0094>5<008B>5<0096008A>4<008F>-8<0087>-11<0096008B0085>12<01E40003>-301<0004>9<008E0095>-3<0091>8<01E10003>-301<0006>-4<037B>9<037B>9<01EF>-5<00950003>-304<008B>5<0090>-9<0088>8<0091>-13<0094>5<008F>-8<0083>-11<0096008B0098>6<0087>-11<0003>] TJ [(a)] TJ [(n)13(n)-9(e)-11(x)6( )-302(G )-304(o)8(f)8(f)8(e)-11(r)5(e)11(d)10( )-302(a)-11( )-302(sp)-13(e)-11(c)9(i)5(f)8(i)5(c)9(a)-11(tio)11(n)-9( )-302(o)8(f)8( )-324(c)9(o)8(m)-8(p)-11(le)-13(x)6( )-302(a)-11(r)5(i)5(th)4(m)-8(e)-11(tic)12( )-302(th)4(a)-11(t )-305(i)5(s )] TJ
Created attachment 24934 [details] Reduced to something actually debuggable
Fixed in: https://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=c92cd1c24abf4 I don't know what you mean by "lack of proper support for ToUnicode CMap" nor the "this just concerns an ASCII character". The problem is a bug in the parsing of the ToUnicode CMap for a multi-byte (so very clearly not ASCII) CIDFont.
(In reply to Chris Liddell (chrisl) from comment #2) > Fixed in: > > https://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=c92cd1c24abf4 I'll try to have a look at it in the next few days. > I don't know what you mean by "lack of proper support for ToUnicode CMap" > nor the "this just concerns an ASCII character". The problem is a bug in the > parsing of the ToUnicode CMap for a multi-byte (so very clearly not ASCII) > CIDFont. OK. About "lack of proper support for ToUnicode CMap", there have been various regressions in the past few years concerning *non-ASCII* characters, possibly related to the ToUnicode CMap handling (but only affecting particular non-ASCII characters); some of them (related to the ToUnicode CMap) have been fixed, but see bug 704674 and bug 704681, which are still open (at least the second one appeared when switching to the new PDF interpreter). Concerning "this just concerns an ASCII character", I just meant that this was the first time I was seeing an ASCII character not handled correctly. I still have to analyze new regressions (to myself: cours05.tex). A bug in poppler or a change on the LaTeX side is not excluded either (I'll have to check that too).
(In reply to Chris Liddell (chrisl) from comment #2) > Fixed in: > > https://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=c92cd1c24abf4 I confirm that the issue is fixed in Debian's package ghostscript 10.02.1~dfsg-1 (from 2023-11-08 in unstable). Note also that while ghostscript 10.0.0~dfsg-11+deb12u2 (for Debian 12 (bookworm), which is the current Debian/stable) doesn't have any issue with the "A" changed to "!", it has various similar issues with non-ASCII characters in the n3163.pdf file. This issues do not appear in ghostscript 10.02.1~dfsg-1. But I can still see the regressions from cours05.tex compared to what I got with the old PDF interpreter.