Summary: | ps2pdf modifies ASCII text of a PDF file, breaking conversion to text and searching for text | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Vincent Lefevre <vincent-gs> |
Component: | PDF Writer | Assignee: | Chris Liddell (chrisl) <chris.liddell> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | chris.liddell |
Priority: | P2 | ||
Version: | 10.02.0 | ||
Hardware: | PC | ||
OS: | Linux | ||
Customer: | Word Size: | --- | |
Attachments: | Reduced to something actually debuggable |
Description
Vincent Lefevre
2023-10-05 14:34:20 UTC
Created attachment 24934 [details]
Reduced to something actually debuggable
Fixed in: https://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=c92cd1c24abf4 I don't know what you mean by "lack of proper support for ToUnicode CMap" nor the "this just concerns an ASCII character". The problem is a bug in the parsing of the ToUnicode CMap for a multi-byte (so very clearly not ASCII) CIDFont. (In reply to Chris Liddell (chrisl) from comment #2) > Fixed in: > > https://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=c92cd1c24abf4 I'll try to have a look at it in the next few days. > I don't know what you mean by "lack of proper support for ToUnicode CMap" > nor the "this just concerns an ASCII character". The problem is a bug in the > parsing of the ToUnicode CMap for a multi-byte (so very clearly not ASCII) > CIDFont. OK. About "lack of proper support for ToUnicode CMap", there have been various regressions in the past few years concerning *non-ASCII* characters, possibly related to the ToUnicode CMap handling (but only affecting particular non-ASCII characters); some of them (related to the ToUnicode CMap) have been fixed, but see bug 704674 and bug 704681, which are still open (at least the second one appeared when switching to the new PDF interpreter). Concerning "this just concerns an ASCII character", I just meant that this was the first time I was seeing an ASCII character not handled correctly. I still have to analyze new regressions (to myself: cours05.tex). A bug in poppler or a change on the LaTeX side is not excluded either (I'll have to check that too). (In reply to Chris Liddell (chrisl) from comment #2) > Fixed in: > > https://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=c92cd1c24abf4 I confirm that the issue is fixed in Debian's package ghostscript 10.02.1~dfsg-1 (from 2023-11-08 in unstable). Note also that while ghostscript 10.0.0~dfsg-11+deb12u2 (for Debian 12 (bookworm), which is the current Debian/stable) doesn't have any issue with the "A" changed to "!", it has various similar issues with non-ASCII characters in the n3163.pdf file. This issues do not appear in ghostscript 10.02.1~dfsg-1. But I can still see the regressions from cours05.tex compared to what I got with the old PDF interpreter. |