Summary: | Encoding of pdf metadata do not comply with pdf standard | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Frédéric Bron <frederic.bron> |
Component: | PDF Writer | Assignee: | Ken Sharp <ken.sharp> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | P4 | ||
Version: | 9.06 | ||
Hardware: | PC | ||
OS: | Linux | ||
Customer: | Word Size: | --- | |
Attachments: |
result of ps2pdf
input ps file |
Description
Frédéric Bron
2012-11-29 13:42:02 UTC
(In reply to comment #0) > Created an attachment (id=9110) [details] > result of ps2pdf I'd much prefer that you post the file before conversion, I'm quite capable of running Ghostscript to see what the output looks like. Created attachment 9111 [details]
input ps file
Technically the 'correct' approach is to define a PDFDSCEncoding which maps the non-ASCII values. However, this is non-trivial, and counter-intuitive. I've made changes so that in the absence of a PDFDSCEncoding we will assume that any non UTF-16BE string is using PDFDocEncoding. We then convert that to UTF-16BE and on to UTF-8. This should resolve the problem. See commit: a3d00daf5f9abb1209cb750a95e23bc6951c1c63 Thanks for quick fix. I have built the modified gs. I do not have the error in evince anymore. I still have a question: why 0xA1 and 0xA2 in .ps are encoded 0xC2 0xA3 and 0xC2 0xA4 in the xml part of the.pdf and not 0xC2 0xA1 and 0xC2 0xA2? For a reason I do not understand pdfinfo interprets it the same but can you explain? (In reply to comment #5) > I have built the modified gs. I do not have the error in evince anymore. I > still have a question: why 0xA1 and 0xA2 in .ps are encoded 0xC2 0xA3 and 0xC2 > 0xA4 in the xml part of the.pdf and not 0xC2 0xA1 and 0xC2 0xA2? For a reason I > do not understand pdfinfo interprets it the same but can you explain? Hmm, I'd have to check, that would suggest that I messed up the lookup table which converts PDFDocEncoding into XML. I'll look at it again. Yes, you were quite correct, I'd missed an entry in the lookup table quite near the beginning. There's a fix here: 3a4439baee68c440da7164daf55de04a4d48609a I believe that fixes it but its unfortunately easy to miss entries when cresting these kinds of tables. that works now, thanks. |