Summary: | In gs 8.14 ps2pdf causes severe glyph distortion in certain fonts | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Bo Thidé <bt> |
Component: | PDF Writer | Assignee: | Raph Levien <raph.levien> |
Status: | NOTIFIED FIXED | ||
Severity: | normal | CC: | alex, htl10, igor.melichev |
Priority: | P3 | Keywords: | bountiable |
Version: | 8.14 | ||
Hardware: | HP | ||
OS: | HP-UX | ||
Customer: | Word Size: | --- | |
Attachments: |
Correct ps file
Distorted .pdf file generated from a correct .ps file with ps2pdf (GS 8.14) My distillation of the file using a relatively fresh HEAD gs bad814-.pdf 3.bmp 3.ps managed to reproduce exact effect of the bug. patch |
Description
Bo Thidé
2004-03-11 13:16:46 UTC
Created attachment 550 [details]
Correct ps file
Created attachment 551 [details]
Distorted .pdf file generated from a correct .ps file with ps2pdf (GS 8.14)
This seems to be fixed in HEAD. Created attachment 562 [details]
My distillation of the file using a relatively fresh HEAD gs
This one appears to be fine.
Can't reproduce on Intel/Windows. From the attached PDF I can see that the glyph '3' is distorted. I guess the distortion happened while executing psf_convert_type1_to_type2. Is HP a low endian platform ? Created attachment 563 [details]
bad814-.pdf
A reduced "Distorted .pdf file". It contains a single glyph with a distortion.
Created attachment 564 [details]
3.bmp
A draft of the unhinted outline taken from bad814-.pdf .
Created attachment 565 [details]
3.ps
A reduced test file containing the single glyph.
Dear Bo Thidé, Since we can't reproduce this problem on our machines, could you please debug this problem on HP ? I'll ask boss about a payment for this job. Run gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=out.pdf 3.ps with a C debugger, setting a breakpoint at the beginning of psf_convert_type1_to_type2 in gdevpsfx.c and trace the function. I guess it wrongly converts a number from big endian to low endian, or wrongly applies an operation '<<' or '>>' to a signed/unsigned int. Note that the function is called with a notdef glyph (which is empty) and with the glyph '3', several times with each one. Glyph indices 938 and 886 are visible in the caller. In the attached draft 3.bmp, the outline starts with the point marked with a small circle, and goes conter-clockwise. The problem happens within the first 7 path segments, which are 4 curves, 1 line, and 2 curves. It seems that if it is a bug, it is HP specific possibly. I cannot reproduce exactly this problem on HP-UX computers available for HP test drive program. HP-UX 11i 11.11 rp2470 2@750MHz 8700 PA-RISC -> works fine HP-UX 11i 11.11 rp3410 2@800MHz 8800 PA-RISC -> works fine HP-UX 11i 11.23 rx1600 2@1.0GHz (Itanium II) -> segv, bug 687643 Please provide detailed information about your hardware, OS, and compiler. Even better, provide an login account where the problem can be reproduced an fixed. Has this (bit-manipulation?) bug been solved in 8.30 or 8.31? Machine: HP 9000/C360 PA-RISC2.0 (and all other PA-RISC workstations and servers that we have - about 50 or so) OS: HP-UX 11.11, fully patched. C compiler: HP92453-01 B.11.11.30766.GP HP C Compiler C++ compiler: HP aC++ B3910B A.03.56 We have no problems with segmentation violation or such. Everything runs fine. It's only the rendering of some glyphs in some fonts (typically in sans serif fonts) that introduces visual glyph distortion. It is very likely that the ghostscript bit manipulation does not to do a correct job for the PA-RISC. This bug has existed in all ghostscript versions since 7.01. That's why we still run AFPL Ghostscript 7.0 (2001-04-08). I still cannot reproduce the problem even on gs 8.14 code base. Please attach your Makefile, configure.log, and the values of your environment variables. The distortion looks rather similiar to a problem I am currently having on Tru64 (64-bit - the same file displays correctly on x86 linux). The embedded font in my case is truetype, however, and I just tried the pdf in this bug with gs 8.51 tru64 which seems to work; just thought I should mention it here: http://bugs.ghostscript.com/show_bug.cgi?id=688047#c3 I'll try to get a small test case for bug 688047 soon - the problem is currently in a 20-page pdf. Tried as Alex (comment 11, comment 14). Can't reproduce it myself either. Found two worth-mentioning issues though - (1) ghostscript generates subtly different pdf files depending on whether it is compiled with the HP compiler or gcc (which may have been what Alex was using, as configure picks gcc by default unless told otherwise). Need to look further. (2) gs 8.51 won't build on HP-UX - jasper problem (new to 8.50/8.51). Dear Igor - I think you had the right idea about the problem in the type 1
to type 2 conversion. I had a good look at the broken pdf (attachment 551 [details]),
and comparing to an extremely similiar but correct pdf,
and found that about 1/2 of the embedded glyphs are corrupted. There
is a very specific pattern to the corruption - an rrcurveto immediately
following a vhcurveto (rather than following any other operator), always
have the x co-ordinate of the 2nd control point messed up.
e.g. the first 7 path segments of the '3' glyph:
0 132 -85 35 -42 18 rrcurveto
66 43 27 43 0 54 rrcurveto
41 -27 94 -129 vhcurveto
-67 0 -31 -31 -40 -131 rrcurveto
15 -4 rlineto
15 28 42 78 91 0 rrcurveto
74 36 -53 -51 hvcurveto
it is 4 curves, 1 line, and 2 curves. The correct 4th curve should be
"-67 0 -89 -31 -40 -131 rrcurveto". Note the -31/-31 versus -89/-31
Likewise, whenever a "rrcurveto" follows a "vhcurveto", instead
of having
A B C D E F rrcurveto
it is always
A B D D E F rrcurveto
I think it is one of the memcpy's going wrong... need to look further...
Created attachment 1474 [details]
managed to reproduce exact effect of the bug.
Dear Bo Thidé - do you use the libcres.a on HP-UX routinely? Or some
other library that optimises/changes memory allocation/movements?
Or very agressive optimization flags?
I managed to reproduce the exact effect of the bug on x86 linux
(i.e. breaking the type1 charstring to type2 transform in the
exact same way) by making one line of code change in
src/gdevpsfx.c:
=================================
/* A 0 C D E F => A C D F E */
memcpy(csp - 4, csp - 3, sizeof(*csp) * 2);
csp[-2] = *csp;
csp[-4] = csp[-3]; <------- inserted line
c = prev_op;
POP(1);
goto put;
==================================
I think I may have found the bug: it is possible that memcpy is
broken on some combination of
HP-UX/compiler/memory-optimization library/agreesive-optimization.
I just did a search on google and found that HP provides an
optimizing library called libcres.a which provides a different
implmentation of memcpy (and other frequently used str*/mem*
functions) and only available for linking against the HP compiler.
This would explain why nobody can reproduce the bug with gcc or
the HP test drive system.
In any case, the bit of ghostscript code is broken - it is
copying an overlaping region of memory, and memcpy doesn't gaurantee
to work in that context - it should be memmove instead.
So this is my next plan - I'll try to look for the libcres.a library on the HP test drive program, and play with the optimization flags to reproduce the bug. I just had a further look - *almost all* the instances of memcpy in src/gdevpsfx.c involves overlapping regions of memory, and should have been memmove instead... That particular memcpy on line 841 (also the last instance) of src/gdevpsfx.c also co-incide with what Bo indicated as the time of breakage - according to the CVS log, between ghostscript 7.0 and 7.02, Peter made a change to an 'if x then jump' condition a few lines above to fix another bug; this would have the effect of altering the code path and exposing the code below which may(would?) have been inaccessible before then. (I am not saying that there is any problem with that change - I don't understand Peter's fix completely [yet]). This is the CVS log entry (between version 1.8 and 1.9 of that file): ================= Fix: An incorrect optimization in the conversion of Type 1 to Type 2 CharStrings could cause character shapes to be mangled. This probably only affected embedded fonts in PDF output. Fixes SourceForge #444374. ================= Hin-Tak Leung, thank you for digging out the exact place of the failure. This is the classic case of the overlapping intervals im memcpy(), whose effect is undefined. memcpy(csp - 4, csp - 3, sizeof(*csp) * 2); This code should use memmove() which works correctly even for overlapping intervals. memmove(csp - 4, csp - 3, sizeof(*csp) * 2); There are other calls to memcpy() with overlapping intervals in gdevpsfx.c, but they seem to work OK. This may depend on the alignment of the memory pointers, the length of the interval, etc. Unfortunately, little can be proven about this case because we cannot reproduce it. Created attachment 1482 [details]
patch
Replace memcpy() calls with overlapping source and destination areas to
memmove() calls. This is important on HP aC++ compiler for PA-RISC with
+Olibcalls option, which replaces memcpy() and friends with an optimized inline
code.
This is not the only place in Ghostscript with the overlapping intervals in
memory functions. Fortunately, valgrind detects this class of errors on the
standard x86 box.
I have looked up it a bit and found that apparently the default for recent version for gcc is to substitute with inline macro code (from gcc) which presumably behaves as desired (controlled explicitly by the -mno-memcpy/-mmemcpy option). So presumably the bug cannot be seen wherever gcc is used; and it is only manifested when both of these conditions are true: (1) using a vendor compiler (2) the memcpy in the vendor's libc is buggy (or more precisely, only do the mimimum and doesn't work correctly when the region overlap). i.e. clarify the 'inline code' part of the previous comment (comment 22). gcc by default inlines its own memcpy, so this isn't a case of inline versus 'non-inline'; the vendor's implementation could be inlined or actual library routines, but seems to be buggy whichever way it is implemented. There are 6 overlapping memcpy's - they all work as optimizing the charstring type 1 to type 2 bytecode transformation. For some yet unknown reason, the bug symptom is entirely due to the last one being broken - I tried breaking all of them, and the symptom is different from Bo's; i.e. more than just the last one is in the active code path, and the other 5 appear to work correctly on Bo's system. So I do not think I have got to the bottom of the problem yet, although it may be simply that memcpy is broken on HP-UX when "3rd argument < less than sizeof(*csp) * 3"! src/gdevpsfx.c:782: memcpy(csp - 5, csp - 4, sizeof(*csp) * 4); src/gdevpsfx.c:790: memcpy(csp - 4, csp - 3, sizeof(*csp) * 3); src/gdevpsfx.c:812: memcpy(csp - 4, csp - 3, sizeof(*csp) * 3); src/gdevpsfx.c:821: memcpy(csp - 5, csp - 4, sizeof(*csp) * 3); src/gdevpsfx.c:837: memcpy(csp - 5, csp - 4, sizeof(*csp) * 5); src/gdevpsfx.c:851: memcpy(csp - 4, csp - 3, sizeof(*csp) * 2); I have a few comments: 1. I can now reproduce the problem on aC++ with +Olibcalls option on a HP box. 2. The problem disappears after the proposed fix even with +Olibcalls. 3. The vendor compiler is not buggy. The spec says that the result of overlapping intervals is undefined. valgring flags this as an error. 4. Vendor's libc version is more robust than the inline code. It is not affected by the overlapping intervals. GCC is even more robust. 5. It is very likely that memcpy() moves data in 8-bit blocks on 64-bit computer providing the alignment is right. This can be confirmed by studying the generated code. 6. The decission about the bounty can be left to a 3rd party, for instance, Igor or Ray. You made the critical observations, I made the conclusions. (comment 26) 3. Valgrind is useful but would not had been appropriate - there are too many memcpy's everywhere in ghostscript. As I said, Igor made the important comment with pin-pointing where (down to one routine in one file) to look. In fact I wrote a little perl script hooking some font tools together to reverse ghostscript's type1 to type2 charstring transform to work out what's wrong with it. 6. The "conclusion" was already in comment 19 ("check libraries and optimization flags", "replace all overlapping memcpy's with memmove"). According to HP's programming guide (pdf and html on HP's web site), "+Olibcalls" is switched on at "+O2" or above. So it is possible that Bo has a habit of declaring "+O2" for everything he compiles, without realising or mentioning it. ==================== +O[no]libcalls Optimization level: +O0, +O1, +O2, +O3, +O4 Default: +Onolibcalls at +O0 and +O1; +Olibcalls at +O2, +O3, and +O4 ==================== Folks, 0. Thank you for good analyzis. 1. Please try replacing the overlapping memcpy with memmove. I believe the latter should work correctly. Does this fix the problem ? 2. The bug is bountiable, and two people worked on the bug. Hin-Tak Leung and Alex, please decide how you'll divide the bounty (I mean in case if it is really finally fixed). 3. If you think that the bonus is too small, please contact Ray. (comment 29) 1. I still like to see why only a specific one memcpy of the six breaks; so I'll spend a bit more time on this bug. 2,3. I understand Alex had spent considerable effort on it; if it is alright with Artifex, a 2nd bounty may be appropriate - perhaps for taking a look at the 8.51 HP-UX build problem in addition? Took a look at the assembler dumps with some changes. Turn out that the +Olibcalls optimization flag (or equivalent) only have effect for n=1 and n=2 . From n>2 or unknown at compile-time onwards, memcpy on HP-UX works the same way as memmove. i.e. it is as I said in comment 25, memcpy doesn't work as desired for n <3. Since one can't do overlaping for n=1, only n=2 is affected; so only one of the 6 memcpy's is critical on HP-UX; but all six should be replaced with memmove. I would consider we know everything there is to know about this bug now. Filed new Bug 688184 for the 8.51 jasper+HP-UX header problem. Hin-Tak Leung, I don't understand why this issue is still open when you, Igor, and Valgring all agree that the fix is good. I've committed the patch to the head branch. Go claim the bounty. |