Bug 690094

Summary: missing letters replaced by little vertical lines
Product: Ghostscript Reporter: artifex
Component: ImagesAssignee: Masaki Ushizaka <masaki.ushizaka>
Status: NOTIFIED FIXED    
Severity: normal CC: henry.stiles, marcos.woehrmann, ralph.giles
Priority: P2    
Version: 8.63   
Hardware: PC   
OS: Windows XP   
Customer: 870 Word Size: ---
Bug Depends on: 691081    
Bug Blocks: 691248    
Attachments: sulzer.pdf
i5.pdf -- simplified sample file
i5.jbig2.zip
i5.jbig2dec.png
i5.luratech.png
A patch to fix missing/vertical line glyph problem

Description artifex 2008-09-25 07:05:39 UTC
When converting ( or viewing ) the attached PDF file sulzer.pdf to TIFFG4 ( or
another format like Postscript ), you can see in the second and following pages,
that letters are missing. The missing letters are shown as a little vertical lines.

The file can be viewed and printed with the AcrobatReader correctly.

The Ghostscript will show some warnings like:

   **** Warning: File has imbalanced q/Q operators (too many q's)
Page 2
   **** Warning: File has imbalanced q/Q operators (too many q's)
Page 3
   **** Warning: File has imbalanced q/Q operators (too many q's)
Page 4
   **** Warning: File has imbalanced q/Q operators (too many q's)

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> CVISION Technologies <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.
Comment 1 artifex 2008-09-25 07:06:47 UTC
Created attachment 4424 [details]
sulzer.pdf
Comment 2 Alex Cherepanov 2008-09-25 17:44:09 UTC
Created attachment 4426 [details]
i5.pdf -- simplified sample file
Comment 3 Alex Cherepanov 2008-09-25 17:51:21 UTC
This is a problem in JBIG2 decoding.
The simplified sample file contains a single encoded image.
The bug is confirmed in the current development version.
Comment 4 Ralph Giles 2009-06-16 10:08:24 UTC
Current gs (r9793) shows a blank page after throwing the following errors on the
simplified file:

jbig2dec FATAL ERROR decoding image: runlength too large in export symbol table
(839 > 18 - 0)
 (segment 0x0a)

   **** Warning: File has insufficient data for an image.

evince shows the image but reports:

Error (524): 17259 extraneous bytes after segment
(twice)
Comment 5 Masaki Ushizaka 2009-08-12 01:42:43 UTC
Reproduced with r9974 on Mac OS X 10.5.7 + Xcode 3.1.2.  Same result as #4.
Comment 6 Masaki Ushizaka 2010-04-07 12:55:45 UTC
*** Bug 691206 has been marked as a duplicate of this bug. ***
Comment 7 Masaki Ushizaka 2010-04-07 12:56:15 UTC
*** Bug 691081 has been marked as a duplicate of this bug. ***
Comment 8 Masaki Ushizaka 2010-04-07 12:59:02 UTC
Merged bug 691081 and 691206 to this.
I should also note that mupdf has a same problem as it uses jbig2dec too.
Comment 9 Masaki Ushizaka 2010-04-12 01:37:48 UTC
Created attachment 6155 [details]
i5.jbig2.zip

Extracted JBIG2 files from i5.pdf.  Using these, I can reproduce this with following command:

./jbig2dec i5.global.jbig2 i5.page.jbig2
Comment 10 Masaki Ushizaka 2010-04-13 09:37:51 UTC
Started from r9769, the "runlength too large in export symbol table" problem was covering the original problem.  A coming patch is fixing it and the original problem is going to visible again.
I am delegating "runlength..." problem to 691081 and record dependency.
Comment 11 Masaki Ushizaka 2010-04-13 12:14:31 UTC
Created attachment 6157 [details]
i5.jbig2dec.png

"runlength..." problem was fixed in r11057, and it starts displaying original problem.
This is a result from jbig2dec command.  Some letters are missing, some are replaced with vertical lines.

Command line used:

./jbig2dec i5.global.jbig2 i5.page.jbig2

(Then converted .pbm to .png)
Comment 12 Masaki Ushizaka 2010-04-13 12:15:58 UTC
Created attachment 6158 [details]
i5.luratech.png

Same file processed with luratech decoder.
Comment 13 Masaki Ushizaka 2010-04-14 09:51:37 UTC
Created attachment 6162 [details]
A patch to fix missing/vertical line glyph problem

This problem was in jbig2_decode_symbol_dict().  The image number was not incremented correctly when it built symbol dictionary.  This small patch fixes it.
Comment 14 Ralph Giles 2010-04-15 04:55:19 UTC
Patch looks good. Please commit.
Comment 15 Masaki Ushizaka 2010-04-15 11:09:24 UTC
The patch has been committed r11074.  Closing this bug.
Comment 16 Masaki Ushizaka 2010-04-21 06:52:09 UTC
A change in r11074 caused segmentation fault by images with SDHUFF == 1.  Reopening to fix.
Comment 17 Masaki Ushizaka 2010-04-21 06:57:48 UTC
As I changed image number to be incremented when 'exflag' is false, the problem of 'exrunlength' when SDHUFF == 1 was unveiled.  The following patch deal with this.

Index: jbig2dec/jbig2_symbol_dict.c
===================================================================
--- jbig2dec/jbig2_symbol_dict.c	(revision 11092)
+++ jbig2dec/jbig2_symbol_dict.c	(working copy)
@@ -693,7 +693,7 @@
     while (j < params->SDNUMEXSYMS) {
       if (params->SDHUFF)
       	/* FIXME: implement reading from huff table B.1 */
-        exrunlength = params->SDNUMEXSYMS;
+        exrunlength = exflag ? params->SDNUMEXSYMS : 0;
       else
         code = jbig2_arith_int_decode(IAEX, as, &exrunlength);
       if (exflag && exrunlength > params->SDNUMEXSYMS - j) {
Comment 18 Masaki Ushizaka 2010-04-21 08:25:12 UTC
The patch has been committed in r11093.  Closing again.
Comment 19 Marcos H. Woehrmann 2011-09-18 21:47:20 UTC
Changing customer bugs that have been resolved more than a year ago to closed.