Bug 691457 - PDF without fonts will not display
Summary: PDF without fonts will not display
Status: RESOLVED WONTFIX
Alias: None
Product: MuPDF
Classification: Unclassified
Component: mupdf (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P4 normal
Assignee: Tor Andersson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-14 13:02 UTC by akay
Modified: 2014-04-16 02:03 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments
include pdf,error.jpg,ok.jpg (1.97 MB, application/x-rar-compressed)
2010-07-14 13:02 UTC, akay
Details
foxitreader (97.77 KB, image/jpeg)
2010-07-20 16:48 UTC, akay
Details
Acrobat Professional view (74.72 KB, image/jpeg)
2010-07-20 16:50 UTC, akay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description akay 2010-07-14 13:02:11 UTC
Created attachment 6482 [details]
include pdf,error.jpg,ok.jpg

Please test the lenin.01.pdf.

I open it with the newest mupdf,can not display.
you can see the effect on the attachment pictures.
I think the reason is the document did not include font.
Comment 1 Tor Andersson 2010-07-15 12:43:18 UTC
The PDF file specifies that the font should use the WinAnsiEncoding. Not much we can do if the file lies to us about which encoding it uses. You should report a bug against the PDF creation software:

/Producer (S22PDF V1.0 \271\371\301\246\(C\) pdf@home.icm.ac.cn)

None of the PDF software I tested with (Apple Preview, Poppler, GS)
draws this file with chinese characters.
Comment 2 akay 2010-07-20 09:40:13 UTC
Hi,I make a patch to view the buggy pdf.(Reference from http://code.google.com/p/sumatrapdf/issues/detail?id=844)

--- c:/pdf_fontfile.c	星期三 七月 14 19:02:22 2010
+++ d:/MyProgram/MuPdf/mupdf/pdf_fontfile.c	星期二 七月 20 17:35:17 2010
@@ -216,6 +216,9 @@
 		fz_warn("unknown cid collection: %s", collection);
 	}
 
+	if (strlen(fontname) == 0)
+		return loadsystemcidfont(fontdesc, GB, MINCHO);
+
 	if (isscript)
 		name = "Chancery";
  
--- c:/pdf_font.c	星期三 七月 14 19:02:22 2010
+++ d:/MyProgram/MuPdf/mupdf/pdf_font.c	星期二 七月 20 17:29:56 2010
@@ -456,6 +456,12 @@
 		}
 	}
 
+	error = pdf_loadsystemcmap(&fontdesc->encoding, "GBK-EUC-H");
+	error |= pdf_loadsystemcmap(&fontdesc->tounicode, "Adobe-GB1-UCS2");
+	error |= pdf_loadsystemcmap(&fontdesc->tottfcmap, "Adobe-GB1-UCS2");
+	if (!error)
+		goto skipencoding;
+
 	/* try to reverse the glyph names from the builtin encoding */
 	for (i = 0; i < 256; i++)
 	{
@@ -483,7 +489,7 @@
 	error = pdf_loadtounicode(fontdesc, xref, estrings, nil, fz_dictgets(dict, "ToUnicode"));
 	if (error)
 		goto cleanup;
-
+skipencoding:
 	/*
 	 * Widths
 	 */
Comment 3 Tor Andersson 2010-07-20 10:55:53 UTC
That is a one-off hack for one buggy file which no other PDF viewer can view properly. Not even Adobe Reader 9. I'm sorry, but the answer is still no.
Comment 4 Hin-Tak Leung 2010-07-20 14:03:48 UTC
That's one ugly patch. I had a look around for "S22PDF V1.0" and "pdf@home.icm.ac.cn" - apparently that contact/maintainence info is no longer valid. (or not contactable from the west). OTOH, S22PDF seems to be rather popular in China and many web sites seems to have auto-generated pdf's of that source, so it is rather unfortunate.

OTOH, this seems to be a rather unfortunate situation where a broken pdf generation tool has caught on in a big part of the world. I can't believe it has caught on to the extent it is without a corresponding viewer that can view it, so may be the reporter can tell me what viewer(s) are used for view these files successfully? Just out of curiosity.
Comment 5 akay 2010-07-20 16:48:23 UTC
Created attachment 6516 [details]
foxitreader 

view it with Foxit Reader(http://www.foxitsoftware.com/?Language=en)
Comment 6 akay 2010-07-20 16:50:26 UTC
Created attachment 6517 [details]
Acrobat Professional view

Adobe Acrobat professional 7.0 view it ok.
Comment 7 akay 2010-07-20 16:52:07 UTC
Hi,I can view it with Foxit Reader(http://www.foxitsoftware.com/?Language=en)
and Adobe Acrobat professional 7.0 view it ok.
You can see the snapshot in the attachments,
foxitreader.jpg,Acrobat.jpg.

Please check it.
Comment 8 Tor Andersson 2010-07-20 19:21:05 UTC
Are you sure you're using the same file?
I've tried with Adobe Reader 7, 8 and 9: they all show dots.
I've tried Foxit Reader 3.3 and 4: they show the same as MuPDF.
Comment 9 Tor Andersson 2010-07-20 22:18:24 UTC
After changing the windows locale to Chinese (PRC) I can see the chinese text in both Foxit and Reader. What happened to the P in Portable Document Format!?
Comment 10 Hin-Tak Leung 2010-07-20 22:23:48 UTC
I installed Foxit under wine/linux, and it took both of these two things to display the broken pdf:
- install the foxit cjk pack
- run wine in the simplified chinese locale (LC_ALL=zh_CN wine "Foxit Reader.exe")

Apparently Foxit use the windows system locale to guess "winansi" should be. It appears that Adobe Professional does the same thing: detect that it is running under localised Chinese Windows, and make a guess from the system locale if "winansi" isn't winansi.

That's a very dubious thing to do - to guess the encoding of pdf's from the windows locale setting.
Comment 11 akay 2010-07-20 23:35:00 UTC
Hi,maybe it use the Operation System 's Language setting.
In Windows XP Chinese,it should display ok,I did not test on Windows English.

After I make a change as I posted patch,it will display ok on mupdf.
But another files will display error.So I think my change is not right.
Comment 12 szfong 2010-12-17 08:24:16 UTC
I've also encountered some files created by s22pdf V1.0 which forced me to change locale in Win7 to Chines (PRC), had to reboot in order to display properly.

Rather than creating a "patch" for Ghostscript, I'm wondering what a set of command can I issue in which will add the proper encoding/cmap mentioned by "akay" and allow us to substitute a different font in place, if applicable. 

A batch file which issues a Ghostscript filter, bad.pdf --> font_substitute.ps --> encoded_cmapped.ps --> GOOD.PDF would solve all our issues concerning buggy PDF created by s22pdf V1.0.  That way the new PDF would once again be portable and viewable by all.
Comment 13 Tor Andersson 2010-12-17 14:40:46 UTC
The files generated by s22pdf are broken and violate the spec.
Adobe renders this file differently depending on locale.
It only works as expected with a chinese locale.

Locale dependent behavior is not a route we're willing to go with MuPDF.
I cannot see a reasonable workaround for this that will work in all
instances, without breaking any other files.

My suggestion is if you have an s22pdf file: open it in chinese
windows and "print to pdf" to redistill and create a non-broken file.