Created attachment 6482 [details] include pdf,error.jpg,ok.jpg Please test the lenin.01.pdf. I open it with the newest mupdf,can not display. you can see the effect on the attachment pictures. I think the reason is the document did not include font.
The PDF file specifies that the font should use the WinAnsiEncoding. Not much we can do if the file lies to us about which encoding it uses. You should report a bug against the PDF creation software: /Producer (S22PDF V1.0 \271\371\301\246\(C\) pdf@home.icm.ac.cn) None of the PDF software I tested with (Apple Preview, Poppler, GS) draws this file with chinese characters.
Hi,I make a patch to view the buggy pdf.(Reference from http://code.google.com/p/sumatrapdf/issues/detail?id=844) --- c:/pdf_fontfile.c 星期三 七月 14 19:02:22 2010 +++ d:/MyProgram/MuPdf/mupdf/pdf_fontfile.c 星期二 七月 20 17:35:17 2010 @@ -216,6 +216,9 @@ fz_warn("unknown cid collection: %s", collection); } + if (strlen(fontname) == 0) + return loadsystemcidfont(fontdesc, GB, MINCHO); + if (isscript) name = "Chancery"; --- c:/pdf_font.c 星期三 七月 14 19:02:22 2010 +++ d:/MyProgram/MuPdf/mupdf/pdf_font.c 星期二 七月 20 17:29:56 2010 @@ -456,6 +456,12 @@ } } + error = pdf_loadsystemcmap(&fontdesc->encoding, "GBK-EUC-H"); + error |= pdf_loadsystemcmap(&fontdesc->tounicode, "Adobe-GB1-UCS2"); + error |= pdf_loadsystemcmap(&fontdesc->tottfcmap, "Adobe-GB1-UCS2"); + if (!error) + goto skipencoding; + /* try to reverse the glyph names from the builtin encoding */ for (i = 0; i < 256; i++) { @@ -483,7 +489,7 @@ error = pdf_loadtounicode(fontdesc, xref, estrings, nil, fz_dictgets(dict, "ToUnicode")); if (error) goto cleanup; - +skipencoding: /* * Widths */
That is a one-off hack for one buggy file which no other PDF viewer can view properly. Not even Adobe Reader 9. I'm sorry, but the answer is still no.
That's one ugly patch. I had a look around for "S22PDF V1.0" and "pdf@home.icm.ac.cn" - apparently that contact/maintainence info is no longer valid. (or not contactable from the west). OTOH, S22PDF seems to be rather popular in China and many web sites seems to have auto-generated pdf's of that source, so it is rather unfortunate. OTOH, this seems to be a rather unfortunate situation where a broken pdf generation tool has caught on in a big part of the world. I can't believe it has caught on to the extent it is without a corresponding viewer that can view it, so may be the reporter can tell me what viewer(s) are used for view these files successfully? Just out of curiosity.
Created attachment 6516 [details] foxitreader view it with Foxit Reader(http://www.foxitsoftware.com/?Language=en)
Created attachment 6517 [details] Acrobat Professional view Adobe Acrobat professional 7.0 view it ok.
Hi,I can view it with Foxit Reader(http://www.foxitsoftware.com/?Language=en) and Adobe Acrobat professional 7.0 view it ok. You can see the snapshot in the attachments, foxitreader.jpg,Acrobat.jpg. Please check it.
Are you sure you're using the same file? I've tried with Adobe Reader 7, 8 and 9: they all show dots. I've tried Foxit Reader 3.3 and 4: they show the same as MuPDF.
After changing the windows locale to Chinese (PRC) I can see the chinese text in both Foxit and Reader. What happened to the P in Portable Document Format!?
I installed Foxit under wine/linux, and it took both of these two things to display the broken pdf: - install the foxit cjk pack - run wine in the simplified chinese locale (LC_ALL=zh_CN wine "Foxit Reader.exe") Apparently Foxit use the windows system locale to guess "winansi" should be. It appears that Adobe Professional does the same thing: detect that it is running under localised Chinese Windows, and make a guess from the system locale if "winansi" isn't winansi. That's a very dubious thing to do - to guess the encoding of pdf's from the windows locale setting.
Hi,maybe it use the Operation System 's Language setting. In Windows XP Chinese,it should display ok,I did not test on Windows English. After I make a change as I posted patch,it will display ok on mupdf. But another files will display error.So I think my change is not right.
I've also encountered some files created by s22pdf V1.0 which forced me to change locale in Win7 to Chines (PRC), had to reboot in order to display properly. Rather than creating a "patch" for Ghostscript, I'm wondering what a set of command can I issue in which will add the proper encoding/cmap mentioned by "akay" and allow us to substitute a different font in place, if applicable. A batch file which issues a Ghostscript filter, bad.pdf --> font_substitute.ps --> encoded_cmapped.ps --> GOOD.PDF would solve all our issues concerning buggy PDF created by s22pdf V1.0. That way the new PDF would once again be portable and viewable by all.
The files generated by s22pdf are broken and violate the spec. Adobe renders this file differently depending on locale. It only works as expected with a chinese locale. Locale dependent behavior is not a route we're willing to go with MuPDF. I cannot see a reasonable workaround for this that will work in all instances, without breaking any other files. My suggestion is if you have an s22pdf file: open it in chinese windows and "print to pdf" to redistill and create a non-broken file.