Created attachment 14646 [details] Initial file - correctly displaying Attached are two versions of a file. The first is the original form of the file, which displays correctly. The second is the result of highlighting a word and then saving. This displays with a lot of text missing, although it displays correctly in many other PDF readers. It looks to be a font problem. Possibly of interest, when the highlighted version was saved, pdf_can_be_saved_incrementally returned false. Also the original file takes a long time to load, so possibly mupdf is fixing it, but then having problems display the fixed form.
Created attachment 14647 [details] Highlighted vesion: not displaying correctly
Created attachment 14649 [details] Initial file - correctly displaying
Created attachment 14650 [details] Highlighted vesion: not displaying correctly
The highlighted version contains: 569 0 obj << /DA<93A4A16D0C1E21B4EEF5ABF8DC634FB0656AE4EB279BED839BB72DA19F913882> /DR<< /Encoding<< /PDFDocEncoding 218 0 R> endobj You can see the Encoding dictionary is incorrectly terminated '>', the /DR dictionary and the object dictionaries are not terminated at all.
And then the highlight annotation is similarly broken: 654 0 obj <</Type/Annot/Subtype/Highlight/Rect[197.1445 523.10446 281.65586 541.1651]/F 4/QuadPoints[197.1445 523.10446 281.65586 523.10446 197.1445 541.1651 281.65586 541.1651]/AP<</N 655 0 R>>/T<50FD3A9670390392D9CC6DE463481421EBA endobj Notice the /T hex string is not terminated, and the dictionary terminator is missing. This then leads to this object: 601 0 obj <</Ascent 1113/CIDSet 275 0 R/CapHeight 788/Descent -178/Flags 4/FontBBox[-233 -178 1192 1113]/FontFamily<240376049AB3AF8B6E8118765B985F14B399C550CE7EAF2DAE4BA29F95FB85B4>/FontFile3 276 0 R/FontName/FQINWG+TypelaboNStd-Heavy/FontStretch/Normal/FontWeight 900/Italic endobj with the same problems, and so on and so on..... The file is, bluntly, knackered. The original file does have some problems, /Indexed colour spaces which do not contain enough data in the lookup table, but that is comparatively minor.
(In reply to Ken Sharp from comment #5) > The file is, bluntly, knackered. The original file does have some problems, > /Indexed colour spaces which do not contain enough data in the lookup table, > but that is comparatively minor. That's really strange: the knackered file was produced by editing the original with MuPDF
I think I know the case that leads to this corruption. The original file is encrypted, but doesn't require a password to be opened. There is an owner password, but decryption and encryption can be performed without it. For some reason saving new content leads to a corrupted file in this case.
The problem can be reproduced by just running the file through 'mutool clean' which also creates a broken file.
commit b59ff4e3956d8e62b6b76a0e23697b15bc8c9467 Author: Tor Andersson <tor.andersson@artifex.com> Date: Wed Feb 7 17:15:54 2018 +0100 Fix 698918: Use encryption when doing initial object formatting to count the length.