Created attachment 18694 [details] Test example Hello, Please consider the attached PDF document with `mupdf` and search for the word "lorem". It fails to find it on my Debian Testing system with `mupdf` version 1.15.0. The reason is that `mupdf` interprets the document with many extra spaces that make the search impossible. $ mutool draw -F txt test.pdf | head -1 Lor e m The attached document was created with `pdflatex` and I raised a discussion there, where the conclusion is probably that `mupdf` should non interpret tiny spaces as qualified spaces. https://tug.org/pipermail/pdftex/2019-December/009162.html Also, this problem is a REGRESSION between version 1.14.0 and version 1.15.0. Indeed, with version 1.14.0, the search worked and we have the correct result: $ mutool draw -F txt test.pdf | head -1 Lorem ipsum dolor sit amet, consectetuer adipiscing elit. This regression was also mentioned in version 1.16.0 https://bugs.ghostscript.com/show_bug.cgi?id=701602 Olivier
The issue here is with the font metrics of the embedded font, possibly related to freetype. The advance of most characters in the embedded font, as reported by freetype, is 0. This makes MuPDF believe that there is a gap between each character as wide as the character itself; and therefore inserts an artificial space. Looking at the actual charstrings in the font file I do see plenty of non-zero 'hsbw' instructions, leading me to suspect this may be a problem with FreeType.
*** Bug 701602 has been marked as a duplicate of this bug. ***
*** Bug 701979 has been marked as a duplicate of this bug. ***
The FreeType bug has been reported upstream here: https://savannah.nongnu.org/bugs/?57519
commit 82196fd87d98e3c2412049caf890f675ae802676 Author: Tor Andersson <tor.andersson@artifex.com> Date: Wed Jan 8 11:22:52 2020 +0100 Bug 701977: Workaround for bug 57519 in FreeType. FT_Get_Advance has a bug with certain Type1 fonts, because the fast metrics parsing function does not handle 'div' operators. Disable the fast metrics for Type1 by forcing the use of the old Type1 engine for our builds. This will not work for builds using the system library, and this workaround should be removed as soon as we update to a FreeType a release with a fix for this bug.
*** Bug 702141 has been marked as a duplicate of this bug. ***
*** Bug 702509 has been marked as a duplicate of this bug. ***
On my Debian system, I upgraded the package "libfreetype6" from version "2.10.1-2" to version "2.10.2+dfsg-4", and searching in MuPDF now works as expected. -- Olivier, with MuPDF version "1.17.0+ds1-1"