691903 – fz_textextractspan: calculate ascender/descender per glyph

Bug 691903 - fz_textextractspan: calculate ascender/descender per glyph

Summary: fz_textextractspan: calculate ascender/descender per glyph

Status:	CONFIRMED

Alias:	None

Product:	MuPDF
Classification:	Unclassified
Component:	fitz (show other bugs)
Version:	unspecified
Hardware:	PC Windows 7

Importance:	P4 enhancement
Assignee:	MuPDF bugs

URL:	http://code.google.com/p/sumatrapdf/i...
Keywords:

Depends on:
Blocks:

Reported:	2011-01-23 15:08 UTC by zeniko
Modified:	2018-08-28 06:55 UTC (History)
CC List:	2 users (show)

See Also:
Customer:
Word Size:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description zeniko 2011-01-23 15:08:28 UTC

If a font has few very large glyphs, the bbox for these large glyphs will be used for smaller glyphs as well. E.g. the FiguralBookPlain font in http://www.maps.org/news-letters/v20n2/v20n2-bulletin_full.pdf results in far too large bboxes for most of the first page's text. To reproduce, just search that document for text present on the first page in pdfview.

Comment 1 Tor Andersson 2012-01-12 00:17:42 UTC

You can now get individual bounding boxes for glyphs with fz_bound_glyph.
This isn't exposed in the text device yet, but will be once I update and
merge the text branch.

Comment 2 Tor Andersson 2012-07-20 12:10:02 UTC

Here is an example of using fz_bound_glyph to compute per glyph bboxes. I'm not convinced that this is better, though. Another approach would be to distrust the freetype ascender/descender fields and compute them from some standard glyph instead.

--- a/fitz/dev_text.c
+++ b/fitz/dev_text.c
@@ -415,10 +415,14 @@ fz_text_extract(fz_context *ctx, fz_text_device *dev, fz_text *text, fz_matrix c
                        adv = ftadv / 65536.0f;
                        fz_unlock(ctx, FZ_LOCK_FREETYPE);
 
+#ifdef REAL_GLYPH_BBOXES
+                       rect = fz_bound_glyph(ctx, font, text->items[i].gid, fz_identity);
+#else
                        rect.x0 = 0;
                        rect.y0 = descender;
                        rect.x1 = adv;
                        rect.y1 = ascender;
+#endif

Comment 3 zeniko 2012-07-21 20:26:46 UTC

(In reply to comment #2)
> Here is an example of using fz_bound_glyph to compute per glyph bboxes.

Thanks. An IMO better example can be found in our patchset, though, where we conditionally use fz_bound_glyph with additional fiddling to get better results than either of your two suggestions.