Bug 706718 - Inter-text distance ignored within text line
Summary: Inter-text distance ignored within text line
Status: RESOLVED FIXED
Alias: None
Product: MuPDF
Classification: Unclassified
Component: mupdf (show other bugs)
Version: 1.22.0
Hardware: PC Windows 11
: P4 major
Assignee: MuPDF bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-05-11 10:36 UTC by Jorj
Modified: 2023-05-11 14:39 UTC (History)
1 user (show)

See Also:
Customer:
Word Size: ---


Attachments
testcase (17.04 KB, application/pdf)
2023-05-11 10:37 UTC, Jorj
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jorj 2023-05-11 10:36:11 UTC
Starting with version 1.22.0, large inter-text distances within a line seem to be ignored andno longer filled with space - linebreaks.
Command "mutool draw -o 2.txt 2.pdf" with the attached test PDF produces this output in v1.21.0:

[10] Yu Guo, Qiyu Jin, Gabriele Facciolo, Tieyong Zeng, and
Jean-Michel Morel.
Residual learning for effective joint
demosaicing-denoising.
arXiv preprint arXiv:2009.06205,

But this output in version 1.22.0:

[10] Yu Guo, Qiyu Jin, Gabriele Facciolo, Tieyong Zeng, and
Jean-Michel Morel.
demosaicing-denoising.Residual learning for effective joint
arXiv preprint arXiv:2009.06205,
Comment 1 Jorj 2023-05-11 10:37:48 UTC
Created attachment 24303 [details]
testcase
Comment 2 Robin Watts 2023-05-11 14:39:53 UTC
Fixed with:

commit bc140682ab56188d0bbec06a4572be06ccb406f8
Author: Robin Watts <Robin.Watts@artifex.com>
Date:   Thu May 11 12:14:56 2023 +0100

    Bug 706718: Don't prepend text extracted lines if vertically shifted.

    The bugfix for 706426 was incorrect, in that it did not check for
    text extracted lines being vertically shifted when considering them
    for prepending.

    Fixed here.

Thanks!