Bug 707560

Summary: Redaction of Devanagari text incomplete
Product: MuPDF Reporter: Jorj <jorj.x.mckie>
Component: mupdfAssignee: MuPDF bugs <mupdf-bugs>
Status: RESOLVED FIXED    
Severity: major CC: lexxwork, robin.watts
Priority: P2    
Version: master   
Hardware: All   
OS: All   
Customer: Word Size: ---

Description Jorj 2024-02-08 22:02:27 UTC
Redacting Devanagari text will leave behind text artifacts.
Please see ZIP file https://github.com/pymupdf/PyMuPDF/files/14215062/redact.zip
for an example text with redaction, the C-source and the output PDF.
Comment 1 Oleksii 2024-02-20 15:54:49 UTC
Greetings!
The development of the project has stopped due to the fact that there are artifacts after deleting the text in Bengali.
Is there any way to speed up at least the consideration of this bug?
How much will it cost to solve this issue?
Comment 2 Robin Watts 2024-02-21 16:41:59 UTC
Fixed with:

commit d3f3ecd2a5e2bd541187bcef0e5dd970c78e4ee8
Author: Robin Watts <Robin.Watts@artifex.com>
Date:   Wed Feb 21 12:24:06 2024 +0000

    Bug 707560: Tweak text redaction logic.

    Previously, when we were asked to see whether a character should
    be redacted or not, we found its bbox, intersected that with the
    redaction annotation, and took a non-empty result to mean that
    there was overlap.

    For 'zero size' bboxes, such as diacritics (which sometimes have
    an advance of 0), this would mean they could never overlap.

    So, we modify the logic here. We still intersect the bbox with the
    redaction annotation, but now consider whether the result is a
    valid rectangle or not. Invalid rectangles do not overlap. Valid
    rectangles (which can include zero area ones) do.