Bug 707560 - Redaction of Devanagari text incomplete
Summary: Redaction of Devanagari text incomplete
Status: RESOLVED FIXED
Alias: None
Product: MuPDF
Classification: Unclassified
Component: mupdf (show other bugs)
Version: master
Hardware: All All
: P2 major
Assignee: MuPDF bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-08 22:02 UTC by Jorj
Modified: 2024-02-21 16:41 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jorj 2024-02-08 22:02:27 UTC
Redacting Devanagari text will leave behind text artifacts.
Please see ZIP file https://github.com/pymupdf/PyMuPDF/files/14215062/redact.zip
for an example text with redaction, the C-source and the output PDF.
Comment 1 Oleksii 2024-02-20 15:54:49 UTC
Greetings!
The development of the project has stopped due to the fact that there are artifacts after deleting the text in Bengali.
Is there any way to speed up at least the consideration of this bug?
How much will it cost to solve this issue?
Comment 2 Robin Watts 2024-02-21 16:41:59 UTC
Fixed with:

commit d3f3ecd2a5e2bd541187bcef0e5dd970c78e4ee8
Author: Robin Watts <Robin.Watts@artifex.com>
Date:   Wed Feb 21 12:24:06 2024 +0000

    Bug 707560: Tweak text redaction logic.

    Previously, when we were asked to see whether a character should
    be redacted or not, we found its bbox, intersected that with the
    redaction annotation, and took a non-empty result to mean that
    there was overlap.

    For 'zero size' bboxes, such as diacritics (which sometimes have
    an advance of 0), this would mean they could never overlap.

    So, we modify the logic here. We still intersect the bbox with the
    redaction annotation, but now consider whether the result is a
    valid rectangle or not. Invalid rectangles do not overlap. Valid
    rectangles (which can include zero area ones) do.