Summary: | Normalize unicode text into NFC for better search. | ||
---|---|---|---|
Product: | MuPDF | Reporter: | zeniko |
Component: | mupdf | Assignee: | MuPDF bugs <mupdf-bugs> |
Status: | CONFIRMED --- | ||
Severity: | enhancement | CC: | bullian, christinedelight.top85, sebastian.rasmussen, tor.andersson |
Priority: | P4 | ||
Version: | unspecified | ||
Hardware: | PC | ||
OS: | Windows XP | ||
Customer: | Word Size: | --- | |
Bug Depends on: | |||
Bug Blocks: | 690681 | ||
Attachments: | example with combining characters ('å' as U+02DA 'a') |
Description
zeniko
2010-01-13 11:45:33 UTC
Ligature expansion has also been reported on its own as bug 690681. The poppler guys kindly link to another testfile: https://bugs.freedesktop.org/show_bug.cgi?id=19154 *** Bug 691746 has been marked as a duplicate of this bug. *** ICU4C has an implementation of "inverse BiDi" visual-to-logical-reordering, available at http://icu-project.org/apiref/icu4c/ubidi_8h.html with UBiDiReorderingMode from {UBIDI_REORDER_INVERSE_NUMBERS_AS_L, UBIDI_REORDER_INVERSE_LIKE_DIRECT, UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL } Created attachment 10913 [details]
example with combining characters ('å' as U+02DA 'a')
Fixed: We expand the standard ligatures. We expand ligatures using one-to-many ToUnicode CMap tables. We run a (rudimentary) RTL visual-to-logic reordering pass. Missing: Normalizing text into NFC. |