... else every consumer of pdf_loadtextfromtree will have to do it instead. See http://code.google.com/p/sumatrapdf/source/detail?r=1693 for what we currently need to make documents searchable in SumatraPDF.
Ligature expansion has also been reported on its own as bug 690681.
The poppler guys kindly link to another testfile: https://bugs.freedesktop.org/show_bug.cgi?id=19154
*** Bug 691746 has been marked as a duplicate of this bug. ***
ICU4C has an implementation of "inverse BiDi" visual-to-logical-reordering, available at http://icu-project.org/apiref/icu4c/ubidi_8h.html with UBiDiReorderingMode from {UBIDI_REORDER_INVERSE_NUMBERS_AS_L, UBIDI_REORDER_INVERSE_LIKE_DIRECT, UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL }
Created attachment 10913 [details] example with combining characters ('å' as U+02DA 'a')
Fixed: We expand the standard ligatures. We expand ligatures using one-to-many ToUnicode CMap tables. We run a (rudimentary) RTL visual-to-logic reordering pass. Missing: Normalizing text into NFC.