Created attachment 26438 [details] POC Hi, I discovered a bypass for the fix of CVE-2024-46954 in decode_utf8() in base/gp_utf8.c. The current fix is solving issues related to UTF-8 overlong encoding and allows to block payloads in the type of \xe0\x81\x9c which were converted to \xc5 before the fix. I find out that payloads like \xc1\c9c are still being processed, bypass the fix and allows to write a \xc5 and thus path transversal. This allows for arbitrary file read, write and directory listing. With this primitives there is a high chance for RCE depending on how the system is configured. A complete fix should check that the provided UTF-8 sequence is a valid one. I've attempted to rewrite the decode_utf8 function so it blocks all invalid UTF-8 sequences. I'm attaching an EPS file that when processes will list the content of C:\Users\<yourUser>\. You need to change the <yourUser> to the current windows user you are using. If you want to credit the finding I'd like to be credited as: truff (https://x.com/truffzor). Please let me know if you need help to reproduce the issue or with the fix. Cheers, Sebastien
(In reply to truff from comment #0) > If you want to credit the finding I'd like to be credited as: truff > (https://x.com/truffzor). We will definitely credit the finding, most likely by putting your details on the commit, so thanks for supplying those. I'll talk with one of my colleagues right now, as we were in the middle of a release, and I think we will want to get this included urgently. Thank you very much for the report!
You are welcome and Thanks for the quick reply Ken. I forgot to ask if this would be eligible for a reward as per your bug bounty policy ? Seb
(In reply to truff from comment #2) > You are welcome and Thanks for the quick reply Ken. > I forgot to ask if this would be eligible for a reward as per your bug > bounty policy ? Yes, absolutely, I was going to contact you privately and suggest it. its not megabucks I'm afraid, we don't have the resources of Google :-) his would be our maximum which is $1000. We'll need you to do a Contributors Licence Agreement, and then our accounts people will need your bank details. I'll get them to contact you, obviously that's not information which should be public. They are in California, so there will be some delay in getting this sorted out. I'll get the ball rolling after I have a quick shower. While I'm doing that, the CLA can be found here: https://artifex.com/contributor/ I've just noticed that the contact address in that PDF is wrong, I'll find out who should get these now and let you know.
OK my bad, I thought you'd supplied a patch, do you want to do so ? Without a patch the bounty would be halved, but obviously we wouldn't need a CLA.
I have a patch somewhere that I need to look for, if I can't find it I'll do it again. Can this wait a bit ? like until tomorrow ?
(In reply to truff from comment #5) > I have a patch somewhere that I need to look for, if I can't find it I'll do > it again. Can this wait a bit ? like until tomorrow ? Its slightly awkward timing as we are literally doing a release. I've had a quick chat with the developer doing the release, and he says he can apply for a CVE without a fix (in fact he already has), and we can't proceed until we have a CVE anyway, so a day shouldn't be a problem. I have to say that neither myself nor one of my colleagues can trigger an actual problem, even after editing 'target_directory' appropriately though my colleague Robin (who I've assigned this to) says he understands the underlying problem.
(In reply to Ken Sharp from comment #6) > I have to say that neither myself nor one of my colleagues can trigger an > actual problem, even after editing 'target_directory' appropriately though > my colleague Robin (who I've assigned this to) says he understands the > underlying problem. Looks like Bugzilla translated the bytes :-( Changing the bytes back to 0xC1 0x9C makes it exhibit.
It's great you have been able to reproduce, I was going to ask to check with xxd or another hes editor. I've found the patch I wrote. I'll be testing it during this afternoon and provide it to you soon.
Created attachment 26439 [details] Fix
I just uploaded a fix. This is the full decode_utf8() function, I believe you will be better that I do a .patch file from it. I've tested quite intensively with different "malicious" payloads and all looks to be blocked. The major changes here relate to the fact that any non perfectly valid UTF-8 sequence will be blocked. While I've tested it with "malicious" payload I did not test it intensively with "normal" payloads but that should not create any side effect. Anyway, if you have unit tests that will be nice to give a pass.
I've asked where to return the CLA to (I'll let you know as soon as I know) and I've asked the accountant to contact you for bank details, you should expect an email to the address here (sebastien.cantos@gmail.com) asking you for your account details. I'm expecting that to be from Christine Lee (christine.lee@artifex.com), if it's from someone else please feel free to contact me (by email if preferred) as a check on the validity. If I get told the email will be from someone else I'll let you know (still waiting for the US to wake up).
Thanks Ken, I'll email the CLA once you will confirm the email address to send to. No urgency at all.
If you feed in a byte sequence that begins with 0xF8 or above, then I believe your patch will return an undefined value. Also, I fear that static analysers such as coverity will complain of dead code because some of the checks are unnecessary. Nonetheless I understand what it's fixing, and I am preparing a new version of the code that incorporates some of the fixes here. Thanks!
Created attachment 26440 [details] 0001-Bug-708311-Fix-the-fix-for-CVE-2024-46954.patch Proposed commit. Any feedback or suggestions gratefully received.
Fixed with: commit f14ea81e6c3d2f51593f23cdf13c4679a18f1a3f Author: Robin Watts <Robin.Watts@artifex.com> Date: Tue Mar 4 16:24:33 2025 +0000 Bug 708311: Fix the fix for CVE-2024-46954. The previous fix for CVE-2024-46954 was still failing to spot a certain subset of 2 byte sequences as being overlong. 1 byte sequences (0xxxxxxx) encode 7 payload bits. 2 byte sequences (110xxxxx 10xxxxxx) only manage to encode 6 payload bits in the second (lowest) byte. Thus the test for an overlong 2 byte encoding is not "is the value of the payload bits in the first byte 0", but rather "is the value of the payload bits in the first byte smaller than 2". Credit for spotting the problem and the initial version of the fix is due to truff (https://x.com/truffzor). Another issue spotted, and fixed here, is that it's illegal to encode high/low surrogates within UTF-8 (as the values they represent should be encoded directly). Finally, we need 21 bits of coverage to get all possible unicode values. 4 byte UTF-8 encodings give us 21 bits of data as required, but there are values within this 21 bit range that are not valid unicode chars. So spot these and reject them too. Thanks for the report!
Putting this back to 'IN_PROGRESS' (which is going to take two steps, apologies for the noise) because we're still trying to get a CVE for it.
CVE-2025-46646
This fix was included in the 10.05.0 release.