Bug 699856 - Attempting to open a carefully crafted PDF file results in long-running computation
Summary: Attempting to open a carefully crafted PDF file results in long-running compu...
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: Security (public) (show other bugs)
Version: unspecified
Hardware: PC Linux
: P4 normal
Assignee: Chris Liddell (chrisl)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-03 07:29 UTC by Shawn Rasheed
Modified: 2019-09-19 06:59 UTC (History)
1 user (show)

See Also:
Customer:
Word Size: 64


Attachments
The crafted PDF file (4.95 KB, application/pdf)
2018-10-03 07:29 UTC, Shawn Rasheed
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Shawn Rasheed 2018-10-03 07:29:08 UTC
Created attachment 15743 [details]
The crafted PDF file

Attempting to open a carefully crafted PDF results in a long-running computation. The page tree nodes are deeply nested where a child page tree node may be a descendent of multiple parents.

It was tested on 9.25 and commit: cbdc54055b7db024951daf3dcb3cafe0af458e47 from the repo, http://git.ghostscript.com/ghostpdl.git
Comment 1 Ken Sharp 2018-10-03 12:58:13 UTC
I'm afraid I don't see a way to address this in the current PDF interpreter. The reason this takes such a long time to run is because before we start processing the page contents we first check the page tree for consistency, to ensure it does not have any recursive elements.

We have to do this, because we've found files in the past that do contain recursion in the page tree, which also leads to a (genuinely endless) loop.

Now in this case what we have is, essentially, a gigantic tree. Its not presented as such, because at each level it reuses existing entries. Nevertheless, when fully resolved, this tree has 2^25 leaf nodes.

The PDF interpreter has to check the path from the root of the tree to every node in order to make sure there is no recursion in the tree. This does indeed take time, lots of time. Given sufficient time I believe this will eventually complete.

But its trivial to produce a PostScript file which has the same effect.
Comment 2 Ken Sharp 2018-10-03 14:23:36 UTC
Actually, I've had a thought....
Comment 3 Ken Sharp 2018-10-03 16:02:40 UTC
OK bright idea seems to actually work. Fixed in commit 0a7e5a1c309fa0911b892fa40996a7d55d90bace
Comment 4 Shawn Rasheed 2018-10-03 17:52:26 UTC
Hi Ken,

I have not had a look at the fix yet. But my understanding is the call tree grows exponentially because the check revists the same nodes over and over again.
Comment 5 Ken Sharp 2018-10-04 07:05:25 UTC
(In reply to Shawn Rasheed from comment #4)
> Hi Ken,
> 
> I have not had a look at the fix yet. But my understanding is the call tree
> grows exponentially because the check revists the same nodes over and over
> again.

Umm yes, and then again, no...

The exponentiallity is due to the fact that the pages tree grows exponentially with each layer you add. Your file is constructed somewhat like this (numbers are the object numbers in the PDF file):

                                  2
                              /      \
                             6        7
                            / \      / \
                           8   9    8   9
                          /     \  /     \
                         10    11 10     11
...
                        3        3 3      3

So the check revisits the same nodes because it is descending the tree to the leaf node for every path in the tree. Adding another level to the tree in this construction simply means adding another intermediate node, but it doubles the number of leaf nodes, hence the exponential increase.

To avoid that we would have to note that the intermediate nodes have been previously encountered (but not in a circular reference), and have been fully resolved. Not an easy task in PostScript, especially to retrofit to an existing PostScript program.
Comment 6 Shawn Rasheed 2018-11-27 01:25:31 UTC
CVE identifier for this bug:
CVE-2018-19478