Bug 699856

Summary: Attempting to open a carefully crafted PDF file results in long-running computation
Product: Ghostscript Reporter: Shawn Rasheed <unshorn>
Component: Security (public)Assignee: Chris Liddell (chrisl) <chris.liddell>
Status: RESOLVED FIXED QA Contact: gs-security
Severity: normal    
Priority: P4 CC: cbuissar
Version: unspecified   
Hardware: PC   
OS: Linux   
Customer: Word Size: 64
Attachments: The crafted PDF file

Description Shawn Rasheed 2018-10-03 07:29:08 UTC
Created attachment 15743 [details]
The crafted PDF file

Attempting to open a carefully crafted PDF results in a long-running computation. The page tree nodes are deeply nested where a child page tree node may be a descendent of multiple parents.

It was tested on 9.25 and commit: cbdc54055b7db024951daf3dcb3cafe0af458e47 from the repo, http://git.ghostscript.com/ghostpdl.git
Comment 1 Ken Sharp 2018-10-03 12:58:13 UTC
I'm afraid I don't see a way to address this in the current PDF interpreter. The reason this takes such a long time to run is because before we start processing the page contents we first check the page tree for consistency, to ensure it does not have any recursive elements.

We have to do this, because we've found files in the past that do contain recursion in the page tree, which also leads to a (genuinely endless) loop.

Now in this case what we have is, essentially, a gigantic tree. Its not presented as such, because at each level it reuses existing entries. Nevertheless, when fully resolved, this tree has 2^25 leaf nodes.

The PDF interpreter has to check the path from the root of the tree to every node in order to make sure there is no recursion in the tree. This does indeed take time, lots of time. Given sufficient time I believe this will eventually complete.

But its trivial to produce a PostScript file which has the same effect.
Comment 2 Ken Sharp 2018-10-03 14:23:36 UTC
Actually, I've had a thought....
Comment 3 Ken Sharp 2018-10-03 16:02:40 UTC
OK bright idea seems to actually work. Fixed in commit 0a7e5a1c309fa0911b892fa40996a7d55d90bace
Comment 4 Shawn Rasheed 2018-10-03 17:52:26 UTC
Hi Ken,

I have not had a look at the fix yet. But my understanding is the call tree grows exponentially because the check revists the same nodes over and over again.
Comment 5 Ken Sharp 2018-10-04 07:05:25 UTC
(In reply to Shawn Rasheed from comment #4)
> Hi Ken,
> I have not had a look at the fix yet. But my understanding is the call tree
> grows exponentially because the check revists the same nodes over and over
> again.

Umm yes, and then again, no...

The exponentiallity is due to the fact that the pages tree grows exponentially with each layer you add. Your file is constructed somewhat like this (numbers are the object numbers in the PDF file):

                              /      \
                             6        7
                            / \      / \
                           8   9    8   9
                          /     \  /     \
                         10    11 10     11
                        3        3 3      3

So the check revisits the same nodes because it is descending the tree to the leaf node for every path in the tree. Adding another level to the tree in this construction simply means adding another intermediate node, but it doubles the number of leaf nodes, hence the exponential increase.

To avoid that we would have to note that the intermediate nodes have been previously encountered (but not in a circular reference), and have been fully resolved. Not an easy task in PostScript, especially to retrofit to an existing PostScript program.
Comment 6 Shawn Rasheed 2018-11-27 01:25:31 UTC
CVE identifier for this bug: