The document at the URL renders *very* slowly since the no-tree rewriting. Profiling shows that most of the time is spent in duff_4i1o1 (called from pdf_grestore -> fz_drawpopclip -> blendmaskover).
I can confirm that Psychology_p354.pdf shows the same hotspot (70% of CPU on the beagleboard in this function).
Due to not having knowledge of the size of the contents being clipped, we were allocating scratch buffers the size of the clip mask rather than the size of the intersection between the clip mask and the contents. The sample PDF sets a new, almost page size rectangular clip mask for every individual path drawn. I've implemented a special case for handling rectangular clip paths by scissoring the bounding boxes instead of rendering a the rectangular path to a clip mask.
*** Bug 690766 has been marked as a duplicate of this bug. ***