Bug 693653

Summary: Slow rendering of shading 6, v8.71 works well.
Product: Ghostscript Reporter: Rodolfo <bopodelvalle>
Component: PDF InterpreterAssignee: Alex Cherepanov <alex>
Status: NOTIFIED WONTFIX    
Severity: blocker CC: robin.watts
Priority: P4    
Version: 9.07   
Hardware: PC   
OS: Linux   
Customer: Word Size: ---
Attachments: File with problems
0001-Bug693653-Attempt-to-speed-shadings.patch

Description Rodolfo 2013-02-20 21:54:59 UTC
Created attachment 9312 [details]
File with problems

I am trying to print this file with a Gutenprint driver. I was using the v8.71 of ghostscript and the printing was ok. When I updated the gs version to 9.07 it doesnt print. I get stuck on "processing" and the printer throws pages with random characters. When going back to the v8.71 it works ok..

I use lp to print. Searching in the debug I found that it uses this>

gs -dFirstPage=1  -sDEVICE=ijs -sIjsServer=hpijs -dDEVICEWIDTHPOINTS=612 -dDEVICEHEIGHTPOINTS=792 -sDeviceManufacturer="HEWLETT-PACKARD" -sDeviceModel="HP LaserJet" -dDuplex=true -dTumble=false -r300 -sIjsParams=Quality:Quality=0,Quality:ColorMode=0,Quality:MediaType=0,Quality:PenSet=0,PS:MediaPosition=7 -dIjsUseOutputFD -sOutputFile=\|lpr enac1.pdf 

To reproduce the problem, just copy this command, 9.07 will get stuck and 8.71 will do it.

Thanks
Comment 1 Alex Cherepanov 2013-02-21 07:20:14 UTC
This command is missing -dNOPAUSE -dBATCH options, which makes gs to wait
for the stdin input.

After adding these options, the command works (tested on 64-bit Linux, gs 9.07)
and finishes in 250 seconds.
Comment 2 Rodolfo 2013-02-21 18:12:33 UTC
The command is just an example using other driver (PCL3) and not gutenprint. If you try to print the document with gutenprint driver it doesnt print with the new version, and with the old version is very fast
Comment 3 Marcos H. Woehrmann 2013-02-22 00:57:59 UTC
My testing shows that 8.71 processes the file in 4 seconds, so it's a bit faster than 9.07 :-)
Comment 4 Rodolfo 2013-02-22 17:51:13 UTC
So what should I do? Its not acceptable a wait time of 250s when 8.71 takes only 4s...
Comment 5 Alex Cherepanov 2013-02-25 06:52:19 UTC
The problem is independent of the ijs driver and can be reproduced as
gs -r300 -sDEVICE=bmp16m -o /dev/null  enac1.pdf

It is caused by inefficient implementation of decompose_linear_color()
that splits shading region recursively, recalculating the same
values many times. one of the most expensive call it does is color
conversion of 1 value at a time using CMS services.

Apparently, decomposition proceeds until it hits decomposition_limit
when color doesn't meet some initial constrains that are independent
of the shading interval.

v.8.71 and lower are significantly faster than v.9.00 and higher.
However, v.9.00 and lower fail with SEGV on page 4. Still it's possible
to find which revision slowed gs down.
Comment 6 Robin Watts 2013-02-25 11:23:47 UTC
Sounds like it might be related to bug 687731.
Comment 7 Rodolfo 2013-02-25 19:12:43 UTC
Is there any way to avoid this problem? I need to print files in black & white, so I dont care much about colors.

(In reply to comment #5)
> The problem is independent of the ijs driver and can be reproduced as
> gs -r300 -sDEVICE=bmp16m -o /dev/null  enac1.pdf
> 
> It is caused by inefficient implementation of decompose_linear_color()
> that splits shading region recursively, recalculating the same
> values many times. one of the most expensive call it does is color
> conversion of 1 value at a time using CMS services.
> 
> Apparently, decomposition proceeds until it hits decomposition_limit
> when color doesn't meet some initial constrains that are independent
> of the shading interval.
> 
> v.8.71 and lower are significantly faster than v.9.00 and higher.
> However, v.9.00 and lower fail with SEGV on page 4. Still it's possible
> to find which revision slowed gs down.
Comment 8 Robin Watts 2013-02-25 19:31:55 UTC
(In reply to comment #7)
> Is there any way to avoid this problem? I need to print files in black &
> white, so I dont care much about colors.

You could try building with MAX_SHADING_RESOLUTION defined to 72.

On unix this would be with something like:

  make gs XCFLAGS="-DMAX_SHADING_RESOLUTION=72"

If you look at the comments at the start of gs/base/gxshade6.c you'll see the reasoning behind this explained.
Comment 9 Rodolfo 2013-02-25 21:56:30 UTC
I tried what you say. This are the results:
9.07 with MAX_SHADING_RESOLUTION defined to 72.
real    1m45.587s
user    1m43.158s
sys    0m2.140s

8.71
real    0m56.173s
user    0m53.727s
sys    0m1.844s

9.07
real    29m11.071s
user    28m17.390s
sys    0m39.858s

BUT, when printing with Gutenprint, it never print with 9.07 (when normal and when MAX_SHADING_RESOLUTION defined to 72), but it works fast with 8.71.
I tried with the modified version of the 9.07 and the state of the printer (seen in localhost:631) has been for more than an hour in "processing" 

(In reply to comment #8)
> (In reply to comment #7)
> > Is there any way to avoid this problem? I need to print files in black &
> > white, so I dont care much about colors.
> 
> You could try building with MAX_SHADING_RESOLUTION defined to 72.
> 
> On unix this would be with something like:
> 
>   make gs XCFLAGS="-DMAX_SHADING_RESOLUTION=72"
> 
> If you look at the comments at the start of gs/base/gxshade6.c you'll see
> the reasoning behind this explained.
Comment 10 Robin Watts 2013-02-26 00:27:04 UTC
(In reply to comment #5)
> It is caused by inefficient implementation of decompose_linear_color()
> that splits shading region recursively, recalculating the same
> values many times. one of the most expensive call it does is color
> conversion of 1 value at a time using CMS services.

Where are we recalculating the same values many times? I can see that we could probably halve the number of calls to patch_color_to_device_color_inline, but this goes through a 1 place cache anyway, so we'd probably not gain any CMS calls by doing so.

> Apparently, decomposition proceeds until it hits decomposition_limit
> when color doesn't meet some initial constrains that are independent
> of the shading interval.

Decomposition continues until we hit a single line height, or until either:

1)   !pfs->unlinear &&                          <- Constant
     pfs->linear_color                          <- Changes during recursion

2)   pfs_unlinear &&                            <- Constant
     pfs->linear_color &&                       <- Changes during recursion
     !color_span(pfs, c0, c1) > pfs->smoothness <- Changes during recursion

The fact that pfs->linear_color can change during recursion is not obvious from the code, so easy to miss.

We could probably speed color_span by avoiding floating point divisions, but that's not an algorithmic improvement.

I can't see any obvious wins here - Alex, can you see something I've missed?
Comment 11 Alex Cherepanov 2013-02-26 01:54:23 UTC
One obvious improvement is to convert more color values at once by
keeping them in a single array. This will give about 10% speed-up on
the sample file.

I need to think more about your comment.
Comment 12 Alex Cherepanov 2013-02-26 08:06:43 UTC
Robin, thank you for the hint.

There is a problem in gx_icc_is_linear_in_line() implementation.
The number of destination colorants ndes is calculated from dev_profile
but gsicc_remap_fast() takes the number of color components from the
profile pointed by icclink->link_handle .

Surprisingly, cleaning the buffers before calculating the destination color
    des0[0] = des0[1] = des0[2] = 0;
    des1[0] = des1[1] = des1[2] = 0;
    des01[0] = des01[1] = des01[2] = 0;
broke rendering completely.
Comment 13 Robin Watts 2013-02-26 13:40:28 UTC
Created attachment 9326 [details]
0001-Bug693653-Attempt-to-speed-shadings.patch

This git commit (NOT APPLIED TO MASTER) attempts to use coarser shadings for linear ones. It produces differences that are (to my eyes unacceptable), so I don't think we can use it.
Comment 14 Alex Cherepanov 2013-02-27 07:36:25 UTC
This commit by Michael Vrhel fixes a bug that caused excessive splitting
of the shading region. It makes gs 6.5 times faster on the sample file.

http://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=d6f83c3aec8ad94b013df125f9feae719c52b941;hp=1f4996e6547bb89cbd92868224a2101c773c886b

Now, the dominant time consumer is image interpolation.
Running the file with -dNOINTERPOLATE flag halves the time again.
Comment 15 Robin Watts 2013-02-27 13:57:42 UTC
(In reply to comment #14)
> This commit by Michael Vrhel fixes a bug that caused excessive splitting
> of the shading region. It makes gs 6.5 times faster on the sample file.
> 
> http://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;
> h=d6f83c3aec8ad94b013df125f9feae719c52b941;
> hp=1f4996e6547bb89cbd92868224a2101c773c886b
> 
> Now, the dominant time consumer is image interpolation.
> Running the file with -dNOINTERPOLATE flag halves the time again.

decompose_linear_color barely shows up in my profiling, so I think we're done here.

Rodolpho - reopen the bug if it doesn't solve the problem for you.
Comment 16 Rodolfo 2013-02-27 17:43:35 UTC
Thanks Alex and Robin. The fix made rendering much more faster using the command time /home/ubuntu/Downloads/ghostscript-9.07/bin/gs -r300 -sDEVICE=bmp16m -o /dev/null  /home/ubuntu/Downloads/enac1.pdf

8.71
real	0m56.173s
user	0m53.727s
sys	0m1.844s

9.07 with MAX_SHADING_RESOLUTION defined to 72.
real	1m45.587s
user	1m43.158s
sys	0m2.140s

9.07
real	29m11.071s
user	28m17.390s
sys	0m39.858s

9.07 with fix
real	1m8.860s
user	1m8.020s
sys	0m0.808s

But, the problem when using Gutenprint still there. Try printing using the gutenprint driver. Using the 8.71 version the job state is processing for 2 seconds. When using 9.07 it get stucked in processing.
Comment 17 Robin Watts 2013-02-27 17:56:33 UTC
(In reply to comment #16)
> But, the problem when using Gutenprint still there. Try printing using the
> gutenprint driver. Using the 8.71 version the job state is processing for 2
> seconds. When using 9.07 it get stucked in processing.

OK, so can you find the exact command that gutenprint is calling, and verify that that takes a short time with 8.71 and a long time with 9.07?

You post a command line in the original description, and alex points out that it requires -dNOBATCH -dNOPAUSE to work properly - is gutenprint missing out those options? Maybe that's the problem?

Can you run top in a window and see whether gs is taking any time?
Comment 18 Rodolfo 2013-02-27 20:26:26 UTC
(In reply to comment #17)
> (In reply to comment #16)
> > But, the problem when using Gutenprint still there. Try printing using the
> > gutenprint driver. Using the 8.71 version the job state is processing for 2
> > seconds. When using 9.07 it get stucked in processing.
> 
> OK, so can you find the exact command that gutenprint is calling, and verify
> that that takes a short time with 8.71 and a long time with 9.07?
> 
> You post a command line in the original description, and alex points out
> that it requires -dNOBATCH -dNOPAUSE to work properly - is gutenprint
> missing out those options? Maybe that's the problem?
> 
> Can you run top in a window and see whether gs is taking any time?

This is the command used by the gutenprint driver:

D [27/Feb/2013:17:22:50 -0300] [Job 19] Ghostscript command line: /usr/bin/gs -dQUIET -dPARANOIDSAFER -dNOPAUSE -dBATCH -sDEVICE=cups -sstdout=%stderr -sOutputFile=%stdout -I/usr/share/cups/fonts -sMediaClass=Standard -dDuplex -r300x300 -dDEVICEWIDTHPOINTS=612 -dDEVICEHEIGHTPOINTS=792 -dcupsBitsPerColor=8 -dcupsColorOrder=0 -dcupsColorSpace=0 -dcupsRowFeed=3 -scupsPageSizeName=Letter -c -f -_

the error it seems that is this:

D [27/Feb/2013:17:22:50 -0300] [Job 19] Unrecoverable error: undefined in .uninstallpagedevice
D [27/Feb/2013:17:22:50 -0300] [Job 19] Operand stack:
D [27/Feb/2013:17:22:50 -0300] [Job 19] defaultdevice
Comment 19 Rodolfo 2013-03-12 15:27:55 UTC
Any ideas?