Bug 707523 - Very slow scaling of transparent data
Summary: Very slow scaling of transparent data
Status: RESOLVED INVALID
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: 10.02.1
Hardware: PC Linux
: P2 normal
Assignee: Default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-31 13:34 UTC by iam
Modified: 2024-01-31 15:07 UTC (History)
0 users

See Also:
Customer:
Word Size: ---


Attachments
4 page test PDF file with transparency (699.81 KB, application/gzip)
2024-01-31 13:36 UTC, iam
Details

Note You need to log in before you can comment on or make changes to this bug.
Description iam 2024-01-31 13:34:25 UTC
When the PDF file is processed with CUPS pdftopdf program (which adds different scaling and cropping options inside the PDF file as far as I understand), ghostscript is very slow to convert it into PS.

The speed difference is 4 times on an Intel desktop with an example 4 page PDF file:
1 second vs 4.2 seconds

And also 4 times on low-end ARM Cortex-A7:
11 seconds vs 39 seconds

This file apparently contains transparent operations, as running it with -dNOTRANSPARENCY speeds up processing by 10+ times.

From 39 seconds down to 4 seconds on Cortex-A7
From 4.2 seconds down to 0.18 on Intel

The results are the same for both original and processed files.

Converting the same file with Poppler (pdftops) shows no difference between the original file and pdftopdf-converted file. The file is processed in 12 seconds on Cortex-A7.

That's why I assume that GhostScript PDF interpreter's scaling operations with transparency involved is very slow.
Comment 1 iam 2024-01-31 13:36:52 UTC
Created attachment 25292 [details]
4 page test PDF file with transparency

Tested on ghostscript 10.0.0~dfsg-11+deb12u3 @ debian 12, ghostscript ghostscript-10.02.1-2.fc39.x86_64 @ fedora 39.

price.pdf is a sample 4-page file
price_fedora.pdf is a processed file in the following way:

    /usr/lib/cups/filter/pdftopdf 0 0 0 0 0 price.pdf > price_fedora.pdf
Comment 3 iam 2024-01-31 13:39:17 UTC
$ time /usr/bin/gs -q -dNOPAUSE -dBATCH -dSAFER -dNOMEDIAATTRS -sstdout=%stderr -sDEVICE=ps2write -dShowAcroForm -sOUTPUTFILE=%stdout -sProcessColorModel=DeviceGray -sColorConversionStrategy=Gray -dLanguageLevel=3 -r600 -dCompressFonts=false -dNoT3CCITT -dNOINTERPOLATE -dNOTRANSPARENT /tmp/price.pdf > /dev/null

real    0m10.981s
user    0m10.360s
sys     0m0.570s


$ time /usr/bin/gs -q -dNOPAUSE -dBATCH -dSAFER -dNOMEDIAATTRS -sstdout=%stderr -sDEVICE=ps2write -dShowAcroForm -sOUTPUTFILE=%stdout -sProcessColorModel=DeviceGray -sColorConversionStrategy=Gray -dLanguageLevel=3 -r600 -dCompressFonts=false -dNoT3CCITT -dNOINTERPOLATE -dNOTRANSPARENT /tmp/price_fedora.pdf > /dev/null

real    0m39.461s
user    0m37.750s
sys     0m1.510s


# Now with -dNOTRANSPARENCY


$ time /usr/bin/gs -q -dNOPAUSE -dBATCH -dSAFER -dNOMEDIAATTRS -sstdout=%stderr -sDEVICE=ps2write -dShowAcroForm -sOUTPUTFILE=%stdout -sProcessColorModel=DeviceGray -sColorConversionStrategy=Gray -dLanguageLevel=3 -r600 -dCompressFonts=false -dNoT3CCITT -dNOINTERPOLATE -dNOTRANSPARENCY /tmp/price.pdf > /dev/null

real    0m4.719s
user    0m4.440s
sys     0m0.200s


$ time /usr/bin/gs -q -dNOPAUSE -dBATCH -dSAFER -dNOMEDIAATTRS -sstdout=%stderr -sDEVICE=ps2write -dShowAcroForm -sOUTPUTFILE=%stdout -sProcessColorModel=DeviceGray -sColorConversionStrategy=Gray -dLanguageLevel=3 -r600 -dCompressFonts=false -dNoT3CCITT -dNOINTERPOLATE -dNOTRANSPARENCY /tmp/price_fedora.pdf > /dev/null

real    0m4.614s
user    0m4.380s
sys     0m0.200s
Comment 4 Ken Sharp 2024-01-31 14:47:46 UTC
(In reply to iam from comment #0)
> When the PDF file is processed with CUPS pdftopdf program (which adds
> different scaling and cropping options inside the PDF file as far as I
> understand), ghostscript is very slow to convert it into PS.

PostScript does not have an equivalent transparency model to PDF, so the **only** way to turn a transparent PDF into PostScript, and have the result be correct, is to render it to an image and wrap the image up as PostScript.

>This file apparently contains transparent operations, as running it with ->dNOTRANSPARENCY speeds up processing by 10+ times.
 
And produces incorrect output.


> That's why I assume that GhostScript PDF interpreter's scaling operations
> with transparency involved is very slow.

Your assumption is incorrect. The ps2write device (NOT Ghostscript, the PostScript output device) is rendering the output. Rendering takes significantly more time, at higher resolutions, than simply processing high level objects.

As far as I can see only the final page actually uses transparency. We cannot turn transparency on and off per page, only per document. That's a limitation of the graphics library.

You could, if you wish, try splitting the file up and processing each page separately.
Comment 5 iam 2024-01-31 15:07:12 UTC
(In reply to Ken Sharp from comment #4)
> >This file apparently contains transparent operations, as running it with ->dNOTRANSPARENCY speeds up processing by 10+ times.
>  
> And produces incorrect output.

Yes, unfortunately.

> Rendering takes
> significantly more time, at higher resolutions, than simply processing high
> level objects.

I see, thanks.

> 
> As far as I can see only the final page actually uses transparency. We
> cannot turn transparency on and off per page, only per document. That's a
> limitation of the graphics library.
> 
> You could, if you wish, try splitting the file up and processing each page
> separately.

Well, this is a printing pipeline where transparent PDFs are quite common (even if they don't contain any real transparency) and any manual intervention is undesirable. I don't *need* postscript really, it's just a pipeline for many (older) printer drivers, many of which convert to entirely different format afterwards themselves.

For now I'm using Poppler wherever possible, but many drivers call ghostscript filters by themselves, which may significantly slow down the printing process depending on the file unfortunately.