Bug 703481 - Switch from 9.07 to 9.53 generating large PDF's
Summary: Switch from 9.07 to 9.53 generating large PDF's
Status: RESOLVED INVALID
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Writer (show other bugs)
Version: 9.53.3
Hardware: PC Windows 10
: P4 normal
Assignee: Default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-05 10:10 UTC by Rohan
Modified: 2021-02-06 06:21 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: 64


Attachments
Input and Output Files for Reproduction (341.78 KB, application/x-zip-compressed)
2021-02-05 10:10 UTC, Rohan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rohan 2021-02-05 10:10:04 UTC
Created attachment 20566 [details]
Input and Output Files for Reproduction

I have been using Ghostscript to embed fonts into PDF's generated. Till 2020 I used version 9.07 to embed the PDF's for Printing (Full Font embedding, not subsets).

Due to security scan and vurnerability I upgraded to GS 9.53.3. With the same input PDF's, the generated PDF's after Font embedding grew considerably large approx 2-5 times from the size of the PDF's generated with 9.07.

What has changed? Why is the PDF's getting larger? We have not changed the command from our side, only the path to the Ghostscrpit executable.

I am using the windows redistributable found on the artifax website.

It's imperative for us to get the PDF's size reduced. 

I've attached a sample files the input file and both output file in the ZIP file.

2a_input -  input file (33 KB)
2a_out_GS907 - output from GS 9.07 (79 KB)
2a_out_GS9533 - output from GS 9.53.3 (244 KB)

Executable Path: C:\Program Files\gs\gs9.53.3\bin\gswin64c.exe

The command switches are as follows:
-q 
-dNOPAUSE 
-dBATCH 
-dSAFER 
-dLOCALFONTS 
-dNOCCFONTS 
-sDEVICE=pdfwrite 
-dCompatibilityLevel=1.4 
-dPDFSETTINGS=/prepress 
-sDEFAULTPAPERSIZE=a4 
-sColorConversionStrategy=CMYK 
-dProcessColorModel=/DeviceCMYK 
-dAutoRotatePages=/PageByPage 
-dCompressPages=true 
-dEmbedAllFonts=true 
-dSubsetFonts=false 
-dMaxSubsetPct=0 
-dConvertCMYKImagesToRGB=false 
-dEncodeColorImages=true 
-dAutoFilterColorImages=true 
-dEncodeGrayImages=true 
-dAutoFilterGrayImages=true 
-dEncodeMonoImages=true 
-dMonoImageFilter=/CCITTFaxEncode 
-dDownsampleMonoImages=false 
-dPreserveOverprintSettings=true 
-dUCRandBGInfo=/Preserve 
-dUseFlateCompression=true 
-dParseDSCCommentsForDocInfo=true 
-dParseDSCComments=true 
-dOPM=0 
-dOffOptimizations=0 
-dLockDistillerParams=false 
-dGrayImageDepth=-1 
-dASCII85EncodePages=false 
-dDefaultRenderingIntent=/Default 
-dTransferFunctionInfo=/Preserve 
-dPreserveHalftoneInfo=false 
-dDetectBlends=true 
-sFONTPATH=C:/WINDOWS/Fonts 
-sOutputFile=2a_out.pdf 
-f 2a_input.pdf
Comment 1 Ken Sharp 2021-02-05 11:46:52 UTC
(In reply to Rohan from comment #0)

> I have been using Ghostscript to embed fonts into PDF's generated. Till 2020
> I used version 9.07 to embed the PDF's for Printing (Full Font embedding,
> not subsets).

Since size is important to you, why are you embedding entire fonts and not subsets ?

 
> What has changed?

The fonts have changed. The URW fonts now include Cyrillic glyphs.


> Why is the PDF's getting larger?

Because you have included the entire font for each of 4 fonts instead of a subset.


> It's imperative for us to get the PDF's size reduced. 

Well I might start by pointing out that you are not a commercial customer, free users are welcome to use Ghostscript, in accordance with the license, but we do not guarantee to provide technical support.

While you are (I assume) using Ghostscript in accordance with the license it is supplied under (AGPL v3) you may like to consider whether technical support might have some value for you, if this sort of problem is 'imperative'.


> The command switches are as follows:

Well let's examine that, shall we ?

> -q 
> -dNOPAUSE 
> -dBATCH 
> -dSAFER 
> -dLOCALFONTS 

This puts all fonts in local VM. What is the reason for this ? The documentation is pretty clear that this is only required for compatibility with certain (very) obselete Adobe printers and fonts, and even then only when running PostScript, not PDF.


> -dNOCCFONTS 

This functionality was removed many years ago. Even when it was present, setting this flag could have had no value, since you were using the pre-compiled binary and therefore not taking advantage of the feature.


> -sDEVICE=pdfwrite 
> -dCompatibilityLevel=1.4 
> -dPDFSETTINGS=/prepress 

Why are you selecting prepress ? If all you want is to get a good output PDF file then you should leave PDFSETTINGS untouched. The default settings will do a good job.

If you specifically want to alter something (such as the effective resolution of images) then you would be better advised to alter **only** those controls.


> -sDEFAULTPAPERSIZE=a4 
> -sColorConversionStrategy=CMYK 

If you want to keep the output file size small, why are you forcing conversion to CMYK ? CMYK requires 4 components, RGB of course requires only 3, so colour specifications in RGB are smaller.


> -dProcessColorModel=/DeviceCMYK 

You do not need to (and should not) set this with the current versions of Ghostscript.


> -dAutoRotatePages=/PageByPage 
> -dCompressPages=true 
> -dEmbedAllFonts=true 

These are the default values, you really do not need to set these.


> -dSubsetFonts=false 
> -dMaxSubsetPct=0 

And this is the source of your problem, if you don't subset fonts then the entire font, all the glyphs, will be stored inside the PDF file. More glyphs=bigger fonts.


> -dConvertCMYKImagesToRGB=false 
> -dEncodeColorImages=true 
> -dAutoFilterColorImages=true 
> -dEncodeGrayImages=true 
> -dAutoFilterGrayImages=true 
> -dEncodeMonoImages=true 
> -dMonoImageFilter=/CCITTFaxEncode 
> -dDownsampleMonoImages=false 
> -dPreserveOverprintSettings=true 
> -dUCRandBGInfo=/Preserve 
> -dUseFlateCompression=true 
> -dParseDSCCommentsForDocInfo=true 
> -dParseDSCComments=true 

Pretty much all of these are the default values, or you would be better advised to leave them alone. DSC comments, for example, are part of the PostScript language and if your input is always PDF can have no effect whatever.

> -dOPM=0 

Do you have any idea what this does ? If not why are you setting it ?


> -dOffOptimizations=0 

Again this is the default value, and in any event has no effect on Ghostscript, it is an Adobe Distiller parameter.


> -dLockDistillerParams=false 
> -dGrayImageDepth=-1 
> -dASCII85EncodePages=false 
> -dDefaultRenderingIntent=/Default 
> -dTransferFunctionInfo=/Preserve 
> -dPreserveHalftoneInfo=false 
> -dDetectBlends=true 

And again these are, broadly, the default values, or have no effect on Ghostscript.

> -sFONTPATH=C:/WINDOWS/Fonts 
> -sOutputFile=2a_out.pdf 
> -f 2a_input.pdf

If you remove the SubsetFonts=true and MaxSubsetPct=0 controls and run the file with all your other setup, the file comes out at 44Kb, compared with your 9.07 version which is 79Kb. Seems to me this is compelling evidence for subsetting fonts, if file size is 'imperative' for you.

Basically you have a choice:

1) Subset fonts
2) Pick up a copy of the old URW fonts from the 9.07 release and use those in place of the current ones. Put the fonts somewhere convenient and supply -sFONTPATH=<path to fonts> in place of your existing one.
3) Get the Ghostscript source code. Replace the Resource/Fonts directory with the old 9.07 fonts, rebuild Ghostscript.

But fundamentally its a result of you choosing not to subset fonts and increased language coverage in the fonts we ship.
Comment 2 jreiser 2021-02-06 06:21:41 UTC
re: Not subsetting fonts
One plausible reason might be that some consumers of the PDFs (such as some printers, display software, etc.) might not process subsetted fonts correctly.  If so, then the original submitter of this bugzilla report should specify the buggy consumers (manufacturers, product names, versions, dates), and explain why catering to them is "imperative".