Bug 707678

Summary: pdfwrite generates a corrupted and incomplete file, but only for Adobe
Product: Ghostscript Reporter: Maxime Mettey <mmettey>
Component: PDF WriterAssignee: Default assignee <ghostpdl-bugs>
Status: RESOLVED INVALID    
Severity: normal    
Priority: P2    
Version: 10.03.0   
Hardware: PC   
OS: Windows 10   
Customer: Word Size: ---
Attachments: Source PDF used in Ghostscript command
page5.pdf (generated by ghostscript command, having the issue)

Description Maxime Mettey 2024-03-20 14:33:22 UTC
Created attachment 25499 [details]
Source PDF used in Ghostscript command

I've encountered an issue while using Ghostscript with a specific PDF file (please refer to the attached file named "source_document.pdf").

I utilized the following command:
"path_to_gs/gswin64c.exe" -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=5 -dLastPage=5 -sOutputFile=page5.pdf source_document.pdf

However, upon opening the generated PDF, page5.pdf, with Adobe Acrobat, I encountered an error message indicating that the document has an issue, and a significant portion of the page's content is missing. Interestingly, page 5 functions perfectly when I open source_document.pdf with Adobe Acrobat.

It's important to note that this issue only occurs with Adobe software; page5.pdf opens correctly with Chrome-based browsers and Mozilla Firefox.

Is there a workaround available to prevent this issue? Alternatively, is this something that requires fixing on your end?

Thank you in advance for your assistance.

Best regards,
Maxime Mettey
Comment 1 Maxime Mettey 2024-03-20 14:34:41 UTC
Created attachment 25500 [details]
page5.pdf (generated by ghostscript command, having the issue)
Comment 2 Maxime Mettey 2024-03-20 14:43:05 UTC
To be more precise, I tried 3 versions of Ghostscript, but all leading to the same issue : 9.53 - 10.00.0 - 10.03.0
Comment 3 Ken Sharp 2024-03-20 15:11:07 UTC
(In reply to Maxime Mettey from comment #0)

> However, upon opening the generated PDF, page5.pdf, with Adobe Acrobat, I
> encountered an error message indicating that the document has an issue, and
> a significant portion of the page's content is missing. Interestingly, page
> 5 functions perfectly when I open source_document.pdf with Adobe Acrobat.

That's irrelevant, the PDF file output by pdfwrite bears no relationship to the input file, in terms of the syntax of the content.

This is described in more detail in the documentation:

https://ghostscript.readthedocs.io/en/latest/VectorDevices.html

 
> It's important to note that this issue only occurs with Adobe software;
> page5.pdf opens correctly with Chrome-based browsers and Mozilla Firefox.

And every other PDF consumer I've tried, including veraPDF which is a well respected PDF verification tool.

 
> Is there a workaround available to prevent this issue? Alternatively, is
> this something that requires fixing on your end?

This appears to be an error with Acrobat. Your original file contains a number of small images, these are contained in the output stream as inline images instead of referenced objects. For whatever reason Acrobat is unable to process these as inline images and tries to process the image data as PDF operators.

I can see nothing wrong with the output of pdfwrite, the problem appears to be with Adobe Acrobat which is unable to deal with a feature defined in the specification. You should report this to Adobe.