Bug 705994 - Strange two-page layout for certain PDF thumbnails
Summary: Strange two-page layout for certain PDF thumbnails
Status: RESOLVED INVALID
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: 10.0.0
Hardware: PC Linux
: P4 minor
Assignee: Default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-19 06:47 UTC by Alan Orth
Modified: 2022-10-19 08:32 UTC (History)
0 users

See Also:
Customer:
Word Size: ---


Attachments
PDF document (559.18 KB, application/pdf)
2022-10-19 06:47 UTC, Alan Orth
Details
Thumbnail (24.42 KB, image/jpeg)
2022-10-19 07:01 UTC, Alan Orth
Details
libvips thumbnail (26.66 KB, image/jpeg)
2022-10-19 07:03 UTC, Alan Orth
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Orth 2022-10-19 06:47:04 UTC
Created attachment 23325 [details]
PDF document

I'm not sure how to describe this bug. I have unexpected results from Ghostscript on certain thumbnails where the resulting image is somehow in a two-page portrait layout, with a blank page on the left side. I am using Ghostscript via ImageMagick, but this command reproduces it:

$ gs -sDEVICE=jpeg -dPDFFitPage=true -dDEVICEWIDTHPOINTS=640 -dDEVICEHEIGHTPOINTS=640 -sPageList=1 -sOutputFile=10568-116598.pdf.jpg 10568-116598.pdf

I have noticed this on happening on a handful of PDFs over the years with different versions of Ghostscript (currently version 10.0.0), but haven't yet figured out a pattern. If need be I can find more PDFs to inspect.

Thank you!
Comment 1 Alan Orth 2022-10-19 07:01:04 UTC
Created attachment 23326 [details]
Thumbnail

Thumbnail generated with gs. Notice there is essentially a blank white page in portrait layout to the left of the correct thumbnail of the first page of the PDF.
Comment 2 Alan Orth 2022-10-19 07:03:29 UTC
Created attachment 23327 [details]
libvips thumbnail

libvips creates a more reasonable thumbnail here.
Comment 3 Ken Sharp 2022-10-19 07:27:42 UTC
PDF files can have multiple different 'Box' values; ArtBox, BleedBox, CropBox,  MediaBox and TrimBox. The MediaBox is required the other boxes are optional, a given PDF page description must contain the MediaBox and may contain any or all of the others.

By default Ghostscript uses the MediaBox to determine the size of the media. Other PDF consumers may exhibit other behaviours.

The pages in your PDF file contain all of the Boxes. In the majority of cases the Boxes all contain the same values (which makes their inclusion pointless of course). But for page 1 they differ:

/CropBox[594.375 0.0 1190.55 839.176]
/MediaBox[0.0 0.0 1190.55 841.89]

You can tell Ghostscript to use a different Box value for the media by using one of -dUseArtBox, -dUseBleedBox, -dUseCropBox, -dUseTrim,Box. If I specify -dUseCropBox then the file is rendered as you expect.
Comment 4 Alan Orth 2022-10-19 08:32:43 UTC
Thank you, Ken! That is a perfect explanation. Just a note for future travelers that you can select the CropBox in ImageMagick using:

-define pdf:use-cropbox=true