Bug 690236 - Ghostscript is not able to convert PDF to PostScript maintaining the input document's page sizes
Summary: Ghostscript is not able to convert PDF to PostScript maintaining the input do...
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: master
Hardware: All All
: P2 critical
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-16 01:53 UTC by Till Kamppeter
Modified: 2009-06-19 06:48 UTC (History)
3 users (show)

See Also:
Customer:
Word Size: ---


Attachments
A3landscape.pdf (19.81 KB, application/pdf)
2009-01-16 01:57 UTC, Till Kamppeter
Details
A3landscape.ps (38.20 KB, application/postscript)
2009-01-16 01:58 UTC, Till Kamppeter
Details
a3landscape-pswrite.ps (13.89 KB, application/postscript)
2009-01-16 03:00 UTC, Ken Sharp
Details
a3landscape-ps2write.ps (116.39 KB, application/postscript)
2009-01-16 03:01 UTC, Ken Sharp
Details
690236.patch (1.08 KB, patch)
2009-01-19 02:59 UTC, Ken Sharp
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Till Kamppeter 2009-01-16 01:53:02 UTC
I use the following Ghostscript command line (with the newest SVN rev 9367):

gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pswrite -sOUTPUTFILE=%stdout
-dLanguageLevel=3 A3landscape.pdf > A3landscape_out.ps

The output PostScript file does not have, as expected, the page size of the
input file (A3) but the system default paper size (A4).

The reverse conversion

gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -sOUTPUTFILE=%stdout
A3landscape.ps > A3landscape_out.pdf

produces a PDF file in A3, as expected.

I attach the two input files.
Comment 1 Till Kamppeter 2009-01-16 01:57:09 UTC
Created attachment 4724 [details]
A3landscape.pdf

PDF file with A3 page size
Comment 2 Till Kamppeter 2009-01-16 01:58:44 UTC
Created attachment 4725 [details]
A3landscape.ps

PostScript input file with A3 page size
Comment 3 Ken Sharp 2009-01-16 02:18:30 UTC
Till, if I use pswrite, as your command line, I get a PostScript file which
correctly converts to an A3 PDF file using Distiller or pdfwrite. It also
renders A3 using Ghostscript.

If I instead use ps2write, then I do see a problem.

Can you confirm the device should be ps2write and not pswrite, so I can be sure
I'm looking at the same issue please ?

NB, I don't think this is a PDF interpreter problem either way, because both
PostScript files contaan correct media information for me.
Comment 4 Till Kamppeter 2009-01-16 02:55:25 UTC
I used "gv" and "evince" and these programs displayed the file incorrectly. If I
directly use "gs" to display the file on the screen it comes out correctly.
Comment 5 Till Kamppeter 2009-01-16 02:57:33 UTC
"ps2write" gives indeed wrong output even with "gs".
Comment 6 Ken Sharp 2009-01-16 03:00:52 UTC
Created attachment 4726 [details]
a3landscape-pswrite.ps

Output from HEAD using:

gs -q -dNOPAUSE -dBATCH -sDEVICE=pswrite -dLanguageLevel=3
-sOutputFile=a3landscape-pswrite.ps a3landscape.pdf
Comment 7 Ken Sharp 2009-01-16 03:01:41 UTC
Created attachment 4727 [details]
a3landscape-ps2write.ps

Output from HEAD using:

gs -q -dNOPAUSE -dBATCH -sDEVICE=ps2write -dLanguageLevel=3
-sOutputFile=a3landscape-ps2write.ps a3landscape.pdf
Comment 8 Ken Sharp 2009-01-16 03:47:55 UTC
Attached are the two output files using pswrite and ps2write. The pswrite output
works immediately to produce A3 output on GS and an A3 PDF file with Distiller.

The ps2write output will *only* do this if 'SetPageSize' is defined (somewhere
in the PostScript environment) as 'true'. This can be easily achieved with GS by
setting -dSetPageSize on the command line. Of course this is more problematical
when sending files to a non-GS device such as a printer or, as in Till's case,
another application.

The pswrite output should work with any PostScript interpreter which supports
either setpagedevice or setpage, and allows for the page size to be altered.
This does work for me with Ghostscript and Distiller, so I think this is working
correctly in general. I'm unsure about evince, but it seems gv uses the
BoundingBox comments to set the page size, not the setpagedevice call. I tested
this by removing the %%BoundingBox and %%DocumentMedia lines from A3landscape.ps
and opening the file in gv, it opens as A4.

I'm unsure how much effort would be required to get pswrite and p2write to write
correct BoundingBox comments. Probably the easiest way to do this would be to
run the file twice, once to the bbox device to generate a (correct) BoundingBox,
then again to generate the PostScript and finally merge the two sets of
information. This would be an enhancement in any event, the PostScript is not
incorrect at the moment.


However I believe that the current behaviour of ps2write is simply wrong. It
will not attempt to switch page sizes, even though it has a page size request,
unless a magic variable is set. This variable will not(of course) be set on most
systems and so the emitted PostScript will not request a page size change.

The whole point of the PageSize request architecture is to allow printers to
switch media, and optionally scale output, so that output does not get clipped.
Failing to make the request defeats this. I think that opdfread should be
modified so that page size requests are always made.

Alex, Ray, I would value opinions from both of you on this one (and anyone else
;-), the change was made by Leonardo way back in May 2005. The log file is less
than informative, and this does not seem to have been done to fix a bug.
Comment 9 Alex Cherepanov 2009-01-16 07:14:14 UTC
I agree with Ken that by default ps2write should pass page sizes to the
device. OTOH, most printers don't support variable media size and many have
broken PageSize policies. ps2write can take parameters that scale,
rotate, or crop the page at the generation time - by copying the parameters
into the body of generated PS. These parameters may have an override
parameter at the interpretation time.

The %%BoundingBox comment is useless in PS files. By definition it's an union
of the boxes that include marked parts of every page. Media size cannot be
derived from %%BoundingBox.

The %%DocumentMedia is a header comment that is difficult to generate but
quite unhelpful in choosing the default page size. It just lists the media
sizes (among other attributes) in no particular order.

The %%PageMedia comment is easy to generate and gives the media size for every
page - exactly what's needed.
Comment 10 Ken Sharp 2009-01-16 07:56:57 UTC
Alex, you said:

"ps2write can take parameters that scale, rotate, or crop the page at the
generation time - by copying the parameters into the body of generated PS. These
parameters may have an override parameter at the interpretation time."

Which parameters are these ? I looked and the only relevant parameters I could
find were the RotatePages and ScalePages switches which, like SetPageSize, are
set from outside the converted job. Is there some way to set these so that
ps2write will emit values for them in the body of the output PostScript ?

Effectively these three switches implement media selection, but you have to set
them in the PostScript interpreter, which is possible, but non-trivial. All
three default to false, maybe SetPageSize should default to true ? That would
still allow control, though it would mean that SetPageSize would have to be set
to false as well as setting the required switch.


"The %%PageMedia comment is easy to generate and gives the media size for every
page - exactly what's needed."

I could be mistaken, but the definition of PageMedia in my copy of the DSC spec
says that supplies a media name. Generating that would mean looking up a table
of media sizes and converting to names, which won't work for custom sizes. Also
the media name used is supposed to be previously defined in a DocumentMedia
comment I think. The examples quoted show this to be the case.

I think we could generate a %%DocumentMedia comment of the form:

%%DocumentMedia Regular <width> <height> 0 () () 

gv seems to be happy with this comment. Or we could just set the BoundingBox to
the media size. Its not correct, but its probably better than the current situation.
Comment 11 Ray Johnston 2009-01-16 09:33:18 UTC
I'm not sure why Igor made 'SetPageSize' off by default, but anyone that wants
in on by default can put it into opdfread.ps as in:

Index: Resource/Init/opdfread.ps
===================================================================
--- Resource/Init/opdfread.ps   (revision 9354)
+++ Resource/Init/opdfread.ps   (working copy)
@@ -43,6 +43,9 @@
 % Assuming the currentfile isn't positionable.
 % As a consequence, the reader fully ignores xref.
 
+% Enable SetPageSize by default.
+/SetPageSize true def
+
 % ====================== Error handler =======================
 % A general error handler prints an error to page.

I have no objection to making the default case be true in the code base --
the two customers that I know of that use ps2write both set SetPageSize true.

As far as adding a DSC comment for the benefit of applications that expect
this, I concur with Alex.
===========================================================================

As an enhancement to simplify setting the defaults in the ps2write output
without changing opdfread.ps or editing the emitted PS, we _could_ take
the defaults from the state when ps2write runs, i.e., if the values are
defined in 'systemdict' (as they are by -d options), then we would set the
value directly in the output (overriding opdfread.ps defaults).

Thus:
  gs -sDEVICE=ps2write -dSetPageSize=true -o x.ps in.pdf
would have SetPageSize true in the emitted PS.
===========================================================================

Assigning back to Ken.
Comment 12 Ken Sharp 2009-01-19 01:53:26 UTC
Fixed by this revision:

http://ghostscript.com/pipermail/gs-cvs/2009-January/008951.html

I added a %%BoundingBox comment to the ps2write output, and altered the pswrite
output so that if the BoundingBox is 0 0 0 0 it gets replaced with the media
size. This should work in gv and presumably evince with some caveats; the
ps2write output is not DSC, so if subsequent pages are on differently sized
media, they will not properly switch. The %%BoundingBox in both cases is not
strictly correct.

Also modified ps2write so that you can set the media switches (SetPageSize etc)
during production of the PostScript, which means you no longer need some magic
means to set these on the target. If any of these are defined, then the result
is 'fixed', you can't alter the behaviour. If none are defined, then the output
is 'flexible' and can be controlled as before by setting these switches on the
target. Customers already using these should therefore experience no differences.

Comment 13 Till Kamppeter 2009-01-19 02:20:38 UTC
Now I do not get a 0 0 0 0 BoundingBox any more, but a BoundingBox for A4
(system media size in /etc/papersize), but the input file is A3. Is there no
possibility to overtake the media size of the input file when converting PDF to
PostScript?
Comment 14 Ken Sharp 2009-01-19 02:59:30 UTC
Created attachment 4733 [details]
690236.patch

Sorry Till, somehow an old version of gdevpsu.c got sumbitted with only half
the work done. Try applying this patch. This should generate a
%%PageBoundingBox DSC comment on every page of output with the correct size in
it.

As regards ps2write, no I'm afraid there's no way to do anything about that
until it becomes DSC compliant. At the time the output PostScript file is
created the page size hasn't been set, so we write the current page size. This
should either be omitted, or superseded by a page level box, like pswrite, but
that's not possible right now.
Comment 15 Till Kamppeter 2009-01-19 03:07:34 UTC
The patch works with my test file. thanks.
Comment 16 Ken Sharp 2009-01-19 03:14:12 UTC
Thanks Till, now committed as revision 9374:

http://ghostscript.com/pipermail/gs-cvs/2009-January/008952.html
Comment 17 Till Kamppeter 2009-01-21 14:32:12 UTC
Ray, I have tried out your patch from comment #11 and it looks promising. My
only A3 PostScript printer, the screen viewer "gv" actually changes "trays" (=
the window size) which it did not without the patch. I think you should apply it
to the code base. If it does not break "pswrite" (without the "2") I would like
to have it in the code base.
Comment 18 Ray Johnston 2009-01-21 17:06:19 UTC
Till,

Ken's patch, which is committed, supersedes my change, and I like it better
since it allows the ps2write step to optionally emit SetPageSize and the
other optional defaults. From Ken's log message:

--------------------
Also allow the use of the /SetPageSize, /RotatePages and /FitPages
keys during ps2write processing to emit a PostScript file with these keys
already set, thus allowing media selection to take place without further
user intervention.
--------

Note that I would have preferred that the enhancements to the command line
be in the log message prior to the DETAILS: section.

I think that Ken should supplement the doc/ps2ps2.htm file to describe this
improvement.

Please test with that versions using -dSetPageSize=true during the step that
produces the PS file e.g.,

  gs -sDEVICE=ps2write -o out.ps -dSetPageSize=true in.pdf
Comment 19 Ken Sharp 2009-01-22 00:12:34 UTC
Ray, I did document the changes to the command line in ps2ps2.htm:

"These keys can be set when executing ps2ps2 (or using the ps2write device),
this 'fixes' the resulting behaviour according to which key has been set. If
these keys are not defined during conversion, the resulting PostScript will not
attempt any form of media selection. In this case the behaviour can then be
modified by setting the keys, either by modifying the resulting PostScript or
setting the values in some other manner on the target device."

This replaces the previous paragraph which only mentioned setting the keys on
the target device after creating the output PostScript.

Comment 20 Ken Sharp 2009-06-19 06:48:31 UTC
*** Bug 690548 has been marked as a duplicate of this bug. ***