Bug 692172 - ps2write output crashes with limitcheck error after converting large documents
Summary: ps2write output crashes with limitcheck error after converting large documents
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PS Writer (show other bugs)
Version: 9.01
Hardware: PC All
: P4 normal
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-28 13:24 UTC by steve166
Modified: 2011-06-21 15:59 UTC (History)
0 users

See Also:
Customer:
Word Size: ---


Attachments
modified ps2write prolog (opdfread.ps) (144.75 KB, application/postscript)
2011-04-28 13:24 UTC, steve166
Details

Note You need to log in before you can comment on or make changes to this bug.
Description steve166 2011-04-28 13:24:07 UTC
Created attachment 7479 [details]
modified ps2write prolog (opdfread.ps)

Hello,
I am supporting customer 661 in converting large documents
to pdf and postcript output. Because ps2write is now DSC
compliant i give this driver another try (after 3 years).
This driver works much better now, but it is still not possible
to create as large output files as with the pswrite device.
As far as i can see the main problem is the limitation of
the object count, because all objects are stored in a postcript array.
Therefore, depending on the length and compexity of the input data after a few thousand pages i get a limitcheck error, when the Objectregistry count gets larger
than 65535 objects.
I have modified the opdfread.ps prolog (see attachement) to store
the object numbers in a dictionary of dictionaries. With this prolog
the limitcheck error is gone and i can view my 20000 pages testfile with ghostview without further errors.
Is it possible to add this or a similar enhancement to one of the
next Ghostscript releases ?
Can i expect more size limitation problems when using the ps2write
device in comparison to the pswrite device driver ? 

Best regards,
steve166
Comment 1 Ken Sharp 2011-06-16 15:03:49 UTC
(In reply to comment #0)

Hello Steve,

> I have modified the opdfread.ps prolog (see attachement) to store
> the object numbers in a dictionary of dictionaries. With this prolog
> the limitcheck error is gone and i can view my 20000 pages testfile with
> ghostview without further errors.
> Is it possible to add this or a similar enhancement to one of the
> next Ghostscript releases ?

I'm fairly happy with the patch, the main concern I have is that you have dropped some DEBUG messages, and no longer seem to be catching the redefinition of objects:

      {  pop             % d id obj
         3 2 roll        % id obj d
         exec            % id obj execute the default deamon

      }ifelse

Previous equivalent:

    } {                                                   % d r id obj e
    dup null ne {                                       % d r id obj e
      mark (The object ) 4 index ( already defined : ) 4 index //error exec
    } {
      pop
    } ifelse
    4 3 roll                                            % r id obj d
    % Execute the default daemon :
    exec
  } ifelse                                              % r id obj

Is there a reason you dropped the check against the null object ? 

> Can i expect more size limitation problems when using the ps2write
> device in comparison to the pswrite device driver ? 

To be honest, I don't know. There is a maximum object number, but its 10 decimal digits, which is probably enough for most purposes. Even if you allowed for 10 streams per page that's still nearly 100 million pages. Beyond that, I don't *think* there are any problems, but its not an area I've looked into.
Comment 2 steve166 2011-06-17 12:16:05 UTC
> I'm fairly happy with the patch, the main concern I have is that you have
> dropped some DEBUG messages, and no longer seem to be catching the redefinition
> of objects:

It is almost 3 years ago since i wrote these patch, so i have some problems to remember on all the details  :-). During analyzing, modifying and testing the original code i have removed all "unneccessary" lines, such as debug messages and error conditions. Then i tested the code an found out that with the (now 3 years old) Ghostcript version, there where some other issues with the ps2write driver. Some weeks ago, as i wrote above, i give this driver another try, and now the things look better. I copied the changes from the old opdfread prolog to the actual one but i forgot to add the dropped debug and error condition lines.

Sadly the ps2write driver have the same problem as the pdfwrite driver, when the files get really big (Bug 692127) . With the testdata from customer 661 you will reach the 4 GByte temporary file limit after about 10000 pages, even if the real ps2write output size is much smaller than 4 GByte. 
In our application we have a workaround for this issue. We remove all images from the input and reinsert them as reusable streams in the ps(2)write output data.  But this is only necessary, because in the moment neither the pswrite nor the ps2write driver is really suitable to handle really big input data streams, especially when the input data contains big full colored images.

Additional we have still compatibility problems with the ps2write driver output and Adobe based RIPs (or Adobe Distiller). So i think we have to stay with the pswrite driver and our workaround mentioned above ...

Best regards,

steve166
Comment 3 Ken Sharp 2011-06-18 09:20:49 UTC
(In reply to comment #2)

> Sadly the ps2write driver have the same problem as the pdfwrite driver, when
> the files get really big (Bug 692127) . With the testdata from customer 661 you
> will reach the 4 GByte temporary file limit after about 10000 pages, even if
> the real ps2write output size is much smaller than 4 GByte. 

I took a quick look and its not going to be easy to lift that limit, its the fseek calls which cause the problem. pdfwrite & ps2write do a lot of fseek'ing in the temp files.

It would be nice to do it one day.

OK, I'll tidy up the debug prints and error checking then, I hope you don't mind, and then will commmit it.

Thanks for your contribution!
Comment 4 Ken Sharp 2011-06-20 10:43:27 UTC
Patch committed in 27b7404218093f3d1cf414b52721c8a24dbc2746

Patch and log here:

http://ghostscript.com/pipermail/gs-cvs/2011-June/013023.html

Again, thanks for the contribution. I will open an enhancement request for the temp file size so that it doesn't get forgotten.
Comment 5 Ken Sharp 2011-06-21 11:26:43 UTC
Bug #692290 is now resolved, it alters the scratch file handling so that it uses 64-bit accesses on systems which support it (those that don't will still work, but are limited to 32-bits).

However, I can't actually check this as I don't have a test file which will exercise it.

Would it be possible for you to try this out with a test file ?

Also you say:

"Additional we have still compatibility problems with the ps2write driver output
and Adobe based RIPs (or Adobe Distiller). "

I am not aware of any problems with compatibility, and I have run the output of ps2write through Adobe Acrobat Distiller. If you have files which exhibit a problem (particularly with Acrobat Distiller) it would be useful to open a bug report for this. Unless we are made aware of bugs, we won't be able to fix them!
Comment 6 steve166 2011-06-21 15:42:18 UTC
> However, I can't actually check this as I don't have a test file which will
> exercise it.

> Would it be possible for you to try this out with a test file ?

Thanks for this fix, but was the original bug number not 692127 ?!
Unfortunately things got worse when i try to test this fix with the
current gs snapshot. It seems that since Ghostcript 9.02 something with
the memory management changes. When i try to convert my 20000 pages
testjob, gswin32c consumes more and more memory. After about 500 pages
ghostscript needs already about 400 Mbyte, growing steadily.
I have to find whats going on here before i can do any tests regarding
the temporary file limit issue. I dont have this memory problem with
Ghostcript 9.01. With Ghostcript 9.01 the memory consumed by gswin32.exe
remains constant at about 30 MByte. This problem is present with pdfwrite, ps2write and, at a lesser degree, with pswrite.
Because we do not process plain postcript data i cannot send you a testfile
(We are processing VIPP using our own postcript prolog).
I will try to find out, if this problem is caused by some of our postcript
procedures.
If you have any idea what change between GS 9.01 and GS 9.02 could cause
this problem, any assumption may help.

> I am not aware of any problems with compatibility, and I have run the output > of
> ps2write through Adobe Acrobat Distiller. If you have files which exhibit a
> problem (particularly with Acrobat Distiller) it would be useful to open a bug
> report for this. Unless we are made aware of bugs, we won't be able to fix
> them!

As i wrote above i could not send you the real input data which causes this problem, because it is no plain postscript data. But i have found out that i could reproduce the problem feeding the ps2write output into ghostscript again, again using the ps2write device. This output then causes a ioerror (if the error happens inside a stream) or a limitcheck error in a show procedure when converted with the Distiller. The problem is font related. It seems that the Distiller have problems finding some characters from ps2write bitmap fonts. Ghostview has no problems with the file.
I will open a new bug with the ps2write output as testfile as soon as i have time. 

Best regards,

steve166
Comment 7 Ken Sharp 2011-06-21 15:59:33 UTC
(In reply to comment #6)

> > Would it be possible for you to try this out with a test file ?
> 
> Thanks for this fix, but was the original bug number not 692127 ?!

It seems it was, but that was closed as WONTFIX (and was raised against pdfwrite rather than ps2write). Amusing that this one is an anagram of the number...

> Unfortunately things got worse when i try to test this fix with the
> current gs snapshot. It seems that since Ghostcript 9.02 something with
> the memory management changes. When i try to convert my 20000 pages
> testjob, gswin32c consumes more and more memory.

This is using the 9.02 code, not the current master ?

> I have to find whats going on here before i can do any tests regarding
> the temporary file limit issue. I dont have this memory problem with
> Ghostcript 9.01. With Ghostcript 9.01 the memory consumed by gswin32.exe
> remains constant at about 30 MByte. This problem is present with pdfwrite,
> ps2write and, at a lesser degree, with pswrite.

Have you tested rendering ? It may be simply related to the high level devices (actually I would expect that it is).


> If you have any idea what change between GS 9.01 and GS 9.02 could cause
> this problem, any assumption may help.

Umm, not a great deal is the simple answer I think. I did do some changes for pdfwrite to hash the content of various objects as a performance optimisation, instead of continuously rescanning to find duplicates. But the per-object MD5 hash is not that big, so it shouldn't use so much memory.

My guess would be that something is not being freed that should be, but I don't know what it would be. It 'might' be something to do with the ICC colour profiles but again, its hard to say.

> As i wrote above i could not send you the real input data which causes this
> problem, because it is no plain postscript data. But i have found out that i
> could reproduce the problem feeding the ps2write output into ghostscript again,
> again using the ps2write device. This output then causes a ioerror (if the
> error happens inside a stream) or a limitcheck error in a show procedure when
> converted with the Distiller. 

TBH initially at least all I need is a file which fails on Distiller, so any PostScript file which fails would probably be enough.

> The problem is font related. It seems that the
> Distiller have problems finding some characters from ps2write bitmap fonts.

That's odd, they are simple type 3 fonts, but maybe there is some limitation in Distiller.

> Ghostview has no problems with the file.
> I will open a new bug with the ps2write output as testfile as soon as i have
> time. 

Please do, I'll look at it when its available.