Bug 692546 - Missing images and text in pdfs version 1.4
Summary: Missing images and text in pdfs version 1.4
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Writer (show other bugs)
Version: 9.04
Hardware: PC Windows Vista
: P4 normal
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-27 10:12 UTC by elie
Modified: 2012-02-08 08:59 UTC (History)
1 user (show)

See Also:
Customer:
Word Size: ---


Attachments
Snapshot.pdf (6.48 MB, application/pdf)
2011-09-27 10:48 UTC, elie
Details
from-ps.pdf (662.41 KB, application/pdf)
2011-09-27 11:21 UTC, elie
Details

Note You need to log in before you can comment on or make changes to this bug.
Description elie 2011-09-27 10:12:23 UTC
I'm using gs to decrease the pdf file sizes, but for pdf 1.4 versions some text and images are not appearing due to the use of transparency.
When using the -dNOTRANSPARENCY option, the text appear again, and the image as well, but i can see black areas.
I attached the sample the pdf sample i'm using to this bug. The command line im using is:
-dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook -dCompressPages=true -c "80000000 setvmthreshold" -c ".setpdfwrite" -c "<</ColorConversionStrategy /LeaveColorUnchanged>> setdistillerparams" -c "<</ColorImageDownsampleType /Bicubic>> setdistillerparams" -c "<</GrayImageDownsampleType /Bicubic>> setdistillerparams" -c "<</MonoImageDownsampleType /Bicubic>> setdistillerparams" -c "<</ColorImageResolution 70>>" setdistillerparams -c "<</GrayImageResolution 70>>" setdistillerparams -c "<</MonoImageResolution 195>>" setdistillerparams

Even with much simpler parameters im unable to produce a correct conversion.

Note: When used adobe acrobat 8.2.4 to produce a postscript from the attached file and then used gs to convert it to pdf, the conversion was correct.
Comment 1 Ken Sharp 2011-09-27 10:30:56 UTC
(In reply to comment #0)

> I attached the sample the pdf sample i'm using to this bug. 

I'm afraid its not attached, we really need the file to investigate the problem.

The command line im
> using is:
> -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook
> -dCompressPages=true -c "80000000 setvmthreshold" -c ".setpdfwrite" -c
> "<</ColorConversionStrategy /LeaveColorUnchanged>> setdistillerparams" -c
> "<</ColorImageDownsampleType /Bicubic>> setdistillerparams" -c
> "<</GrayImageDownsampleType /Bicubic>> setdistillerparams" -c
> "<</MonoImageDownsampleType /Bicubic>> setdistillerparams" -c
> "<</ColorImageResolution 70>>" setdistillerparams -c "<</GrayImageResolution
> 70>>" setdistillerparams -c "<</MonoImageResolution 195>>" setdistillerparams
> 
> Even with much simpler parameters im unable to produce a correct conversion.

If simpler parameters show a problem, then please provide simpler parameters, over-complicating the problem will make it harder to investigate.

I would suggest that you do *not* use -dPDFSETTINGS=/ebook if you plan to set controls individually, as the PDFSETTINGS will override the individual controls.

Because setdistillerparams takes a dictionary argument you can supply all the paramters in one dictionary, you don't need multiple calls to setdistillerparams.

 
> Note: When used adobe acrobat 8.2.4 to produce a postscript from the attached
> file and then used gs to convert it to pdf, the conversion was correct.

PostScript does not support transparency, so conversion to PostScript will eliminate the transparent content by converting those areas to images. Ghostscript does the same (broadly speaking) when outputting PostScript using the ps2write device.
Comment 2 elie 2011-09-27 10:48:51 UTC
Created attachment 7933 [details]
Snapshot.pdf
Comment 3 elie 2011-09-27 10:59:44 UTC
(In reply to comment #1)
> (In reply to comment #0)
> 
> > I attached the sample the pdf sample i'm using to this bug. 
> 
> I'm afraid its not attached, we really need the file to investigate the
> problem.
> 
> The command line im
> > using is:
> > -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook
> > -dCompressPages=true -c "80000000 setvmthreshold" -c ".setpdfwrite" -c
> > "<</ColorConversionStrategy /LeaveColorUnchanged>> setdistillerparams" -c
> > "<</ColorImageDownsampleType /Bicubic>> setdistillerparams" -c
> > "<</GrayImageDownsampleType /Bicubic>> setdistillerparams" -c
> > "<</MonoImageDownsampleType /Bicubic>> setdistillerparams" -c
> > "<</ColorImageResolution 70>>" setdistillerparams -c "<</GrayImageResolution
> > 70>>" setdistillerparams -c "<</MonoImageResolution 195>>" setdistillerparams
> > 
> > Even with much simpler parameters im unable to produce a correct conversion.
> 
> If simpler parameters show a problem, then please provide simpler parameters,
> over-complicating the problem will make it harder to investigate.

sure,

-dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook

> 
> I would suggest that you do *not* use -dPDFSETTINGS=/ebook if you plan to set
> controls individually, as the PDFSETTINGS will override the individual
> controls.
> 
> Because setdistillerparams takes a dictionary argument you can supply all the
> paramters in one dictionary, you don't need multiple calls to
> setdistillerparams.
> 
> 
> > Note: When used adobe acrobat 8.2.4 to produce a postscript from the attached
> > file and then used gs to convert it to pdf, the conversion was correct.
> 
> PostScript does not support transparency, so conversion to PostScript will
> eliminate the transparent content by converting those areas to images.

I'm ok with that if only the transparent content is converted.

> Ghostscript does the same (broadly speaking) when outputting PostScript using
> the ps2write device.

Using ps2write, is taking too long, the ps file size is huge compared to the attached input (=156M). And im getting the following error:

Error: /VMerror in --.endtransparencygroup--
VM status: 3 3823452 5456676
Current allocation mode is local
Last OS error: 12
GPL Ghostscript  9.00: Unrecoverable error, exit code 1
Comment 4 Ken Sharp 2011-09-27 11:14:26 UTC
(In reply to comment #3)
 
> Using ps2write, is taking too long, the ps file size is huge compared to the
> attached input (=156M). 

Because it is converting the data to image.
> And im getting the following error:
> 
> Error: /VMerror in --.endtransparencygroup--
> VM status: 3 3823452 5456676
> Current allocation mode is local
> Last OS error: 12
> GPL Ghostscript  9.00: Unrecoverable error, exit code 1

So it ran out of memory. I would suggest reducing the resolution, the default is 720 dpi which may be higher than you require. Normally this has little effect but when flattening transparency it will convert to images at the specified resolution. Alternatively increase the memory available to GS.

Try -r96 for screen resolution. It will convert faster be a smaller PDF file and should not run out of memory. Still not as good as preserving the transparency of course.
Comment 5 elie 2011-09-27 11:20:42 UTC
(In reply to comment #4)
> (In reply to comment #3)
> 
> > Using ps2write, is taking too long, the ps file size is huge compared to the
> > attached input (=156M). 
> 
> Because it is converting the data to image.
> > And im getting the following error:
> > 
> > Error: /VMerror in --.endtransparencygroup--
> > VM status: 3 3823452 5456676
> > Current allocation mode is local
> > Last OS error: 12
> > GPL Ghostscript  9.00: Unrecoverable error, exit code 1
> 
> So it ran out of memory. I would suggest reducing the resolution, the default
> is 720 dpi which may be higher than you require. Normally this has little
> effect but when flattening transparency it will convert to images at the
> specified resolution. Alternatively increase the memory available to GS.
> 
> Try -r96 for screen resolution. It will convert faster be a smaller PDF file
> and should not run out of memory. Still not as good as preserving the
> transparency of course.

Got my ps but when converting it back to pdf, the content is cropped and it looks like it is 1 image. Will attach the output from the ps.
Comment 6 elie 2011-09-27 11:21:56 UTC
Created attachment 7934 [details]
from-ps.pdf
Comment 7 Ken Sharp 2011-09-27 11:42:17 UTC
(In reply to comment #5)
 
> Got my ps but when converting it back to pdf, the content is cropped and it
> looks like it is 1 image. Will attach the output from the ps.

A single image is entirely possible, it depends on how the page content is defined.
Comment 8 elie 2011-09-27 12:08:38 UTC
(In reply to comment #7)
> (In reply to comment #5)
> 
> > Got my ps but when converting it back to pdf, the content is cropped and it
> > looks like it is 1 image. Will attach the output from the ps.
> 
> A single image is entirely possible, it depends on how the page content is
> defined.
I see, a single image won't really suit me, I need to be able view the pdf and zoom it without distorting the text.
Comment 9 Ken Sharp 2011-09-27 12:12:54 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #5)
> > 
> > > Got my ps but when converting it back to pdf, the content is cropped and it
> > > looks like it is 1 image. Will attach the output from the ps.
> > 
> > A single image is entirely possible, it depends on how the page content is
> > defined.
> I see, a single image won't really suit me, I need to be able view the pdf and
> zoom it without distorting the text.

If the text is not acceptable in an image then you can't use any technique involving PostScript conversion, as any text may be in a transparency group.
Comment 10 elie 2011-09-27 13:08:33 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > (In reply to comment #5)
> > > 
> > > > Got my ps but when converting it back to pdf, the content is cropped and it
> > > > looks like it is 1 image. Will attach the output from the ps.
> > > 
> > > A single image is entirely possible, it depends on how the page content is
> > > defined.
> > I see, a single image won't really suit me, I need to be able view the pdf and
> > zoom it without distorting the text.
> 
> If the text is not acceptable in an image then you can't use any technique
> involving PostScript conversion, as any text may be in a transparency group.

I assumed this might work because of the following reasons:
-Using the command line in comment 3, produced a working pdf except for the 2 areas on the right. So part of this text was used.
-(not for the sake of comparison) adobe acrobat reader produced on the attached input a ps and when converted with gs, the output pdf was converted nicely with selectable text.
Comment 11 elie 2011-10-07 09:05:34 UTC
any updates ?:)
Comment 12 Emiel Molenaar 2011-10-27 14:16:45 UTC
(In reply to comment #11)
> any updates ?:)

Suffering exact the same issue here, PDF's with some kind of transparency in it are missing images / text after processing. Any news on this one?
Comment 13 Ken Sharp 2012-02-08 08:59:38 UTC
This works in the current HEAD code, I believe it to have been fixed by Git commit: a6a7d7b62d5bb5cbe00c1051d8a9cb749e43fe86

Patch is here:
http://ghostscript.com/pipermail/gs-cvs/2012-February/014159.html

NB this change is *not* in the imminent 9.05 release of Ghostscript having been completed too late for inclusion.