Bug 699647 - Cannot convert PDF to image when PDF was created by microsoft print to PDF
Summary: Cannot convert PDF to image when PDF was created by microsoft print to PDF
Status: RESOLVED WORKSFORME
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: master
Hardware: PC Linux
: P4 normal
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-17 08:22 UTC by Erez
Modified: 2019-06-13 13:35 UTC (History)
1 user (show)

See Also:
Customer:
Word Size: ---


Attachments
The output jpg when trying to convert it from pdf (97.29 KB, image/jpeg)
2018-08-17 08:22 UTC, Erez
Details
The input (55.82 KB, application/pdf)
2018-08-17 08:31 UTC, Erez
Details
File (Microsoft Print To PDF) (18.58 KB, application/pdf)
2019-06-13 08:28 UTC, namhoa19772003
Details
Convert result (2.56 KB, image/jpeg)
2019-06-13 08:28 UTC, namhoa19772003
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Erez 2018-08-17 08:22:57 UTC
Created attachment 15478 [details]
The output jpg when trying to convert it from pdf

When trying to convert a pdf which created by Microsoft Print To PDF, to an image, the output image has very odd letters.

Logs: 
   **** Warning: can't process font stream, loading font by the name.
   **** Warning: can't process font stream, loading font by the name.
   **** Warning: can't process font stream, loading font by the name.
   **** Warning: can't process font stream, loading font by the name.
   **** Warning: can't process font stream, loading font by the name.
   **** Warning: can't process font stream, loading font by the name.
   **** Warning: can't process font stream, loading font by the name.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> Microsoft: Print To PDF <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.


See attachment.
Comment 1 Ken Sharp 2018-08-17 08:28:27 UTC
(In reply to Erez from comment #0)
> Created attachment 15478 [details]
> The output jpg when trying to convert it from pdf

You  are going to have to give us a copy of the original PDF file so that we can follow the code path. We can't simply look at the incorrect output and guess what was in the PDF file.

 
> When trying to convert a pdf which created by Microsoft Print To PDF, to an
> image, the output image has very odd letters.
> 
> Logs: 
>    **** Warning: can't process font stream, loading font by the name.

Which tells us that there's something about the file that Ghostscript doesn't like. At least it told you!

We'll need the original PDF file, the command line you used for Ghostscript and the actual version of Ghostscript you are using. If you really are using 'master' then we'll need the SHA1 of the commit that you pulled form our Git repository.
Comment 2 Erez 2018-08-17 08:31:07 UTC
Created attachment 15479 [details]
The input

See attachment
Comment 3 Erez 2018-08-17 08:39:46 UTC
The input in shell is just

convert a.pdf a.jpg

I am using 
GPL Ghostscript 9.07 (2013-02-14)
Comment 4 Ken Sharp 2018-08-17 09:06:17 UTC
(In reply to Erez from comment #3)
> The input in shell is just
> 
> convert a.pdf a.jpg

convert is not Ghostscript, that's ImageMagick. When attempting to reproduce a bug we need to know the settings we should use for Ghostscript, if we don't have them then the probability of being able to reproduce (and therefore fix) the bug is sharply reduced.

You can probably find out what command IM is using, but I can't tell you how to do so.

> I am using 
> GPL Ghostscript 9.07 (2013-02-14)

OK well that's a seriously (> 5 years old) version of Ghostscript. You should probably upgrade anyway, and you should really test the latest release before posting a bug report.
Comment 5 Ken Sharp 2018-08-17 09:07:42 UTC
So, checking with a version of 9.07 (>5 years old) I see that Ghostscript is unable to read teh embedded CIDFont.

Current code, however, processes the file without complaint, and the outptu matches the Acrobat view.

So this has been addressed sometime in the past 5 years. You need to upgrade.
Comment 6 Erez 2018-08-17 09:09:31 UTC
After upgrading the issue is fixed. Thanks
Comment 7 namhoa19772003 2019-06-13 08:28:10 UTC
Created attachment 17667 [details]
File (Microsoft Print To PDF)
Comment 8 namhoa19772003 2019-06-13 08:28:47 UTC
Created attachment 17668 [details]
Convert result
Comment 9 namhoa19772003 2019-06-13 08:29:16 UTC
I use Ghostscript 9.27 and  ImageMagick 7.0.8-49 on Centos7. But the convert result is not good. Please see the attached image.
Comment 10 Ken Sharp 2019-06-13 08:31:43 UTC
(In reply to namhoa19772003 from comment #9)
> I use Ghostscript 9.27 and  ImageMagick 7.0.8-49 on Centos7. But the convert
> result is not good. Please see the attached image.

This bug report has been closed. If you think you have found a new bug then open a new report.
Comment 11 Ken Sharp 2019-06-13 08:32:26 UTC
Note that we cannot help you with ImageMagick convert, you must supply a *Ghostscript* command line.
Comment 12 namhoa19772003 2019-06-13 13:35:24 UTC
Thanks