Bug 700471 - PDF to JPG: result with wrong characters
Summary: PDF to JPG: result with wrong characters
Status: RESOLVED INVALID
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: 9.26
Hardware: PC Windows 10
: P4 normal
Assignee: Ken Sharp
QA Contact: Bug traffic
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-09 17:27 UTC by Guilllaume
Modified: 2019-01-20 11:20 UTC (History)
0 users

See Also:
Customer:
Word Size: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Guilllaume 2019-01-09 17:27:02 UTC
When converting some PDF files to JPG with GS9.26 or ImageMagick 6.9.7-4, many JPG files are created (more than the page number of the PDF) and the content of each is not good. There are weird characters and unknown text like this:

%PDF-1.7
%???
1 0 obj
<</Type/Catalog/Pages 2 0 R/Lang(fr-FR) /MetaData 15 0 R/ViewerPreferences 16 0 R>>
endobj
2 0 obj

This problem only happens on my production server (locally, the JPG files are created normally). And only for some PDF files but I do not know which ones.
One that does not work is a Word file (.docx) saved in PDF from a Apple Mac.

ImageMagick command: convert -density 300 -background white -alpha remove -limit memory 1 -limit map 1 -quality 100 $pdf_path $jpg_path

Is it a encoding problem? Why does it happen only on the production server and not locally?
Comment 1 Ken Sharp 2019-01-09 17:38:51 UTC
(In reply to Guilllaume from comment #0)

Please do not modify the Importance field, that's for our assessment. I can see no reason to regard this a JPX/JBIG2 problem. 


> content of each is not good. There are weird characters and unknown text
> like this:
> 
> %PDF-1.7
> %???
> 1 0 obj
> <</Type/Catalog/Pages 2 0 R/Lang(fr-FR) /MetaData 15 0 R/ViewerPreferences
> 16 0 R>>
> endobj
> 2 0 obj

There's nothing wrong with that, its the the header of a PDF file, there are no 'weird characters' or 'unknown text' there.

 
> This problem only happens on my production server (locally, the JPG files
> are created normally). And only for some PDF files but I do not know which
> ones.
> One that does not work is a Word file (.docx) saved in PDF from a Apple Mac.

You are going to have to supply us a file, preferably a simple file, to demonstrate the problem, we cannot work blind.

 
> ImageMagick command: convert -density 300 -background white -alpha remove
> -limit memory 1 -limit map 1 -quality 100 $pdf_path $jpg_path

We can't use an ImgaeMagick command line, you must supply a Ghostscript command line.

 
> Is it a encoding problem?

There is no possible way to tell from the information supplied. No example file, no output JPEG, nothing really.....


> Why does it happen only on the production server
> and not locally?

How should I know ?

Presumably there is some difference between the configuration or build of the versions of Ghostscript you are using. You need to check the versions of Ghostscript, the command line being used, the configuration (including environment variables) and if either system is Linux (as opposed to Windows 10 as stated) then you need to determine the source of the Ghostscript you are using. Distributions often modify the Ghostscript build, you need to make sure you are using the same source and build options on each system.
Comment 2 Ken Sharp 2019-01-20 11:20:15 UTC
No specimen file supplied, no further contact from reporter, closing.