Bug 707726 - Error in reading characters when converting from pdf to image
Summary: Error in reading characters when converting from pdf to image
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Writer (show other bugs)
Version: 10.03.0
Hardware: PC Windows 10
: P2 normal
Assignee: Chris Liddell (chrisl)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-04-08 18:10 UTC by Kieu Tra
Modified: 2024-04-17 08:14 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments
File pdf to convert to images have problem with some sentences (268.25 KB, application/pdf)
2024-04-08 18:10 UTC, Kieu Tra
Details
Image converted (12.63 KB, image/tiff)
2024-04-08 19:20 UTC, Kieu Tra
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kieu Tra 2024-04-08 18:10:23 UTC
Created attachment 25583 [details]
File pdf to convert to images have problem with some sentences

Hello,
I am using the ghostscript with version 10.03 to convert my pdf to image. Some sentences are wrongly read in this version although they are NOT with the version 9.54.
My parameters are as follows:
""
-dNOPAUSE
-dBATCH
-dSAFER
-sDEVICE=jpeggray
-r200
-dJPEGQ=100
-dAntiAliasGrayImage=false
-dAntiAliasMonoImage=false
-dAutoFilterColorImages=false
-dAutoFilterGrayImages=false
-dDownsampleColorImages=false
-dDownsampleGrayImages=false
-dDownsampleMonoImages=false
-dColorConversionStrategy=/LeaveColorUnchanged
-dConvertCMYKImagesToRGB=false
-dConvertImagesToIndexed=false
-dPreserveHalftoneInfo=true 
-dPreserveOPIComments=true
-dCompatibilityLevel=1.7


Thanks for help.
Tra
Comment 1 Ken Sharp 2024-04-08 18:26:50 UTC
None of these switches:

-dAntiAliasGrayImage=false
-dAntiAliasMonoImage=false
-dAutoFilterColorImages=false
-dAutoFilterGrayImages=false
-dDownsampleColorImages=false
-dDownsampleGrayImages=false
-dDownsampleMonoImages=false
-dColorConversionStrategy=/LeaveColorUnchanged
-dConvertCMYKImagesToRGB=false
-dConvertImagesToIndexed=false
-dPreserveHalftoneInfo=true 
-dPreserveOPIComments=true
-dCompatibilityLevel=1.7

Will have any effect whatsoever with any device except the pdfwrite device.

-dSAFER is the default so you don't need that.

It would be helpful to be clear about exactly which text you see as a problem; what I see is that the (artificially emboldened) bold text has capital Ecircumflex glyphs drawn in place of spaces.
Comment 2 Kieu Tra 2024-04-08 19:20:53 UTC
Created attachment 25584 [details]
Image converted

I added the image converted from pdf with wrongly read words.
Comment 3 Kieu Tra 2024-04-08 19:22:20 UTC
Hi, thanks for your reply.
I added the image converted in the attachment file. 
Bests,
Tra
Comment 4 Chris Liddell (chrisl) 2024-04-16 15:47:14 UTC
Fixed in:

https://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=1a5b48e1e295
Comment 5 Kieu Tra 2024-04-16 18:13:19 UTC
(In reply to Chris Liddell (chrisl) from comment #4)
> Fixed in:
> 
> https://git.ghostscript.com/?p=ghostpdl.git;a=commitdiff;h=1a5b48e1e295

Hello Chris Liddell,

Thanks for your work.

Would you please generate a new version/patch so I can update in my application?

Sincerely,
Tra KIEU
Comment 6 Chris Liddell (chrisl) 2024-04-17 08:14:57 UTC
The source patch is linked above.

The next scheduled release will be in September.