Bug 693324 - Ghost script 9.04 changes characters on merging two pdf files.
Summary: Ghost script 9.04 changes characters on merging two pdf files.
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Writer (show other bugs)
Version: 9.04
Hardware: Macintosh MacOS X
: P1 normal
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-11 18:15 UTC by scott.matsubara
Modified: 2014-02-17 04:43 UTC (History)
2 users (show)

See Also:
Customer: 535
Word Size: ---


Attachments
test files and the merge result (80.17 KB, application/zip)
2012-09-11 18:15 UTC, scott.matsubara
Details
Our test procedure (29.50 KB, application/msword)
2012-09-17 18:54 UTC, scott.matsubara
Details
Our test data (191.54 KB, application/zip)
2012-09-17 18:55 UTC, scott.matsubara
Details
output from current Ghostscript (80.50 KB, application/octet-stream)
2012-09-18 07:34 UTC, Ken Sharp
Details
second set of files (204.40 KB, application/octet-stream)
2012-09-18 09:14 UTC, Ken Sharp
Details
ps&pdf after patch (440.89 KB, application/zip)
2012-09-20 17:37 UTC, scott.matsubara
Details
Build error in GS on Mac (42.00 KB, application/msword)
2012-09-21 21:06 UTC, scott.matsubara
Details
side effect description and query (32.00 KB, application/msword)
2012-10-04 18:00 UTC, scott.matsubara
Details

Note You need to log in before you can comment on or make changes to this bug.
Description scott.matsubara 2012-09-11 18:15:18 UTC
Created attachment 8924 [details]
test files and the merge result

Ghost script 9.04 changes characters on merging two pdf files.

Attached two input pdf files and merged pdf files.

In the merged file the input character in the original file 8 changed to 5(page no 8).

The command used is
gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=merged.pdf \9.pdf  8.pdf

When the same command is used to merge the ps files the merged file is fine and displaying characters correctly.

Please find the attached ps files also in the zip file.

Note: Earlier version 8.54 is working fine for the above two cases.

This bug is also observed in later versions of Ghostscript - 9.05 & 9.06

Kindly provide the solution for this or the date when this issue will be fixed.
Comment 1 Ken Sharp 2012-09-12 08:30:49 UTC
Changed priority for customer issue, and reset the importance to normal. I'm not absolutely certain this is the correct customer number though.
Comment 2 Ken Sharp 2012-09-13 13:59:59 UTC
Basicallym this is a 'name collision'. Both PDF files contain subset fonts with the same prefixes, but *different* contents. pdfwrite thinks they are the same (because they have the same name) and so only embeds one copy.

This is always a potential problem.

I have extended the heuristic used to generate the subset prefix so that it is less likely that such collisions will occur in future, but it can't be guaranteed.

commit: 44d00dd1bd34e2fb735d4682b73d880e208f92bd contains the patch for this.

Please note that the separate PDF files will have to be regenerated from the PostScript. The separate PDF files supplied are beyond repair, you must generate fresh ones using the patch referenced above.
Comment 3 scott.matsubara 2012-09-17 18:54:16 UTC
Created attachment 8940 [details]
Our test procedure
Comment 4 scott.matsubara 2012-09-17 18:55:14 UTC
Created attachment 8941 [details]
Our test data
Comment 5 scott.matsubara 2012-09-17 19:04:04 UTC
Thank you very much for your quick support.

We downloaded the source files with patch fix, replaced the files in 9.06, rebuilt, deployed and verified by creating new PDF files from the ps.
But there is no change in the behavior (8 changed to 5, 9 changed to 6).

Can you please check whether the steps we followed are correct?  Also, could you confirm the fix with the files we provided?  Please refer attached files for the details.

Thanks in advance.
Comment 6 Ken Sharp 2012-09-18 07:34:25 UTC
Created attachment 8943 [details]
output from current Ghostscript

This issue is resolved from an engineering standpoint. Normally support would test the patch on the relevant older version of Ghostscript and if necessary on a different operating system. I am unable to test this on a Mac.

I have run the tests again, and I see no problems running with the current HEAD version of Ghostscript on a Windows platform. The output files are in the attached Zip archive.
Comment 7 Ken Sharp 2012-09-18 09:14:22 UTC
Created attachment 8944 [details]
second set of files

And here are the new files run through the current HEAD revision of Ghostscript. I have looked at the output and am unable to see any problems, of course I have no idea what I'm supposed to be looking for.
Comment 8 scott.matsubara 2012-09-18 09:17:07 UTC
This is a Mac specific issue.  On Windows, we have no problem with the same version of GS.
Please test on a Mac.
Comment 9 Ken Sharp 2012-09-18 09:21:47 UTC
(In reply to comment #8)
> This is a Mac specific issue.  On Windows, we have no problem with the same
> version of GS.
> Please test on a Mac.

I am unable to test on a Mac, I'll have to leave it to Marcos.
Comment 10 Ken Sharp 2012-09-18 19:03:35 UTC
(In reply to comment #8)
> This is a Mac specific issue.  On Windows, we have no problem with the same
> version of GS.
> Please test on a Mac.

This has now been tried on a Mac without seeing a problem either. Please upload the intermediate and final PDF files as produced on your Macintosh.
Comment 11 scott.matsubara 2012-09-18 21:49:27 UTC
> This has now been tried on a Mac without seeing a problem either. Please upload
> the intermediate and final PDF files as produced on your Macintosh.

They're already included in the attachement what we sent to you on 9/11.
    8.pdf  -  2nd file to merge
    9.pdf  -  1st file to merge
    8test.ps  -  Intermediate PS file from 8.pdf
    9test.ps  -  Intermediate PS file from 9.pdf
    merged.pdf  -  Final merged PDF which has the problem.  (8 -> 5)

Please let me know what else you need from us.
Comment 12 Ken Sharp 2012-09-19 07:05:47 UTC
(In reply to comment #11)
> > This has now been tried on a Mac without seeing a problem either. Please upload
> > the intermediate and final PDF files as produced on your Macintosh.
> 
> They're already included in the attachement what we sent to you on 9/11.

I want the versions produced by your Macintosh build of Ghostscript *after* applying the patch to resolve the problems. This patch was not committed until the 13th September, so the files from the 11th are not useful.

You also have not supplied the PDF files for 'case 2' at all.


> Please let me know what else you need from us.

Please send the intermediate PDF files, created by the Mac build *with* the patch from the 13th applied, for all 4 input PostScript files.
Comment 13 scott.matsubara 2012-09-20 17:37:24 UTC
Created attachment 8959 [details]
ps&pdf after patch

Please find attached intermediate files after the patch applied.
Comment 14 Ken Sharp 2012-09-21 07:26:02 UTC
(In reply to comment #13)
> Created an attachment (id=8959) [details]
> ps&pdf after patch
> 
> Please find attached intermediate files after the patch applied.

The files 8.pdf and 9.pdf contain fonts with the same name and subset prefix, just as the examples before the commit. Since the code should now be using the content of the font stream to affect the hash, this should not occur (or at least be very unlikely).

Indeed, for us, it does not occur on Windows, Linux or Macintosh.

I notice that the Producer is stated as Ghostscript 9.06, not the pre-release 9.07, so this was not produced using the most up to date source code.

I hope to be in a position to test this further on a Macintosh at some point soon, but since this works on a Mac for one of my colleagues, at the moment the most likely explanation would seem to be some error in applying the patch. I would suggest taking a snapshot from the Git repository and trying that. Go to :

http://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=44d00dd1bd34e2fb735d4682b73d880e208f92bd

and press the 'snapshot' link to retrieve a copy of the code at the time this change was committed.
Comment 15 scott.matsubara 2012-09-21 21:06:24 UTC
Created attachment 8961 [details]
Build error in GS on Mac
Comment 16 scott.matsubara 2012-09-21 21:08:33 UTC
(In reply to comment #14)
> most likely explanation would seem to be some error in applying the patch. 

Okay...
We were unable to build the source downloaded from the link mentioned by Artifex.
Please refer the details in the attachment.  Thanks.
Comment 17 Chris Liddell (chrisl) 2012-09-21 22:16:45 UTC
The "configure" script is created as part of the release process, and the snapshot download is not a "release" archive, it is purely a snapshot of our in-development code from our source control repository.

For such a snapshot, you will find the Ghostscript source the "ghostpdl-44d00dd/gs" directory, and in that directory is a script called "autogen.sh" (note: take care, there is a similar script in the top level directory for the PCL and XPS builds).

Running the autogen.sh script in ghostpdl-44d00dd/gs will create and run the configure script for you. You must have autoconf installed.
Comment 18 Ken Sharp 2012-09-25 09:51:56 UTC
I have now set up a Mac test environment, downloaded and built the latest source code for Ghostscript, and retested all the files sent.

For me these now work as expected, I do not see any problems. It is, of course, possible that I am missing the fault, especially for the second set of files as we still don't know what the perceived problem is with this set. The described problem with 8.ps and 9.ps definitely does not occur for me.

Did you manage to build the current source and test it ? Are you still experiencing a problem ?
Comment 19 scott.matsubara 2012-09-25 18:58:42 UTC
With the latest patch provided by Artifex, we are able to build and verify that the issue is fixed.

We'll integrate it to our product and test further.  If we find any issue or side effect, we'll let you know, but otherwise, we can close.  

Thank you very much for your great support!
Comment 20 scott.matsubara 2012-10-04 18:00:29 UTC
Created attachment 8974 [details]
side effect description and query

The original reported issues is fixed (693324), but during further testing, we found some issues which looks to be a side effect (it was working fine before).  Please find attached for more detailed information.  

There seems to have an issue on hyphen in file name (single byte, as well as double byte).
Comment 21 Ken Sharp 2012-10-05 03:39:22 UTC
(In reply to comment #20)
> Created an attachment (id=8974) [details]
> side effect description and query
> 
> The original reported issues is fixed (693324), but during further testing, we
> found some issues which looks to be a side effect (it was working fine before).
>  Please find attached for more detailed information.  

Could we have a new bug report for a new problem please ?
Comment 22 Ken Sharp 2012-10-05 03:50:12 UTC
Also, please attach the relevant input files and supply a command line. I have to say that these do *not* sound like pdfwrite problems, most likely any change is due to filename processing, Ghostscript (on Windows) does not handle Unicode filenames, though this has recently been altered.
Comment 23 scott.matsubara 2012-10-05 04:13:36 UTC
> Could we have a new bug report for a new problem please ?

Yes, I was wondering if I should do so...  Close this, and registered new one as 693367.