Bug 696424 - Improve Appearance synthesis of variable text fields in AcroForms
Summary: Improve Appearance synthesis of variable text fields in AcroForms
Status: RESOLVED WONTFIX
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: 9.18
Hardware: PC Windows 7
: P4 enhancement
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-23 01:11 UTC by Björn Palmqvist
Modified: 2015-11-24 06:29 UTC (History)
0 users

See Also:
Customer:
Word Size: ---


Attachments
Crappy pdf (32.88 KB, application/pdf)
2015-11-23 01:11 UTC, Björn Palmqvist
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Björn Palmqvist 2015-11-23 01:11:53 UTC
Created attachment 12148 [details]
Crappy pdf

When I'm trying to convert a pdf, with editable text fields that is saved with Quartz PDFContext (Mac OS X previewer), to PNG I get warnings about the font with the following log messages (Using GS 9.18):

   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.
   **** Warning: Tf refers to an unknown resource name: Helvetica Assuming it's a font name.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> Mac OS X 10.11.1 Quartz PDFContext <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

The repair also fails so some of the text in the document are missing or displayed incorrect. If I open the pdf in Acrobat I can save a new repaired copy of it that will be converted as expected.

We are using imagemagick on top on GS but as far as I can tell it is the PDF to image conversion that fails.

I attach a test pdf that has this problem.
Comment 1 Ken Sharp 2015-11-23 01:37:36 UTC
(In reply to Björn Palmqvist from comment #0)

> When I'm trying to convert a pdf

What do you mean by 'convert' ? You have not supplied a GS command line so its not clear at all.


> with Quartz PDFContext (Mac OS X previewer), to PNG I get warnings about the
> font with the following log messages (Using GS 9.18):

That's because the font isn't defined. Note that when you open this file with Acrobat, and then close it, Acrobat offers to 'save the changes' that's telling you that Acrobat silently modified the file, fixing this problem.

>    **** The file was produced by:
>    **** >>>> Mac OS X 10.11.1 Quartz PDFContext <<<<
>    **** Please notify the author of the software that produced this
>    **** file that it does not conform to Adobe's published PDF
>    **** specification.
> 
> The repair also fails so some of the text in the document are missing or
> displayed incorrect.

The key part of the message is ">    **** This file had errors that were repaired or ignored."

Your file is broken, we may or may not be able to do something about it, if we can't then we ignore bits of it.


> We are using imagemagick on top on GS but as far as I can tell it is the PDF
> to image conversion that fails.
> 
> I attach a test pdf that has this problem.

Fundamentally you should send this PDF file to the publisher of the software that created it (which could be Apple or Adobe, or possibly something else, don't know what was used to fill in the fields in the AcroForm) and get them to fix their software. The basic problem is that the PDF file is invalid.

I will see if there's anything we can do to improve our rendering of the content.
Comment 2 Björn Palmqvist 2015-11-23 01:52:12 UTC
Sorry for the missing command.

We are using convert from imagemagick in the normal case to convert the pdf to png. Since the problem is i gs I get the same appearance on the PNG if I use the following command:

gswin64 -dBATCH -dNOPAUSE -sDEVICE=png16m -r200 -sOutputFile="converted_%03d.png" statement-of-purpose_builtin.pdf

I get the same result on a Linux system as well.
Comment 3 Ken Sharp 2015-11-24 06:29:23 UTC
The text annotation is a multiline text annotation, but the actual text consists of a single line, there are no '\n' or '\r' characters in it. The annotation additionally has no Appearance stream.

In the absence of appearance streams for annotations Ghostscript will attempt to synthesise one. When synthesising an appearance for a multiline text annotation, we split the string data on \r or \n characters, and then fit the strings into the available rectangle.

The 'fit' is performed by starting at the bottom and drawing the last text string and then moving up, drawing the second to last string and so on until we reach the first part of the text. This distributes the text evenly vertically.

Because the data does not contain any newline characters, we can't split it, and so its drawn as a single line of text.

We don't attempt to synthesise appearances for choice fields at all, which is why that doesn't appear to be set. We *only* support text fields in AcroForms.