Bug 695653 - Ghostscript can't read PDF XFA files
Summary: Ghostscript can't read PDF XFA files
Status: NOTIFIED WONTFIX
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: master
Hardware: PC Linux
: P1 enhancement
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-29 21:13 UTC by Marcos H. Woehrmann
Modified: 2014-11-18 15:02 UTC (History)
0 users

See Also:
Customer: 780
Word Size: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marcos H. Woehrmann 2014-10-29 21:13:58 UTC
The attached PDF file is an XFA file written by Adobe LiveCycle.  As such no software other than Adobe Acrobat can read this file, instead a message telling you to upgrade to Adobe Reader is rendered as the first page.

The customer would prefer if we generate an error when faced with one of these files, rather than reporting a successful rendering.

The command line I'm using for testing:

  bin/gs -sDEVICE=ppmraw -o test.ppm ./001170_InputMix_Test_FPR.pdf
Comment 2 Ken Sharp 2014-10-30 01:36:26 UTC
(In reply to Marcos H. Woehrmann from comment #0)
> The attached PDF file is an XFA file written by Adobe LiveCycle.  As such no
> software other than Adobe Acrobat can read this file, instead a message
> telling you to upgrade to Adobe Reader is rendered as the first page.
> 
> The customer would prefer if we generate an error when faced with one of
> these files, rather than reporting a successful rendering.

Fundamentally, we can't. The file is a valid PDF file which just happens to contain some additional information, Adobe applications are able to use that information to override the PDF file, but that doesn't detract from the fact that it is a valid PDF file.

The additional information is stored as an annotation in an AcroForm, which we generally don't even read so unless the customer had enabled AcroForm processing by setting -dShowAcroForm we won't ever see the annotation which includes the XFA definition.

I'm also against throwing an error on a valid PDF file, I'd consider having Ghostscript emit a warning, but since this will only be possible if -dShowAcroForm is true I'm doubtful as to the utility of doing so.

On balance, I don't think its worth it.
Comment 3 Marcos H. Woehrmann 2014-11-04 09:12:49 UTC
The customer reports that 'Foxit Reader' (version 7.0.3.916) is able to read the file in the same way that Acrobat does, so now they'd like Ghostscript to do so as well.
Comment 4 Ken Sharp 2014-11-04 10:50:54 UTC
(In reply to Marcos H. Woehrmann from comment #3)
> The customer reports that 'Foxit Reader' (version 7.0.3.916) is able to read
> the file in the same way that Acrobat does, so now they'd like Ghostscript
> to do so as well.

We would have to implement an XML parser and layout engine. XFA is an addition to the PDF spec, it isn't part of it.

As a genuine customer request this should be considered by Sales & marketing. This really isn't a bug!