Bug 690043

Summary: Can't Create a pdf file with attachment
Product: Ghostscript Reporter: Xi Wang <xwang14>
Component: PDF WriterAssignee: Ken Sharp <ken.sharp>
Status: RESOLVED FIXED    
Severity: enhancement CC: hauser, pipitas
Priority: P4 Keywords: bountiable
Version: 8.62   
Hardware: PC   
OS: Windows XP   
Customer: Word Size: ---

Description Xi Wang 2008-08-28 22:57:26 UTC
 
Comment 1 Xi Wang 2008-08-28 23:02:57 UTC
I can't use PDF writer to create a pdf file which contains an attachment.

I've tried to use command line to create a pdf file. It seems ps2pdf can't 
suppport pdfmark to EMBED a file in pdf.

In the post script i tried following to make test1.pdf file as attachment of 
output pdf file.but i found the test1.pdf is included in the outputfile as 
stream but not as an attachment.


 %% Embed a mono version of PDF log in the main PDF file
/F (C:\\test1.pdf) (r) file def
[/_objdef {fstream} /type /stream /OBJ pdfmark
[{fstream} << /Type /EmbeddedFile >> /PUT pdfmark
[{fstream} << /Company (Schlumberger) >> /PUT pdfmark
[{fstream} F  /PUT pdfmark
[/Name (Mono version PDF log for Black & White SLB printers)  /FS 
<< /Type /Filespec /F (MonoVersion.pdf)
 /EF << /F {fstream} >> >> /EMBED pdfmark
[{fstream} /CLOSE pdfmark
Comment 2 Ken Sharp 2008-08-29 01:41:14 UTC
the pdfwrite device supports only a subset of the pdfmark operations. These
are defined in gdevpdfm.c:

	/* Miscellaneous. */
    {"ANN",          pdfmark_ANN,         PDFMARK_NAMEABLE},
    {"LNK",          pdfmark_LNK,         PDFMARK_NAMEABLE},
    {"OUT",          pdfmark_OUT,         0},
    {"ARTICLE",      pdfmark_ARTICLE,     0},
    {"DEST",         pdfmark_DEST,        PDFMARK_NAMEABLE},
    {"PS",           pdfmark_PS,          PDFMARK_NAMEABLE},
    {"PAGES",        pdfmark_PAGES,       0},
    {"PAGE",         pdfmark_PAGE,        0},
    {"PAGELABEL",    pdfmark_PAGELABEL,   0},
    {"DOCINFO",      pdfmark_DOCINFO,     0},
    {"DOCVIEW",      pdfmark_DOCVIEW,     0},
	/* Named objects. */
    {"BP",           pdfmark_BP,          PDFMARK_NAMEABLE | PDFMARK_TRUECTM},
    {"EP",           pdfmark_EP,          0},
    {"SP",           pdfmark_SP,          PDFMARK_ODD_OK | PDFMARK_KEEP_NAME |
PDFMARK_TRUECTM},
    {"OBJ",          pdfmark_OBJ,         PDFMARK_NAMEABLE},
    {"PUT",          pdfmark_PUT,         PDFMARK_ODD_OK | PDFMARK_KEEP_NAME},
    {".PUTDICT",     pdfmark_PUTDICT,     PDFMARK_ODD_OK | PDFMARK_KEEP_NAME},
    {".PUTINTERVAL", pdfmark_PUTINTERVAL, PDFMARK_ODD_OK | PDFMARK_KEEP_NAME},
    {".PUTSTREAM",   pdfmark_PUTSTREAM,   PDFMARK_ODD_OK | PDFMARK_KEEP_NAME |
                                          PDFMARK_NO_REFS},
    {"APPEND",       pdfmark_APPEND,      PDFMARK_KEEP_NAME},
    {"CLOSE",        pdfmark_CLOSE,       PDFMARK_ODD_OK | PDFMARK_KEEP_NAME},
    {"NamespacePush", pdfmark_NamespacePush, 0},
    {"NamespacePop", pdfmark_NamespacePop, 0},
    {"NI",           pdfmark_NI,          PDFMARK_NAMEABLE},
	/* Marked content. */
    {"MP",           pdfmark_MP,          PDFMARK_ODD_OK},
    {"DP",           pdfmark_DP,          0},
    {"BMC",          pdfmark_BMC,         PDFMARK_ODD_OK},
    {"BDC",          pdfmark_BDC,         0},
    {"EMC",          pdfmark_EMC,         0},
	/* Document structure. */
    {"StRoleMap",    pdfmark_StRoleMap,   0},
    {"StClassMap",   pdfmark_StClassMap,  0},
    {"StPNE",        pdfmark_StPNE,       PDFMARK_NAMEABLE},
    {"StBookmarkRoot", pdfmark_StBookmarkRoot, 0},
    {"StPush",       pdfmark_StPush,       0},
    {"StPop",        pdfmark_StPop,        0},
    {"StPopAll",     pdfmark_StPopAll,     0},
    {"StBMC",        pdfmark_StBMC,        0},
    {"StBDC",        pdfmark_StBDC,        0},
    /* EMC is listed under "Marked content" above. */
    {"StOBJ",        pdfmark_StOBJ,        0},
    {"StAttr",       pdfmark_StAttr,       0},
    {"StStore",      pdfmark_StStore,      0},
    {"StRetrieve",   pdfmark_StRetrieve,   0},
	/* End of list. */
    {0, 0}

Embed is not supported. This is a limitation, not a bug, I'll leave it to
support whether to make this an enhancement request, or close the issue.
(There is already an enhancement request for pdfmarks relating to structure,
#687793)
Comment 3 Xi Wang 2008-08-30 06:39:19 UTC
Thanks a lot for your quick response. 
Since it's not clarified in the help doc.I've struggled a while to achive this 
un-supported function.

I think to support the functions which is supported by distiller will 
encourage more people to use gs.
Comment 4 Ray Johnston 2008-09-04 09:40:58 UTC
Maybe someone wants to add this support for us. This is an enhancement
Comment 5 Ralf Hauser 2011-12-15 08:16:29 UTC
for smart-phones, it would be great if embedded files at least could be read

http://code.google.com/p/apv/issues/detail?id=85
Comment 6 Ken Sharp 2011-12-15 08:38:22 UTC
(In reply to comment #5)
> for smart-phones, it would be great if embedded files at least could be read
> 
> http://code.google.com/p/apv/issues/detail?id=85

This sounds like a completely different problem, the enhancement request here is to *create* PDF files which contain an EmbeddedFile, not to read PDF files which contain them.
Comment 7 Ralf Hauser 2011-12-15 11:07:14 UTC
re comment 6 see Bug 692744
Comment 8 Ken Sharp 2014-01-29 09:35:52 UTC
I believe commit a91d2576df0e60f6e691a3bd967b51109ae41f22 will resolve this.

Comment 5 seems irrelevant since it refers to 'reading' embedded files which is completely different to creating a PDF file with an embedded file. Comment #7 doesn't add any illumination since it refers to a MuPDF bug report.

May need to revisit this in future.
Comment 9 Ken Sharp 2014-01-30 02:02:09 UTC
Unfortunately as I indicated in comment #8 the original commit wasn't quite enough. However, in combination with commit 3e5ae4ea39655643ae352cf4723702a164c10417 I believe this does now work as expected.
Comment 10 pipitas 2014-01-30 04:25:06 UTC
I just compiled Git 3e5ae4ea39655643ae352cf4723702a164c10417 on Mac OS X.

I've been using Ghostscript successfully to embed files into PDFs, albeit using '/ANN pdfmark', not '/EMBED pdfmark'.

Here are two example/template textfiles which I used for this. The first one is "make-embed-with-ann.variables":

%%%%%%% make-embed-with-ANN.variables %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  /_var_filetoembed    { (GS9_Color_Management.pdf)                                } def
  /_var_moddate        { (D:20140129230250+01'00')                                 } def
  /_var_description    { (Attached File \(embedded\): GS9_Color_Management.pdf)    } def
  /_var_srcpg          { 133                                                       } def
  /_var_title          { (By: Kurt Pfeifle)                                        } def
  /_var_subj           { (embedding files into PDF with 'pdfmark' and Ghostscript) } def
  /_var_rect           { [ 100    100    210    310 ]                              } def
  /_var_border         { [   0      0      0        ]                              } def
  /_var_color          { [   0.08   0.09   0.75     ]                              } def
  /_var_name           { /Paperclip                                                } def
  /_var_mimetype       { (application/pdf)                                         } def
  /_var_desc           { (Description #optional property)                          } def
  /_var_filename       { (GS9_Color_Management.pdf)                                } def
%%%%%%% end file %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The second one is "make-embed-with-ann.pdfmark":

%%%%%%% make-embed-with-ANN.pdfmark %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  
  [ /_objdef  {MyStreamName}
    /type     /stream
  /OBJ pdfmark
  [ {MyStreamName}
     << /Type /EmbeddedFile>>
  /PUT pdfmark-
  [ {MyStreamName} _var_filetoembed (r) file
  /PUT pdfmark
  [ /Type          /Annot
    /Subtype       /FileAttachment
    /ModDate       _var_moddate
    /Contents      _var_description  % Contents of file attachment as a string
    /SrcPg         _var_srcpg        % Sequence number of page where annotation appears 
    /Subj          _var_subj
    /Title         _var_title
    /Rect          _var_rect
    /Border        _var_border
    /Color         _var_color
    /Name          _var_name
    /FS            << 
                     /Type    /FileSpec
                     /EF      << /F {MyStreamName} >>
                     /F       _var_filename
                     /Subtype _var_mimetype             % MIME type of file attachment
                     /Desc    _var_desc
                   >> 
  /ANN pdfmark
  [ {MyStreamName}
  /CLOSE pdfmark
%%%%%%% end file %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

To embed GS9_Color_Management.pdf into itself, creating the output file 'embed-example.pdf', I run:

   gs -o embed-example.pdf           \
      -sDEVICE=pdfwrite              \
       make-embed-with-ANN.variables \
       make-embed-with-ANN.pdfmark   \
       GS9_Color_Management.pdf


After reading about this new feature I tried to come up with two analogue files, "make-embed-with-embed.{variables,pdfmark}". However, I was not able to wrap my mind around the Adobe's documentation (I used "pdfmarkReference_v9.pdf") well enough to come up with a working solution. 

Currently I'm using this pdfmark file (with hard-coded stuff instead of the *.variables file):

%%%%%%% make-embed-with-EMBED.pdfmark %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  
  [ /_objdef {MyStreamName} /type /stream              /OBJ pdfmark
  [ {MyStreamName} << /Type /EmbeddedFile >>           /PUT pdfmark
  [ {MyStreamName} (GS9_Color_Management.pdf) (r) file /PUT pdfmark
  [ /Name (UnicodeUniqueName)                      % e.g., <feff 0041 0073> is Unicode for "As" 
    /FS  
       <<   
         /Type /Filespec
         /F    (gs9_color_management.pdf)
         /EF   << /F {MyStreamName} >>
       >>   
                                                     /EMBED pdfmark
  [ {MyStreamName}                                   /CLOSE pdfmark
%%%%%%% end file %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

with this commandline:

    gs -o embed-test.pdf -sDEVICE=pdfwrite  make-embed-with-EMBED.pdfmark  GS9_Color_Management.pdf

The resulting PDF is incomplete (though the GS commandline completes and returns 0 error or warnings): it misses the trailer and the xref sections (and maybe more).

It may be my own fault, trying to make use of a buggy self-written pdfmark file.

However, it would be nice if you could provide a working example for the new feature.

-------

The newly built Ghostscript still works with my previous method to embed files ("make-embed-with-ANN.{variables,pdfmark}") -- however, the output PDF metadata now contains as "Creator: David M. Jones" and as "Title: CMBX12". These two values clearly originate from MBATIN+CMBX12 font subset used by GS9_Color_Management.pdf. So somehow the recent modifications messed something up...
Comment 11 pipitas 2014-01-30 04:26:24 UTC
Sorry for the line-wraps in my previous comment's code examples...
Comment 12 Ken Sharp 2014-01-30 05:18:43 UTC
Oddly part of my commit disappeared. The content of the repository has 2 pairs of lines in the wrong order.

I've fixed that, and also taken the opportunity to slightly improve the error handling there, so that at least we don't end up writing most of the trailer to a temporary file and discarding it.

Its not really possible to do much more than that with the error handling as the pdf_close routine is structured at the moment, as we need to run the whole routine in order to clean up memory usage (this is particularly important when using %d in the output specification).

It should now at least produce a working PDF file even if the embedded content doesn't work. Testing in progress.
Comment 13 Ken Sharp 2014-01-30 07:36:27 UTC
OK commit ea83541f7fcc1af40cf62f0e7457df74e8b427c4 has the proper ordering of the searches, and the additional error handling.

In terms of an example, I used:

/F (d:\\temp\\so.pdf) (r) file def
[/_objdef {fstream} /type /stream /OBJ pdfmark
[{fstream} << /Type /EmbeddedFile >> /PUT pdfmark
[{fstream} << /Company (Schlumberger) >> /PUT pdfmark
[{fstream} F  /PUT pdfmark
[/Name (Mono version PDF log for Black & White SLB printers)  /FS 
<< /Type /Filespec /F (MonoVersion.pdf)
 /EF << /F {fstream} >> >> /EMBED pdfmark
[{fstream} /CLOSE pdfmark

Obviously from a PostScript file. I have no idea if this will work at all if the input is a PDF file, but I don't see why it wouldn't.
Comment 14 pipitas 2014-01-30 13:16:01 UTC
Thanks a lot. I re-build with the newest Git sources on Mac OS X -- now embedding one PDF file into another works with my method as outlined in comment #10 with the "make-embed-with-EMBED.pdfmark" file.

Will be doing more testing in the next few days...
Comment 15 pipitas 2014-01-30 14:03:58 UTC
Concluding my comment #10, I noted an observation about current Git-Ghostscript messing up the metadata info. 

This is still the case after the recent fix for EMBED pdfmark support. I'll submit a separate bug report for this.