Bug 706069 - Annotation line break
Summary: Annotation line break
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: General (show other bugs)
Version: 9.53.3
Hardware: PC Linux
: P4 enhancement
Assignee: Default assignee
URL:
Keywords:
: 706068 (view as bug list)
Depends on:
Blocks:
 
Reported: 2022-11-09 04:43 UTC by Anand
Modified: 2022-11-11 08:27 UTC (History)
0 users

See Also:
Customer:
Word Size: ---


Attachments
GS annotation (41.73 KB, image/png)
2022-11-09 04:44 UTC, Anand
Details
Firefox output (458.85 KB, image/png)
2022-11-09 04:46 UTC, Anand
Details
Input PDF with annotation (3.24 MB, application/pdf)
2022-11-09 09:05 UTC, Anand
Details
PDF reference 1.7 (30.97 MB, application/pdf)
2022-11-10 02:22 UTC, Anand
Details
Page 2 output (15.78 KB, image/png)
2022-11-10 09:58 UTC, Anand
Details
Page 1 output (42.50 KB, image/png)
2022-11-10 14:59 UTC, Anand
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Anand 2022-11-09 04:43:35 UTC
The annotation is not rendered with proper line breaks with this PDF
https://www.adobe.com/technology/pdfs/presentations/KingPDFTutorial.pdf .

gs -q -dSAFER -dBATCH -dFirstPage=1 -dLastPage=1 -sOutputFile=out.png -sDEVICE=png256 ~/Documents/KingPDFTutorial.pdf
Comment 1 Anand 2022-11-09 04:44:59 UTC
Created attachment 23467 [details]
GS annotation
Comment 2 Anand 2022-11-09 04:46:05 UTC
Created attachment 23468 [details]
Firefox output
Comment 3 Ken Sharp 2022-11-09 08:39:45 UTC
*** Bug 706068 has been marked as a duplicate of this bug. ***
Comment 4 Ken Sharp 2022-11-09 08:50:55 UTC
Please attach specimen files to this bug tracker, rather than giving a URL. Please also indicate which page number in the file you perceive as being at fault (I'm assuming this is a multiple page file).

It can take time before an engineer can investigate a reported problem and in that time the URL can go stale (Adobe for example rejig their site frequently which is why we no longer point to the PDF Reference there).

In addition the files attached to Bugzilla form part of our history; in future when investigating a similar problem an engineer may need to look into an old report. If the file is attached here then it is still available for us to investigate, which it may not be otherwise.

Finally.... Annotations may, or may not, have an Appearance which describes how the annotation should appear. Adobe Acrobat often (but not always) ignores these and generates a new appearance itself. In the absence of an Appearance stream PDF consumers may take any action, including drawing nothing.

So it is entirely possible that Ghostscript is behaving as expected.
Comment 5 Anand 2022-11-09 09:05:20 UTC
Created attachment 23469 [details]
Input PDF with annotation

The description provides the command used for generating the images. ie. page 1.
Comment 6 Ken Sharp 2022-11-09 16:02:46 UTC
The annotation in question is a Popup annotation which is described in the Reference as:

"A pop-up annotation (PDF 1.3) displays text in a pop-up window for entry and editing. It typically does not appear alone but is associated with a markup annotation, its parent annotation, and is used for editing the parent’s text."

Since Ghostscript doesn't allow entry or editing of annotations (or any other interactive behaviour) there isn't really a huge amount of point in display these. However.... We've had complaints in the past that we didn't render them, so now we do.

But the Popup annotation continues:

"It has no appearance stream or associated actions of its own"

The Popup in this case is associated with a /Text annotation which does have an appearance stream, but it only draws the little yellow speech bubble.

So there is no appearance stream for the actual text of the annotation anywhere. In the absence of an appearance stream, as noted previously, a PDF consumer may take any action, including none.

We choose to synthesise an appearance and for historical reasons current code matches the old PostScript-based PDF interpreter. Text reflow in PostScript is difficult, and altering the old PostScript program is also difficult. Together the two made it unfeasible to do a better job with text which does not fit into the annotation Rect.

However, with the new PDF interpreter written in C we have more options. So I've chosen to enhance the appearance synthesis to cope with this situation by applying (very) primitive text reflow when the supplied text won't fit in the /Rect of the annotation.

Commit 3f42b87dd0f531cffb9bd643f37bee1e8295c2b9 implements that.

Note that the appearance still does not precisely match Acrobat or Firefox for a number of reasons:

1) There is no documented standard as to how the text should be drawn.
2) The /Rect for the Popup annotation (the location and size of the annotation) extends past the CropBox of the page, so it is slightly clipped.

But that's as far as I think we want to proceed.


NOTE! This commit will NOT be applicable to version 9.53.3 at all (the version reported) and we will not be attempting any enhancement of that code, which uses the old PostScript-based PDF interpreter, for the reasons noted above. This commit is only applicable to code based on 9.56.1 or better using the new C-based PDF interpreter.
Comment 7 Anand 2022-11-09 16:45:38 UTC
H.7 (Page 751) The annotation with object identifier 12 0 illustrates splitting a long text string across multiple lines...

Do you believe this might be missing too? Basically using a \ to split the string.
Comment 8 Ken Sharp 2022-11-09 17:05:04 UTC
(In reply to Anand from comment #7)
> H.7 (Page 751) The annotation with object identifier 12 0 illustrates
> splitting a long text string across multiple lines...
> 
> Do you believe this might be missing too? Basically using a \ to split the
> string.

I'm not sure what you are referring to here. The sample file only has 108 pages, not 751. There is no H.7 in the 1.7 PDF Reference (and your file is a PDF 1.6 file). 

In the ISO PDF 2.0 reference there is an H.7 but it is on page 911.

I see nothing in the spec which indicates that a '\' indicates a newline. If you want a newline, insert a \r or \n.

And I'm disinclined to add code to do so without an example PDF file.
Comment 9 Anand 2022-11-10 02:22:46 UTC
Created attachment 23472 [details]
PDF reference 1.7

In the attached doc, the point I made is listed in page 1082.

Additionally, with the latest fix, page 1 annotation looks fine but not on any other page. Would you know why that might be? e.g.

gs -q -dSAFER -dBATCH -dFirstPage=2 -dLastPage=2 -sOutputFile=a.png -sDEVICE=png256 -DPDFDEBUG ~/Documents/KingPDFTutorial.pdf
Comment 10 Ken Sharp 2022-11-10 08:46:37 UTC
(In reply to Anand from comment #9)
> Created attachment 23472 [details]
> PDF reference 1.7

It should be obvious from my comment #8 that I have access to the PDF Reference (many revisions). So there really is no need to attach a 30MB file....

> In the attached doc, the point I made is listed in page 1082.

So rather than section H.7 on page 751 that is section G.6,4 on page 1052....

I still do not see anything that suggests '\' as a line break. The example has a '\' which is a normal typographical convention that the line should be taken as continuing without interruption even though it has been necessary to wrap it in order to fit the constraints of the media. It does not indicate anything in terms of the PDF format.

If you can supply me a PDF file where this is not the case I will change my mind, but I do not believe that Acrobat will break a line at a '\' character.

 
> Additionally, with the latest fix, page 1 annotation looks fine but not on
> any other page. Would you know why that might be? e.g.

Well no, because your command line specified page 1 and when I asked you specified page 1, so I only looked at page 1.

I've scanned pages 2 onwards (no I haven;t looked at all 108 pages) and I don't see any problems. The appearance seems the same as Acrobat.
Comment 11 Anand 2022-11-10 09:58:51 UTC
Created attachment 23473 [details]
Page 2 output

I meant - with the latest fix and the command mentioned in my last comment (this one is for page 2), the annotation is not rendered. I'm attaching the output here. Though the fix works fine for page 1.
Comment 12 Ken Sharp 2022-11-10 10:47:36 UTC
(In reply to Anand from comment #11)
> Created attachment 23473 [details]
> Page 2 output
> 
> I meant - with the latest fix and the command mentioned in my last comment
> (this one is for page 2), the annotation is not rendered. I'm attaching the
> output here. Though the fix works fine for page 1.

Using current code the annotation (yellow speech bubble) is rendered and it looks pretty much identical to the Acrobat display.
Comment 13 Anand 2022-11-10 14:59:39 UTC
Created attachment 23475 [details]
Page 1 output

Page 1 output for comparison.
Comment 14 Ken Sharp 2022-11-10 15:05:57 UTC
(In reply to Anand from comment #13)
> Created attachment 23475 [details]
> Page 1 output
> 
> Page 1 output for comparison.

You can't compare page 1 with page 2, they are different pages. I already noted that our rendering of the Popup annotation on page 1 does not match Acrobat and explained why.

As far as I can see our rendering of page 2 does match Acrobat, closely enough.
Comment 15 Anand 2022-11-11 03:49:53 UTC
But aren't they both popup annotations - though on different pages? Shouldn't there be consistency in rendering?
Comment 16 Ken Sharp 2022-11-11 08:27:59 UTC
(In reply to Anand from comment #15)
> But aren't they both popup annotations - though on different pages?

No.