Bug 691352 - cairo pdf mis-distilled.
Summary: cairo pdf mis-distilled.
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Writer (show other bugs)
Version: master
Hardware: PC Linux
: P4 normal
Assignee: Ken Sharp
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-30 09:34 UTC by Hin-Tak Leung
Modified: 2013-11-06 01:23 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments
pdf file from an LGM attendent (323.63 KB, application/pdf)
2010-05-30 09:34 UTC, Hin-Tak Leung
Details
the svg (13.50 KB, application/octet-stream)
2010-05-31 15:38 UTC, Hin-Tak Leung
Details
on arrow svg (7.95 KB, text/plain)
2010-05-31 16:17 UTC, Hin-Tak Leung
Details
one arrow pdf which is wrongly distilled. (4.48 KB, application/pdf)
2010-05-31 16:20 UTC, Hin-Tak Leung
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hin-Tak Leung 2010-05-30 09:34:00 UTC
Created attachment 6337 [details]
pdf file from an LGM attendent

svn r11340 (head) misrender the file. x11 okay and agree with acroread. (x11alpha crashes with range check).

gs-r11340 -sDEVICE=pdfwrite -o /tmp/a.pdf slide.pdf
Comment 1 Ken Sharp 2010-05-31 07:21:04 UTC
Another example of convoluted PDF creation from Cairo. 

The page has a transparency group, each object on the page also has its own transparency group, each arrow is a separate form XObject, each (of course) with its own transparency group, each arrow is filled not with yellow, but with a type 2 Shading pattern, apparently designed so that the actual fill is with a flat yellow. 

Just picking the file apart to reduce the complexity of the problem to something which can be reasonably investigated will take some time.
Comment 2 Ken Sharp 2010-05-31 08:23:41 UTC
To add to the fun, and presumably reduce the performance still further, each of the arrows is drawn through a 100% transparent (!) softmask. This seems to be the problem, pdfwrite is emitting this with an identity matrix, which doesn't take into account the fact that the page has a scale factor of 0.1. The result is that the softmask is drawn 1/10 the required size.
Comment 3 Stani 2010-05-31 10:13:58 UTC
This pdf file is what Inkscape generates with Cairo when you choose 'Save as PDF'. So if you want to have a less complex example, just draw a shape in Inkscape and save  it as pdf. Or I can attach the original Inkscape SVG file so you can remove elements easily.

Thanks for solving this,

the LGM attendent
Comment 4 Ken Sharp 2010-05-31 10:23:50 UTC
(In reply to comment #3)
> This pdf file is what Inkscape generates with Cairo when you choose 'Save as
> PDF'.

We know, we've seen Cairo generated PDF files from Inkscape before :-( TO be fair it coudl be Inkscape which is the culprit but in any event the files are unbelievably over-complex for the actual displayed content. This results in large files, which process slowly and are next to impossible to reduce because all the elements are stored inside form objects.

> So if you want to have a less complex example, just draw a shape in
> Inkscape and save  it as pdf. Or I can attach the original Inkscape SVG file so
> you can remove elements easily.

Won't work :-) We actually need the bizzareness to demonstrate the problem; without the weird way the file is drawn, the problem doesn't exist. Also, I have no idea how to use Inkscape, at least I understand PDF files.
  
 
To add to the fun, and presumably reduce the performance still further, each of the arrows is drawn through a 100% transparent (!) softmask, and (even more bizzarely) the softmask is generated via another form which uses a shading dictionary to generate the fill....

This seems to be the problem, pdfwrite is emitting this with a matrix which doesn't take into account the fact that the page has a scale factor of 0.1. The result is that the softmask is drawn 1/10 the required size.
Comment 5 Hin-Tak Leung 2010-05-31 15:36:30 UTC
(In reply to comment #3)
> Or I can attach the original Inkscape SVG file so you can remove elements easily...

(In reply to comment #4)
> Won't work :-) We actually need the bizzareness to demonstrate the problem;
> without the weird way the file is drawn, the problem doesn't exist. Also, I
> have no idea how to use Inkscape, at least I understand PDF files.

Inkscape is fairly common/standard on most linux systems - and due to having worked on XML technology some years ago in a commercial M$ environment, I have read the SVG spec cover-to-cover and probably and can probably still hand-edit them with emacs.

I have the original svg from Stani actually so I can try to simplify the svg to the smaller feature that can get Inkscape to mis-behave.
Comment 6 Hin-Tak Leung 2010-05-31 15:38:35 UTC
Created attachment 6339 [details]
the svg 

I'll simplify this in time - just in case I forget.
Comment 7 Ken Sharp 2010-05-31 15:48:44 UTC
(In reply to comment #6)
> Created an attachment (id=6339) [details]
> the svg 
> 
> I'll simplify this in time - just in case I forget.

Don't worry I already have it reduced to a single arrow. The remaining complexity is required to demonstrate the problem. Its something weird about the CTM in force when a type 2 pattern is rendered inside a form, especially when the form is nested inside another form. 

The pattern is supposed to have a matrix which reflects the transformation to the default co-ordinate space, not the space in force at the time the pattern is rendered. This is not happening correctly.
Comment 8 Hin-Tak Leung 2010-05-31 16:17:19 UTC
Created attachment 6340 [details]
on arrow svg

trimming the svg down to one of the arrows which generates a pdf which is wrongly distilled. (the pdf will follow).
Comment 9 Hin-Tak Leung 2010-05-31 16:20:46 UTC
Created attachment 6341 [details]
one arrow pdf which is wrongly distilled.

This pdf is generated by inkscape from the last svg (which consists of just a yellow-paint filled arrow). Distilled by r11340, the results shows a yellow-black gradient.
Comment 10 Ken Sharp 2010-05-31 16:30:09 UTC
(In reply to comment #9)

> This pdf is generated by inkscape from the last svg (which consists of just a
> yellow-paint filled arrow). Distilled by r11340, the results shows a
> yellow-black gradient.

Sort of, but not exactly. The yellow gradient simply peters out, this is because its 1/10 the size it ought to be....

I have a PDF cut down from the original which shows exactly the same behaviour, except that the background is white not black. 

Of course *why* Inkscape/Cairo is converting a yellow flat fill into an axial shading escapes me.
Comment 11 Ken Sharp 2010-06-01 13:15:06 UTC
I have a fix in testing for the yellow arrows, this does not fix the green text which is a different problem.

In this case the page content stream runs a form (again painted through a 100 % transparent soft mask). The form sets up another transparency group and fills a rectangle with a Pattern. The pattern executes a form XObject. The form XObject sets up yet another transparency group, and then (finally!) writes the text.

The PDF is, of course, valid but this is obviously highly inefficient. It would be possible to dispense with all the transparency groups, since none of the objects appears to be other than 100% opaque. We could also dispense with 2 form XObjects and a Pattern colour space. 

All the content appears to be present in the output PDF file, so I suspect this must be another (different) matrix scaling problem relating to the nested forms and so on.
Comment 12 Ken Sharp 2010-06-02 07:54:01 UTC
Revision 11347 partially resolves this issue, patch and log with more details can be found here:

http://ghostscript.com/pipermail/gs-cvs/2010-June/011159.html

This fixes the yellow arrows, but not the green text, which I'll continue to work on, along with the two files from our regression suite which exhibit differences (noted in the revision log).
Comment 13 Ken Sharp 2010-10-18 10:11:37 UTC
In fact, when processing the PDF file produced by pdfwrite, Ghostscript *does* get the expected answer with the green text. I have no idea how though.
Comment 14 Ken Sharp 2010-10-18 10:37:44 UTC
The two files noted in the revision log as requiring more investigation now work correctly:

Bug688807.pdf
Bug689918.pdf

The remaining issues with this file are probably caused by the sheer (unreasonable) complexity. The green text is actually drawn in a Form; the Form draws a rectangle and fills it with a Pattern colour. The Pattern colour draws a Form, the final Form actually draws the text. Each of the Forms includes transparency (as does the parent page) all of which is set to 100% opaque (or 100% transparent in the case of masks).

While this is still a bug, despite the fact that Ghostscript can render the file, its possibly caused by other issues; eg the Resources being described at the Page level instead of the Form level, and requiring inheritance. These other problems need to be addressed before revisiting this report.

Cairo really does produce badly sub optimal PDF files....
Comment 15 Ken Sharp 2013-11-06 01:23:36 UTC
Commit 616ff4cc44b00bcf7df97b64c8bdaebbe0620713 finally fixes the last remaining issue in this old report.