Bug 626295 - transparency test file interpretion erro
Summary: transparency test file interpretion erro
Status: RESOLVED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: master
Hardware: All All
: P4 normal
Assignee: Michael Vrhel
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-10-21 05:02 UTC by Jack Moffitt
Modified: 2017-02-01 08:22 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments
test2.pdf (9.57 KB, application/pdf)
2004-09-23 20:21 UTC, Jack Moffitt
Details
bug626295_simple.pdf (44.08 KB, application/pdf)
2010-12-01 19:10 UTC, Michael Vrhel
Details
patch to implement the proposed fix in commnt 9 (1.25 KB, patch)
2013-06-29 09:14 UTC, Ken Sharp
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jack Moffitt 2002-10-21 05:02:39 UTC
Originally reported by: nobody@users.sourceforge.net
Running gs on this test file makes ghostscript clear
the screen after showing it immediatly. What you can
see is that it uses transparency on the wrong element.
At least it differs from what Adobe Acrobat 5.0 shows.
Adobe Acrobat 5.0 has no problems viewing this file.
Best regards,
Carsten Hammer

hammer.carsten@oce.de
Comment 1 Jack Moffitt 2002-10-23 08:51:26 UTC
Comment originally by jackiem@users.sourceforge.net
Logged In: YES 
user_id=206537

You are probably using an old version of Ghostscript.  Try
7.21 or later.  I get mostly the correct output.  The only
difference is that the upper set of ampersands are
transparent in GS but not in Acrobat Reader 5.  Other than
that difference, the file renders the same.

Comment 2 Alex Cherepanov 2003-05-20 10:23:31 UTC
GS renders the file as if /AIS was true.
Comment 3 Jack Moffitt 2004-09-23 20:21:50 UTC
Created attachment 912 [details]
test2.pdf

Added file from original sourceforge bug.
Comment 4 Alex Cherepanov 2004-12-29 09:03:59 UTC
The PDF file has multiple string rendered inside BT..ET  block.
By default TK is on, and later glyphs should knock out earlier glyph
in the same block. Ghostscript doesn't implement .settextknockout 
except saving and restoring the flag in the graphic state.
So the strings get blended.

The dependence on s AIS flag reported earlier is AR5 bug fixed in AR6.
Comment 5 Timothy Osborn 2007-07-31 09:20:05 UTC
I have reviewed this bug and it is still happening as described in comment #1 as
of r8156.
Comment 6 Ralph Giles 2009-07-30 10:19:40 UTC
This is still a bug in 8.70. The upper pair of ampersands knock out the text
behind them in Adobe Reader 9 for MacOS, but show as transparent in Ghostscript
8.70rc1 and Apple's Preview.app.

Transferring to Michael.
Comment 7 Michael Vrhel 2010-12-01 19:10:25 UTC
Created attachment 6979 [details]
bug626295_simple.pdf

A simplified file.
Comment 8 Michael Vrhel 2010-12-01 19:11:07 UTC
It appears that we are not paying attention to text_knockout.
Comment 9 Michael Vrhel 2010-12-02 17:54:16 UTC
This needs to be solved in the pdf interpreter so I am passing off to Alex.  What needs to be done is very similar to what was put in place for the \B command.  If TK is true and PDFusingtransparency is set, then we need to push and pop a knockout transparency group around the command that renders the affected text.  Again, see what was done around line 460 in pdf_ops.ps where we pushed the knockout group for the \B command so that the stroking was drawn in a knockout manner.
Comment 10 Shailesh Mistry 2011-07-11 19:19:34 UTC
Bug still reproducible in Ghostscript 9.03
Comment 11 Ken Sharp 2013-06-29 09:14:51 UTC
Created attachment 10032 [details]
patch to implement the proposed fix in commnt 9

I've tried the approach described by Michael in comment #9 and it works fine for the test file. However it introduces a number of faults in the regression suite, in particular blends.ai.pdf.

I'm not at all sure why but I have a suspicion that putting the text into its own group may not work in all cases.

I'm reassigning this to Michael to look at, as he knows far more about the transparency handling in Ghostscript than I do.

The attached patch pushes a transparency group when we encounter a BT and pops it when we get an ET. The PDFRM states that the text knockout can only be applied to whole text objects (ie between BT and ET) and cannot be changed during an object. I also found it necessary to alter the default value of the text knockout transparency parameter, as it was defaulting to false, and the spec says its initial value is true.
Comment 12 Michael Vrhel 2016-10-12 09:53:31 UTC
A note on this is that overprint is true (as is OPM).  In such a case, we need to do some special handing of the transparency.  Working with Ray on this.
Comment 13 Michael Vrhel 2016-12-18 19:30:15 UTC
There have been fixes in the knockout group handling and overprint in the pdf14 device.  So I 
thought I would test Ken's patch.  I did not see an issue with blends.ai.pdf,
however there were a couple files that did segv.  Bug695897.pdf was one.  I 
did notice that several times there is an warning thrown

**** Error: Illegal nested BT operator (inside a gsave) detected.
   Output may be incorrect.
   
Is it possible there is a push of a group and not a pop due to this issue?
I ask because, when I run Bug695897.pdf and look at the group pushes and
pops there is clearly a mismatch coming from the interpreter.  See the 
cleaned up output below where we end up pushing a mask, not popping it and
way later doing and extra group pop which causes the actual crash.

Handing back to Ken to maybe take a quick (or have an opinion) about
any cases where we might get a mismatch of a push and a pop.  For example
an extra BT with no ET or vice versa???

From Bug695897.pdf  using  -sDEVICE=tiff24nc -Zv -r72  

[v]gs_pdf14_device_push

	[v](0xcd5e60)gx_begin_transparency_group [0 -0 595 842] Num_grp_clr_comp = 0
		 (no CS)  Isolated = 1  Knockout = 1
	[v]gx_end_transparency_group
	
	[v](0xcd5e60)gx_begin_transparency_mask [0 0 1 1]
		  subtype = 1  Background_components = 0 Matte_components = 0 Num_grp_clr_comp = 1 no TR
	[v](0xcd5e60)gx_end_transparency_mask(0)

	[v](0xcd5e60)gx_begin_transparency_group [0 0 1 1] Num_grp_clr_comp = 0
		 (no CS)  Isolated = 1  Knockout = 0
	[v]gx_end_transparency_group

	[v](0xcd5e60)gx_begin_transparency_mask [0 0 0 0] ???????? No end found 
      subtype = 2  Background_components = 0 Matte_components = 0 Num_grp_clr_comp = 0 no TR

		[v](0xcd5e60)gx_begin_transparency_group [0 -0 595 842] Num_grp_clr_comp = 0
			(no CS)  Isolated = 1  Knockout = 1

			[v](0xcd5e60)gx_begin_transparency_group [-161.645 -658 433.355 184] Num_grp_clr_comp = 0
				(no CS)  Isolated = 1  Knockout = 1
			[v]gx_end_transparency_group

			[v](0xcd5e60)gx_begin_transparency_group [1 -7 142 7] Num_grp_clr_comp = 0
				(no CS)  Isolated = 1  Knockout = 1
			[v]gx_end_transparency_group

		[v]gx_end_transparency_group

		[v](0xcd5e60)gx_begin_transparency_group [-161.645 -658 433.355 184] Num_grp_clr_comp = 0
			(no CS)  Isolated = 1  Knockout = 1
		[v]gx_end_transparency_group

		[v](0xcd5e60)gx_begin_transparency_group [1 -1 159.395 11] Num_grp_clr_comp = 0
			(no CS)  Isolated = 1  Knockout = 1
		[v]gx_end_transparency_group

[v]gx_end_transparency_group ????????  Ending for a group that we never begin???????
Comment 14 Ken Sharp 2016-12-19 00:57:42 UTC
(In reply to Michael Vrhel from comment #13)
> There have been fixes in the knockout group handling and overprint in the
> pdf14 device.  So I 
> thought I would test Ken's patch.  I did not see an issue with blends.ai.pdf,
> however there were a couple files that did segv.  Bug695897.pdf was one.  I 
> did notice that several times there is an warning thrown
> 
> **** Error: Illegal nested BT operator (inside a gsave) detected.
>    Output may be incorrect.
>    
> Is it possible there is a push of a group and not a pop due to this issue?

If you implement the patch, and the file has mismatched BT/ET pairs, then yes, there will be a push and no pop. I don't see any way to address that, there's no way to tell if an 'ET' is missing.

The error above means that we encountered a BT at a time when it wasn't legal to do so (presumably because there was a missing 'ET', so we continued treating everything as text from that point)

Looking at Bug695897.pdf I can see that the producer of the PDF file treats text groups as if they were some sort of form, and 'nests' them. There are the correct number of BT and ET operations, but as the error states, they are nested :

BT
...
...
BT
...
...
ET
...
..
BT
...
..
ET
...
...
ET

which is not legal. I also believe that our PDF interpreter ignores a 'BT' operation when its already in a text state. I can't recall if it 'balances' BT/ET pairs, I suspect not.


> Handing back to Ken to maybe take a quick (or have an opinion) about
> any cases where we might get a mismatch of a push and a pop.  For example
> an extra BT with no ET or vice versa???

Extra BT or ET operations are not unknown, we need to be able to at least not crash when that happens. I don't think its unreasonable to produce the wrong output in those cases though, the PDF interpreter does flag the problem up and emit a warning after all.

There's no realistic way for the PDF interpreter to clean up group push and pops if there's a problem with BT/ET matching. It might be possible to address files like this one where there are the correct number of matching pairs, but we've seen files where they *don't* balance. We can't afford to have a seg fault if the number of BT and ET operators don't match, so either we need to fix this at the C level so we don't seg fault, or we can't do a push/pop round the text group.

If text knockout were available as part of the graphics state, would it be possible for the compositor to pay attention to that, rather than pushing/popping another pair of groups ? These cause significant inefficiencies in the pdfwrite output and I'd like to try and avoid them where possible, quite apart from the problem of seg faults.
Comment 15 Michael Vrhel 2017-01-24 16:59:40 UTC
I looked into handling this outside of the interpreter and it is problematic.  The contents of the simple file look like this

BT
/GS0 gs
/C2_0 1 Tf
10 0 0 10 72 588 Tm
<00030003000300030003000300030003000300030003000300030003000300030003000300030003000300030003005600520003005A0048004C00570048005500030045004C005600030048004C0051000300530044004400550003003D0048004C004F004800510003>Tj
0.8 g
/C2_1 1 Tf
300 0 0 300 100 390 Tm
<0009>Tj
ET

If I try to handle the push of the knockout group via the pdf14 begin text (and the release operation from the text code for the pop)  I get the two different text sections (the Tm) as different begin texts.  The group needs to encompass from BT to ET.  This is what Acrobat is also doing.   If you look at AR output preview, it shows a knockout group where the text is drawn even though there is not one in the source content.   So, I will go back to using Ken's fix via the interpreter and make sure the transparency code is robust enough to catch the cases were we have mismatches of BT/ET.