691378 – ghostPCL memory not undercontrol

Bug 691378 - ghostPCL memory not undercontrol

Summary: ghostPCL memory not undercontrol

Status:	NOTIFIED WORKSFORME

Alias:	None

Product:	GhostPCL
Classification:	Unclassified
Component:	PCL interpreter (show other bugs)
Version:	master
Hardware:	PC Windows XP

Importance:	P4 normal
Assignee:	Henry Stiles

URL:
Keywords:

Depends on:
Blocks:

Reported:	2010-06-10 12:06 UTC by norbert.janssen
Modified:	2011-09-18 21:46 UTC (History)
CC List:	0 users

See Also:
Customer:	661
Word Size:	---

Attachments
analysis of job (breakdown). (256.90 KB, application/octet-stream) 2010-06-11 06:47 UTC, norbert.janssen	Details
adding binarysearch to pl_dict_lookup_entry(). (30.00 KB, application/octet-stream) 2010-06-17 14:01 UTC, norbert.janssen	Details
replacement for previous pldict.tar (30.00 KB, application/octet-stream) 2010-06-17 14:28 UTC, norbert.janssen	Details
Show Obsolete (1) Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description norbert.janssen 2010-06-10 12:06:40 UTC

When printing the testjob cmp_02_08.pcl (uploaded on peeves.ghostscript.com) the memory consumed by the ghostPCL is always rising.
The job consists of a number of macros. In the macros fonts are downloaded (stupid I know, but it's from a real-life application :(). However, it is always the same set of macros and same set of download fonts. So after a while memoryconsumption should stabilize.
Note also: when compared to our other rip, it's 3times slower. (I suspect due to too long lists that have to be searched time over time).

Comment 1 norbert.janssen 2010-06-10 12:10:48 UTC

cmp_02_08.pcl.gz uploaded to peeves.ghostscript.com:${MYHOME}/data4_691378

Comment 2 norbert.janssen 2010-06-10 14:29:56 UTC

e.g. with the debugger attached.
run: pcl6 -sDEVICE=jpeg -sOutputFile=NULL -dNOPAUSE cmp_02_08.pcl (24717 pages)
break point at gx_default_print_page_copies (display pdev->PageCount to show progress).
Have also a display on (gs_malloc_memory_s*)pjl_mem->non_gc_memory (in  plmain.c) and watch used/max_used grow.

with the gdevbit.c (black&white at 600dpi) it is even worse (that's our targetprinter) and more than 256MB of memory is used.

So actually with this testfile we have 2 problems. Performance (3 times slower) and Memoryconsumption (not under control) with respect to our reference-pdl.

Comment 3 Henry Stiles 2010-06-10 20:38:57 UTC

(In reply to comment #0)
> When printing the testjob cmp_02_08.pcl (uploaded on peeves.ghostscript.com)
> the memory consumed by the ghostPCL is always rising.
> The job consists of a number of macros. In the macros fonts are downloaded
> (stupid I know, but it's from a real-life application :(). However, it is
> always the same set of macros and same set of download fonts. So after a while
> memoryconsumption should stabilize.
> Note also: when compared to our other rip, it's 3times slower. (I suspect due
> to too long lists that have to be searched time over time).

But it is not the same set of fonts.  There are thousands of unique font identifiers in the file (I assume associated with a font)  and not a single command to delete a font.  I would expect the memory to grow as it is now.  Are you expecting that we pick fonts to be deleted when memory exhaustion approaches,  is that the approach in the other RIP?

I guess it is possible downloaded fonts are deleted with the associated macro but that is not in the documentation and would seem rather odd.  I'll continue to investigate, let me know if you have anything to add.

Comment 4 norbert.janssen 2010-06-11 06:42:25 UTC

(In reply to comment #3)
> (In reply to comment #0)
> > When printing the testjob cmp_02_08.pcl (uploaded on peeves.ghostscript.com)
> > the memory consumed by the ghostPCL is always rising.
> > The job consists of a number of macros. In the macros fonts are downloaded
> > (stupid I know, but it's from a real-life application :(). However, it is
> > always the same set of macros and same set of download fonts. So after a while
> > memoryconsumption should stabilize.
> > Note also: when compared to our other rip, it's 3times slower. (I suspect due
> > to too long lists that have to be searched time over time).
> 
> But it is not the same set of fonts.  There are thousands of unique font
> identifiers in the file (I assume associated with a font)  and not a single
> command to delete a font.  I would expect the memory to grow as it is now.  > > Are
> you expecting that we pick fonts to be deleted when memory exhaustion
> approaches,  is that the approach in the other RIP?

The total job is 58MB in size, so memory required for the fonts (the downloaddata) should not exceed that. The rasterized characterbitmaps, can always be re-made. So there is no reason to keep the bitmaps in cache forever(i.e. flush from a limitedsized cache with a LRU mechanism?). The characterdescription should be kept.

Perhaps there is an issue with flushing a font from memory/cache when an identical fontid is going to be downloaded?

> 
> I guess it is possible downloaded fonts are deleted with the associated macro
> but that is not in the documentation and would seem rather odd.  I'll continue
> to investigate, let me know if you have anything to add.
I don't think this is according to the PCL spec.

See also BUG 691374 (which is printing same job multiple times) Each time 5-6MB more memory is used. This is also about downloading same set of fonts over and over again.

Comment 5 norbert.janssen 2010-06-11 06:47:03 UTC

Created attachment 6365 [details]
analysis of job (breakdown).

this is an breakdown of the job (i.e. what it is dowing: Define Macro, Download Font etc ...) To help analysis

Comment 6 norbert.janssen 2010-06-11 09:35:52 UTC

(In reply to comment #5)
> Created an attachment (id=6365) [details]
> analysis of job (breakdown).
> 
> this is an breakdown of the job (i.e. what it is dowing: Define Macro, Download
> Font etc ...) To help analysis

I extracted all downloadfonts (not part of a macro) and all macros.
The job consists of 11 downloadfonts (about 32kB data total)
and 1554 macros (that in turn also download fonts) (about 34MB data total).
So remaining part of job is just page-content (24717 pages = 58MB - 34MB).
So this would have to be kept in memory during the total lifetime of the interpreter. There is only an escE at the very start, and no explicit delete of macro/font.
Thus memoryconsumption of the pcl6 rip should not grow substantially above the 34MB (for the download data).

Comment 7 norbert.janssen 2010-06-11 09:41:48 UTC

(In reply to comment #6)
> (In reply to comment #5)
> > Created an attachment (id=6365) [details] [details]
> > analysis of job (breakdown).
> > 
> > this is an breakdown of the job (i.e. what it is dowing: Define Macro, Download
> > Font etc ...) To help analysis
> 
> I extracted all downloadfonts (not part of a macro) and all macros.
> The job consists of 11 downloadfonts (about 32kB data total)
> and 1554 macros (that in turn also download fonts) (about 34MB data total).
> So remaining part of job is just page-content (24717 pages = 58MB - 34MB).
> So this would have to be kept in memory during the total lifetime of the
> interpreter. There is only an escE at the very start, and no explicit delete of
> macro/font.
> Thus memoryconsumption of the pcl6 rip should not grow substantially above the
> 34MB (for the download data).

Uploaded job.gz to peeves.ghostscript.com/${MYHOME}/data4_691378
this is a bunch of pcl jobs (f#.pcl to download a font, m#.pcl to download a macro).

Comment 8 Henry Stiles 2010-06-14 20:08:29 UTC

(In reply to comment #6)
> (In reply to comment #5)
> > Created an attachment (id=6365) [details] [details]
> > analysis of job (breakdown).
> > 
> > this is an breakdown of the job (i.e. what it is dowing: Define Macro, Download
> > Font etc ...) To help analysis
> 
> I extracted all downloadfonts (not part of a macro) and all macros.
> The job consists of 11 downloadfonts (about 32kB data total)
> and 1554 macros (that in turn also download fonts) (about 34MB data total).
> So remaining part of job is just page-content (24717 pages = 58MB - 34MB).
> So this would have to be kept in memory during the total lifetime of the
> interpreter. There is only an escE at the very start, and no explicit delete of
> macro/font.
> Thus memoryconsumption of the pcl6 rip should not grow substantially above the
> 34MB (for the download data).

The fonts are compressed about 2:1 so it will be quite a bit higher, but more importantly the macros are not deleted.  It looks like an error in the pcl.  Each delete command seems to come immediately after creating a fresh macro id for the next macro.  If the delete command was before it would have deleted the previous macro and there would be little memory growth.  I am stumped how this could work in the other rip, every object in memory has been allocated as expected from the pcl and there are no commands delete the objects nor have the objects gone out of scope.

Comment 9 norbert.janssen 2010-06-17 14:01:30 UTC

Created attachment 6383 [details]
adding binarysearch to pl_dict_lookup_entry().

When testing with cmp_02_08.pcl I noticed that pl_dict_lookup_entry() uses a linear search algorithm. Attached a tarfile in which I added a binarysearch algorithm to speed up the lookup in large dictionaries.

I also noticed a bug in pl_dict_free(). When an entry is freeed, it is not unlinked correctly from the pdict->entries list.

What I noticed (but I can't be entirely certain). In pcl_font_header( when a font is reloaded for the second time (after page 5), pcl_delete_soft_font() removes the same font (i.e. the pde entry from the dictionary is deleted (HOWEVER I did not see it getting unlinked from the pdict->entries list). So after added same font (there were 2 entries in the pdict->entries list with the same short-key.
In a linearsearch algorithm the first (last one added) is returned, so this is correct. But the list becomes unnecessary large.

The unlinking should have been done in pl_dict_free(). See attached pdict.tar

Comment 10 norbert.janssen 2010-06-17 14:28:57 UTC

Created attachment 6384 [details]
replacement for previous pldict.tar

I put all additional code between #ifdef USE_BINSEARCH, so it can be easily (not)compiled in.

Comment 11 Henry Stiles 2010-06-17 21:26:48 UTC

(In reply to comment #10)
> Created an attachment (id=6384) [details]
> replacement for previous pldict.tar
> 
> I put all additional code between #ifdef USE_BINSEARCH, so it can be easily
> (not)compiled in.

Hi Norbert,

Thanks for this, I am testing it and don't see any problems yet, I do see the following valgrind complaint with many files, any ideas before I dive into it with the debugger?

==26725== Invalid read of size 4
==26725==    at 0x321E35: pl_dict_lookup (pldict.c:281)
==26725==    by 0x3369F1: pcl_find_symbol_map (pcsymbol.c:289)
==26725==    by 0x32E744: check_support (pcfsel.c:164)
==26725==    by 0x32E8C7: score_match (pcfsel.c:226)
==26725==    by 0x32EDDB: pcl_reselect_font (pcfsel.c:413)
==26725==    by 0x332E05: pcl_recompute_font (pcfont.c:90)
==26725==    by 0x335744: pcl_text (pctext.c:737)
==26725==    by 0x34134F: pcl_process (pcparse.c:474)
==26725==    by 0x2179: pcl_impl_process (pctop.c:461)
==26725==    by 0x32372E: pl_process (pltop.c:162)
==26725==    by 0x3739BD: pl_main (plmain.c:421)
==26725==    by 0x375A66: main (plmain.c:1308)
==26725==  Address 0xbfffe4a4 is just below the stack ptr.  To suppress, use: --workaround-gcc296-bugs=yes

Comment 12 norbert.janssen 2010-06-18 06:34:51 UTC

(In reply to comment #11)
> (In reply to comment #10)
> > Created an attachment (id=6384) [details] [details]
> > replacement for previous pldict.tar
> > 
> > I put all additional code between #ifdef USE_BINSEARCH, so it can be easily
> > (not)compiled in.
> 
> Hi Norbert,
> 
> Thanks for this, I am testing it and don't see any problems yet, I do see the
> following valgrind complaint with many files, any ideas before I dive into it
> with the debugger?
> 
> ==26725== Invalid read of size 4
> ==26725==    at 0x321E35: pl_dict_lookup (pldict.c:281)
> ==26725==    by 0x3369F1: pcl_find_symbol_map (pcsymbol.c:289)
> ==26725==    by 0x32E744: check_support (pcfsel.c:164)
> ==26725==    by 0x32E8C7: score_match (pcfsel.c:226)
> ==26725==    by 0x32EDDB: pcl_reselect_font (pcfsel.c:413)
> ==26725==    by 0x332E05: pcl_recompute_font (pcfont.c:90)
> ==26725==    by 0x335744: pcl_text (pctext.c:737)
> ==26725==    by 0x34134F: pcl_process (pcparse.c:474)
> ==26725==    by 0x2179: pcl_impl_process (pctop.c:461)
> ==26725==    by 0x32372E: pl_process (pltop.c:162)
> ==26725==    by 0x3739BD: pl_main (plmain.c:421)
> ==26725==    by 0x375A66: main (plmain.c:1308)
> ==26725==  Address 0xbfffe4a4 is just below the stack ptr.  To suppress, use:
> --workaround-gcc296-bugs=yes

Probably the return of ppde in pl_dict_lookup_entry.

    if (i != -1)
    {
	pde = pdict->sorted[i];
	ppde = &pde;
	return ppde;
    }

In the old situation, ppde contains the address of the pde->next pointer (pde is the current entry in the linked-list; then pde is set to the next entry.
What is returned is the address of the previous_pde->next i.e. address of a pointer to the pde to be looked for.

Only difference is that the pde in the new situation is not declared a static, so should not be accessed outside pl_dict_lookup_entry.

So perhaps the following is better:

    if (i != -1)
    {
        pde = pdict->sorted[i]->prev; /* pde (or pdict->entries) before lookup */
        ppde =&pde->next;
        return ppde;
    }


Now the returned value of ppde is the same as in the !USE_BINSEARCH case

Note: the pdict->sorted only grows (never shrinks), and is only released if the dictionary is released.

Comment 13 Henry Stiles 2010-06-24 04:53:47 UTC

(In reply to comment #12)
> (In reply to comment #11)
> > (In reply to comment #10)
> > > Created an attachment (id=6384) [details] [details] [details]
> > > replacement for previous pldict.tar
> > > 
> > > I put all additional code between #ifdef USE_BINSEARCH, so it can be easily
> > > (not)compiled in.
> > 
> > Hi Norbert,
> > 
> > Thanks for this, I am testing it and don't see any problems yet, I do see the
> > following valgrind complaint with many files, any ideas before I dive into it
> > with the debugger?
> > 
> > ==26725== Invalid read of size 4
> > ==26725==    at 0x321E35: pl_dict_lookup (pldict.c:281)
> > ==26725==    by 0x3369F1: pcl_find_symbol_map (pcsymbol.c:289)
> > ==26725==    by 0x32E744: check_support (pcfsel.c:164)
> > ==26725==    by 0x32E8C7: score_match (pcfsel.c:226)
> > ==26725==    by 0x32EDDB: pcl_reselect_font (pcfsel.c:413)
> > ==26725==    by 0x332E05: pcl_recompute_font (pcfont.c:90)
> > ==26725==    by 0x335744: pcl_text (pctext.c:737)
> > ==26725==    by 0x34134F: pcl_process (pcparse.c:474)
> > ==26725==    by 0x2179: pcl_impl_process (pctop.c:461)
> > ==26725==    by 0x32372E: pl_process (pltop.c:162)
> > ==26725==    by 0x3739BD: pl_main (plmain.c:421)
> > ==26725==    by 0x375A66: main (plmain.c:1308)
> > ==26725==  Address 0xbfffe4a4 is just below the stack ptr.  To suppress, use:
> > --workaround-gcc296-bugs=yes
> 
> Probably the return of ppde in pl_dict_lookup_entry.
> 
>     if (i != -1)
>     {
>     pde = pdict->sorted[i];
>     ppde = &pde;
>     return ppde;
>     }
> 
> In the old situation, ppde contains the address of the pde->next pointer (pde
> is the current entry in the linked-list; then pde is set to the next entry.
> What is returned is the address of the previous_pde->next i.e. address of a
> pointer to the pde to be looked for.
> 
> Only difference is that the pde in the new situation is not declared a static,
> so should not be accessed outside pl_dict_lookup_entry.
> 
> So perhaps the following is better:
> 
>     if (i != -1)
>     {
>         pde = pdict->sorted[i]->prev; /* pde (or pdict->entries) before lookup
> */
>         ppde =&pde->next;
>         return ppde;
>     }
> 
> 
> Now the returned value of ppde is the same as in the !USE_BINSEARCH case
> 
> Note: the pdict->sorted only grows (never shrinks), and is only released if the
> dictionary is released.

I don't understand what you mean the return is in the for loop before the postcondition sets ppde = &pde->next.  Actually I worked on this independently I didn't notice your comment here and simply fixed it with:

 return &pdict->sorted[i]

maybe I don't understand something.  That fixes owl.pcl but the code crashes later on with pcl5c fts 0543.  There is code (pl_dict_undef_purge_synonyms) that assumes the code will inserted at the head of a linked list - I don't know if that is the issue with fts 0543.  Have you been working on this?

Comment 14 norbert.janssen 2010-06-24 07:02:31 UTC

(In reply to comment #13)
I was not clear. 
The valgrind warning was about accessing the address of pde (ppde = &pde; which is a localvar on stac), so outside the function this is not valid anymore).
> > Probably the return of ppde in pl_dict_lookup_entry.
> > 
> >     if (i != -1)
> >     {
> >     pde = pdict->sorted[i];
> >     ppde = &pde;
> >     return ppde;
> >     }
> > 

> > So perhaps the following is better:
> > 
> >     if (i != -1)
> >     {
> >         pde = pdict->sorted[i]->prev; /* pde (or pdict->entries) before lookup
> > */
> >         ppde =&pde->next;
> >         return ppde;
> >     }
> > 

> 
> I don't understand what you mean the return is in the for loop before the
> postcondition sets ppde = &pde->next.  Actually I worked on this independently
> I didn't notice your comment here and simply fixed it with:
> 
> 
This is also a good solution.

> maybe I don't understand something.  That fixes owl.pcl but the code crashes
> later on with pcl5c fts 0543.  There is code (pl_dict_undef_purge_synonyms)
> that assumes the code will inserted at the head of a linked list - I don't know
> if that is the issue with fts 0543.  Have you been working on this?

The pl_dict_entry is still inserted at the front of the list, this has not changed. There is just an extra sorted-array of pointers which speeds up the lookup of an entry.
The pl_dict_undef_purge_synonyms, removes entries using pl_dict_undef(), so this should also not be a problem. pl_dict_undef() calls pl_dict_free() that first removes the entry in the sorted-array, and then removes the pl_dict_entry from the list.

Only thing extra in pl_dict_free, is the unlinking of the pde from the list, which should be done before calling gs_free_object(). Or am I missing something? See also my comment#9.

fts 0543 is from QualityLogic suite I assume. Have to see if I have it too.

Comment 15 norbert.janssen 2010-07-30 08:52:13 UTC

I have the memoryconsumption for this job now under control. Also performance is ok now.
So this CR can be closed.

Comment 16 Marcos H. Woehrmann 2011-09-18 21:46:26 UTC

Changing customer bugs that have been resolved more than a year ago to closed.