Bug 689488

Summary: Performance issue: 8.60 slower than 8.53
Product: Ghostscript Reporter: Marcos H. Woehrmann <marcos.woehrmann>
Component: PDF InterpreterAssignee: Alex Cherepanov <alex>
Status: NOTIFIED FIXED    
Severity: normal CC: leonardo
Priority: P2    
Version: 8.60   
Hardware: All   
OS: All   
Customer: 190 Word Size: ---
Attachments: profile.txt
patch
patch for gs8.60
patch

Description Marcos H. Woehrmann 2007-09-28 09:19:32 UTC
The customer reports and I've verified that Ghostscript 8.60 (and head (r8257)) are much slower at 
processing the test file than Ghostscript 8.53.

On my AMD64 Linux box gs8.60 takes 10 minutes vs. 30 seconds for gs8.53.

The file is too large to attach to this bug: casper:/home/support/689488/hartboom.pdf

The command line I used for testing:

  bin/gs -sDEVICE=ppmraw -sOutputFile=test.ppm ./hartboom.pdf
Comment 1 Marcos H. Woehrmann 2007-09-28 09:22:26 UTC
Created attachment 3424 [details]
profile.txt

Customer supplied profile for gs8.60
Comment 2 Marcos H. Woehrmann 2007-09-28 09:36:12 UTC
Created attachment 3425 [details]
patch

This patch fixes the problem.
Comment 3 Marcos H. Woehrmann 2007-09-28 09:37:10 UTC
Please ignore comment #2 (the patch is for a different bug).
Comment 4 Marcos H. Woehrmann 2007-09-29 07:45:47 UTC
I attempted to find the revision responsible for the increase in processing time but my results were not 
definitive:

rev   run time (minutes:seconds)
6375  0:36
7004  0:36
7005  4:30
7158  4:34
7159  7:08
7187  7:18
7188  doesn't build
7189  8:56
7902  9:51
7903 10:48
8257 10:21

Comment 5 Alex Cherepanov 2007-10-02 04:23:15 UTC
The change made in the rev. 7005 is the most significant factor.
Reverting this change speeds in the current version gives 7 times
speed improvement.
Comment 6 Marcos H. Woehrmann 2007-10-02 10:36:18 UTC
My testing with reverting the 7005 change is more significant:

8.60 stock: 9:32
8.60 without r7005: 0:33

There is no difference in the output.
Comment 7 Marcos H. Woehrmann 2007-10-02 10:41:51 UTC
Created attachment 3431 [details]
patch for gs8.60
Comment 8 Alex Cherepanov 2007-10-02 17:05:46 UTC
The gstate is placed into the 2nd element of the implementation array
for a purpose - to simulate Adobe interpreters for CET compliance.
It is not used anywhere by Ghostscript.

We can place the same gstate (e.g. saved during the start-up) to all
Implementation arrays.
Comment 9 Alex Cherepanov 2008-05-12 21:00:44 UTC
Created attachment 4006 [details]
patch

Reduce generation of garbage and the time spent on garbage collection by
placing gstate into the 2nd element of the patterm implementation array
in Adobe compatibility mode only. Ghostscript doesn't use this gstate
at all.

The patch is committed as a rev. 8727.
Regression testing shows no differences.
Comment 10 Alex Cherepanov 2008-05-12 21:07:26 UTC
Profiling shows that most of the time is now goes into evaluation
of FunctionType 0. Perhaps, Leonardo can commend on the new profile.

  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 23.39     18.75    18.75   292343     0.00     0.00 
fn_Sd_1arg_linear_monotonic_rec
  8.97     25.94     7.19      564     0.01     0.14  interp
  6.86     31.44     5.50 22009901     0.00     0.00  fn_Sd_evaluate_general
  5.25     35.65     4.21 22009901     0.00     0.00  fn_interpolate_linear
  2.73     37.84     2.19 100041115     0.00     0.00  gx_unit_frac
  2.72     40.02     2.18  8251347     0.00     0.00  function_linearity
  2.11     41.71     1.69 12493371     0.00     0.00  is_dc_nearly_linear
  2.07     43.36     1.66  7294089     0.00     0.00  dstack_find_name_by_index
  2.00     44.97     1.61 96265099     0.00     0.00  fn_gets_8
  1.52     46.19     1.22 25715695     0.00     0.00  cmap_cmyk_direct
  1.47     47.37     1.18                             __divdi3
  1.42     48.51     1.14     3330     0.00     0.00  ht_tiles_reloc_ptrs
  1.24     49.50     0.99  2671943     0.00     0.00  gx_fill_trapezoid_ns_lc
  0.87     50.20     0.70 22202185     0.00     0.00  gx_restrict01_paint_3
  0.86     50.89     0.69   449342     0.00     0.00  gstate_clone
  0.80     51.53     0.64  2986374     0.00     0.00  names_ref
  0.79     52.16     0.63  6045239     0.00     0.00  get_color_index_cache_elem
  0.75     52.76     0.60  5252479     0.00     0.00  gc_trace
  0.71     53.32     0.57 96265099     0.00     0.00  data_source_access_string
  0.68     53.87     0.55   192081     0.00     0.00  fill_quadrangle
  0.68     54.41     0.55  9038352     0.00     0.00  dict_find
  0.62     54.91     0.50 25009847     0.00     0.00  gx_remap_DeviceCMYK
  0.61     55.40     0.49  4245114     0.00     0.00  gx_cspace_is_linear_in_line
  0.61     55.89     0.49  4161726     0.00     0.00  scan_token
  0.57     56.35     0.46 11725605     0.00     0.00  array_get
  0.56     56.80     0.45 22202185     0.00     0.00  gx_restrict01_paint_4
  0.55     57.24     0.44 25715659     0.00     0.00  fwd_map_cmyk_cs
  0.52     57.66     0.42  3995994     0.00     0.00 
gx_default_fill_linear_color_scanline
  0.50     58.06     0.40  4250542     0.00     0.00  init_gradient
  0.47     58.44     0.38  1108625     0.00     0.00  fn_Sd_1arg_linear_monotonic
  0.46     58.81     0.37  2894170     0.00     0.00  chunk_locate_ptr
  0.45     59.17     0.36 11285539     0.00     0.00  obj_eq
  0.43     59.51     0.35 12277041     0.00     0.00  igc_reloc_refs
  0.42     59.85     0.34  1528525     0.00     0.00  decompose_linear_color
  0.41     60.18     0.33  6045239     0.00     0.00  gs_cached_color_index
  0.41     60.51     0.33   510756     0.00     0.00  quadrangle_color_change
  0.40     60.83     0.32  1251747     0.00     0.00 
gx_cspace_is_linear_in_triangle
  0.39     61.14     0.31    23420     0.00     0.00  inflate_fast
  0.37     61.44     0.30 25715695     0.00     0.00  color_cmyk_to_rgb
  0.37     61.74     0.30    88276     0.00     0.00  gc_trace_chunk
  0.35     62.02     0.28  8081889     0.00     0.00  i_alloc_struct
  0.35     62.30     0.28  7244635     0.00     0.00  i_free_object
  0.35     62.58     0.28  5553158     0.00     0.00  refs_clear_reloc
  0.31     62.83     0.25  7147969     0.00     0.00  refs_clear_marks
  0.30     63.07     0.24    63007     0.00     0.00  gc_objects_compact
  0.29     63.30     0.24   192087     0.00     0.00  patch_fill
  0.27     63.52     0.22  1108625     0.00     0.00  get_scaled_range
  0.27     63.74     0.22   151283     0.00     0.00  gc_objects_clear_marks
  0.26     63.95     0.21 25715659     0.00     0.00 
gx_forward_get_color_mapping_procs
  0.25     64.15     0.20  2228571     0.00     0.00  gx_fill_trapezoid_ns_fd
  0.24     64.34     0.19   151283     0.00     0.00  gc_do_reloc
  0.24     64.53     0.19 13480160     0.00     0.00  zexec
  0.24     64.72     0.19 25715709     0.00     0.00  gx_default_rgb_map_rgb_color
  0.23     64.91     0.19   767372     0.00     0.00  triangle_by_4
  0.22     65.09     0.18  7290582     0.00     0.00  ztype
Comment 11 Alex Cherepanov 2008-07-16 23:18:52 UTC
Rev. 8842 makes gs about 40% faster than v. 8.53 on the
given sample file.