Bug 689488 - Performance issue: 8.60 slower than 8.53
Summary: Performance issue: 8.60 slower than 8.53
Status: NOTIFIED FIXED
Alias: None
Product: Ghostscript
Classification: Unclassified
Component: PDF Interpreter (show other bugs)
Version: 8.60
Hardware: All All
: P2 normal
Assignee: Alex Cherepanov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-09-28 09:19 UTC by Marcos H. Woehrmann
Modified: 2008-12-19 08:31 UTC (History)
1 user (show)

See Also:
Customer: 190
Word Size: ---


Attachments
profile.txt (98.58 KB, text/plain)
2007-09-28 09:22 UTC, Marcos H. Woehrmann
Details
patch (927 bytes, patch)
2007-09-28 09:36 UTC, Marcos H. Woehrmann
Details | Diff
patch for gs8.60 (906 bytes, patch)
2007-10-02 10:41 UTC, Marcos H. Woehrmann
Details | Diff
patch (1.64 KB, patch)
2008-05-12 21:00 UTC, Alex Cherepanov
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marcos H. Woehrmann 2007-09-28 09:19:32 UTC
The customer reports and I've verified that Ghostscript 8.60 (and head (r8257)) are much slower at 
processing the test file than Ghostscript 8.53.

On my AMD64 Linux box gs8.60 takes 10 minutes vs. 30 seconds for gs8.53.

The file is too large to attach to this bug: casper:/home/support/689488/hartboom.pdf

The command line I used for testing:

  bin/gs -sDEVICE=ppmraw -sOutputFile=test.ppm ./hartboom.pdf
Comment 1 Marcos H. Woehrmann 2007-09-28 09:22:26 UTC
Created attachment 3424 [details]
profile.txt

Customer supplied profile for gs8.60
Comment 2 Marcos H. Woehrmann 2007-09-28 09:36:12 UTC
Created attachment 3425 [details]
patch

This patch fixes the problem.
Comment 3 Marcos H. Woehrmann 2007-09-28 09:37:10 UTC
Please ignore comment #2 (the patch is for a different bug).
Comment 4 Marcos H. Woehrmann 2007-09-29 07:45:47 UTC
I attempted to find the revision responsible for the increase in processing time but my results were not 
definitive:

rev   run time (minutes:seconds)
6375  0:36
7004  0:36
7005  4:30
7158  4:34
7159  7:08
7187  7:18
7188  doesn't build
7189  8:56
7902  9:51
7903 10:48
8257 10:21

Comment 5 Alex Cherepanov 2007-10-02 04:23:15 UTC
The change made in the rev. 7005 is the most significant factor.
Reverting this change speeds in the current version gives 7 times
speed improvement.
Comment 6 Marcos H. Woehrmann 2007-10-02 10:36:18 UTC
My testing with reverting the 7005 change is more significant:

8.60 stock: 9:32
8.60 without r7005: 0:33

There is no difference in the output.
Comment 7 Marcos H. Woehrmann 2007-10-02 10:41:51 UTC
Created attachment 3431 [details]
patch for gs8.60
Comment 8 Alex Cherepanov 2007-10-02 17:05:46 UTC
The gstate is placed into the 2nd element of the implementation array
for a purpose - to simulate Adobe interpreters for CET compliance.
It is not used anywhere by Ghostscript.

We can place the same gstate (e.g. saved during the start-up) to all
Implementation arrays.
Comment 9 Alex Cherepanov 2008-05-12 21:00:44 UTC
Created attachment 4006 [details]
patch

Reduce generation of garbage and the time spent on garbage collection by
placing gstate into the 2nd element of the patterm implementation array
in Adobe compatibility mode only. Ghostscript doesn't use this gstate
at all.

The patch is committed as a rev. 8727.
Regression testing shows no differences.
Comment 10 Alex Cherepanov 2008-05-12 21:07:26 UTC
Profiling shows that most of the time is now goes into evaluation
of FunctionType 0. Perhaps, Leonardo can commend on the new profile.

  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 23.39     18.75    18.75   292343     0.00     0.00 
fn_Sd_1arg_linear_monotonic_rec
  8.97     25.94     7.19      564     0.01     0.14  interp
  6.86     31.44     5.50 22009901     0.00     0.00  fn_Sd_evaluate_general
  5.25     35.65     4.21 22009901     0.00     0.00  fn_interpolate_linear
  2.73     37.84     2.19 100041115     0.00     0.00  gx_unit_frac
  2.72     40.02     2.18  8251347     0.00     0.00  function_linearity
  2.11     41.71     1.69 12493371     0.00     0.00  is_dc_nearly_linear
  2.07     43.36     1.66  7294089     0.00     0.00  dstack_find_name_by_index
  2.00     44.97     1.61 96265099     0.00     0.00  fn_gets_8
  1.52     46.19     1.22 25715695     0.00     0.00  cmap_cmyk_direct
  1.47     47.37     1.18                             __divdi3
  1.42     48.51     1.14     3330     0.00     0.00  ht_tiles_reloc_ptrs
  1.24     49.50     0.99  2671943     0.00     0.00  gx_fill_trapezoid_ns_lc
  0.87     50.20     0.70 22202185     0.00     0.00  gx_restrict01_paint_3
  0.86     50.89     0.69   449342     0.00     0.00  gstate_clone
  0.80     51.53     0.64  2986374     0.00     0.00  names_ref
  0.79     52.16     0.63  6045239     0.00     0.00  get_color_index_cache_elem
  0.75     52.76     0.60  5252479     0.00     0.00  gc_trace
  0.71     53.32     0.57 96265099     0.00     0.00  data_source_access_string
  0.68     53.87     0.55   192081     0.00     0.00  fill_quadrangle
  0.68     54.41     0.55  9038352     0.00     0.00  dict_find
  0.62     54.91     0.50 25009847     0.00     0.00  gx_remap_DeviceCMYK
  0.61     55.40     0.49  4245114     0.00     0.00  gx_cspace_is_linear_in_line
  0.61     55.89     0.49  4161726     0.00     0.00  scan_token
  0.57     56.35     0.46 11725605     0.00     0.00  array_get
  0.56     56.80     0.45 22202185     0.00     0.00  gx_restrict01_paint_4
  0.55     57.24     0.44 25715659     0.00     0.00  fwd_map_cmyk_cs
  0.52     57.66     0.42  3995994     0.00     0.00 
gx_default_fill_linear_color_scanline
  0.50     58.06     0.40  4250542     0.00     0.00  init_gradient
  0.47     58.44     0.38  1108625     0.00     0.00  fn_Sd_1arg_linear_monotonic
  0.46     58.81     0.37  2894170     0.00     0.00  chunk_locate_ptr
  0.45     59.17     0.36 11285539     0.00     0.00  obj_eq
  0.43     59.51     0.35 12277041     0.00     0.00  igc_reloc_refs
  0.42     59.85     0.34  1528525     0.00     0.00  decompose_linear_color
  0.41     60.18     0.33  6045239     0.00     0.00  gs_cached_color_index
  0.41     60.51     0.33   510756     0.00     0.00  quadrangle_color_change
  0.40     60.83     0.32  1251747     0.00     0.00 
gx_cspace_is_linear_in_triangle
  0.39     61.14     0.31    23420     0.00     0.00  inflate_fast
  0.37     61.44     0.30 25715695     0.00     0.00  color_cmyk_to_rgb
  0.37     61.74     0.30    88276     0.00     0.00  gc_trace_chunk
  0.35     62.02     0.28  8081889     0.00     0.00  i_alloc_struct
  0.35     62.30     0.28  7244635     0.00     0.00  i_free_object
  0.35     62.58     0.28  5553158     0.00     0.00  refs_clear_reloc
  0.31     62.83     0.25  7147969     0.00     0.00  refs_clear_marks
  0.30     63.07     0.24    63007     0.00     0.00  gc_objects_compact
  0.29     63.30     0.24   192087     0.00     0.00  patch_fill
  0.27     63.52     0.22  1108625     0.00     0.00  get_scaled_range
  0.27     63.74     0.22   151283     0.00     0.00  gc_objects_clear_marks
  0.26     63.95     0.21 25715659     0.00     0.00 
gx_forward_get_color_mapping_procs
  0.25     64.15     0.20  2228571     0.00     0.00  gx_fill_trapezoid_ns_fd
  0.24     64.34     0.19   151283     0.00     0.00  gc_do_reloc
  0.24     64.53     0.19 13480160     0.00     0.00  zexec
  0.24     64.72     0.19 25715709     0.00     0.00  gx_default_rgb_map_rgb_color
  0.23     64.91     0.19   767372     0.00     0.00  triangle_by_4
  0.22     65.09     0.18  7290582     0.00     0.00  ztype
Comment 11 Alex Cherepanov 2008-07-16 23:18:52 UTC
Rev. 8842 makes gs about 40% faster than v. 8.53 on the
given sample file.