The customer reports and I've verified that Ghostscript 8.60 (and head (r8257)) are much slower at processing the test file than Ghostscript 8.53. On my AMD64 Linux box gs8.60 takes 10 minutes vs. 30 seconds for gs8.53. The file is too large to attach to this bug: casper:/home/support/689488/hartboom.pdf The command line I used for testing: bin/gs -sDEVICE=ppmraw -sOutputFile=test.ppm ./hartboom.pdf
Created attachment 3424 [details] profile.txt Customer supplied profile for gs8.60
Created attachment 3425 [details] patch This patch fixes the problem.
Please ignore comment #2 (the patch is for a different bug).
I attempted to find the revision responsible for the increase in processing time but my results were not definitive: rev run time (minutes:seconds) 6375 0:36 7004 0:36 7005 4:30 7158 4:34 7159 7:08 7187 7:18 7188 doesn't build 7189 8:56 7902 9:51 7903 10:48 8257 10:21
The change made in the rev. 7005 is the most significant factor. Reverting this change speeds in the current version gives 7 times speed improvement.
My testing with reverting the 7005 change is more significant: 8.60 stock: 9:32 8.60 without r7005: 0:33 There is no difference in the output.
Created attachment 3431 [details] patch for gs8.60
The gstate is placed into the 2nd element of the implementation array for a purpose - to simulate Adobe interpreters for CET compliance. It is not used anywhere by Ghostscript. We can place the same gstate (e.g. saved during the start-up) to all Implementation arrays.
Created attachment 4006 [details] patch Reduce generation of garbage and the time spent on garbage collection by placing gstate into the 2nd element of the patterm implementation array in Adobe compatibility mode only. Ghostscript doesn't use this gstate at all. The patch is committed as a rev. 8727. Regression testing shows no differences.
Profiling shows that most of the time is now goes into evaluation of FunctionType 0. Perhaps, Leonardo can commend on the new profile. % cumulative self self total time seconds seconds calls s/call s/call name 23.39 18.75 18.75 292343 0.00 0.00 fn_Sd_1arg_linear_monotonic_rec 8.97 25.94 7.19 564 0.01 0.14 interp 6.86 31.44 5.50 22009901 0.00 0.00 fn_Sd_evaluate_general 5.25 35.65 4.21 22009901 0.00 0.00 fn_interpolate_linear 2.73 37.84 2.19 100041115 0.00 0.00 gx_unit_frac 2.72 40.02 2.18 8251347 0.00 0.00 function_linearity 2.11 41.71 1.69 12493371 0.00 0.00 is_dc_nearly_linear 2.07 43.36 1.66 7294089 0.00 0.00 dstack_find_name_by_index 2.00 44.97 1.61 96265099 0.00 0.00 fn_gets_8 1.52 46.19 1.22 25715695 0.00 0.00 cmap_cmyk_direct 1.47 47.37 1.18 __divdi3 1.42 48.51 1.14 3330 0.00 0.00 ht_tiles_reloc_ptrs 1.24 49.50 0.99 2671943 0.00 0.00 gx_fill_trapezoid_ns_lc 0.87 50.20 0.70 22202185 0.00 0.00 gx_restrict01_paint_3 0.86 50.89 0.69 449342 0.00 0.00 gstate_clone 0.80 51.53 0.64 2986374 0.00 0.00 names_ref 0.79 52.16 0.63 6045239 0.00 0.00 get_color_index_cache_elem 0.75 52.76 0.60 5252479 0.00 0.00 gc_trace 0.71 53.32 0.57 96265099 0.00 0.00 data_source_access_string 0.68 53.87 0.55 192081 0.00 0.00 fill_quadrangle 0.68 54.41 0.55 9038352 0.00 0.00 dict_find 0.62 54.91 0.50 25009847 0.00 0.00 gx_remap_DeviceCMYK 0.61 55.40 0.49 4245114 0.00 0.00 gx_cspace_is_linear_in_line 0.61 55.89 0.49 4161726 0.00 0.00 scan_token 0.57 56.35 0.46 11725605 0.00 0.00 array_get 0.56 56.80 0.45 22202185 0.00 0.00 gx_restrict01_paint_4 0.55 57.24 0.44 25715659 0.00 0.00 fwd_map_cmyk_cs 0.52 57.66 0.42 3995994 0.00 0.00 gx_default_fill_linear_color_scanline 0.50 58.06 0.40 4250542 0.00 0.00 init_gradient 0.47 58.44 0.38 1108625 0.00 0.00 fn_Sd_1arg_linear_monotonic 0.46 58.81 0.37 2894170 0.00 0.00 chunk_locate_ptr 0.45 59.17 0.36 11285539 0.00 0.00 obj_eq 0.43 59.51 0.35 12277041 0.00 0.00 igc_reloc_refs 0.42 59.85 0.34 1528525 0.00 0.00 decompose_linear_color 0.41 60.18 0.33 6045239 0.00 0.00 gs_cached_color_index 0.41 60.51 0.33 510756 0.00 0.00 quadrangle_color_change 0.40 60.83 0.32 1251747 0.00 0.00 gx_cspace_is_linear_in_triangle 0.39 61.14 0.31 23420 0.00 0.00 inflate_fast 0.37 61.44 0.30 25715695 0.00 0.00 color_cmyk_to_rgb 0.37 61.74 0.30 88276 0.00 0.00 gc_trace_chunk 0.35 62.02 0.28 8081889 0.00 0.00 i_alloc_struct 0.35 62.30 0.28 7244635 0.00 0.00 i_free_object 0.35 62.58 0.28 5553158 0.00 0.00 refs_clear_reloc 0.31 62.83 0.25 7147969 0.00 0.00 refs_clear_marks 0.30 63.07 0.24 63007 0.00 0.00 gc_objects_compact 0.29 63.30 0.24 192087 0.00 0.00 patch_fill 0.27 63.52 0.22 1108625 0.00 0.00 get_scaled_range 0.27 63.74 0.22 151283 0.00 0.00 gc_objects_clear_marks 0.26 63.95 0.21 25715659 0.00 0.00 gx_forward_get_color_mapping_procs 0.25 64.15 0.20 2228571 0.00 0.00 gx_fill_trapezoid_ns_fd 0.24 64.34 0.19 151283 0.00 0.00 gc_do_reloc 0.24 64.53 0.19 13480160 0.00 0.00 zexec 0.24 64.72 0.19 25715709 0.00 0.00 gx_default_rgb_map_rgb_color 0.23 64.91 0.19 767372 0.00 0.00 triangle_by_4 0.22 65.09 0.18 7290582 0.00 0.00 ztype
Rev. 8842 makes gs about 40% faster than v. 8.53 on the given sample file.