Before raising feature requests, I'd first like to explain what I intend to do. There might be much better ways to accomplish the same thing that I'm simply not aware of. I have the requirement to render logos (.eps) which primarily contain text into (.png/.gif) for display in webapplications at a high throughput. The current implementation runs on a Quadcore XEON >3GHz processor invoking multiple "gs" processes concurrently and having a cache of rendered images at hand. However I'm looking at ways to even improve the performance beyond that. The EPS files have Postscript Type 1 fonts embedded. I extracted the fonts and made Ghostscript aware of these by adding those to the Fontmap file resulting in smaller EPS files to parse. Extracting the fonts I didn't expect multiple "gs" process invocations to be faster, but I expected the gsapi to improve in performance because the gsapi could reuse rastered fonts from Postscript Type 1 outlines. Unfortunately that's not the case. On my Core2 Duo 2.5Ghz development machine I achieve about 7 eps->png conversion per second in a single threaded environment regardless of using gs processes, gsapi and extracted or included fonts. Only 10% performance difference. Without having done any profiling on Ghostscript I just expect rastering of Postscript Type 1 outlines in my environment being the most expensive task: EPS file including PS Type 1 font 220kb, EPS file w/o PS Type font 12kb. If those rastered fonts could be reused, I expect a dramatic performance boost. Maybe there are other ways to do that (e.g. use the Ghostscript Library and render to display buffers, grab the image, clear the page, render next image to buffer, grab the image, ..). What's the best way to do that? I noticed there are a couple of more things hampering ultimate performance: . "The exported gsapi_*() functions must be called from one thread only." . "At this stage, Ghostscript does not support multiple instances of the interpreter within a single process." With different gs processes I can distribute those among the available CPU cores. However I can't do the with the gsapi: The api can't be called from different threads and I can't even instantiate as many api instances for each available core (think of Core i7 with 8 hyperthreading cores). Will the gsapi support multiple threads or instances in the near future to support concurrency?
Please attach a few EPS files or try to run a profiler on your own. It is hard to guess what's the bottleneck. Do you restart Ghostscript for every EPS file you generate ?
Along with Alex's request to see the EPS files, I have a few other questions... We've seen that for simple files, using the gsapi interface to process the input files and generate the output is MUCH more efficient since it skips the initialization of the interpreter and graphics library. Please see bug 690352 http://bugs.ghostscript.com/show_bug.cgi?id=690352 in particular comment #18 http://bugs.ghostscript.com/show_bug.cgi?id=690352#c18 Thus, I am surprised that using the gsapi call didn't help the throughput. The other issue is that the gsapi call may not be utilizing the font cache and the FontDirectory in a way that allows for multiple jobs to avoid re-rendering glyphs. If a Font is flused from the FontDirectory by an 'end of job' restore, then the corresponding cache pair won't be used in the next job. Loading fonts in 'server level' will insure that they are persitent across jobs. Refer to the documentation on 'exitserver' and the job server loop for this, or let me know if you need "snapshot" code to load fonts persistently. Lastly, if the target resolution is known and the font sizes are consistent, a 'bitmap' (Type 3) font can speed up the font rendering since all of the hinting and 'fill' operations are pickled into the bitmap font. I suspect this isn't as much of an enhancement issue as one of "how to use Ghostscript in the most efficient manner for a particulare application" which is something I help our customers and users with often.
As I said I might be using ghostscript in a very inefficient batch mode and basically I'm looking for ways to facilitate the glyph cache. My benchmarks with these options "-dSAFER -q -dEPSCrop -dNOPAUSE -dBATCH -sDEVICE=pngalpha -r300" on a P8700 2,53 Core2Duo are these: EPS with two embedded fonts gs cmd 1 thread gs : 5,27 fps gs cmd 2 thread gs : 9,54 fps gs cmd 3 thread gs : 9,65 fps gs api 1 thread : 8 fps same EPS, but with two fonts in Fontmap file gs cmd 1 thread : 7,18 fps gs cmd 2 thread : 12,52 fps gs cmd 3 thread : 12,56 fps gs api 1 thread : 8 fps the gsapi calls are called in this order: gsapi_new_instance() for all jobs { gsapi_set_stdio() gsapi_init_with_args() gsapi_exit() } gsapi_delete_instance() Obviously this is not sufficient to use the glyph cache and my question boils down to which way to go: . use gsapi_set_display_callback() to grap rastered images and try to prevent new gsapi_init_wih_args() calls? . use the -dJOBSERVER option? I wasn't aware of this new Ghostscript >= 8.15 feature. Maybe it's exactly what I'm looking for. Is there sample test code? The usage documentation http://www.ghostscript.com/doc/current/Use.htm doesn't tell me whether I can use -sDEVICE=pngalpha in JOBSERVER mode. Will it wrap each batch into an own PNG envelop? Or do I need to get rastered images and use my own PNG export filters? http://bugs.ghostscript.com/show_bug.cgi?id=690352#c18 is interesting. Multiple gs processes are used to compensate the inability of the gsapi to handle multiple threads or instances. How are the different jobs delimited?
Did you try to pre-load the fonts into the memory? Ray is a better person to discuss the configuration options but many things depend on your EPS files and your fonts. We cannot even check your results without sample files. Please attach a few EPS files. You can mark the attachments "private" to restrict the access to the files to Artifex employees and contractors. Help us to help you.
You still haven't uploaded a sample EPS file for use to test with, but generally you will be able to cache glyphs and also avoid reloading fonts using the API calls differently. Use these options: "-q -dEPSCrop -dNOPAUSE -sDEVICE=pngalpha -r300 -dJOBSERVER" but you don't have a "-sOutputFile=___" or "-o ____", but I'll assume that you want each pngalpha in a separate file. As mentioned previously, using a single instance of Ghostscript allows data to be retained in the FontDirectory and font glyph cache for multiple jobs. To do this, you use slightly different calling sequence: ------------------------------------------------------------------------- gsapi_new_instance() gsapi_set_stdio() gsapi_init_with_args() /* The args are as above */ for all jobs { gsapi_run_string(minst, "<< /OutputFile (out.png) >> setpagedevice\n" /* set output file */ ".locksafe\n" /* enter SAFER mode */ "(in.eps) run\n" /* process the input file */ "false 0 startjob pop\n", /* start a new job which also */ /* resets to NOSAFER mode and */ /* allows OutputFile to be changed */ 0, &errcode); } gsapi_exit() gsapi_delete_instance() ------------------------------------------------------------------------- The -dBATCH is not used since the gsapi_exit() suffices to end the execution. The -dSAFER is not used since the OutputFile setting must be enabled for writing prior to locking the file permissions with ".locksafe" The "false 0 startjob pop" resets the state, including the SAFER mode to that before the job, allowing the next job to set OutputFile (which is not allowed in SAFER mode). As mentioned previously, any fonts that are to be persistent across jobs should be loaded in the job server VM (exitserver mode), which means that it follows a "true 0 startjob pop" and the setting of persistent VM is ended by a "false 0 startjob pop". For example to pre-load Helvetica, use: gsapi_run_string(minst, "true 0 startjob pop\n" /* exit to the job server */ "/Helvetica findfont pop\n" /* load Helvetica */ "false 0 startjob pop\n", /* start a new job */ 0, &errcode); following this, Helvetica (and any other fonts loaded) will be persistent in VM and won't require any searching. I expect that most of the performance improvement to come from not restarting Ghostscript each time, but preloading fonts that are commonly needed (once) couldn't really hurt since it is done once outside the "for all jobs" loop.
Note that either .locksafe ot .setsafe can be used in the previous example, but BOTH allow the input file to start a new job, such as with "false 0 startjob pop" and continue in NOSAFER mode. To run jobs more securely, one would have to use the '.runandhide' operator as described in doc/Language.htm#Miscellaneous so that the save object that restores to NOSAFER mode is inaccessible to the job. It sort of depends on how much isolation is wanted from malicious PostScript input files. Obviously, PDF files can't start new jobs, so this isn't an issue.
Peformance in this use case went up from 8fps -> 16.5fps doing the above suggested: > same EPS, but with two fonts in Fontmap file > gs api 1 thread : 8 fps I would have to modify my code to use "gs cmd" instead of "gsapi" to utilize multiple cores. I think this should meet my performance requirements. However just for curiosity: > If a Font is flused from the FontDirectory by an 'end of job' restore, > then the corresponding cache pair won't be used in the next job. Refer to > the documentation on 'exitserver' and the job server loop for this, or let me > know if you need "snapshot" code to load fonts persistently. Do I get the 8 -> 16fps performance improvement because of preloaded fonts and avoiding repeated initialization code? Or because it actually reuses rastered glyphs between jobs? I would have expected an even higher performance improvement from a glyph cache to be honest. Sorry for not yet having uploaded an EPS sample: This will take me some time - having to replace copyright fonts with free fonts and altering the EPS file itself to something that can be shared.
robustness vs. performance : certainly a tradeoff. Will implement both strategies (isolated gs cmds) and -dJOBSERVER. Live evaluation will show whether -dJOBSERVER is sufficient robust or our EPS files are sufficient well formed. Thanks again for pointing me to the "3.7.7 Job Execution Environment" in the Postscript Specification: http://www.adobe.com/products/postscript/pdfs/PLRM.pdf I wasn't aware of that feature and I think it makes my feature request obsolete and may be marked as WONTFIX. I also see that there is still stuff to implement for the Jobserver and lots of other things: http://www.ghostscript.com/doc/current/Projects.htm Maybe a generel "Performance HowTo" would have helped me a lot. Just pointing to the appropriate references. I recognize you give in bugzilla similar advises repeated times. The bleeding edge source code is no longer available to the public I guess: http://sourceforge.net/project/stats/detail.php?group_id=1897&ugn=ghostscript&type=cvs&mode=12months&year=2010 And forum activity is very poor: http://sourceforge.net/project/stats/detail.php?group_id=1897&ugn=ghostscript&type=forum&mode=60day&forum_id=0 (low frequency forums/mailing lists have the disadvantage that users can't help each other since there are just too few of them) Thanks again so far for your valuable suggestions.
Sorry, I just recognized the updated source code location: http://code.google.com/p/ghostscript/source/checkout
The bleeding edge source code continues to be available via our source code repository at svn.ghostscript.com (svn). You can set up a local version using: svn checkout http://svn.ghostscript.com/ghostscript/trunk/gs my_local_gs where 'my_local_gs' is the directory for the top level gs then: svn update will update your local gs sources to the latest and greatest. The active discussions are mostly on IRC at irc.ghostscript.com #ghostscript and on the gs-devel mailing list (see ghostscript.com for links. I'll keep this open to remind me to collect some of the 'how to' in a doc. Thanks for your willingness to change your modus operandi to get the best performance. I'll also still look into the answer to the font cache issue.
We probably won't be able to dig into the font cache performance issue anytime soon, so I am closing this bug. A question to gs-devel or #ghostscript IRC might help, but I suspect that performance analysis will need to be done by the submitter or someone else that cares about this. Sometimes performance can be enhanced by turning off garbage collection (with -dNOGC) and disabling IdiomRecognition with: -c "<< /IdiomRecognition false >> setuserparams" One final note before closing this bug: Multiple core rendering _may_ allow some performance enhancement. We have found that the 'PNG' encoding is a significant performance hit, so scattering pages of 'raw' raster (on a ram disk) to each be converted to PNG may help utilize parallel CPU's. If this helps, and multiple CPU's are desired to render each page faster, then the -dNumRenderingThreads= parameter can be used (probably with -dMaxBitmap= and -dBufferSpace= parameters to force clist rendering into an appropriate number of bands so that each band is one CPU. It is important that the PNG compression be run in more than one CPU since this is often a bottleneck, so multi-threaded rendering from the clist doesn't really help. Closing since we have improved things quite a bit and this is really seems to be a specific issue for this user.