Steps to reproduce: - Display a file in GSView. The Ghostscript version I used is current master (4a1159926a8094f19bcacddf0503b5a06edd9184) but an older master and ghostscript-9.03 (at 2711662696407b82c2d382ae85567a5bd67f2037) were crashing too, for the same reason. The GSView version is 4.6. - Either change some parameter, for example the resolution in order to zoom in, or drag another file on the GSView's window. - At this point GS crashes. From what I in found, the problem is not with GSView but with Ghostscript. Both GSView and the command line GS executable call the GSAPI functions, but gswin32c.exe does a single gsapi_init_with_args() + gsapi_exit(), while GSView does multiple such cycles. During a gsapi_exit() call, imain.c::gs_main_finit() does a ‘restore past the bottom’ in line #865 ‘code = alloc_restore_all(idmemory);’. The problem here is that 3 tables get freed, but dangling pointers to them remain in gs_lib_ctx. These tables are gs_name_table, io_device_table, and font_dir. I do not know if the new member profiledir is in the same situation (for example if COMPILE_INITS = 0). The crash happens during the next gsapi_init_with_args() because of the dangling io_device_table pointer. - imain.c::gs_main_init1() line #198 calls obj_init() and gs_iodev_init(), in this order. - during obj_init(), gs_setdevice_no_erase() get called, via gsdevice.c::gs_nulldevice() line #615. - ‘if (libctx->io_device_table != NULL’, gs_setdevice_no_erase() goes on with the ICC stuff, which ends with a crash while trying to access the non-existent io_device_table. I’ll attach my patch, which solves the crash by clearing the 3 pointers. What I don’t know/ I’m not sure about: - If the gs_lib_ctx->profiledir (and profiledir_len) may exhibit a similar problem for COMPILE_INITS = 0. - If the initialization order ‘obj_init() then gs_iodev_init()’ is OK, or it should be reversed. Obviously, some initialization is skipped for the initial null device; I have no idea if this initialization, at that point, is important or not.
Created attachment 7696 [details] A patch that solves the crash. Bug #692367: GS crashes when changing file/ resolution/ etc from GSView. Clear 3 pointers in gs_lib_ctx to tables that are freed during the 'restore past the bottom' done by gs_main_finit(). Previously, the dangling io_device_table pointer caused a crash during en eventual 2nd call to gsapi_init_with_args(). (The GS command line and GUI exes don't do this. GSView does, and so may do other GSAPI clients.)
(In reply to comment #1) > Created an attachment (id=7696) [details] > A patch that solves the crash. > > Bug #692367: GS crashes when changing file/ resolution/ etc from GSView. > > Clear 3 pointers in gs_lib_ctx to tables that are freed during the > 'restore past the bottom' done by gs_main_finit(). Previously, the > dangling io_device_table pointer caused a crash during en eventual > 2nd call to gsapi_init_with_args(). (The GS command line and GUI exes > don't do this. GSView does, and so may do other GSAPI clients.) I don't think gs_lib_ctx pointers should point to any memory in non stable local vm. We'll discuss it on IRC today. Thanks for looking into this SaGS
Maybe the type of memory in which these tables are allocated is not OK, I don’t know enough about the sources to confirm or infirm this. However I want to point out some things about this: (a) I’d rule out local-versus-global-VM difference, because the outermost restore affects both global and local VM. (b) It could be stable-versus-non-stable, yes, but see (c). (c.1) If these pointers are to survive gsapi_exit(), then I think the tables must be allocated during gsapi_new_instance() and not during gsapi_init_with_args() as they are now, or at least much earlier then they are now. Otherwise, the same PostScript code sent after the 1st gsapi_init_with_args() will be run in a different context compared to the same PS code sent after a 2nd one; for example the 1st won’t have the DoICCStuff() done for the initial null device because at that moment io_device_table is not yet allocated, while the 2nd will have the DoICCStuff() because it will already have io_device_table non-NULL from the 1st init+exit cycle. (c.2) If these pointers are not to survive gsapi_exit() (independent of why the tables are freed), then they need to be cleared at some point... (d) In comment #0 I mentioned COMPILE_INITS = 0, but the correct thing that could matter is a non-default ICCProfilesDir. I did not investigate this, and don’t know in which type of memory is allocated this string or if it also disappears during the ‘restore past the bottom’. Just want to say not to forget this one.
Yes we really do need a design discussion about the library context, we has been putting it off for too long.
We agree Chris will deliver a complete solution in 9.05 and SaGS workaround will go in 9.04, IRC log follows: <Robin_Watts> henrys: You were talking about bug 692367 right ? [10:09] <henrys> yes I was [10:10] <ray_laptop> calling init_with_args more than once clearly wasn't intended to be supported, but I guess we don't disallow it [10:14] <Robin_Watts> So the lib_ctx is allocated by us calling gs_lib_ctx_init on a gs_memory_t that doesn't already have a lib_ctx in it. <Robin_Watts> and that happens in gs_malloc_init <ray_laptop> the expected sequence is _new_instance ... init_with_args .... delete_instance (gsapi omitted) [10:15] <Robin_Watts> The lib_ctx has to live on (and keep itself valid) for as long as we have a gs_memory_t [10:16] <Robin_Watts> So it seems to me that either we need to be destroying the mem (and the lib_ctx) [10:19] <Robin_Watts> or we need to be destroying the lib_ctx and setting the pointer to it in the mem to NULL. <Robin_Watts> or we need to ensure that the lib_ctx's pointers are correct. <Robin_Watts> The latter seems the easiest route. <Robin_Watts> and sags patch seems to achieve that. [10:20] <henrys> I think we should use stable memory for those pointers and not need sags patch. <Robin_Watts> though it is horrible that we have to 'know' that the restore will have removed those pointers and fix them up later. [10:21] <Robin_Watts> so yes, you may be right. <ray_laptop> Robin_Watts: stable memory is still subject to GC, but that shouldn't hurt [10:22] <Robin_Watts> I have no feeling for the lifecycle of those pointers at all. <henrys> and multiple init_with_args I find quite non-intuitive there must be a better way of reseting the resolution which is what I think russel wants to do. <Robin_Watts> henrys: Do you find that offensive? [10:23] <ray_laptop> you don't have to init_with_args to reset resolution <Robin_Watts> It seems reasonable to me that I should be able to init the dll, then call it several times, for several jobs, then exit. <ray_laptop> but I suppose we don't say that init_with_args ... gsapi_exit ... gsapi_init_with_args is illegal <henrys> init means once to me. <ray_laptop> if he had done the gsapi_delete_instance then gsapi_new_instance it would have been fine [10:24] <ray_laptop> he (Russell) could just use 'put_params' to set the resolution, if that's all he wanted [10:26] <Robin_Watts> I'm new to this, so bear with me. <ray_laptop> but I think that when he 'opens' a new file he wants a fresh interpreter state so that there won't be anything "dangling" from the previous [10:27] <Robin_Watts> We call gsapi_new_instance to make a new instance. <ray_laptop> (GSView predates us having a working -dJOBSERVER) <Robin_Watts> We then call gsapi_init_with_args. [10:28] <Robin_Watts> If that's only intended to ever be called once, then why isn't that part of the gsapi_new_instance call? <ray_laptop> but what GSView is doing is rather moot at this point -- if we didn't forbid it, it should work (IMHO) <Robin_Watts> Anyway, if we move those pointers into stable memory, then the alloc_restore_all won't free them ? [10:29] <Robin_Watts> the lib_ctx will be left with pointers pointing to valid objects. [10:30] <Robin_Watts> Will they correctly get reused when gs is next entered? <henrys> Robin_Watts:I thought you could make other funtions like gsapi_init_with_file which would presumably use something like a .file but I don't know for sure why it isn't one function as you say. <Robin_Watts> (i.e. they won't just get overwritten and leak?) <Robin_Watts> or are you suggesting that in the same place as sags is currently blanking the pointers, we should free the objects (and blank the pointers) anyway? [10:31] <kens> Time for me to go. Night all. <Robin_Watts> night kens. have a good weekend. *** kens (~Miranda@46.208.20.220) has quit: Quit: kens <henrys> my view is we should explicitly free the pointers when the instance is destroyed if we agree 1 libctx per instance. [10:32] <ray_laptop> actually init_with_args is optional. You can call gsapi_run_string_begin and never call init_with_args <ray_laptop> and the reason init_with_args is separate is that we want people to be able to set up the stdio and display callbacks AFTER the instance is created, but before init [10:33] <henrys> oh okay that makes sense [10:34] <Robin_Watts> 1 lib_ctx per instance, yes. 1 lib_ctx per call to init_with_args, no. (I'm sure that's what you meant, I'm restating for my own clarity) <ray_laptop> henrys: putting those pointers in stable memory, then freeing them when we delete the instance makes sense to me <Robin_Watts> ray_laptop: Ah, ok. <henrys> now are we allowing multiple gsapi_exits per instance? [10:35] <ray_laptop> henrys: we didn't say you couldn't, and I thought that is what gsview was doing [10:36] <Robin_Watts> If exit is intended to 'bookend' init, then that's a very poor choice of words. [10:37] <ray_laptop> sort of moot now since we have a published API <chrisl> Robin_Watts, ray_laptop: shouldn't it be non_gc_memory rather than stable_memory for this? [10:38] <Robin_Watts> init matches with a fin. create matches with a destroy. begin matches with end. enter matches with exit. etc. <ray_laptop> but I like it better than the non-word PCL uses 'dnit' <henrys> chrisl:why would the pointer be subject to gc? We are concerned about save/restore? [10:39] <Robin_Watts> ray_laptop: We can always create a gs_api_finalise, that does the same as exit, and deprecate exit. <ray_laptop> chrisl: putting them in non_gc_memory is fine (and I rather like it) <chrisl> henrys: I thought you were talking about taking them *out* of the garbage collector's remit? <Robin_Watts> chrisl: That sounds better to me too. If we're expecting them to never be garbage collected, then why put them in garbage collectable memory. [10:40] <henrys> so stable non gc memory <chrisl> If we intend to explicitly free these pointers at shutdown, then I think that makes sense. [10:41] <ray_laptop> I'd have to see if the GC has any knowledge of them. I don't think it does since they aren't traceable from the 'root'. The pseudo-global gs_lib_ctx is invisible to the GC <henrys> ray_laptop:actually I think you and mvrhel2 already did that for iccdir. <ray_laptop> henrys: yes, and the more stuff we move to non_gc_memory, the better (IMHO) since it will make the GC run faster when it does run <chrisl> ray_laptop: If they are allocated from GC memory, but not accessible from "root" that's *bad*! [10:42] <ray_laptop> the two 'gotchas' with moving stuff in general to non_gc_memory is that each allocation is a separate one so we don't want lots of small allocations and we have to make sure we don't leak. [10:43] <henrys> who want to be the point person (victim) for this one? * Robin_Watts has disconnected. <ray_laptop> chrisl: yes, if they are in a 'chunk' owned by the GC, and they aren't traced, that is bad (assumed lost and are collected) [10:44] <henrys> we've agreed to allocate non gc stable memory, so this isn't an issue right? [10:45] <chrisl> ray_laptop: so the only thing that's saved us, probably, is that they are in the outermost global VM save level, which probably doesn't get swept until we're shutting down. <ray_laptop> chrisl: I don't see where/how the io_device_table (for one) is made known to the GC, but the init process does a GC call before the outermost save and it works. Try: gs -dNOOUTERSAVE -c 2 vmreclaim quit [10:49] <henrys> sags is correct that the outermost restore affect global vm right? That is the issue not gc? [10:50] <henrys> right? <ray_laptop> the outermost restore _does_ restore global AND local VM [10:51] <ray_laptop> henrys: that is intentional in the PS spec and design to allow for a jobserver to isolate VM state between jobs <ray_laptop> but if the tables are in non_gc_memory we have to make sure that they are not EVER traced by the GC [10:52] <chrisl> This is looking like a potentially quite de-stabilizing change so close to a release :-( [10:56] <ray_laptop> chrisl: I agree. <henrys> no this goes in next time. <henrys> next release [10:57] <ray_laptop> anybody have any idea why this didn't show up before ? i.e., does 8.71 fail ? <chrisl> Could we pull in sags's changes for 9.04? <henrys> chrisl:I wouldn't even do that all we have now is a gsview bug [10:58] <henrys> arguably if russel didn't exit and just used set params this would not have come up. [10:59] <ray_laptop> I think that the gs_main_finit change of SaGS' patch is safe enough <ray_laptop> henrys: I think the issue with GSView when a file is dragged to to GSView would do the same -- for that he wants a 'new' interpreter state [11:00] <chrisl> henrys: My worry is a lot of users, customers and potential customer are using/will use GSView as a simple way to play with Ghostscript, it's not a good show to crash, I think. * henrys cringes but won't argue <ray_laptop> but (IMHO) a delete_instance .. new_instance would be preferred * ray_laptop agrees with chrisl [11:01] <chrisl> ray_laptop: do you think we could get Russel to make that change and do a GSView release to approximately coincide with 9.04? That would also solve the problem...... <ray_laptop> but if the pointers are moved to stable_memory, then we haven't touched the GC aspect and the pointers won't get freed by the alloc_restore_all [11:02] <chrisl> ray_laptop: so we could do that 9.04 and do the moving out of GC memory altogether after? [11:03] <ray_laptop> henrys: the problem is that GSView is designed to use newer versions of GS, but there isn't a way to make older GS NOT use a newer one * henrys votes to assign the general problem to be fixed in 9.05 to chrisl as it is a system setup problem and requires postscript kung fu. <henrys> well at least understanding of ps memory * chrisl thinks he's being punished for something........ [11:04] <ray_laptop> chrisl: yes, stable_memory means that it will persist (and we will have to explicitly free the elements somewhere) <chrisl> henrys: okay, I just didn't want to have to try to solve it properly before 9.04. <ray_laptop> chrisl: but SaGS' patch will allow the objects to be freed (as now) and just make sure we know we have to reallocate them, so maybe it is the 'least change' -- no new leaks [11:05] ERC>
Resolved in: http://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=f62ce1 SaGS, thanks for the patch, but there was some discomfort about gs_main_finit() "just knowing" that these tables had been garbage collected during the outermost restore. So the solution I've opted for is that those tables now have "finalize" methods, and the finalize method sets the context pointers to NULL when their memory is released.