Summary: | increased rom size | ||
---|---|---|---|
Product: | Ghostscript | Reporter: | Henry Stiles <henry.stiles> |
Component: | General | Assignee: | Peter Cherepanov <sphinx.pinastri> |
Status: | RESOLVED LATER | ||
Severity: | enhancement | CC: | ghostpdl-bugs, marcos.woehrmann, robin.watts, sphinx.pinastri |
Priority: | P4 | Keywords: | bountiable |
Version: | master | ||
Hardware: | PC | ||
OS: | All | ||
Customer: | Word Size: | --- |
Description
Henry Stiles
2011-07-09 01:59:30 UTC
Since this customer is making a (low res) monochrome printer, using really minimal ICC profiles makes sense. I suggest that they convert back to ICC profiles that are that type. Most of that 434K should be reclaimed. As far as the CMS code, have they removed the imdi code ? (In reply to comment #1) > Since this customer is making a (low res) monochrome printer, using really > minimal ICC profiles makes sense. I suggest that they convert back to ICC > profiles that are that type. Most of that 434K should be reclaimed. > > As far as the CMS code, have they removed the imdi code ? The email to support of July 1 states the following configuration: FEATURE_DEVS=$(PSD)psl3.dev $(PSD)pdf.dev BAND_LIST_STORAGE=memory DEVICE_DEVS=$(DD)bmpmono.dev (all other DEVICE_DEVS* empty) GS_SHARED_OBJS= which does not include imdi. I'm sure other stuff could be removed though. For this particular customer, I think that "simple" icc profiles (without the space hog Multi-dimensional lookup tables) would suffice. This would correspond to the color quality of 8.71 (for the input ColorSpace Resources) and they only need a single 'defaultgray.icc' output profile. Such a low-res device can only approximate gray shades, much less subtleties of CMYK to gray conversion. That should address MUCH of the 436K additional due to iccprofiles. We can't do much about the new cms code other than make sure that we aren't including features we don't need (it does seem a little heavy for what we do with ICC profile based color conversion). It may be that this library is not very 'factored' for leaving out unused modules from the link. I assume that the SHA code is for AES decryption, so this can be left out, but you don't get new features for zero code. Not sure what the increase in the 'rop module' is about, but it doesn't seem like it should be included in the customer's PS/PDF build. Did ROP support creep in as a "standard" feature ? The memory locking code was always in there (for the gsmalloc.c usage if nothing else) so I don't understand why this could be an "increase". Dropping CMap's that are not required (if CJK support isn't needed) is a BIG easy win, and if they need the CMap's then prioritizing the project of converting the CMap files as Tor does for mupdf makes sense. Do we have any statement from the customer as to the goals of the rom size, and how much we are over ? From reviewing the email, it is not clear that the ROM size was an issue as much as the "resident RAM" consumption at the start of processing a job having increased by 2Mb without clarity as to how much was unpacked ROM code and how much is runtime RAM. For the 'ronfs' the design is such that it is pretty easy to link this into addressable ROM space so that Resources including the Font, CMap, iccprofile, Decoding, etc. files remain in ROM and don't occupy DRAM. Last year, Ray wrote some code in mkromfs to flatten all the gs postscript initialisation files into a single file, removing comments etc. This was never actually debugged and used. I have a fixed version locally, together with some more changes to allow other postscript files to be similarly 'compacted'. This saves 640K. Unfortunately, ps2write is broken with it; it reports a problem with opdfread.ps - which isn't a file in romfs, rather it lives in a C header. I'll talk to Ken about this on monday in case he has any ideas. There is scope for greater compression still; in rays code, we should replace \n with space in the compacted PS output (or space with \n). Consistently using a single whitespace char should help the flate stage. In my code, I should look at converting hex numbers to decimal, and hex strings to binary. Converting the hex strings in CMap files would definitely help. Without understanding the CMap format too well, I looked at the ones we distribute and the begincidrange and beginbfchar are the common cases, but while 16-bit values are common in most, some of them have longer strings (up to 128-bit) some way for us to represent these (hopefully without adding to the common 16-bit cases. I'd recommend that if mkromfs is made smarter to process CMaps (probably with a different option, such a -M) then changing the operator would be best (so that imported CMaps don't trip over anything). If we do this, we can still use the end___ operators unchanged, and also can omit the count that value before the begin (that we discard anyway). Since the CMap files are such a large size, even compressed, the extra effort if worth it (IMHO). With the reduction in CMaps (see 692376) we trim a bit over 2 Meg off the %rom% filesystem. mkromfs had its compaction further improved in commit 0eaf43f, and this is enabled in 'ascii' mode. The code knows how to use binary compaction too, but this is disabled at the moment pending removal of level 1 operation in the interpreter. I guess we can continue improving this but the customer should be notified the rom size is now within their requirements. The customer should be notified we have fixed the rom size to meet their requirement, then the bug can be downgraded to P4 and assigned back to Robin so he can enable the code discussed in comment 7. I've notified the customer, removed the customer number from the bug report, and changed the priority to P4. Assigning back to Robin per comment #9. I'll sit on this until Level 1 is removed from the interpreter then. CMap resources can be compacted by building a single "CMap collection" that stores multi-level differences from a base CMap. If a compact representation of CMap resources is still desirable, I can look into this problem. Currently, after compression, CMaps account for 1,968,026 bytes in Ghostscript out of the %rom% size of 8,848,736 bytes. The total 'minimal' executable (64-bit linux, only bbox and bit devices) is 16,343,440 bytes. We'll discuss this among the staff to see if it sounds like it is worth doing. We discussed this project, and Chris is going to write up some design and implementation guidelines. We discussed using the mupdf CMap approach that uses a python script to process CMaps to make use of one CMap from another for sections that are common (so those are only stored once). We'd have the tool in toolbin, and use it to build the result, which would get committed and pushed on the rather rare instance when we want to update a CMap. For various reasons, we'd rather be able to continue to use the "normal" CMap parsing implementation(s), and also avoid too many problems with Postscript programs that might dig around in Resource instances. mupdf uses a small group of python tools that together pull out common mappings from a number of CMaps, they get called from a top level shell script: https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/runcmapshare.sh The problem (for gs) is that the files created for shared mappings by the mupdf tools don't conform to the CMap spec - they have none of the Postscript infrastructure, they use begincidchar/endcidchar and have too many mappings (between the single begin/end operators). For gs, we'd like to follow a similar approach, but having those shared mappings defined in valid (custom named) CMaps that can be pulled into the "real" CMap files with usecmap. By doing that, and using cidranges (instead of cidchars) for the mappings, we ought to be able to get a decent improvement in the overall size, and continue using the existing CMap parsing. Given how rarely the CMap files need to change, we're not tied to having that CMap processing done at compile time (we can pre-process the CMap files only when they are updated), so if you want to start with the mupdf python implementation, and work forward from that, that would be fine. OTOH, if you feel a fresh, C implementation would be better, that would be fine, too. Even with a C implementation, I think I would opt not to include the CMap pre-processing in the build - i.e. we pre-process and commit the optimised CMap files to git when the CMaps need updated. Hopefully, that's clear enough - if not, feel free to ask questions (either here, or #ghostscript - do bear in mind, I'm in the UK, so timezones!). Lastly, if that turns out to be less interesting for you (Peter), please assign it back to me. Per discussion among engineering, marking this as bountiable, as well as assigned to Peter in case he wants to tackle it. Others should not work on this until checking with Artifex (Chris). |