Bug 692023 - Possible thread issue. . .
Summary: Possible thread issue. . .
Status: RESOLVED WORKSFORME
Alias: None
Product: MuPDF
Classification: Unclassified
Component: mupdf (show other bugs)
Version: unspecified
Hardware: PC Windows 7
: P4 normal
Assignee: Tor Andersson
URL:
Keywords:
: 692162 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-03-04 14:40 UTC by Pedro Rivera
Modified: 2011-08-23 01:03 UTC (History)
2 users (show)

See Also:
Customer:
Word Size: ---


Attachments
JMuPdf Source Files (75.74 KB, application/octet-stream)
2011-03-04 14:44 UTC, Pedro Rivera
Details
Updated JMuPdf Sources (28.51 KB, application/octet-stream)
2011-03-07 13:11 UTC, Pedro Rivera
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Pedro Rivera 2011-03-04 14:40:35 UTC
I currently use MuPdf in an open source project called JMuPdf. It is a great tool. Thanks.

Basically JMuPdf is a Java library with JNI interfaces to MuPdf. The only change I  made to MuPdf was to res_pixmap.c.

I came across a snag when stress testing mutli-threaded processing. I wrote a function to create TIF files from PDF documents. When creating them one at a time I have no problems. If I kick off multiple processes, say 20 threads the application crashes.

I was able to overcome this issue by implementing thread locking in my C code. I was able to narrow it down to one function call, fz_executedisplaylist().

What I did was the following:

	(*env)->MonitorEnter(env, obj);
		fz_executedisplaylist(page->list, dev, ctm);
	(*env)->MonitorExit(env, obj);

With the code above I can process as many threads as memory permits without issues. If I take out the Monitor...() lines the application crashes.

My question is why does this happen? I looked at the code for fz_executedisplaylist() and can't see anything that would cause it to crash under a multi-threaded process.

I have attached a zip file with all the source. The ones to look at are (1) pixels.c from the JNI source and (2) Test07.java and Test09.java.

I am still learning how to use MuPdf toolkit so I may be doing something wrong. Any suggestions are more than welcome.

Thanks again,

Pedro
Comment 1 Pedro Rivera 2011-03-04 14:44:39 UTC
Created attachment 7323 [details]
JMuPdf Source Files
Comment 2 zeniko 2011-03-04 16:01:54 UTC
MuPDF isn't thread safe, so you'll have to make sure not to share any data at all between two concurrent threads. E.g.:

>	fz_device *dev = fz_newdrawdevice(jmupdf->glyphcache, pix);

glyphcache isn't thread safe. Using different glyph caches per thread should fix your crash.
Comment 3 zeniko 2011-03-04 16:06:13 UTC
BTW: One of your changes to res_pixmap.c contains an exploitable integer overflow. Make sure not to use JMuPdf to read/convert documents from untrusted sources!
Comment 4 Pedro Rivera 2011-03-04 18:23:21 UTC
(In reply to comment #3)
> BTW: One of your changes to res_pixmap.c contains an exploitable integer
> overflow. Make sure not to use JMuPdf to read/convert documents from untrusted
> sources!

Thanks for the quick reply. I will make the changes you mentioned. I will let you know how it goes.

Thanks!
Comment 5 Pedro Rivera 2011-03-07 13:09:29 UTC
(In reply to comment #2)
> MuPDF isn't thread safe, so you'll have to make sure not to share any data at
> all between two concurrent threads. E.g.:
> 
> >	fz_device *dev = fz_newdrawdevice(jmupdf->glyphcache, pix);
> 
> glyphcache isn't thread safe. Using different glyph caches per thread should
> fix your crash.

I went ahead and made that change. It didn't work. Although I found an issue with my xref_array.c program while testing. The way I was handling the "pdf_handle_array" array was causing issues so I changed it to init a fixed size array instead of a "dynamic" one. As documents are closed I then reuse elements. This solved some other threading issues I was having, so now I am getting much better throughput. The only line left that I can't seem to figure out is in pixels.c in the getPixMap() function. Here is the change I made:

	/*
	 * Draw image
	 */
	fz_glyphcache * glyphcache = fz_newglyphcache();
	fz_device *dev = fz_newdrawdevice(glyphcache, pix);

	if ((*env)->MonitorEnter(env, obj) == JNI_OK)
	{
		fz_executedisplaylist(page->list, dev, ctm);
		(*env)->MonitorExit(env, obj);
	}

Again, if I remove the Monitor...() functions the application crashes. I believe that no data sharing is happening between threads when this function is called.

I noticed that their was a new version of MuPdf toolkit. So I upgraded and retrofitted my code. I retrofitted res_pixmap.c and pixels.c where  fz_clearpixmap() changed to fz_clearpixmapwithcolor(). That's all that changed on upgrading to 0.8. It didn't make a difference though in regards to the threading issue. I am stumped. 

I did some research regarding accessing global variables in a multi-threaded environment. It seems like it's ok as long as I don't try to update the same element. The "page" (pdf_page) variable is cached in the PDF_HANDLE item and discarded only when another page is accessed or the the handle to a document is closed. 

I have included another zip file with just the JNI code and the one MuPdf program I modified. The java code didn't change.

Again I really appreciate all your help,

Pedro
Comment 6 Pedro Rivera 2011-03-07 13:11:09 UTC
Created attachment 7328 [details]
Updated JMuPdf Sources
Comment 7 Tor Andersson 2011-03-07 22:08:35 UTC
One of the few global variables in mupdf is the FT_Library (fz_ftlib) in res_font.c.

I don't know if this may be the cause of your troubles, but it may be worth looking into.
I have less knowledge about some of the other third party libraries, but libjpeg, zlib and
jbig2dec should be safe. I can't vouch for openjpeg though, I'd have to study the source.
Comment 8 Pedro Rivera 2011-03-07 22:52:26 UTC
(In reply to comment #7)
> One of the few global variables in mupdf is the FT_Library (fz_ftlib) in
> res_font.c.
> 
> I don't know if this may be the cause of your troubles, but it may be worth
> looking into.
> I have less knowledge about some of the other third party libraries, but
> libjpeg, zlib and
> jbig2dec should be safe. I can't vouch for openjpeg though, I'd have to study
> the source.

Thanks for pointing that out. I looked at the source and in freetype.h I found this comment where *FT_Library is defined:

"For multi-threading applications each thread should have its own FT_Library object."

So it seems that FT_Library is not thread safe. I'll study the code res_font.c to see how I can solve this problem. 

Thanks again,

Pedro
Comment 9 Pedro Rivera 2011-03-08 03:17:27 UTC
I looked over res_font.c and because of the way it is written I will have to wrap the fz_executedisplaylist() function around the Monitor...() synchronization functions provide by java jni interface. I can live with this for now but it is a big bottleneck.

res_font.c program uses two global variables fz_ftlib and fz_ftlib_refs which are not thread safe. Also I see that fz_ftlib is passed around to other functions possibly updating some of its properties.(?)

At this point there is not much that can be done unless res_font.c is modified so that an FT_Library instance is passed in instead of being declared globally. I don't know how to solve this as I am still figuring out mupdf library. 

In a multi-thread environment one would normally process a unique document handle per thread. Maybe passing in an FT_Library instance to fz_newdrawdevice() could work. One could code something like:

	fz_glyphcache *glyphcache = fz_newglyphcache();
        fz_freetype *ftlib = fz_newfreetype();
	fz_device *dev = fz_newdrawdevicewithfreetype(glyphcache, pix, ftlib);
	fz_executedisplaylist(page->list, dev, ctm);

If fz_newdrawdevice() is used then it could use the globally defined ftlib.

Just thoughts.

Thanks again,

Pedro
Comment 10 Tor Andersson 2011-03-08 09:29:53 UTC
The FT_Library instance is used when creating the FT_Face font objects. As such, the fz_font struct is not thread safe at the moment. You could try putting the FT_Library global in thread local storage, if that is available when using the JNI?
Comment 11 Pedro Rivera 2011-03-08 13:46:46 UTC
(In reply to comment #10)
> The FT_Library instance is used when creating the FT_Face font objects. As
> such, the fz_font struct is not thread safe at the moment. You could try
> putting the FT_Library global in thread local storage, if that is available
> when using the JNI?

I made the following change to res_font.c :

    static __thread FT_Library fz_ftlib = nil;
    static __thread int fz_ftlib_refs = 0;

I reran my tests and it worked like a champ. I converted 1,000 documents multi-threaded with no bottle necks. It ran in 40 seconds compared to ~2 minutes.

Maybe that simple change should make into the library as a permanent fix(?)

I tested this change under Windows, Debian and Ubuntu and it worked very well.

Thank you,

Pedro
Comment 12 zeniko 2011-03-08 20:18:03 UTC
(In reply to comment #11)
> I made the following change to res_font.c :
> 
>     static __thread FT_Library fz_ftlib = nil;
>     static __thread int fz_ftlib_refs = 0;

This wouldn't be compatible with how we currently use MuPDF: We load documents on one thread, render them on a second and search them on a third (all access being synchronized per pdf_xref i.e. per document instance). This leads to loading FreeType on one thread and trying to free it on another. Instead we'd rather need loading FreeType per document/pdf_xref.
Comment 13 Pedro Rivera 2011-03-08 21:31:17 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > I made the following change to res_font.c :
> > 
> >     static __thread FT_Library fz_ftlib = nil;
> >     static __thread int fz_ftlib_refs = 0;
> 
> This wouldn't be compatible with how we currently use MuPDF: We load documents
> on one thread, render them on a second and search them on a third (all access
> being synchronized per pdf_xref i.e. per document instance). This leads to
> loading FreeType on one thread and trying to free it on another. Instead we'd
> rather need loading FreeType per document/pdf_xref.

I agree. It would be better to have FT loaded on a per xref basis with the condition that it only loads it if needed when rendering. I have some processes that is purely text extraction.
 
Also, Isn't FreeType only loaded when processing font files while rendering a page? In this case it would only load and unload on your second thread. Unless of course your second thread isn't persistent for future rendering. I apologize if I misunderstand.

In my case I have multiple threads rendering pages from different documents, e.g typical server side processing. Therefore using __thread or loading FT at xref level would have the same result under this scenario. 

In the meantime can I suggest a happy medium? :) 

#ifdef _FT_LIB_NEED_TLS
     static __thread FT_Library fz_ftlib = nil;
     static __thread int fz_ftlib_refs = 0;
#else
     static FT_Library fz_ftlib = nil;
     static int fz_ftlib_refs = 0;
#endif


Thanks,

Pedro
Comment 14 Pedro Rivera 2011-08-23 00:57:31 UTC
I'm closing this out as I have solved several threading issues by installing various handlers in the MuPDF code. Although I would love to see the library be entirely thread safe, maybe there should be a "wish" list where one could drop legitimate ideas of how to get there. 

In my research I have identified the following programs as being thread unsafe:

1. draw_edge.c       
2. res_font.c
3. base_error.c
4. res_pixmap.c
5. draw_unpack.c

It seems to me that making it them thread safe would not be too difficult. Adding a few more items to struct pdf_xref_s could work:

/* Manage freetype library */
typedef struct fz_free_type_s fz_free_type;
struct fz_free_type_s {
	FT_Library library;
	int refs;
};

/* Manage warning messages */
typedef struct fz_warn_msg_s fz_warn_msg;
struct fz_warn_msg_s {
	char message[LINE_LEN];
	int count;
};

/* Manage error messages */
typedef struct fz_error_msg_s fz_error_msg;
struct fz_error_msg_s {
	char message[LINE_COUNT][LINE_LEN];
	int count;
};

struct pdf_xref_s
{
  ...
  ...
  fz_free_type *free_type;
  fz_warn_msg *warn_msg;
  fz_error_msg *error_msg;
}

+++
Comment 15 Pedro Rivera 2011-08-23 01:03:16 UTC
*** Bug 692162 has been marked as a duplicate of this bug. ***