| <<<Back 1 day (to 2012/02/11) | 2012/02/12 |
notzed | morning, i know it | 00:30.40 |
| s a long shot, but any mupdf devs about? | 00:30.46 |
Robin_Watts | I'm here. | 00:47.34 |
| tor8 and sebras may be. | 00:47.34 |
sebras | Robin_watts: are you there? | 01:05.07 |
Robin_Watts | I am. | 01:05.13 |
sebras | Robin_Watts: yes! :) | 01:05.22 |
| so with the locking in place, do we actually still require that display lists be used in multi-threaded environments? | 01:05.52 |
| what does the locking actually protect? | 01:06.08 |
| despite your ridiculously long commit comment I'm still puzzled. :) | 01:06.41 |
Robin_Watts | sebras: In theory, no. You can now multiple threads that all read from the file. | 01:06.56 |
| BUT... they will all contend, so it'd be a poor use of a multicore system. | 01:07.15 |
sebras | sure, but it should work! | 01:07.23 |
Robin_Watts | There are 4 locks. | 01:07.25 |
sebras | I created a simple program using pthreads. | 01:07.39 |
Robin_Watts | Lock 1: Protects calls to the user supplied allocation functions. | 01:07.40 |
sebras | it crashes spectacularly. | 01:07.53 |
Robin_Watts | (so 2 threads can't call malloc at once). | 01:07.58 |
| Lock 2: Internal one used for store stuff. | 01:08.05 |
sebras | specifically I'm wonding about the file position. | 01:08.08 |
Robin_Watts | Lock 3: File lock. | 01:08.20 |
sebras | is it reset before/after each blocking read? | 01:08.20 |
| s/wonding/wondering/ | 01:08.36 |
| and Lock 4: glyphcache... I know. :) | 01:08.57 |
Robin_Watts | I have asserts in the code that verify that the file lock is taken before every seek. | 01:08.59 |
| and held during every read. | 01:09.06 |
sebras | ok, and is it the case that mupdf does a fz_seek before each fz_read? | 01:09.23 |
| I think not, and I believe it may be the problem. | 01:09.37 |
Robin_Watts | It can: seek, read,read,read | 01:09.42 |
| and we hold the lock throughout that time. | 01:09.52 |
| If we lock, seek, read, unlock, lock, read, unlock then there would be a problem. | 01:10.14 |
sebras | exactly. | 01:10.26 |
Robin_Watts | But unless I've cocked up (which is very possible), that should never happen. | 01:10.34 |
sebras | so my question is -- do we? :) | 01:10.38 |
| http://pastebin.com/WNXM07FT | 01:11.28 |
Robin_Watts | I guess we could add another flag to fz_stream to say we've seeked. Clear that when we take a lock. And then verify on fz_read that it's set. | 01:11.39 |
sebras | running that program in valgrind causes a SIGSEGV. | 01:12.16 |
| on a file with 18 pages. | 01:12.29 |
| on a file with 1 page, everything is fine. | 01:12.35 |
Robin_Watts | sebras: OK. Can you mail me that? I'll give it a whirl and try and add some more debugging to my locking etc. But it won't be until at least monday, possibly later, as I'm supposed to be helping with a gs customer bug. | 01:13.34 |
sebras | of course. | 01:13.53 |
Robin_Watts | Thanks. | 01:14.02 |
sebras | I'll try adding a seekflag and see if it triggers an assert. | 01:14.20 |
| if so we know what the problem is. | 01:14.25 |
Robin_Watts | It's possible that I've dropped and retaken the file lock somewhere. | 01:14.27 |
| or maybe I need a write lock on the xref stuff as well. | 01:15.11 |
sebras | when it's being repaired..? | 01:15.32 |
Robin_Watts | But your example should help find the problem - thanks. | 01:15.35 |
sebras | that's not the case in this situation. | 01:15.41 |
Robin_Watts | When new objects are being read in. | 01:15.53 |
sebras | have you tested multi-threading? | 01:16.15 |
| or am I the first one? :) | 01:16.21 |
Robin_Watts | We have a mupdf customer that wants to make stuff work on multi-core systems. | 01:16.40 |
sebras | right. | 01:17.00 |
Robin_Watts | They were the driving force behind this latest batch of work. | 01:17.01 |
| And they are happy with what they have now. | 01:17.11 |
sebras | oh. then I'm probably doing something wrong. | 01:17.27 |
Robin_Watts | (But maybe they aren't tickling it in quite the same way you are) | 01:17.31 |
| I suspect it's not the file lock at fault. | 01:18.27 |
| I suspect it's two threads trying to access a shared piece of memory at the same time. | 01:18.45 |
| the file lock going wrong would give parsing errors or a deadlock, not a SEGV. | 01:19.09 |
| But I must go to bed now, or my missus will kill me. | 01:19.30 |
sebras | ok. nn. | 01:19.40 |
Robin_Watts | Thanks. Night. | 01:19.46 |
sebras | Robin_Watts: there is not a seek everywhere. | 01:22.36 |
| Robin_Watts: I clear the flag in fz_new_stream() and fz_lock_stream(), and set it at the end of fz_seek(). and then I put an assert for it to be set at the start of fz_read(). | 01:24.01 |
| the asset is triggered... | 01:24.08 |
| and it seems to be when we're loading the stream-contents. | 01:24.43 |
| Robin_Watts: you has mailz. | 02:36.58 |
notzed | ahh, someone is here. hi. i've been working on a binding for mupdf: I noticed a lot of changes recently, do you expect those to continue for a while, or when is the `api' likely to stabilise a bit? | 03:40.13 |
| it's not particularly important, i just don't want to waste time tracking a fluid situation. | 03:41.06 |
sebras | notzed: the changes are preparations for the mupdf 1.0 release. | 10:18.51 |
| notzed: 1.0 is scheduled and the end of feb release with the intent of having a more stable interface than before. | 10:22.25 |
notzed | oh nice, thanks. | 10:28.56 |
filmor | Robin_Watts, now that you redid the locking, is a fz_document_clone possible? :) | 12:06.46 |
| thanks btw | 12:06.53 |
Robin_Watts | filmor: The problem that the document is still associated with a single context still exists. | 12:14.09 |
| BUT... if I can convince myself that the document only uses elements from the context that don't change per cloned context (i.e. locks, store, allocator), then it may be the case that fz_document_clone is no longer necessary. | 12:15.23 |
filmor | that would be fine with me ;) | 12:16.49 |
| if you like I can go through the code | 12:17.07 |
Robin_Watts | no, of course that's not true, as it uses the exception stacks in the code. | 12:27.59 |
| s/code/context/ | 12:28.06 |
| That may be sebras' problem too. | 12:28.53 |
| I think the rule has to be that the 'fz_document' (or pdf_document) can only be used from one thread at a time. | 12:30.50 |
| Unless we pull context out of document - and that's a lot more passing of context around the code. | 12:31.28 |
| I need to think that over and discuss it with tor8. | 12:31.41 |
filmor | passing the context around shouldn't be that big of a problem | 12:36.20 |
| and except of the exception stack I don't see any use that isn't locked anymore, so it's kind of wasted potential :) | 12:36.53 |
sebras | Robin_Watts: since the file posisition is associated with the fd which is part of the stream inside a document, yes that is very likely. | 13:01.38 |
tor8 | sebras, Robin_Watts: each stream object is associated with a context as well. that may also cause trouble if you access a file from more than one thread. | 13:05.02 |
| disassociating the stream objects from a context should be easy enough, but I really don't like the idea of passing a fz_context as well as the pdf_document to all the pdf functions :( | 13:05.39 |
| so the benefit of decoupling the file streams isn't obvious. I'd rather have a fz_clone_stream or fz_reopen_stream to use streams from multiple threads | 13:06.30 |
| after all, the whole idea of a file is based on reading a stream from one thread | 13:06.47 |
| another blocker for pdf_clone_document is that the fz_obj objects are linked to the original document for indirect reference resolving, and not having to pass around a fz_context | 13:07.49 |
sebras | tor8: ok, then currently loading pages from more than one thread would be a problem. | 13:19.53 |
tor8 | sebras: yes, because the pdf_document has one context, with one exception stack | 13:20.34 |
sebras | tor8: but that still doesn't explain why my MT-testcase that loads all pages in one thread and creates display lists that are then rendered in each thread crashes... | 13:20.38 |
tor8 | quite. that one should work. do the test files have type 3 fonts by any chance? | 13:21.05 |
sebras | I don't think so. | 13:21.18 |
tor8 | because that's the only place I know we go back from fitz into the pdf interpreter | 13:21.31 |
sebras | nope, no type3. only type1 and truetype. | 13:21.50 |
| I revise my statement, I get warnings: | 13:23.42 |
| warning: assert: overwrite hash slot | 13:23.44 |
| warning: assert: remove inexistent hash entry | 13:23.48 |
| but it doesn't segfault. | 13:23.55 |
tor8 | looks like race condition or bad locking when protecting the store or glyph cache | 13:24.15 |
sebras | its in fz_hash_insert. | 13:24.30 |
tor8 | the store and the glyph cache are the two users of fz_hash_table | 13:24.51 |
sebras | why are the warnings not assertions in base_hash.c btw? | 13:27.54 |
| it's a problem in the store. | 13:29.55 |
| tor8: I have an idea -- what if two processes wanted to cache the same object, a fontdesc in this instace, twice in the store. then that object would have the same key and be stored in the same place in the hash..? | 13:38.21 |
| I belive that this is what's happening. a quick look revealed that the new resource to store is a font, as is the existing object at with the same key already in the hash. moreover size, ascent, descent, cap_height match exactly. | 13:44.10 |
tor8 | right, I think I saw a comment in a commit somewhere to the effect that there could be a "race to cache objects" where the last one to the store tossed his copy and used the older one | 13:51.19 |
| sebras: a fontdesc, however, shouldn't be cached by the load-in-one-thread-render-in-many example | 13:51.49 |
| shouldn't be cached by multiple threads, that is | 13:51.59 |
| since the fontdescriptors are used by pdf_run_page, not fz_run_display_list | 13:52.28 |
sebras | the code in fz_hash_insert is run by pdf_load_font. | 13:53.17 |
| which is executed inside the rendering thread when it is loading/drawing the page. | 13:54.12 |
| agh, the latency is unbearable just now. sorry if I mistype... :) | 13:54.15 |
| http://pastebin.com/TuQaGYxc | 13:57.11 |
| I added abort()s at the fz_warn("assert:...")s. | 13:57.57 |
| when I look at the key they are of course the same, but when I look at the font descriptors stored at the last line of pdf_load_font() they are also the same. | 13:59.01 |
| oh and this is mt-load-and-draw-in-threads.c | 13:59.31 |
tor8 | sebras: didn't we just arrive at the conclusion that mt-load-and-draw-in-threads.c wouldn't work? | 14:03.55 |
sebras | d'oh! | 14:04.17 |
tor8 | possibly not for this reason, though | 14:04.18 |
sebras | yes. | 14:04.28 |
tor8 | but I'm surprised it even got this far :) | 14:04.42 |
sebras | but mt-load-in-main-draw-in-threads.c dies as well... | 14:07.30 |
| maybe I misunderstood one part in that testcase though. | 14:08.24 |
| should the main thread allocate the pixmaps or should the rendering threads? | 14:08.46 |
| tor8: the segfault happens inside malloc() when freetype tries to allocate something. | 14:10.50 |
tor8 | sebras: are you cloning the context for each rendering thread? | 14:11.38 |
sebras | yes. but the pixmaps are created with the main context. | 14:11.50 |
| I guess that could be a problem. | 14:11.56 |
tor8 | you'd have to ask Robin_Watts the correct behavior there | 14:12.10 |
| but I think it shouldn't matter | 14:12.14 |
| since the allocator is shared when cloning | 14:12.34 |
sebras | I guess my testcase and variations thereof show just what a can of medusa's hair multi-threading is... | 14:14.08 |
Robin_Watts | I'm back from my run, and I've had an idea. | 14:21.30 |
| Our options are: 1) Pass context everywhere. Through all the pdf object code, everywhere. Don't store a ctx in xref, store etc. | 14:22.10 |
| That's a big change, but would lead to the 'cleanest' solution. | 14:22.31 |
| 2) When we open a document, clone the passed context, and store that in the document. | 14:22.47 |
| That becomes the only context that document functions ever use. | 14:22.59 |
| And we have a new lock used to ensure that only one thread is using that context at a time. | 14:23.13 |
| It means you *can* access a document from any thread you like, safely, but more than one at a time will block. | 14:23.37 |
| It means that the system I outlined before (of using the document just from one thread to read pages -> convert to display list) is still the recommended one for efficiency. | 14:24.19 |
tor8 | Robin_Watts: that means adding a fz_lock in *every* pdf_* function though | 14:24.20 |
| and buys us nothing in terms of MT performance | 14:24.43 |
Robin_Watts | tor8: Yes. Every top level function 'document' function, yes. | 14:24.48 |
| At the moment, we have 'you can't use the document from more than one thread - ever'. | 14:25.13 |
| With the extra lock we'd get 'you can use the document from anywhere you want, but for efficiency, just from one'. | 14:25.39 |
| (from one at a time) | 14:25.58 |
tor8 | if you think about why you'd want to use the document from multiple threads, I can only think of one reason | 14:26.13 |
| to speed up parsing by using more cores to parse and decode images | 14:26.26 |
sebras | Robin_Watts: doing that would require listing the public interface as well. | 14:26.34 |
tor8 | not a use case that's solved by this though. | 14:26.43 |
Robin_Watts | tor8: indeed. | 14:26.49 |
sebras | Robin_Watts: that boundary is right now somewhat fluid. | 14:26.52 |
tor8 | parsing is cheap, decoding images is expensive, and if we stick compressed data in a fz_image structure that can be done in the rendering threads | 14:27.27 |
Robin_Watts | tor8: The problem at the moment, is that when we first make a document from a file, it binds the current context into it. | 14:27.32 |
| tor8: Images are a special case that we can do something about. | 14:27.46 |
| We can identify the region of the document that contains the image data. | 14:28.09 |
tor8 | and the underlying file access is serial by nature, changing that is going to be awkward at best | 14:28.12 |
Robin_Watts | together with the filters that need to be applied to it. | 14:28.29 |
| Then that can be accessed from any thread just by holding the FILE lock. | 14:28.45 |
| Not the document lock. | 14:28.49 |
| The document lock is only required for things that start changing xref entries, or resolving objects etc. | 14:29.19 |
| I think the document lock is a good idea, because it's a small change, and it saves people from making silly mistakes. | 14:31.25 |
| Unless you fancy going for 1), which is cleaner, but touches much more code. | 14:32.23 |
| It may be possible to try 1) on a branch and for us to take a look at it. | 14:32.41 |
tor8 | 1) is as you say conceptually cleaner, but you still have the assumption in most of the parsing and lexing code that there is a serial file to read data from. if we go down that approach, something needs to be done about that other than adding more locks and locking calls everywhere. | 14:33.56 |
| anything that changes a data structure would need a lock | 14:34.11 |
| I don't like it. | 14:34.15 |
| having fitz level display list objects created in one thread, consumed as read-only in multiple threads, with locked access around a few shared caches .. that I can live with | 14:35.01 |
| if we want anything fancier, I want my goroutines and channels | 14:35.38 |
Robin_Watts | Fair enough. | 14:36.14 |
tor8 | the file lock we have now, that's primarily (or only) for dealing with the type 3 callbacks? | 14:36.23 |
Robin_Watts | So, sebras; your tests won't work. | 14:36.26 |
sebras | Robin_Watts: no I understand that now. | 14:36.45 |
Robin_Watts | tor8: No. | 14:37.09 |
| It's partly for that. | 14:37.20 |
| but it's also to allow for loading image data direct from the file in non document threads. | 14:37.46 |
sebras | Robin_Watts: except for mt-load-in-main-draw-in-threads.c which should work according to your discussion. | 14:37.51 |
Robin_Watts | (not that I've implemented that yet) | 14:37.55 |
| sebras: Probably. | 14:38.02 |
tor8 | Robin_Watts: right. that's also kind of callback-ish though :) | 14:38.12 |
| Robin_Watts: that trick won't work in the general case in XPS though | 14:38.33 |
| due to the crazy ass interleaved parts crap. not that I've actually seen image parts being split. | 14:39.14 |
| Robin_Watts: so in the fitz namespace, we hide contexts in three places -- streams, fz_objs, and the fz_document | 14:40.32 |
| streams I'm fine with, since they're serial by nature | 14:40.40 |
| the fz_objs really ought to be pdf_obj | 14:40.50 |
sebras | Robin_Watts: mt-load-in-main-draw-in-threads.c doesn't work either. I tried a new variant where I even allocate the pixmap in the rendering thread and take care to free it using the correct context in the main thread, but no luck. freetype misbehaves and I see segfaults all the same. | 14:41.10 |
tor8 | and fz_document is a thin shim around the pdf_document which is also serial by nature | 14:41.15 |
| Robin_Watts: did I miss anything there? | 14:41.35 |
Robin_Watts | Thinking. | 14:41.50 |
| I dislike the fact that objs/streams/etc are tied to 'whatever context was first used to access them'. | 14:42.34 |
| I think I'd still like the document to clone its own context when it's opened, and to always use that when handling the document. | 14:43.08 |
| That means the document can safely be used from any thread - just not more than one at a time. | 14:43.32 |
tor8 | Yeah, that makes sense. | 14:45.41 |
| if you take care to try/catch with the correct exception stack! | 14:46.44 |
| fz_try(my_thread_ctx) { pdf_run_page(doc); } will fail spectacularly with that approach | 14:47.35 |
| because of the separate exception stacks | 14:47.50 |
Robin_Watts | tor8: Yes. We'd have to have a top level exception handler inside document. | 14:48.36 |
| (probably we could do the context cloning/switching etc all at the document level, rather than in each sub class. | 14:49.08 |
| I have just come back from a run, so must dive into shower. I'll keep pondering this | 14:49.34 |
| Why won't it work with the crazy assed interleaved shit? | 14:50.55 |
tor8 | xps has parts potentially split into several zip entries | 14:51.18 |
| with funky naming to find and reassemble them | 14:51.30 |
| we could easily enough reassemble them and special case those fz_images though | 14:51.50 |
Robin_Watts | Right, so we need a 'smarter' image type that can handle not only "single location, plus filters", but "funky shit" too. | 14:52.03 |
tor8 | well, I don't think we'll ever see such cases | 14:52.23 |
| so I'm perfectly happy to trade the memory use of those non-existing-in-the-real-world cases for sanity in the image loader | 14:52.55 |
malc_ | tor8: around? | 18:44.53 |
| Forward 1 day (to 2012/02/13)>>> | |