| <<<Back 1 day (to 2011/09/14) | 2011/09/15 |
mvrhel2 | weird. That last commit that I did came back with cluster diffs but the cluster push that I had done with it did not show these | 04:09.30 |
| marcosw_: are you around? | 06:06.33 |
| good night all | 06:16.59 |
marcosw_ | mvrhel2: I am around, but not for much longer. | 06:51.57 |
henrys | flight delayed 2 hours just got home ... | 07:11.35 |
kens | Not good.... | 07:11.48 |
henrys | well heading off to bed see you in a few hours. | 07:19.04 |
kens | night henrys | 07:19.10 |
Robin_Watts | rebooty | 10:12.25 |
tor8 | morning | 12:10.16 |
Robin_Watts | hello. | 12:14.58 |
tor8 | I looked at your branch a bit yesterday | 12:15.18 |
Robin_Watts | And how horrified were you? | 12:16.06 |
tor8 | I think, if we're going to use exceptions at all we should be using them for all error management. not mixing and matching error bubbling with exception throwing. | 12:16.25 |
Robin_Watts | tor8: Right. | 12:16.34 |
tor8 | and I don't see a lot of benefit to exception throwing, since almost everywhere we'll need to catch, clean up, and propagate anyway | 12:16.57 |
| it'd be different in a garbage collected environment | 12:17.07 |
| and I'm not quite ready to go down that route yet | 12:17.16 |
Robin_Watts | I'd envisaged us using exceptions within the library, and when we catch them, we convert to errors to pass to the caller. | 12:17.39 |
| But what I've got at the moment, should allow us to change over to that gradually. | 12:17.58 |
| at the moment the structure for exceptions is in place, but unused. | 12:18.09 |
tor8 | I prefer keeping things simple with one "true" way to do it. | 12:19.23 |
Robin_Watts | tor8: So, you've just said you'd like us to use "either exceptions or errors". | 12:20.14 |
| and then you said "and I don't want to use exceptions" | 12:20.25 |
| and you've previously said "I don't want to bubble errors" | 12:20.33 |
| I'm having trouble squaring all those statements. | 12:20.46 |
tor8 | I've previously said "I don't want to bubble errors for malloc, because it's such a pain in the ass" | 12:21.02 |
| I can reconsider the later statement | 12:21.15 |
| bubbling up malloc errors means adding error handling to all constructors as well | 12:21.21 |
Robin_Watts | So you're now considering that bubbling errors is the right thing to do ? | 12:22.00 |
tor8 | I'd rather bubble malloc errors than use two separate error mechanisms | 12:22.32 |
| and given the choice between bubbling and setjmp/longjmp exceptions, which need to basically bubble as well to clean up properly, I'd go with bubbling | 12:23.12 |
Robin_Watts | ok. | 12:23.12 |
tor8 | agreed so far? | 12:23.35 |
Robin_Watts | I could be happy abandoning the setjmp/longjmp stuff in favour of bubbling errors, but before I do so, I'd like to just be sure we're doing it for the right reasons. | 12:24.31 |
| You say that setjmp/longjmp need to bubble too... | 12:24.51 |
tor8 | how about an experiment, take one (largish) file and rewrite it to properly use all three proposed error handling schemes and then see which approach we (hah, I meant "I" of course) like best | 12:25.32 |
| almost all functions have some intermediate allocs that need to be freed or cleaned up | 12:25.55 |
Robin_Watts | Essentially, with exceptions, either you need to expose the exceptions to the callers of the library, or you need to catch the exceptions at the topmost level and convert that to a bubbled error. | 12:26.17 |
tor8 | we could try using the stack more, even going so far as to use alloca | 12:26.24 |
Robin_Watts | I'd fight alloca. | 12:26.35 |
tor8 | I don't think we'd be able to use it much anyway, most things we're allocating is creating data structures and linked lists | 12:27.09 |
Robin_Watts | I'm assuming that the conversion of exceptions to bubbled errors for the callers isn't what offends you? | 12:27.34 |
tor8 | one problem with mupdf is there isn't much separation between topmost level functions and internals | 12:27.40 |
Robin_Watts | It's more the fact that we need cleanup at every intermediate alloc level, regardless of the scheme we pick ? | 12:28.01 |
tor8 | correct. | 12:28.17 |
Robin_Watts | OK. | 12:28.22 |
| Then we are in agreement, that both schemes involve work. | 12:28.41 |
| The question is, which works out easiest. | 12:28.53 |
| and I also agree that doing a rewrite of a file and then looking at it seems sensible. | 12:29.10 |
| So I'll look into that today. | 12:29.28 |
tor8 | I could live with exceptions if we used them everywhere; I could even live with a thin wrapper for external functions that turn exceptions into error codes for external callers. I just don't think we'll see much benefit over simple bubbling. | 12:29.41 |
| but like you said, we should at least try it | 12:29.54 |
| btw, for bubbling with malloc errors, we have one more choice to make. how to bubble malloc and constructor errors. | 12:30.32 |
| now we have: fz_font *fz_new_font() which cannot fail | 12:30.52 |
| a) fz_error fz_new_font(fz_font **fontp) | 12:31.06 |
| b) fz_font *fz_new_font() where NULL return means error | 12:31.26 |
Robin_Watts | Well, personally, I'd prefer a new naming scheme :) | 12:32.05 |
| but putting that aside. | 12:32.22 |
| If we have more than one way it can fail, then a) is best. | 12:32.41 |
| i.e. if it could fail because of memory, or it could fail because it failed to open a given font etc. | 12:33.03 |
| If it's a matter of memory failure or success, then b) wins. | 12:33.17 |
tor8 | I'm inclined to go with the latter, and add two functions fz_(re)throw_null(). a) does allow us to use a special numeric code to signal specific error causes, but is a fair bit uglier. I really wish for multiple return values here... | 12:33.51 |
Robin_Watts | Regardless of which of those we pick, we should have fz_error_mem to indicate a memory allocation failure. | 12:34.12 |
| tor8: We simply *must* change the fz_throw/fz_catch stuff. | 12:34.38 |
| Unless we are really using exceptions, those names are unforgivable. | 12:34.56 |
tor8 | breaks out the thesaurus | 12:35.29 |
Robin_Watts | fz_error_create, fz_error_destroy | 12:35.51 |
tor8 | too long | 12:35.59 |
kens | lob and grab ? :-) | 12:36.02 |
Robin_Watts | 'too long' is a terrible reason. | 12:36.20 |
| have you seen the mupdf source? | 12:36.28 |
| Clarity wins, as far as I am concerned. | 12:37.25 |
tor8 | and the wrong word order, but that goes with you wanting a different naming scheme altogether ;) | 12:37.46 |
Robin_Watts | Yes indeed. | 12:37.55 |
| I deeply favour type_verb | 12:38.12 |
tor8 | reeks of object orientation ;) | 12:38.26 |
Robin_Watts | I'm taking a 'thing' and 'verb'ing it. | 12:38.47 |
| It means that all functions that handle a 'thing' are grouped together. | 12:39.04 |
| And verbs should always be consistent (create/destroy, init/fin, take/drop) etc | 12:39.41 |
tor8 | I'd agree if we're writing in german. I may have agreed if the world was that simple, what do you do with functions that deal with many types, or don't belong to a single type? | 12:40.22 |
Robin_Watts | Almost always, you have one main type. | 12:41.44 |
| Take, for example, your pdf_store functions. | 12:41.54 |
| Those add or remove, or hunt for objects in a store. | 12:42.11 |
| Those are pdf_store_{add,remove,find} | 12:42.22 |
| because the store is the primary object, and the other objects are being added to/looked for in the store. | 12:42.47 |
| It doesn't strictly need to be by type, but it should certainly be by module. | 12:43.28 |
tor8 | your naming emphasize the thing, my naming emphasizes the verb. yours is stricter and more hierarchical, mine is more flexible for the odd functions that don't fit or get shuffled around a lot. | 12:43.48 |
Robin_Watts | I disagree with the latter part of that sentence, but I accept that this is not a fight I am going to win (today). | 12:45.17 |
tor8 | I don't disagree with your naming scheme in general, but for mupdf we picked the current naming quite a long time ago and I'm happy with it so far | 12:46.05 |
Robin_Watts | I've got to reboot. VS has soiled itself again. | 12:46.06 |
sebras | tor8: seems like there will be a large patch coming my way soon.. | 12:54.55 |
Robin_Watts | Gah. Even after a reboot, VS is still jammed. | 13:03.14 |
tor8 | reinstall time? ;) | 13:04.33 |
Robin_Watts | I suspect it's a project file problem. | 13:04.56 |
| I cloned the Debug configuration for mupdf as Memento, and added MEMENTO=1 to the defines. | 13:07.52 |
| and now when it tries to build it has trouble opening files in the Memento directory. | 13:10.29 |
| Ah, it failed to clone it properly. | 13:14.51 |
| I told myself I'd copy it in VS rather than hand hacking the files, because it would be less likely to get it wrong... how little I knew. | 13:15.21 |
| ok, tor8... want to suggest a suitable file for cleanup? | 13:21.54 |
| (to try exception handling based cleanup, I mean) | 13:22.07 |
tor8 | pdf_xref.c perhaps? | 13:23.06 |
Robin_Watts | That only has a single malloc. | 13:23.22 |
| Ah, but it has callocs too. | 13:23.41 |
| OK. | 13:23.43 |
tor8 | you want to grep for fz_new as well | 13:23.46 |
| but perhaps it isn't the best file to test with | 13:24.08 |
| I just remember having to do a lot of nested cleanups when catching malloc errors in there before :) | 13:24.29 |
| I have another question for you. in fz_to_int() and similar functions we can run into malloc errors, due to fz_resolve_indirect | 13:25.35 |
| other parsing errors get ignored there, perhaps we should treat a malloc error the same way there and just swallow it and treat it like other parsing errors. | 13:26.02 |
| that saves us a *lot* of error bubbling | 13:26.21 |
Robin_Watts | We could note the failure in the context, and then continue parsing. | 13:27.18 |
| That way we can indicate that the page rendering is only partial. | 13:27.34 |
tor8 | right. | 13:29.25 |
| we should make a note in a context every time we do run out of memory and can't recover by clearing caches, so the caller can try something different at the top level. | 13:30.02 |
Robin_Watts | Yes. | 13:30.11 |
tor8 | which will be either re-render at a lower resolution, or just throw up an error message "no can do" | 13:30.39 |
Robin_Watts | yup. | 13:30.50 |
tor8 | if we do make a global/context note, we can drop the a) approach and return NULL for errors. ie just pass/fail errors, no details in the actual code. | 13:31.45 |
| which is essentially what we have today anyway. | 13:31.56 |
Robin_Watts | tor8: No, I'd still favour a separate error code for out of memory. | 13:32.40 |
tor8 | okay. just thinking out loud :) | 13:33.12 |
Robin_Watts | If we are in the 'success or alloc_failure' case, then returning NULL is still the right thing to do. | 13:34.00 |
| If we are in 'success, or alloc failure, or some other failure' then it's still no harder to return a different error code, and saves people checking the context. | 13:34.37 |
tor8 | where will the error be raised (first call to fz_throw with the current names) | 13:36.12 |
| do you mean: p = fz_new_foo(); if (!p) return fz_throw("out of memory"); | 13:36.52 |
Robin_Watts | Yes. | 13:36.58 |
| With exceptions, it would be fz_new_foo that threw. | 13:37.19 |
| So the calling code is simpler (no need to check) | 13:37.30 |
tor8 | or: p = fz_new_foo(); if (!p) return fz_rethrow("out of memory"); (with the throw in fz_malloc() two levels deeper) | 13:37.42 |
| right. | 13:38.05 |
Robin_Watts | Cleanup can often be done a function at a time, rather than on every allocation. | 13:38.36 |
| So you get: | 13:38.41 |
tor8 | I usually do: err = foo(); if (err) goto cleanup; with a cleanup: block at the end after the normal return | 13:39.33 |
Robin_Watts | try { foo = fz_malloc(); bar = fz_malloc(); baz = fz_malloc(); use(foo, bar, baz); } catch { } free(foo); free(bar); free(baz); | 13:39.41 |
| Rather than: foo = fz_malloc(); if (!foo) goto cleanup; bar = fz_malloc(); if (!bar) goto cleanup; etc | 13:40.23 |
tor8 | yeah. that code compression may win me over on the exception throwing, actually. | 13:40.50 |
| but in that case I still propose removing our old bubbling altogether | 13:41.24 |
Robin_Watts | I think we'd end up in the stage where we had exception handling exclusively at the bottom of our call tree. | 13:42.00 |
| And at some point as we move up, it would return to code bubbling. | 13:42.22 |
| Where ideally that point would be as close to the entry point to the libs as possible. | 13:42.35 |
| (possibly just at the pdfapp level) | 13:42.40 |
tor8 | yeah. I've been messing about with a higher level "document" api for the app level | 13:43.04 |
| one that would abstract away the differences between PDF and XPS, and hide the device interface behind convenience rendering functions and do a lot of the page/image caching | 13:43.36 |
| at that level we could return error codes instead, and most users would access mupdf through that layer | 13:43.55 |
Robin_Watts | I suspect most users would be put off by the idea of having to handle exceptions. | 13:44.16 |
tor8 | you mean like for libjpeg and libpng? ;) | 13:44.35 |
Robin_Watts | tor8: Most users of libjpeg and libpng blindly copy the example code :) | 13:46.36 |
| What did you think of henrys message about making a mupdf demo app on the store? Was he talking about iOS? | 13:47.23 |
tor8 | yeah. I think I should just get my hands dirty and make a proper app for both ios and android that we put on the markets. | 13:48.30 |
Robin_Watts | tor8: How will you get files into the app? | 13:49.05 |
tor8 | so far I've been happy to make just the library and leave the apps to other people | 13:49.20 |
| iTunes has a (fairly well hidden though) document syncing page where you can transfer documents to apps | 13:49.43 |
| for android, we can just browse the sdcard filesystem | 13:49.57 |
Robin_Watts | Right. It was iOS that worried me. | 13:50.06 |
tor8 | so we'll need a big fat help dialog with screenshots "HOW TO SYNC YOUR FILES" | 13:50.34 |
| on iOS | 13:50.39 |
Robin_Watts | Indeed. | 13:50.52 |
henrys | there is no way on these devices to say you want a mime type opened with some app? | 14:20.15 |
tor8 | on android there may be, but not on iOS | 14:20.38 |
henrys | Robin_Watts:to answer your question scott and miles need a non technical demo for management and sales folks they meet at companies, they both have ipads which they toat around and inevitably the folks across the table will have some form of tablet. To them a "development environment" is just "air" (non tangible). | 14:23.46 |
| Robin_Watts, tor8:so I've been playing around with xcode ide - gdb and stuff, it has many of the visual studio features and it keeps the gdb command line available, I am not sure how long this has been around I haven't tried it in quite some time. | 14:34.29 |
| alexcher:are you about? | 14:39.25 |
Robin_Watts | henrys: Well, with what tor8 says about syncing documents through itunes, that's a game change. | 14:40.40 |
| and means it's possible without writing huge reams of code. | 14:41.09 |
henrys | hmm how do all these apps work? http://www.labnol.org/software/ipad-pdf-reader-apps/13807/ | 14:45.31 |
Robin_Watts | Good Reader is a download tool. | 14:47.35 |
| The ones I looked at before all either fetched from the web, or did some WebDAV nonsense to allow people to upload files to the device. | 14:48.18 |
henrys | I see | 14:51.43 |
| Robin_Watts:while I was on the plane I looked at your interp4.pdf file - assuming ray won't get to it for sometime and tripped over something I don't understand how the ImageMatrix gets initialized for inline images. I seem to be getting bogus data read from a non-existent dictionary value. | 14:53.59 |
Robin_Watts | That's the 'interpolate in landscape' stuff? | 14:54.32 |
henrys | yes | 14:54.39 |
| that's why I was asking if alex was about? | 14:54.56 |
Robin_Watts | That's entirely out of cache for me, I'm afraid, but yes, asking alex would be good. | 14:55.12 |
henrys | right I'll just send him mail. | 14:55.48 |
Robin_Watts | tor8: if you do an obj = fz_dict_gets(); shouldn't you then do an fz_drop_obj(obj) ? | 15:35.34 |
| oh, OK, no. | 15:36.05 |
tor8 | no, that's one of the exceptions. | 15:36.08 |
| basically: if you call something with "new" or "load" in the name, you need to drop | 15:36.22 |
| otherwise you must "keep" to make sure it lives on past your current function call | 15:36.45 |
| you have to call drop if you own the object, and you own it by either creating the object, or keeping it. | 15:37.46 |
Robin_Watts | yeah, ok. | 15:38.01 |
tor8 | only cause of problems I can see with that would be multi-threaded access to pdf_xref (not going to happen) | 15:38.45 |
| but also if we add purging objects from the xref when we run low on memory, that might free an object we're using temporarily | 15:39.18 |
Robin_Watts | tor8: Yes, need to think about that. | 15:39.48 |
| but not today :) | 15:39.56 |
tor8 | that can only happen if you follow an indirect reference | 15:39.57 |
| since if you're looking at an object at the top level of the xref, that one you must own since you have to call pdf_load_object | 15:40.29 |
| but if you follow an indirect reference, that will implicitly load the object and fudge with the reference counter | 15:41.05 |
Robin_Watts | It may be that to make multithreaded access work, the easiest thing to do will be to make fz_dict_gets also increment the counter. | 15:41.37 |
tor8 | I was thinking about a low-memory flag in the xref, to prevent us from ever caching an object | 15:41.47 |
| yeah, but that means a lot more refcount shuffling everywhere. we'll take both a development and runtime speed hit doing that :( | 15:42.36 |
| depending on how we use fz_obj's, we may be able to get away with reference counting them altogether if we restructure a few things | 15:43.49 |
| there's a missing "not" somewhere in that sentence | 15:44.05 |
Robin_Watts | tor8: (and anyone else interested) http://ghostscript.com/~robin/pdf_xref.c | 16:02.50 |
| That's a first pass at the file, converting it to use exception handling. | 16:03.09 |
| oops. better version. | 16:04.45 |
henrys | tor8:so scott now has a list of folks to visit once he has an app in hand. So how long do you think it will take to get something presentable rolling? | 16:05.01 |
tor8 | Robin_Watts: what does fz_var() do? | 16:06.57 |
| henrys: a few weeks | 16:07.11 |
Robin_Watts | tor8: When you longjmp, you can lose the state of local variables. | 16:07.31 |
tor8 | + a few weeks for apple's app store approval process | 16:07.36 |
| oh, so volatile? | 16:07.51 |
Robin_Watts | tor8: No. | 16:07.57 |
tor8 | brb, dinner | 16:07.57 |
Robin_Watts | I could have used volatile to avoid it, but that's unnecessarily harsh. | 16:08.13 |
| Instead I call a function with the address of the var cast to a void *. | 16:08.36 |
henrys | very exciting time to be doing mupdf - the message I got from the show is print is going to get smaller and light interactive stuff is here. Everyone is scrambling to get out of things like static printed signs and moving to interactive projector stuff. | 16:09.05 |
kens | Night all. | 16:09.24 |
Robin_Watts | That forces the C compiler to assume that that variable can be updated on every function call, so it ensures that its printed. | 16:09.27 |
| night kens. | 16:09.29 |
henrys | night kens | 16:09.34 |
Robin_Watts | s/printed/written back to memory/ | 16:09.59 |
| tor8: Most of the try/catching in that file isn't so much as for cleanup as for ensuring we get (more or less) the same debug messages as before. | 16:11.12 |
| henrys: You may be interested in looking at that file too, to see what you think of the exception handling scheme. Look for fz_try/fz_catch/fz_throw etc. | 16:13.07 |
henrys | will do. | 16:13.18 |
marcosw_ | henrys and rayjj: it doesn't look like we need a support meeting this week, only four new bugs and they all seem straightforward. | 16:25.34 |
henrys | right - you did see all the xps bugs right? | 16:36.16 |
| marcosw ^^^ | 16:36.42 |
| or marcosw_ ^^^ | 16:36.49 |
tor8 | Robin_Watts: better hope fz_var doesn't get inlined then :) | 16:37.26 |
Robin_Watts | tor8: Indeed. It calls a function in except.c especially to avoid that. | 16:37.51 |
tor8 | Robin_Watts: I'd still go with volatile though | 16:39.06 |
Robin_Watts | tor8:We | 16:39.15 |
| tor8: We can go with volatile if you'd prefer. | 16:39.26 |
tor8 | Robin_Watts: we could play with the gcc specific stack trace functions at the throw-site to get a reasonable approximation of the current print-out. we'd lose a few bits of detail on the intermediate printouts but nothing I'd consider crucial. | 16:40.27 |
| it's very rare that the rethrow adds more context than the throw in the old scheme. a few places it adds on an object number that isn't available at the throw site. | 16:40.59 |
Robin_Watts | tor8: Yeah, I saw that. | 16:41.38 |
| I'd forgotten how much I like code written in this style :) | 16:41.55 |
tor8 | exception throwing style you mean? :) | 16:42.15 |
Robin_Watts | yes. | 16:42.18 |
| It's been 10+ years since we wrote this lib. | 16:42.27 |
tor8 | seeing it in use, you've almost sold it | 16:42.29 |
| sold me on it | 16:42.34 |
Robin_Watts | It really means that errors *are* the exception, rather than the norm, and the flow of the code reflects that. | 16:43.03 |
tor8 | I also see you hid the context in the big data type structs though :/ not sure if I like that compromise. it always irked me how gs hides the gs_memory here and there | 16:43.10 |
Robin_Watts | tor8: We can change to pass the context around more, if you want. | 16:43.34 |
| Given that xref was only touched within one thread, it seemed safe to hide the ctx there. | 16:44.01 |
| Likewise streams. | 16:44.08 |
| And it saved having to pass an extra pointer everywhere. | 16:44.22 |
tor8 | if we're going to pass it around, we should add it as a first argument to every function. but there is a point to be made for hiding it in pdf_xref, since that's the "context" for mupdf. | 16:44.28 |
| how about fz_context to all fz_ namespace functions, and pdf_xref to the pdf_ functins | 16:44.43 |
Robin_Watts | That sounds nice, if it's practical. | 16:45.06 |
tor8 | and hide it in xps_context for xps (or pass explicitly, I don't remember if the xps context is as pervasive as xref) | 16:45.10 |
Robin_Watts | The worry I have is that in future we may do some clever things to allow for multithreaded operation. | 16:45.36 |
tor8 | otoh, fz_try(xref->ctx) is going to get tiresome to type rather fast | 16:45.40 |
Robin_Watts | Like, say, pulling the image data out of the file into a compressed block of memory, and then doing image decomps in another thread. | 16:46.09 |
tor8 | the MT operations I have in mind are (a) rendering display trees in separate threads, (b) passing off image decoding to worker threads | 16:46.41 |
Robin_Watts | Right. | 16:46.51 |
tor8 | (b) needs mutexes/semaphores to block the renderer until the image is ready | 16:46.58 |
Robin_Watts | We need to ensure that those cope with being called with different contexts. | 16:47.22 |
tor8 | interpreting the pdf content streams I don't imagine we'll ever want multi-threaded | 16:47.24 |
Robin_Watts | Indeed. | 16:47.30 |
| Anything that risks hitting the disc must be single threaded really. | 16:47.50 |
tor8 | decoding images in a separate thread, and multiple threads for rendering should be straight forward enough with fz_context being passed around | 16:47.52 |
Robin_Watts | tor8: yes. I agonised over it for a bit, but decided that ctx in xref and streams made sense. | 16:48.21 |
| but I deliberately didn't put the ctx into the pdf_store. | 16:48.32 |
tor8 | hang on, let me see where you put it in the streams | 16:48.39 |
| all the places the pdf_store is used, we probably also have the pdf_xref? | 16:48.53 |
Robin_Watts | fz_stream | 16:48.55 |
| yes, it gets the ctx from the xref and then passes it to pdf_store_find etc. | 16:49.19 |
| I didn't bake the ctx into the pdf_store itself on a pdf_store_new | 16:49.35 |
tor8 | we could pass the pdf_xref to the pdf_store functions instead of the two arguments, but that's bordering on ugly | 16:50.35 |
Robin_Watts | That's over the edge into ugly, IMHO. | 16:50.58 |
henrys | tor8:I'm thinking that some of the customers scott will be visiting would want to be able to process a movie in a pdf file, something I know little about. I am hoping we'd be able to just hand tthat off to another componenet on the system and return. | 16:51.02 |
Robin_Watts | We could have more than one store. | 16:51.03 |
tor8 | I think the pdf_store almost belongs to the fz_ namespace anyway | 16:51.06 |
Robin_Watts | tor8: yes, it's a generic cache. | 16:51.24 |
tor8 | henrys: movies are, I believe, a form of annotation objects | 16:51.34 |
Robin_Watts | I think maybe we should rejig it a bit to recognise that fact. | 16:51.52 |
tor8 | Robin_Watts: with specialised lookups, but in essence it's just a refined hashtable | 16:52.11 |
Robin_Watts | i.e. rename the store to pdf_cache. find becomes 'claim', and we'd need a 'release'. | 16:52.36 |
tor8 | it indexes on object number in the best case, and by object comparisons in the worst case | 16:52.39 |
| yeah, it started out as a store not a cache, since I preloaded all resources before running the page | 16:53.07 |
Robin_Watts | we want a way to lock an object whilst we are using it so it doesn't get kicked out in a multithreaded environment. | 16:53.17 |
tor8 | since we don't do that anymore, it's reasonable to rename it to pdf_cache | 16:53.17 |
Robin_Watts | hence an atomic 'claim' rather than a 'find' and a 'lock'. | 16:53.31 |
tor8 | quite | 16:53.39 |
| oh wait, no, we're being side tracked | 16:53.52 |
Robin_Watts | is quite relieved at this point. | 16:53.58 |
| I was expecting a much larger fight over the context/exceptions stuff. | 16:54.11 |
tor8 | the pdf_cache is only ever accessed from the pdf_interpret loop | 16:54.12 |
| not by the renderer | 16:54.26 |
| it's primarily there so resources (images and fonts) that are used multiple times get reused in the display list | 16:55.04 |
Robin_Watts | tor8: Right, yes, it's definitely a sidetrack. Maybe for the future we might want to keep other things in there (like decoded or scaled images etc) | 16:55.11 |
tor8 | we may end up putting a callback from the fz_pixmap rendering logic back into the pdf interpreter to decode on the fly. but I think we have a ways to go before that becomes a priority. | 16:56.09 |
| like allowing non-alpha and 1-bpp image formats in fz_pixmap | 16:56.29 |
Robin_Watts | henrys: movies in PDF files; the movie data is an annotation, yes, I believe. | 16:56.33 |
tor8 | and storing compressed jpeg data that's decoded on the fly | 16:56.41 |
| and not requiring all decompressed image data in ram when rendering. icky stuff. | 16:57.24 |
Robin_Watts | to then display it, either we need to hook our renderer to be capable of quickly rerendering a given rectangle on each frame, or (more likely), we need to call a system component to tell it to play the movie. | 16:57.26 |
tor8 | henrys: or let them know where on the page the movie annotation appears, and hand them the data as a blob | 16:58.00 |
Robin_Watts | Either the system component will open a 'subwindow' on the device and put the rectangle there, or it will decode and pass back image data. | 16:58.11 |
tor8 | let them deal with decoding and rendering the movie on top | 16:58.12 |
Robin_Watts | The latter used to be more common, but the former is now almost always the case, what with DRM etc. | 16:58.30 |
tor8 | same deal with embedded 3d cad models | 16:58.44 |
Robin_Watts | (they don't want apps getting them to decode video and grabbing it back to recompress) | 16:58.46 |
marcosw_ | henrys: I just got back online, had to leave the building for a fire drill (followed by a spontaneous meeting at the evacuation assembly point). | 17:01.01 |
| what xps bugs are you referring to? | 17:01.33 |
tor8 | Robin_Watts: you may be relieved, but sebras will kill us for introducing exceptions, and zeniko will strangle us for changing all the api:s again... ;) | 17:02.31 |
Robin_Watts | The APIs shouldn't change, right? | 17:02.54 |
| He should only be calling 'above' the exceptions. | 17:03.14 |
marcosw_ | henrys: found the xps bugs, in an email from a customer. | 17:03.15 |
tor8 | we're adding the context to all functions, and taking away the error returns | 17:03.23 |
Robin_Watts | tor8: right, to the internal functions, yes. | 17:03.38 |
tor8 | they're doing a fair bit of low level digging around. and as I said before, there's no clear separation between internal and public functions. | 17:03.59 |
| something we may want to work on at some stage | 17:04.14 |
Robin_Watts | Yes, splitting fitz.h and mupdf.h up into smaller modular headers would be nice. | 17:04.53 |
tor8 | I was thinking primarily of splitting them into internal and public header files. | 17:05.22 |
Robin_Watts | That'd be a start :) | 17:05.29 |
tor8 | and if you want to split them into smaller modular pieces, we can do that later too. | 17:05.58 |
| but I think we should fix the more pressing issues of the context and exception throwing first, oh, and that iOS/android viewer app :) | 17:06.59 |
Robin_Watts | Well, if we're agreed on the principle, I can keep going and push exception handling throughout. | 17:07.41 |
| and you can do the app :) | 17:07.51 |
Robin_Watts | walks dogs. bbs. | 17:08.17 |
| henrys: Looking at Drivers.htm in the strip_copy_rop section. | 18:24.58 |
| As noted above, the source S may be a solid color, a bitmap or a pixmap. If S is a solid color: | 18:25.18 |
| sdata, sourcex, sraster and id are irrelavant | 18:25.33 |
| scolors points to two gz_color_index values; scolors[0] = scolors[1] = the color | 18:25.52 |
| Why 2 ? | 18:25.55 |
| Ah, to allow simplistic handling further on, I bet. | 18:26.20 |
| henrys: mvrhel2 looked at the speed of the slowest plank pcl files, and found that a significant amount of time was spent in planar_to_chunky. | 18:44.21 |
| I'm looking into why that's being called. | 18:44.37 |
| It looks like the file is plotting a monochrome image with lop 0x66. | 18:45.17 |
| (That is, D^S) | 18:45.43 |
| We break the image down into rectangles, and plot each one, again with D^S, even though we know that S is 1 for each of the rectangles we plot. | 18:46.40 |
| So that's basically NOT(D). | 18:46.56 |
| In order to do a rop, we have to 'get bits' the background, and convert to rgb. | 18:47.21 |
| So that's a planar to chunky operation, then a convert to rgb operation, then a not operation, then we have to convert back from rgb to cmyk and write out to the planes. | 18:48.16 |
| Are rops known to be broken on the display device? | 18:52.35 |
henrys | Robin_Watts:sorry just returned. | 20:20.48 |
| I'd imagine they'd be broken if aa were enabled I don't see why it wouldn't work otherwise. | 20:21.33 |
Robin_Watts | Does lj5200_pcl6_mono_PWTW51DC.prn look right to you then ? | 20:42.42 |
| I think it's rops that are killing the plank performance. | 20:44.34 |
henrys | checking the file - well we can do the rops in cmyk and flip the bits right? | 20:45.10 |
Robin_Watts | rops force things into rgb, so that's extra planar to chunky and chunky to planar stages. | 20:45.23 |
henrys | well in cmy. | 20:45.25 |
| yes I am talking about doing away with that. | 20:45.41 |
Robin_Watts | We can, but the results will be wrong :) | 20:45.47 |
henrys | why? | 20:46.50 |
Robin_Watts | Well, take the case used in this file; inverting a region. | 20:47.15 |
| Flipping the CMYK bits does not give the same results as converting to rgb, flipping, and converting back. | 20:47.35 |
| (0001 converts to rgb as 000 inverts to 111, converts back as 0000, which is not the same as 1110) | 20:49.02 |
henrys | we can't have 0001 it must be converted 111 | 20:50.19 |
Robin_Watts | Can we assume that we only get 'valid' cmyk in? | 20:51.12 |
henrys | actually the pcl manual specifically says the printer does the ops in CMY space behind the scenes. | 20:51.35 |
Robin_Watts | Are you saying that we'll never see a 0001 pixel? if k is 1, then c,m,y will be 1 too ? | 20:52.49 |
henrys | yes. | 20:53.00 |
| I am saying that whenever we do see 0001 we should treat it as 111 | 20:53.48 |
| it is worse than that you are reading halftoned data right? | 20:54.26 |
Robin_Watts | You're placing dependencies on the pixel data. | 20:54.43 |
henrys | all of this is some crazy approximation. | 20:54.43 |
Robin_Watts | In order to work on planes at a time, I can't afford to say "if k is 1 then treat c as 1". | 20:55.13 |
| From what you're saying, I *do* need to look at all 4 planes at the same time. | 20:57.36 |
| So the best I could do is to do some code that reads a pixel from all 4 planes, operates on it, then writes it back. | 20:58.20 |
| we can't treat the 4 planes independently, but I might be able to avoid slaving through the data so many times in lots of steps. | 20:59.02 |
henrys | why can't we always produce CMY? | 20:59.40 |
| no K | 20:59.43 |
Robin_Watts | Sorry? | 20:59.49 |
henrys | when pcl is active. | 21:00.03 |
| pcl input is always rgb well srgb if the output is always cmy there is no issue right? | 21:00.27 |
Robin_Watts | you're saying the plank device should be cmy, not cmyk ? | 21:00.33 |
henrys | I am thinking out loud. | 21:00.52 |
| but yes that was what I was thinking. | 21:01.04 |
Robin_Watts | If we only ever used the traditional crap colour mappings, then yes. | 21:01.22 |
| because k is always mechanically derivable from c,m,y. | 21:01.43 |
| BUT a) what about colour management? | 21:01.59 |
| b) that would mean the cmyk4 device would really have 3 components, and lots of gs assumes that 3 components = rgb, 4 components = cmyk, right ? | 21:02.30 |
henrys | PCL is purely business graphics really just srgb. | 21:02.43 |
| so are rops the only performance problems? | 21:04.00 |
Robin_Watts | Pass. | 21:04.18 |
| Did Marcosw ever do the plank vs pamcmyk4 timings for gs rather than pcl ? | 21:04.36 |
| I haven't seen them. | 21:05.25 |
| With that, I'd hope to see a much smaller difference. | 21:05.38 |
henrys | maybe those differences should be the focus while I think about this. I do think the CMY device is the answer. | 21:06.26 |
Robin_Watts | What are the relevant priorities here? | 21:06.39 |
| plank vs mupdf error checking vs Company I. | 21:06.54 |
Robin_Watts | swaps to staring at a screen further away... | 21:08.46 |
henrys | I feel that plank and Company I have higher priority, but it's debatable all of it should be pushed forward gradually. We really to talk to michael about cmy. | 21:09.09 |
| s/really/really need/ | 21:10.38 |
| I like the exception handling in mupdf | 21:11.45 |
| alexcher? | 21:12.27 |
sebras | Robin_Watts, tor8: AAARGH! ;) (haven't read the whole log yet) | 23:31.27 |
Robin_Watts | Why aargh? | 23:31.36 |
| You don't like exceptions? | 23:31.41 |
sebras | Robin_Watts: no, not really. though I agree that error handling is a Hard<tm> problem to solve in a convenient and elegant way. | 23:32.30 |
| Robin_Watts: this is artifex's/tor's project though so I'm not in a position to complain really. I'll adapt I guess. | 23:33.11 |
| I agree with tor8 that probably multiple return values would have been my desired way. | 23:33.35 |
Robin_Watts | sebras: I was very wary of the idea of exceptions when I was first exposed to it. | 23:33.45 |
| but the code can look a lot cleaner with it. | 23:34.04 |
| And the lib we are using is pure C (no C++ cruft). | 23:34.19 |
sebras | no, that's good at least. | 23:34.28 |
| the first of my two exception gripes is that it is not uncommon to have c++/java/whatever throw exceptions back to the caller without declaring it (which means that they are invisible, i.e. you do not know that you may need to catch anything). my other gripe is that most often you see someting similar to try {}catch(AllExceptions) {...} which to me seems like there has been no though process involved in distinguishing different types of error c | 23:37.41 |
| sorry for the long response, but as you can tell I feel passionately about it. :) | 23:38.05 |
Robin_Watts | sebras: ok, to take those points... | 23:38.20 |
| java methods that throw exceptions are explicitly required to state the exceptions that they throw. | 23:38.43 |
sebras | nope, ont all. | 23:38.51 |
| not | 23:38.53 |
| the checked ones in java are shown, not the others. | 23:39.01 |
| and in c++ I believe that you never need to declare them..? | 23:39.13 |
Robin_Watts | Hmm. | 23:39.20 |
| Can't comment on C++, but on Java, I thought we were safe. | 23:39.38 |
sebras | not when I was programming java back in the 1.4 days at least... | 23:40.05 |
Robin_Watts | Certainly I've had problems in java when I haven't declared that functions throw exceptions; the compiler would barf if I hadn't got the declaration right. | 23:40.26 |
sebras | true enough, but you are only guaranteed to see some of them. | 23:40.51 |
| have you ever seen a function declare that it returns NullPointerException? | 23:41.11 |
| that one is unchecked. | 23:41.15 |
Robin_Watts | But yes, having some way of defining at the prototype level what exceptions are thrown (or at least *that* some exceptions are thrown) would be a good thing. | 23:41.17 |
sebras | and need not be delcared. | 23:41.20 |
Robin_Watts | sebras: Ah, right. | 23:41.27 |
| Currently the lib doesn't provide that, and I can't see how it could. | 23:41.43 |
sebras | mm, the error handling should be obvious. that is important. | 23:41.44 |
| no, I agree. | 23:41.51 |
| what about my second gripe? | 23:42.10 |
Robin_Watts | (I did debate about a "#define THROWS" so we could put 'THROWS' before any given prototype as an indication) | 23:42.42 |
| The second gripe; in our lib fz_throw ... fz_catch ... the catch catches all exceptions. | 23:43.47 |
| The exception type is still subject to change. | 23:44.19 |
| The exception used to be an int and a character buffer. | 23:44.40 |
sebras | and now..? | 23:44.47 |
| struct? | 23:44.50 |
Robin_Watts | where the int was the exception type, and the character buffer was the formatted string. | 23:44.57 |
| currently it's just a formatted string. | 23:45.08 |
| I'm tempted to go back to the struct (int/char[]) thing. | 23:45.41 |
| That way people can process specific types of exceptions in the catch clause. | 23:45.56 |
sebras | most of the time the throw/catch things in mupdf are not really exceptions, they are mainly there to print stuff prettily. :) | 23:47.07 |
| the malloc()-may-not-return-NULL patch removed _a_ _lot_ of error handling code that one next to never tests. | 23:48.28 |
| I have read a bit about memento and that seems quite useful as it assures that the error code is actually tested. | 23:49.06 |
Robin_Watts | sebras: yes, my hope is that that should ensure we can spot cleanup errors as they are introduced. | 23:52.17 |
| tor8: If you read this, what do you think of the idea of having exceptions having a type and a string, rather than just a string? | 23:53.39 |
| I don't envisage a huge number of possible variations. | 23:54.11 |
| 'out of memory' and 'other error' maybe :) | 23:54.20 |
sebras | Robin_Watts: there may be some early errors as well, e.g. 'unable to open file', etc. | 23:56.18 |
| Robin_Watts: I won't complain too much until I have seen what you (and tor8?) come up with. actually I tried to look at your tree over at gitweb, but none of the exception ideas were there... | 23:57.09 |
Robin_Watts | sebras: The exception routines should be there in my latest version (on the failing_allocs branch) | 23:57.38 |
| But the code doesn't use any exceptions yet. | 23:58.01 |
| http://ghostscript.com/~robin/pdf_xref.c <- That has the exception code used in it. | 23:58.38 |
| That's a single file I converted to get tor8's opinion. | 23:58.55 |
sebras | alright, a bit too tired right now (after an icelandic concert), so I'll defer that until tomorrow. but I will most certainly take a look. | 23:59.30 |
| Forward 1 day (to 2011/09/16)>>> | |