| <<<Back 1 day (to 2018/01/17) | 20180118 |
tor8 | sebras: fredross-perry: there is a function to immediately release all resources for a java Document without forcing the gc to run: Document.destroy() | 10:48.23 |
| that will call the finalizer and set the internal C pointer in the java wrapping object to NULL | 10:48.50 |
| fredross-perry: the fz_set_context function sets a global value. I don't understand how you mean to have it contain a flag for whether the most recently opened stream was an InputStream or OutputStream and then act on that information. | 10:50.32 |
| but that code seems to have gone now, so that's good :) | 10:51.35 |
titanous | tor8: do you mind taking a look at https://github.com/google/oss-fuzz/pull/1067 | 13:31.00 |
tor8 | titanous: in pdf_fuzzer.cc you can just call the fz-prefixed functions. fz_open_document_with_stream, fz_count_pages, and fz_drop_document. | 13:34.39 |
| just pass in "pdf" as the 'magic' argument | 13:35.07 |
| then you don't need to do the &doc->super stuff | 13:35.22 |
titanous | got it | 13:35.27 |
tor8 | and that will then also make it work for the other input formats we support | 13:35.59 |
titanous | what email address should I use for you? and who else should be CCed? they need to be Google accounts | 13:36.17 |
tor8 | titanous: I think sebras was looking into setting up a google account for this. he's been a bit busy the last couple of days so I don't know what's happened with that. | 13:37.04 |
| how much traffic is this going to generate? | 13:37.39 |
| you can also pass 0 to fz_new_pixmap_from_page_number and save a bit of memory | 13:39.06 |
| as the final argument | 13:39.12 |
titanous | there will be one email thread for each bug found, but they are very easy to filter into a folder or whatever | 13:39.30 |
| they will be from Google's monorail issue tracker | 13:39.43 |
| then there is a web UI that can be used to access the stats and bugs found | 13:39.57 |
| the bugs will automatically be closed after the issue is fixed | 13:40.20 |
| everyone that you want to have access to the details should be in the YAML file | 13:41.20 |
| I can add the initial set and then you can send PRs to oss-fuzz in the future | 13:41.34 |
tor8 | titanous: you can put tor.andersson@artifex.com in there to start with | 13:49.57 |
| artifex.com is a google account | 13:50.30 |
titanous | perfect | 13:50.35 |
| do you mind just posting a comment that says it looks good if everything is okay? | 13:50.58 |
| I've pushed all of the changes | 13:51.07 |
tor8 | paulgardiner: you around? | 14:21.21 |
| I finally got around to reviewing your signature-work branch | 14:21.43 |
| Only found some minor issues. Two typos spotted: "verifcation" in a comment in "Further changes to sig..." and "potention" in a commit message | 14:22.42 |
| fz_output_as_stream ought to be named fz_stream_from_output (and the internal callback can be either as_stream, to_stream, or stream_from, I have no strong opinion there) | 14:23.09 |
| I find that X_from_Y naming "x = X_from_Y(y)" reads better than "x = Y_to_X(y)" | 14:24.08 |
| in complete_signatures, you create a new stream from the output for each byte range | 14:24.50 |
| you could hoist that creation outside the for (unsaved_sigs) loop and just seek/read inside the loop | 14:25.10 |
| I have nothing to say about how the openssl API is used; I'll trust you on those bits :) | 14:26.45 |
paulgardiner | That's great. I can sort most of those out. The repeatedly creating a new stream might have to remain. Robin requested the condition that the output is not used while a derived stream is in existence. | 14:34.34 |
Robin_Watts | tor8: yeah, I felt it was important that we not have to cope with output and streams both existing at the same time and being used interleaved. | 14:41.18 |
tor8 | Robin_Watts: paulgardiner: that's not the case here though, or have I missed something? the only functions being called (other than create/seek/read/drop of the stream) in the loop is pdf_obj_parent_num and pdf_dict_getl | 14:46.34 |
Robin_Watts | tor8: I haven't looked at the latest patch, but I thought when I looked before it was an issue. | 14:47.18 |
tor8 | I'm talking about the first loop only | 14:47.25 |
| pdf_write_digest should probably create the stream from its fz_output inside the function itself | 14:48.06 |
| since that's where the interleaving happens | 14:48.12 |
| this case I'm talking about the second loop that does create/pdf_write_digest/drop | 14:48.54 |
paulgardiner | Ah, right. That makes sense | 14:54.00 |
| Robin_Watts: I've sorted the issue you spotted before. Just sounds like I've overdone it on the fix | 14:54.43 |
| tor8: I've addressed all your comments and pushed a new version to the branch. I'll retest on the cluster too. | 15:55.40 |
| cluster says it's okay. | 16:07.53 |
fredross-perry | sebras - you there? | 16:54.09 |
sebras | fredross-perry: I am. | 16:57.32 |
fredross-perry | So I am chasing down unreleased references to the document in my Java code. But I am still somewhat uncomfortable forcing GC. I think maybe we forgo the close() parts of the stream code, and I'll do the right thing with secureFS on my side. | 16:59.23 |
| thoughts? | 17:00.04 |
sebras | I sent this message privately earlier today "I spoke with tor8 and he reminded me about Document.destroy() which will call the finalizer and set the pointer to the native fz_document to NULL. that's something that you could call just prior to setting the last java reference to a document to null in order to trigger it to release all its resources. without having to wait for GC. I had forgotten about it, | 17:00.54 |
| sorry. :)" | 17:01.00 |
| fredross-perry: I have been forcing GC myself when I have needed to control freeing, so I have forced more work on myself as well... | 17:02.10 |
Robin_Watts | sebras: You've been "forcing gc" ? | 17:05.57 |
| By calling System.gc(); ? | 17:06.09 |
| In general, we should never "force gc". | 17:06.52 |
sebras | Robin_Watts: when I have been looking for reference leaks, yes. | 17:07.16 |
fredross-perry | Right. But in my case the only reason I am needing the _close to be called is so I can turn around and tell secureFS to close. I know it's good that we're revealing cleanup that needs to be done, but I also want to reliably call close. We can leave the _close functionality in, and say that my code has the issue. IOW, trying to get something reasonable pushed into mupdf soon. | 17:07.17 |
Robin_Watts | But we should free as much as we can without relying on gc, IMAO. | 17:07.20 |
fredross-perry | Robin - I agree. | 17:07.28 |
Robin_Watts | sebras: Ah, so for debug only, gotcha. | 17:07.31 |
Robin_Watts | scuttles back under rock. | 17:07.41 |
sebras | Robin_Watts: yes, for runtime I don't care if the resources used by a document are freed at a later point. | 17:08.19 |
| Robin_Watts: unless the resources are somewhat limited. | 17:08.26 |
| fredross-perry: so securefs has two close calls? one to close an opened file and one to unmount/close the entire securefs from future use? (until it is mounted/openend again of course) | 17:09.16 |
fredross-perry | sebras - not like that. The model for a single document is open, read, seek, close. Right now close is invoked via the fz_stream we create, which is subject to all the Java refs being cleaned up. I'd rather handle close explicitly on the Java side without relying on mupdf at this point. | 17:11.36 |
sebras | fredross-perry: ok, then I misunderstood you. what happens if you call Document.destroy() at the place where you would want to call Document.close()? | 17:12.51 |
| fredross-perry: given what I read in Document.destroy() this ought to free all its resources. | 17:13.25 |
fredross-perry | That's what I was doing, and the ref count for the document (in mupdf) was 2, not 1. Which has led to me hunting down unreleased refs on my side. | 17:13.59 |
| The other reason I don't want to rely on that is that even if I do find and squash all the refs, there's no guarantee that we'll introduce on in the future, and that seems fragile to me. | 17:14.59 |
sebras | fredross-perry: it was? you didn't mention that yesterday when I proposed nulling the references and forcing GC. | 17:15.02 |
fredross-perry | Don't get me wrong. Nulling references and forcing GC is leading to fixing some issues on my side. I just don't want to rely on getting that right in order for close() to get called. | 17:16.41 |
Robin_Watts | fredross-perry: Yeah, this is why we have the _destroy() calls, IIRC. | 17:17.29 |
sebras | Robin_Watts: and yesterday I had forgotten about that so I never suggested using it. (as I haven't even used it myself.) | 17:19.05 |
fredross-perry | destroy() just calls finalize() directly. | 17:19.05 |
Robin_Watts | fredross-perry: IIRC destroy calls finalize, and sets everything in the java object to NULL. | 17:19.49 |
| That way, if (when) the gc happens and the java object gets thrown away, the finalize call doesn't do anything. | 17:20.13 |
| It means that after a destroy we have a 'zombie' java object, if you like - a husk with no real contents. | 17:20.34 |
sebras | Robin_Watts: but if there is a reference leak of fz_stream somewhere in the native code then that stream (and hence the SeekableStream) would still stay around even if Document is now finalized. | 17:21.01 |
Robin_Watts | sebras: Yes. | 17:21.17 |
sebras | Robin_Watts: and fredross-perry seems to suggest that even if that happens he wants SeekableStream.close() to be called during Document.finalize(). | 17:21.56 |
Robin_Watts | My contention would that if we are calling destroy() reliably then it's easier to rule out that leaks aren't happening just because gc hasn't happened. | 17:22.06 |
sebras | Robin_Watts: i.e. even if we have reference leaks in the native code the java code shouldn't suffer. | 17:22.12 |
Robin_Watts | (cos ISTR that even calling System.gc() is not guaranteed to call all finalizers immediately) | 17:22.22 |
sebras | Robin_Watts: yes, if don't want to wait for GC then we must call .destroy(), but if I get fredross-perry correctly his worry is shielding the java code against reference leaks in the native code. | 17:23.07 |
Robin_Watts | I think we have to work on the basis that if the native code gets reference counts wrong, we're screwed. | 17:23.08 |
fredross-perry | I believe that I've got a stray Java reference to the document that still exists at the time Document_finalize gets called. That's what I am hunting down. | 17:23.11 |
sebras | Robin_Watts: no, you'd have to call java.lang.System.runFinalizers() too. | 17:23.25 |
Robin_Watts | fredross-perry: Right. | 17:23.26 |
fredross-perry | I don't think the issue is in the native code. | 17:23.33 |
Robin_Watts | fredross-perry: I can't remember if I got Memento working with the JNI code. | 17:23.48 |
| If so, that might help track leaks. | 17:24.07 |
| Cos for any given block the details stuff can store the history of where reference counts were changed. | 17:24.33 |
| tor8, sebras: http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=79aca97f74ebf4a9971ffbc02bebf36de4c475fc | 17:25.42 |
fredross-perry | Is there a Java-only way to track references? 'Cause I don't believe I've got a JNI issue at this point. | 17:26.06 |
| I'm going to the office now to fiddle w/miles. Back up later. | 17:26.45 |
sebras | Robin_Watts: so in lcmsart you are basically taking the new upstream and adding two shelly fixes? | 17:30.13 |
| I can't judge the fixes themselves, but the branching seems reasonable to me. | 17:30.32 |
| Robin_Watts: it clusters well I guess? | 17:32.02 |
Robin_Watts | it does. | 17:32.10 |
| sebras: yes, upstream + 2 fixes. | 17:32.20 |
sebras | Robin_Watts: are you familiar with securefs? | 17:33.40 |
Robin_Watts | at a highish level | 17:33.51 |
sebras | Robin_Watts: I want to know why it is important that the underlying stream is closed when we close a document? | 17:34.00 |
Robin_Watts | (i.e. I understand the intent) | 17:34.13 |
sebras | is it just a cleanliness matter or may something actually stop working? | 17:34.30 |
Robin_Watts | sebras: The idea is that an integrator can give us a securefs implementation, and we do file fetching/writing through that. | 17:34.50 |
| so perhaps it could wrap their own cloud service. | 17:35.02 |
| Or maybe it could do DRM etc. | 17:35.11 |
| If we don't close the underlying securefs thing, how can the cloud service know that it can free the connection (or the local cached copy of the file etc)? | 17:35.54 |
sebras | Robin_Watts: so this is similar to on Linux having a program with a file open on a file system that you want to unmount: you can't. | 17:36.29 |
Robin_Watts | sebras: sounds similar, yes. | 17:36.45 |
sebras | Robin_Watts: why would the cloud service want to free the connection before all the files have been closed? | 17:37.34 |
Robin_Watts | sebras: It would want to free the connection when all the files have been closed. | 17:38.58 |
sebras | Robin_Watts: I guess one way around this is to have the SeekableInputStream implementation have a SeekableInputStream.freeAllResourcesAndHaveFutureInterfaceCallsThrowIOException() that can be called...? | 17:39.11 |
Robin_Watts | I'm clearly missing something here in this discussion. | 17:39.31 |
| When we go to open a document, we open a "file" (which is actually a securefs stream), yes? | 17:39.51 |
sebras | Robin_Watts: I'm too so I have the eerie feeling that I'm missing something. | 17:39.55 |
| yes | 17:39.59 |
Robin_Watts | we read the document in. | 17:40.01 |
sebras | yes | 17:40.03 |
Robin_Watts | we keep that stream open all the time we have the document open. | 17:40.14 |
sebras | yes | 17:40.18 |
Robin_Watts | when we close the document, we close the stream. | 17:40.25 |
sebras | yes | 17:40.29 |
| or rather... the document drops its reference | 17:40.40 |
| but the stream is not closed. | 17:40.43 |
Robin_Watts | At that point, we want to make sure that the underlying stream really is closed, rather than it being "closed-pending-a-gc". | 17:41.08 |
sebras | right. | 17:41.19 |
| and barring no errors you would to Document.destroy() and that would happily close the stream. | 17:41.41 |
| but if there is a mistake in the code so that there is still a reference to Document, or in the JNI so that it has leaked a reference to the stream, then the stream would not close. | 17:42.09 |
Robin_Watts | hence rather than the document just doing: doc.stream = null;, we want to do doc.stream.destroy(); doc.stream = null; | 17:42.10 |
| sebras: Yes. | 17:42.19 |
sebras | agree, but fred's suggesting that doing so wouldn't be enough if there are also leaks of Document in java or stream in the native code. | 17:42.42 |
Robin_Watts | So the correct thing to do is to make sure there is no such mistake in the code. | 17:42.50 |
sebras | yes, but fred's suggesting that we might introduce another one. | 17:43.06 |
| and then the stream wouldn't be closed. | 17:43.12 |
Robin_Watts | another mistake? | 17:43.23 |
sebras | yes. | 17:43.26 |
Robin_Watts | Let's try not to introduce such mistakes :) | 17:43.33 |
sebras | there is one now, but even if we fix that we might introduce another one. | 17:43.51 |
Robin_Watts | Indeed. | 17:43.58 |
sebras | and that's true we've messed up reference counting before. | 17:44.00 |
Robin_Watts | but hopefully such things should be easy to spot, because securefs won't close properly. | 17:44.17 |
sebras | but I have a hard time both relying on reference counting to free resources and then subverting it in order to free resources when I want. | 17:44.40 |
| I would hope so. though finding reference leaks is pretty difficult sometimes. especially for complicated documents. | 17:45.30 |
| Robin_Watts: apparently some software can show all java object references: https://stackoverflow.com/questions/6155464/is-it-possible-see-all-the-references-to-an-object-in-execution-time#6155986 | 17:47.00 |
| not sure mupdf would build in netbeans though. | 17:47.11 |
Robin_Watts | sebras: Urm... | 17:47.52 |
| The way I think about this stuff is that all the native stuff should be freed directly, without relying on gc. | 17:48.45 |
| That's why we have .destroy. | 17:48.53 |
| The only thing that gc does for us is to get rid of the 'husk' java objects. | 17:49.23 |
sebras | right. | 17:49.29 |
| I'm not sure we call .destroy() in our own apps at the moment. | 17:49.40 |
Robin_Watts | So I don't feel we are ever "relying on reference counting to free resources and then subverting it in order to free resources when I want" | 17:49.51 |
| Are we not? We should be. | 17:50.01 |
sebras | well, .destroy() is kind of saying that I don't trust the reference couting to free resources for me, so do it NOW! :) | 17:50.23 |
| maybe I'm viewing it incorrectly. :) | 17:50.43 |
Robin_Watts | destroy() is saying "I want to not rely on when finalize() is called", yes. | 17:52.16 |
| IF we always call destroy, then we are NEVER relying on gc. | 17:52.41 |
sebras | true. | 17:52.50 |
Robin_Watts | (for native objects). | 17:52.53 |
| And that's what we should be shooting for, I believe. | 17:53.00 |
sebras | but would we want a Document.destroy_and_disconnect_the_java_objects_from_their_native_counterparts() as an alternative? | 17:53.32 |
Robin_Watts | sebras: Was there a lgtm about that commit ? | 17:54.01 |
sebras | i.e. in this case that the SeekableStream thingies are closed and so when the fz_stream callbacks are called they fz_throw? | 17:54.04 |
| Robin_Watts: lcms2? yes. the branching looks correct to me. I assume that it clustered well..? | 17:54.31 |
Robin_Watts | It did. | 17:54.37 |
sebras | Robin_Watts: LGTM. | 17:54.42 |
Robin_Watts | Ta. | 17:54.44 |
sebras | np. | 17:54.47 |
Robin_Watts | sebras: Document.close(), presumably calls stream.close(), right? | 17:55.30 |
sebras | no. | 17:55.37 |
Robin_Watts | Why not? | 17:55.49 |
sebras | Document.destroy() -> Document.finalize() -> fz_drop_document() -> pdf_drop_document_imp() -> fz_drop_stream() -> close-callback -> SeekableStream.close() | 17:57.03 |
| but if someone leaked a reference to the stream fz_drop_stream() wouldn't call the callback. | 17:57.23 |
Robin_Watts | Right, so let's not leak references to that stream. | 17:57.43 |
sebras | agreed, but then fred needs to find the reference leak that he is facing now. | 17:58.03 |
Robin_Watts | On platforms where Memento runs, and the details stuff works, you can see for any given memory block where the reference counts were updated. | 17:58.28 |
| so you can see precisely where they were leaked. | 17:58.39 |
| (with a C backtrace) | 17:58.59 |
sebras | Robin_Watts: yes, though I'm not sure it works in muso. | 17:59.09 |
Robin_Watts | sebras: Right, but that may be something we can solve. | 17:59.22 |
| fundamentally there is nothing about Memento that shouldn't work on Android. | 17:59.39 |
| We just need to figure out the right backtrace method to use. | 17:59.51 |
sebras | that might help. otherwise it is a lot of breakpoints and checking keep/drops against the backtraces of each triggered breakpoint (i.e. the old fashioned way) | 18:00.02 |
Robin_Watts | sebras: Supposedly android has backtrace and backtrace_symbols_fd, so using MEMENTO_STACKTRACE_METHOD=1 should work. | 18:01.47 |
sebras | that'd be nice. | 18:01.59 |
| then a way to build mupdf using memento is the thing missing. | 18:02.12 |
| Robin_Watts: would we need to include the java backtrace too? | 18:02.35 |
Robin_Watts | sebras: The stacktrace would run back as far as the JNI function, which should be enough I'd hope? | 18:03.20 |
sebras | right, and then point to the JVM code that executes the javabyte code presumably. | 18:03.55 |
Robin_Watts | sebras: I mean, we shouldn't need the java backtrace. | 18:04.20 |
| cos the C backtrace alone should tell us what java calls were being made. | 18:04.44 |
| (just not the callstacks for the java calls) | 18:05.15 |
sebras | Robin_Watts: if there are resource leaks then they mostly be pointing to the member that runs the page. | 18:05.30 |
| right. perhaps they are not needed. | 18:05.50 |
| ok. so we need to figure out how to build memento under android then, got it. | 18:06.06 |
Robin_Watts | -DMEMENTO=1 | 18:06.12 |
fredross-perry | Just to be clear, I don't believe there is an issue in JNI at this point. Rather, there is a Java doc reference that still exists at the time Document_finalize is called. It's related to the way I'm handling page-related objects in NUI. I just have to hunt them down. So when Document_finalize drops the doc, call_SeekableInputStream_close is NOT called, so we've lost the opportunity to tell secureFS to close. I | 18:10.06 |
| can handle getting secureFS closed another way. | 18:10.07 |
sebras | fredross-perry: but calling Document.destroy() ought to freel the internal structure of the object pointed to by that stray Document reference. so then the callback ought to be called. | 18:12.51 |
| despite the stray java Document reference. | 18:13.00 |
| at least the way I understand it. | 18:13.10 |
fredross-perry | that's what's not happening. Remember, Document.destroy() is in Java, which just calls finalize(). Finalize calls fz_drop_document. | 18:14.24 |
sebras | Robin_Watts: wee! the latest version of android NDK deprecates armeabi, mips and mips64! :) | 18:14.44 |
fredross-perry | do we need those? | 18:15.17 |
sebras | fredross-perry: yes, and then fz_drop_document calls pdf_drop_document_imp which calls fz_drop_stream. and that ought to close the stream. | 18:15.21 |
| fredross-perry: re: archs, I bet there aren't many users with devices on those platforms. but it certainly reduces build times! | 18:15.44 |
fredross-perry | it doesn't when I step thru it, though. | 18:15.46 |
Robin_Watts | fredross-perry: So there is another java object that is holding a Stream reference. | 18:16.02 |
sebras | or native code leaking a reference. | 18:16.21 |
Robin_Watts | sebras: Fred's claim is that it's a java object rather than a native one. | 18:16.41 |
| (AIUI) | 18:16.45 |
fredross-perry | that's my claim. | 18:16.50 |
sebras | Robin_Watts: src/mupdf-android-fitz/build/intermediates/ndkBuild/release/obj/local/mips64/libmupdf_core.a(memento.o): In function `Memento_showStacktrace': | 18:17.00 |
| Robin_Watts: src/mupdf-android-fitz/libmupdf/source/fitz/memento.c:794: undefined reference to `__cxa_demangle' | 18:17.10 |
| ugh. | 18:17.13 |
Robin_Watts | So... presumably we should be able to find that java object from Document. | 18:17.23 |
fredross-perry | Right, I am in the process of hunting it down. | 18:17.50 |
| heading back home now, more later. | 18:18.54 |
Robin_Watts | so we can hopefully add a 'destroy' method to that java object so: doc.mysterious_object.destroy() calls mysterious_object.stream.destroy(); and sets mysterious_object.stream = null; | 18:18.57 |
| So, yes, finding that object is key. I agree with your plan :) | 18:19.24 |
| sebras: Ugh. | 18:19.34 |
| MEMENTO is deliberately set up so we can use different stacktrace mechanisms easily. | 18:20.00 |
| possibly we need to copy the standard linux one (the one that uses backtrace etc) and then tweak it a bit. | 18:20.23 |
sebras | Robin_Watts: would we need to link with some special library to get this symbol? | 18:21.34 |
| seems like it ought to be part of libstdc++, but I'm not sure. | 18:22.17 |
Robin_Watts | sebras: not a clue, offhand, sorry. | 18:24.17 |
| We can probably live without demangling. | 18:24.37 |
| It just means we need to mentally decode the function names ourselves. | 18:24.49 |
sebras | right. I'll comment it out. someone suggested compiling __cxa_demangle from the libstdc++ manually, but that seems awful (because android may at one point introduce it!) | 18:25.53 |
Robin_Watts | sebras: #ifndef MEMENTO_ANDROID maybe. | 18:26.50 |
| MEMENTO_ANDROID is set already. | 18:27.10 |
sebras | Robin_Watts: fred: https://pastebin.com/raw/tXx7iEAa with this patch I get a memento build of libmupdf for android. hopefully that can be used to track down the problem easier. | 18:30.51 |
| I need to run out for a while, but when I'm back I'll try to sort that patch out cleanly with ifdefs. | 18:31.15 |
fredross-perry | sebras - just to confirm, in Document_finalize, if I force doc->refs = 1 before dropping the doc, it all works as I expect. So that's my evidence that I've got a stray doc reference. | 19:09.55 |
| I believe I've found my issue. com.artifex.mupdf.fitz.Page is holding the doc reference. I have to call destroy() on it. When I do, all is well, including no forcing GC. | 19:33.50 |
| sebras - OK finally, http://git.ghostscript.com/?p=user/fred/mupdf.git;a=commit;h=48c2ebc8f2a1ec009fd474169d0374b74c0d5304 | 23:46.32 |
| Forward 1 day (to 2018/01/19)>>> | |