MuPDF IRC logs

	<<<Back 1 day (to 2018/03/22)	20180323
fredross-perry	go it, that makes sense. Thanks.	00:41.54
	got it	00:41.58
vtorri_	hello	10:15.17
mubot	Welcome to #mupdf, the channel for MuPDF. If you have a question, please ask it, don't ask to ask it. Do be prepared to wait for a reply as devs will check the logs and reply when they come on line.	10:15.17
vtorri_	normally, a PDF file has a magic number : %PDF	10:15.31
	but I have found a file that has not that magic number	10:15.54
	it's ILDA instead	10:16.01
kens	Its not actually a requirement as I recall, just a recommendation	10:16.15
vtorri_	is it a valid magic number for a PDF file ?	10:16.16
kens	Also, PDF files are permitted to have any amount of 'junk' at the start of the file	10:16.42
vtorri_	kens ok, thank you	10:16.56
kens	Though again I think there's a recommendation that the %PDF should appear in the first 1Kb	10:17.01
	The PDF 1.7 spec says 'should be' %PDF....	10:17.50
	But a lot of PDF creators play fast and loose with anything that says 'should' as opposed to 'must'	10:18.21
vtorri_	the file has indeed %PDF-1.5 at the beginning of the file	10:18.52
kens	Umm, so where's the ILDA ?	10:19.11
vtorri_	at the beginning, first 4 bytes	10:19.32
kens	Before the %PDF ? THat's valid	10:19.42
vtorri_	pdf file : https://www.alchemistowl.org/pocorgtfo/pocorgtfo15.pdf	10:19.57
	yes ILDA*********%PDF-1.5	10:20.09
kens	I can't find the refrence in the spec right now, but memory says that the %PDF only has to appear 'somewhere' in the first 1Kb	10:20.43
	Ah, its section 3.4.1, specific to Acrobat viewers	10:25.04
	Compatibility and Implementation Limits	10:25.13
vtorri_	looking	10:26.23
kens	That's appendix I	10:26.43
	Bah I mean H	10:26.53
	page 1102 in my copy of the 1.7 spec	10:27.13
vtorri_	kens https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf <-- that specs ?	10:27.34
kens	Umm, no.	10:27.59
	That's the ISO version, I mean the last one actually released by Adobe itself	10:28.14
	I try not to use the ISO specifications because they are less readable	10:29.00
	That one seems to have dropped the compatibility and implementation limits appendix altogether	10:29.40
	https://www.adobe.com/devnet/pdf/pdf_reference_archive.html	10:32.05
	The link "PDF Reference, Sixth Edition, version 1.7"	10:33.06
vtorri_	yes, i've finally found that link too	10:35.01
	thank you	10:36.31
kens	NP	10:38.45
tor8	Tamir_Evan: http://git.ghostscript.com/?p=user/tor/mupdf.git;a=commitdiff;h=608ee461efa6eb08e75bdf97038c13c08ae35d8f;hp=e44b4ab40a8a1bb69e1684aa4b781bb71d0d998d	10:59.34
	Tamir_Evan: could you try that and see if it solves your last mingw build problem?	10:59.51
Tamir_Evan	tor8: I'll give it a try. It may take a while.	11:03.17
	tor8: O.K., I'm a bit new to this. I cloned your repository, and checked out the 'wip' branch. When I tried to do 'git submodule update --init', it complained about failing to clone 'git://git.ghostscript.com/user/tor/thirdparty-*.git'. How do I init/update the sub-modules?	11:18.36
tor8	Tamir_Evan: it's much easier if you clone the master repository, and add my repository as a second remote	11:19.03
	the thirdparty submodules are using relative paths, so get confused if you clone user repos	11:19.23
Tamir_Evan	O.K., I'll try that.	11:19.41
tor8	$ git remote add tor git://git.ghostscript.com/user/tor/mupdf.git	11:19.57
	$ git remote update tor	11:20.02
	then you should have a tor/wip remote branch	11:20.09
Tamir_Evan1	tor8: Build fails with "undefined reference to `ftello'"/"undefined reference to `fseeko'", when trying to build cmapdump.	11:51.06
	tor8: In my fork of MuPDF I used fseeko64/ftello64 when compiling for MinGW (https://github.com/TamirEvan/mupdf/commit/61f5a397563ae023622322f6c00c6fbdb017799e), but (a) I don't know how to test that it actually works, beyond compiling successfully, (b) Mingw-w64's toolchain builds without a problem to begin with (it's mingw.org's toolchain that doesn't support these functions), ...	11:51.30
	tor8: and (c) I'm not sure on how willing you're going to be to add just for that toolchain.	11:51.44
Tamir_Evan	tor8: ... to add another ifdef just for ..., I meant.	11:53.18
	tor8: Looking at my bug report (https://bugs.ghostscript.com/show_bug.cgi?id=698783), I could have saved my self the trouble of trying to build it, as I said there that mingw.org's MinGW toolchain doesn't support ftello()/fseeko() either.	12:00.30
tor8	Tamir_Evan: not very willing, to be honest	12:00.33
Tamir_Evan	I understand.	12:00.48
tor8	you could probably get around it by adding -D_ftelli64=ftello64 to the XCFLAGS	12:01.03
Tamir_Evan	I'll give it a try.	12:01.48
	O.K., that workaround worked. Building current master with the added 'XCFLAGS+=-D_ftelli64=ftello64 XCFLAGS+=-D_fseeki64=fseeko64' builds successfully the Win32 viewer and mutool, and the viewer seems to work. Thanks.	12:26.47
tor8	Tamir_Evan: great!	12:27.01
Tamir_Evan	tor8: That leaves the need for defining CXX for the cross-compiler toolchains (including the two that target Windows through MinGW), now that HarfBuzz is compiled as C++. I reported a bug for it at https://bugs.ghostscript.com/show_bug.cgi?id=698926, and fixed it in my own fork (for the Mingw-w64 cross-compilers only) at https://github.com/TamirEvan/mupdf/commit/8cc29302edcb7e2f0fce9d4bc224e151a3fb847e.	12:43.48
tor8	Tamir_Evan: ah, yes, that's something that's slipped through the cracks	12:44.53
	the w64_x86-cross-mingw32 and w64_amd64-cross-mingw32 targets are the ones you use I assume?	12:45.12
	is the c++ compiler named "i686-w64-mingw32-g++" (and similar for the other ones?)	12:45.42
	Tamir_Evan1: would the top commit on tor/master work for you?	12:48.01
Tamir_Evan	tor8: Yes, that should work, though I can only really test the w64_x86-cross-mingw32 one. Let me check.	12:50.37
	tor8: The Win32 viewer builds and works, when cross-compiling under Cygwin with 'OS=w64_x86-cross-mingw32', and doesn't show any linkage to the Cygwin dll, so it looks good. The build of mutool fails with "undefined reference to `_mkgmtime32'", but that is not related to the C++ compiler issue. So that commit resolves the bug for me. Somebody else is going to have to test the other cross-toolchains.	13:09.37
vtorri_	Tamir_Evan i'll do later some tests with msys2 + mingw-w64	13:14.03
fredross-perry	robin, sebras - that code generated two logs (see the bug). one is ISGSEGV which I can't figure out. The other is indeed a lock failure, but i think resuled in an unresoled JNI exception which causes problems.	14:11.05
Robin_Watts	fredross-perry: OK, so both of the ones he quotes are SIGABRT	14:12.43
fredross-perry	hmm.	14:13.13
Robin_Watts	So, it does look like it's a problem with the lock of the bitmap failing.	14:14.55
	It vaguely looks like on an abort it lists all the threads in progress.	14:15.14
	Looking at the first SIBABRT...	14:15.39
fredross-perry	There's two log files attached. One of them is clearly a lock failure.	14:15.47
Robin_Watts	We have Thread-53534 which is in new AndroidDrawDevice	14:16.19
fredross-perry	The docs for AndroidBitmap_lockpixels say give four result, one of them can be JNI exception.	14:16.30
Robin_Watts	Thread-53519 looks like the stderr handler, and 53518 is the stdout handler.	14:17.19
fredross-perry	So we probably need to clear that before we proceed to throw a new one. I think that's what the log suggests.	14:17.20
Robin_Watts	We have a RenderThread, which looks like a system thing.	14:18.11
	AsyncTask #1 looks idle.	14:18.38
	so does #2.	14:18.48
	and all the others look OK too.	14:19.33
	OK, so we only have one thread that's actively doing a lock as far as I can see.	14:19.47
fredross-perry	If it's a JNI exception, that makes me wonder if there's something wrong with incoming bitmap itself.	14:20.08
Robin_Watts	fredross-perry: Yeah, I'm just rereading the log to see if I can parse that better.	14:20.33
fredross-perry	If you're looking at log(1), search for "failed to".	14:20.50
	I put the width, height and bbox there just for info. They look fine.	14:21.20
Robin_Watts	hmm.	14:21.24
	I see "Failed to lock bitmap"	14:21.35
	then width, height bbox.	14:21.46
	then "warning: items left on stack in draw device"	14:21.57
	Then I see "Failed to lock bitmap again"	14:22.09
fredross-perry	I don't know what that is.	14:22.14
Robin_Watts	s/again"/"again/	14:22.21
fredross-perry	the first thee lines are me doing LOGE.	14:22.32
	the second 'Failed to lock bitmap' if from fz_throw	14:22.49
Robin_Watts	Eh? We don't fz_throw that.	14:23.13
fredross-perry	if (!info)	14:23.32
	{	14:23.33
	LOGE("Failed to lock bitmap");	14:23.34
	LOGE("width %d height %d ", width, height);	14:23.36
	LOGE("bbox %d %d %d %d ", bbox.x0, bbox.y0, bbox.x1, bbox.y1 );	14:23.37
	fz_throw(ctx, FZ_ERROR_GENERIC, "Failed to lock bitmap");	14:23.38
	}	14:23.39
Robin_Watts	fredross-perry: I thought we had an abort() in there before it would have reached that point?	14:24.08
fredross-perry	I assume that's caugh in fz_catch, no, this is your proposed change.	14:24.34
Robin_Watts	fredross-perry: Right, so I'm confused.	14:24.53
fredross-perry	as am i.	14:25.00
	the lock fails, and we're in if (!info).	14:25.16
	I log a couple of things.	14:25.22
	then we fz_throw.	14:25.29
Robin_Watts	The plan, I thought, was to take on my change (which is a permanent change), but also to add an 'abort();' if the lock fails.	14:25.34
fredross-perry	and in fz_catch, we jni_rethrow.	14:25.41
	My bad, I don't recall it that way.	14:25.59
Robin_Watts	I probably wasn't clear.	14:26.05
	The problem with just taking on the change is that now I don't know where this is all crashing.	14:26.20
	Is it retrying the render and crashing a second time through?	14:26.29
	Let me look at the docs for lockPixels again.	14:27.06
fredross-perry	What it looks like is that the lock fails, but our attempt to jni_rethrow after that if foiled by an outstanding JNI exception.	14:27.25
Robin_Watts	fredross-perry: Right.	14:27.58
	AndroidBitmap_lockPixels() returns an int. I can't see what possible values that is allowed to take in the docs, can you?	14:28.22
fredross-perry	https://developer.android.com/ndk/reference/group___bitmap.html	14:30.50
	there's 4 possible values.	14:31.06
Robin_Watts	Ah!	14:32.36
	OK, yes, I agree then.	14:32.41
	It looks like we are calling it with an exception outstanding.	14:32.49
	So, we need to find out why... brb.	14:32.59
fredross-perry	oh wait. androidDrawDevice_lock does jni_throw when it fails. ??	14:34.28
Robin_Watts	Yes.	14:40.56
	Ah.	14:43.08
	The idea is that stuff calls lockNativeDevice, and if it returns non-zero, an exception has already been raised.	14:43.44
	So in newNativeAndroidDrawDevice, we end up raising 2 exceptions.	14:43.57
	Ahem.	14:44.39
	The idea is that stuff calls lockNativeDevice, and if it returns NULL, an exception has already been raised.	14:44.47
	So in newNativeAndroidDrawDevice, we end up raising 2 exceptions.	14:44.55
sebras	Robin_Watts: correct, I agree.	14:45.12
Robin_Watts	pants....	14:45.17
sebras	Robin_Watts: did i miss that we raise two when I reviewd this yesterday?	14:45.26
	if so I'm sorry.	14:45.28
Robin_Watts	sebras: no, I've spotted a deeper problem.	14:45.46
sebras	but I truely think we should make a build with _only_ abort() and no exception handling.	14:45.51
	Robin_Watts: just to rule out that we've made other changes that make things work.	14:46.04
Robin_Watts	lockNativeDevice is supposed to return a pointer to an info block.	14:46.16
	but that can permissibly be NULL.	14:46.24
	So detecting NULL as meaning "I had a problem" won't work.	14:46.38
sebras	Robin_Watts: when may it be NULL?	14:48.01
Robin_Watts	/* Some devices (like the Displaylist device) need no locking, so have no info. */	14:48.51
sebras	Robin_Watts: oh, when using the Display ListDevice?	14:48.53
	Robin_Watts: but what we want to detect is when info->lock() indicates error.	14:49.10
	ah, but you're saying that the returnvalue from lockNativeDevice() can be NULL so _its_ return value can not be used? that I agree with.	14:49.46
Robin_Watts	Yes.	14:50.13
sebras	Robin_Watts: perhaps we can have aan out argument indicating error?	14:50.22
Robin_Watts	sebras: We could.	14:50.28
sebras	that ought to work.	14:50.29
Robin_Watts	Or we could use a special pointer value.	14:50.34
sebras	Robin_Watts: that seems silly.	14:50.42
Robin_Watts	#define INVALID_INFO ((NativeDeviceInfo *)1)	14:51.02
sebras	that could work.	14:53.24
	but I'm not sure I like it. I need to pop out for a bit.	14:53.39
fredross-perry	When you say 'make a build with _only_ abort()', you mean without your proposed changes?	14:56.54
	and, abort90 only when the lock fails?	14:57.03
	s/abort90/abort()	14:57.15
	ok, i am also popping out, back in 30	14:57.52
Robin_Watts	fredross-perry, sebras: OK, here is my latest attempt:	15:01.14
	http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=7c0efe69eebb4c5e635e82d69df818678403a736	15:01.15
	I hope that should fix the "double raised error" thing.	15:01.32
	and it should cope with the fact that display lists pass NULL on locking too.	15:01.48
	I reckon the right thing to do is for fred to roll this into his new build, but for him to put an 'abort();' call in in addition as soon as possible after each lockPixels() call.	15:02.46
fredross-perry	You mean after each -failing- lockPixels call.	15:22.56
Robin_Watts	yes.	15:24.34
fredross-perry	(trivial) compile error on line 3153 - should return 0	15:30.23
Robin_Watts	fredross-perry: Thanks. I've got my head up gs at the moment.	15:34.14
fredross-perry	Can't unsee that image...	15:34.29
	I'll try and simulate some fails and see what happens.	15:35.02
sebras	I just pushed/popped back in.	16:06.30
fredross-perry	With the new code, and without abort(), I now get an exception up in my Java code that I can handle. I'm testing this by simulating an error every 100 calls or so. When I add abort(), I get a log that I'm not sure is useful? https://pastebin.com/dZ2BNakw	16:15.46
	The abort() is right after the call to AndroidBitmap_lockPixels().	16:15.47
	except that it confirms we're in Java_com_artifex_mupdf_fitz_android_AndroidDrawDevice_newNative	16:21.25
sebras	fredross-perry: it is important because we can clearly see the call to abort() is happening.	16:21.25
	fredross-perry: and if you run ndk-stack we should be able to know in what file/line all calls such as AndroidDrawDevice_newNative+356 originate.	16:22.19
	fredross-perry: by your own testing we only prove that what we _think_ is the problem is resolved by robins proposed fix.	16:23.16
	fredross-perry: but we haven't proved that this is what the reporter is experiencing, right?	16:23.33
	Robin_Watts: if lockNativeDevice() discovers that self is not a NativeDevice, should we set *err = 1 ?	16:24.12
fredross-perry	We have not. Well, based on the last test, we know that they had at least one case of the lock failing.	16:24.14
sebras	Robin_Watts: can that even happen? do we need to check this for _every_ device call?	16:24.46
fredross-perry	good catch.	16:24.46
Robin_Watts	sebras: I'm not sure we're gaining any knowledge given that fred is "simulating failures".	16:24.55
sebras	Robin_Watts: you mean gain knowledge of the reporters issue? we don't.	16:25.57
	Robin_Watts: hence I think adding the abort() is crucial.	16:26.10
fredross-perry	FYI, I am not simulating failures by causing AndroidBitmap_lockPixels to fail. Just going through the 'AndroidBitmap_lockPixels failed' logic.	16:26.28
Robin_Watts	sebras: Some logging, plus the abort, yes.	16:26.40
	fredross-perry: Now I'm even more worried :)	16:27.01
fredross-perry	lol	16:27.08
sebras	Robin_Watts: yup, I don't know what is smart to log, but I think you and fredross-perry is hashed that out yesterday.	16:27.14
Robin_Watts	If you're letting the lockPixels call happen, and then pretending it failed, you'll have bitmaps being left locked.	16:27.27
fredross-perry	So to simulate, I could call AndroidBitmap_lockPixels with a crap object and let it really fail.	16:27.43
Robin_Watts	fredross-perry: No. If you call it with a crap object, it'll crash.	16:28.02
	a) I'm not sure simulation helps us at all.	16:28.13
sebras	fredross-perry: if AndroidBitmap_lockPixels() handle crap objects.	16:28.13
Robin_Watts	b) If you want to simulate fail, then you should fail instead of calling AndroidBitmap_lockPixels.	16:28.38
fredross-perry	That's what I am doing.	16:28.48
	Sorry if I was not clear.	16:28.54
Robin_Watts	right.	16:29.08
	So, that's not how I read what you wrote above. No worries. Sounds like we're on the same page now though.	16:29.33
fredross-perry	If a true call fails, I would expect we won't leave smething unlocked.	16:29.37
	same page.	16:29.46
Robin_Watts	If a true call fails, I would expect we won't leave something locked :)	16:30.06
fredross-perry	agreed.	16:30.19
Robin_Watts	So let's send the abort + logging build to silver and see what he comes back with.	16:30.36
	One thing... in your logging before it was reporting width and height as being the same... really ?	16:30.50
fredross-perry	Yes. So I don't have to reconfig the bitmaps on rotation.	16:31.38
	What do you propose to log, then?	16:32.17
Robin_Watts	It might be good to log the value passed back by lockPixels too.	16:32.27
	We could also try logging "lock" and "unlock" on every lock and unlock, just to be sure we aren't somehow failing to nest properly.	16:33.40
fredross-perry	right. and maybe everything in the 'info' structure.	16:34.04
sebras	do we know if AndroidBitmap_lockPixels() itself throws an exception?	16:34.31
Robin_Watts	Also, how often are we getting the "warning: items left on stack in the draw device" error?	16:34.32
sebras	perhaps we ought to check that too?	16:34.37
	(so we don't accidentally overwrite it)	16:34.46
Robin_Watts	sebras: I don't believe it does throw an exception. Certainly I haven't seen that in the docs.	16:35.09
fredross-perry	sebras - we'll log the return value of AndroidBitmap_lockPixels()	16:35.28
sebras	Robin_Watts: me neither. I'm looking at its source code at the moment.	16:35.39
	it returns ANDROID_BITMAP_RESULT_BAD_PARAMETER if env or bitmap == NULL	16:36.19
	it returns ANDROID_BITMAP_RESULT_JNI_EXCEPTION if the call to android::bitmap::lockPixels() returns NULL	16:36.41
	and ANDROID_BITMAP_RESULT_SUCCESS otherwise.	16:36.48
Robin_Watts	sebras: Right. That's why I suggested logging the return code.	16:37.47
fredross-perry	So I guess the idea here is that either (a) the tester will get ONLY this abort(), or (b) they will get this abort and maybe something else.	16:40.18
	We hope it's (a) but it coud be something else. Maybe we should abort() on unlock too?	16:40.56
sebras	Robin_Watts: fredross-perry: I just verified in the android source that AndroidBitmap_lockPixels() will never throw exceptions of its own.	16:56.44
fredross-perry	ok, that	16:56.58
	's good	16:57.02
sebras	it seems the way it fails is that some internal functions return NULL and then this prompts AndroidBitmap_lockPixels() to return ANDROID_BITMAP_RESULT_JNI_EXCEPTION	16:57.16
Robin_Watts	fredross-perry: unlock can't fail, right?	16:57.27
fredross-perry	I would think not.	16:57.47
Robin_Watts	So aborting on unlock would be bad :)	16:58.05
fredross-perry	but it does return an int.	16:58.29
Robin_Watts	fredross-perry: I'd like to see the tester get the abort, and for us to look at the log to understand what value was returned.	16:58.42
fredross-perry	for unlock?	16:58.54
Robin_Watts	fredross-perry: For lock.	16:58.59
fredross-perry	yes, doing that.	16:59.05
Robin_Watts	but if unlock returns an int, we could do the same there sure.	16:59.12
fredross-perry	except without the abort. ?	17:00.28
Robin_Watts	The abort wouldn't hurt, cos it should never go off, right? :)	17:00.45
sebras	unlock can fail.	17:00.59
Robin_Watts	If we think that lock is failing with -2 (exception already raised), then we could maybe call 'CheckException' beforehand to see whether there really was an exception set on entry.	17:01.00
sebras	but only if the internal state in the bitmap is NULL (Igues sshouldn't happen)	17:01.16
Robin_Watts	sebras: How can it fail?	17:01.16
	right, we should never hit that, I hope.	17:01.29
sebras	Robin_Watts: or if env or bitmap == NULL so basically sanitychecks.	17:02.04
	Robin_Watts: we should take case in newNativeAndroidDrawDevice() so that jbitmap is never null.	17:02.43
	Robin_Watts: ah, we already do in android_AndroidDrawDevice_newNative!	17:03.04
fredross-perry	So to recap, if lock fails, we'll log some stuff and abort. If unlock fails, we'll log some stuff, but not abort.	17:04.19
Robin_Watts	You can abort both sides if you want.	17:15.22
	I just hope we should never see the unlock abort, cos it should never fail.	17:15.36
fredross-perry	ok	17:20.48
	take a look at the branch 'fred/bug699048' on my mupdf tree. It's Robin's new code plus abort() and logging I've put it.	18:05.05
Robin_Watts	http://git.ghostscript.com/?p=user/fred/mupdf.git;a=shortlog;h=refs/heads/bug699048	18:06.24
	fredross-perry: It looks like you've taken more commits from me that you should.	18:06.34
	no, ignore that.	18:06.57
fredross-perry	right.	18:07.02
Robin_Watts	I can't see where you've taken ANY commits from me :)	18:07.16
fredross-perry	that's probably true. Just compare this to what you started with to see what I am doing.	18:08.07
	because of the aborts and logging, I'm considering this temporary anyway.	18:08.33
Robin_Watts	http://git.ghostscript.com/?p=user/fred/mupdf.git;a=blame;f=platform/java/mupdf_native.c;h=42f1a1a816036308f2e64dbb1bb3ebc43498e8e4;hb=f163ae00dfc4ced669713739da92a13b3ef054c6	18:11.10
	That doesn't have any of my changes in.	18:11.17
fredross-perry	try adding me as a remote on your locl system and fetching my branch then.	18:12.01
Robin_Watts	I see no difference.	18:12.56
	What SHA are you on locally?	18:13.13
	git log -1 \| head -1	18:13.28
fredross-perry	I'm in repos/mupdf.git under my login. Is that the issue?	18:14.02
Robin_Watts	bug699048 is set to the same point as master.	18:14.15
	which hasn't changed since february.	18:14.37
	I don't believe you've pushed properly.	18:14.42
fredross-perry	looks that way.	18:15.19
	oh. I am on a detached head.	18:16.16
Robin_Watts	git push -f origin HEAD:bug699048	18:17.22
	(assuming that origin = your repo on casper)	18:17.38
fredross-perry	look again	18:17.51
Robin_Watts	better!	18:18.04
fredross-perry	ok	18:18.07
	I did master:bug699048 before, but my local was detached	18:18.40
Robin_Watts	looks plausible.	18:19.00
fredross-perry	I also am now getting exceptions in the calling Java code, which I handle and can recover from these things. (not with the aborts, of course).	18:20.05
Robin_Watts	cool.	18:20.16
	Forward 1 day (to 2018/03/24)>>>

Log of #mupdf at irc.freenode.net.