MuPDF IRC logs

	<<<Back 1 day (to 2018/05/14)	20180515
tor8	sebras: why do you deep copy the resource dict in pdf_filter_page_contents in the non-sanitize case?	08:10.05
Robin_Watts	tor8: Without that he was seeing resources not being copied across.	08:16.46
tor8	Robin_Watts: different issue, I think. just pdf_keep_obj would suffice, why the deep copy is my question	08:17.20
Robin_Watts	tor8: Oh, I see. I'll shut up.	08:17.37
tor8	and another, larger, question would be "should we also filter out unused resources when only cleaning?"	08:19.18
avih	tor8: do you want the jsarray qsort implementation?	08:22.18
	several weeks ago you'll say you don't mind taking it if i wrote one, and i've pinged you twice on it, but got no comment from you.	08:23.13
	you said*	08:23.23
	i don't mind at all if you implement it or modify my code, it's just orders of magnitude faster than the current sort, and with relatively small LOC, so i think mujs would benefit from it.	08:26.20
tor8	avih: I've been on vacation, just got back	08:27.20
avih	it also barely uses additional memory - only additional log2(N) on average	08:27.22
moolc	tor8: got the epub problem e-mail?	08:27.39
avih	oh, hope you enjoyed it :)	08:27.40
tor8	it's supposed to sort in place; I'm not thrilled about creating the temporary 'stack' array	08:28.15
	so I've been meaning to rewrite it, but haven't had time	08:28.28
avih	it does sort in place	08:28.29
	the stack is of yet-to-be-partitioned ranges	08:28.58
tor8	it would be even faster if not using js arrays for the intermediate arrays	08:29.00
	which I can do if I reimplement it in C	08:29.19
avih	correct, and i did have such implementation, but it's only marginally faster - barely measurable, and you have the additional memory to manage yourself	08:29.33
tor8	avih: did you c version use js array objects?	08:30.14
avih	the original implementation uses fixed stack array of 2log2(64 bits) items, and just resets the sort on overflow. and because the pivot is random, it'd succeed with this stack size sooner rather than later. however, this doesn't take into account an adversary compare function - which the user can supply, which can cause it to reset over and over and never finish	08:31.33
	so it has to account for the worst case - so the stack must be growable	08:32.20
	tor8: no, the c version used the js alloc function and threw on errors.	08:32.55
	the js array is actually an elegant approach. you get the memory management for free, and it's actually small part of the runtime. most of the time is at the partition function anyway. the stack is used an order of magnitude less than comparisons	08:34.53
	and it's a rather tiny array. for 1000 items the stack array is ~20 items	08:37.20
tor8	avih: yeah. a ~20 item array that doesn't use the full blown js_Object array implementation with properties would be even nicer though.	08:38.24
avih	oh? is there a size limit where it becomes "full blown" array?	08:39.10
tor8	no. but even a simple 20-element js array will be pretty heavy and slow compared to what we can get away with in C	08:40.02
	it will always do string-to-number and number-to-string conversions and self balancing binary tree accesses for every array access	08:40.30
avih	yes, i do get this	08:40.37
	but iirc i measured it and it's barely measurable	08:40.48
	the diff, that is	08:40.55
tor8	right. I can see how it could be dwarfed by other things	08:41.10
avih	we're talking (roughly speaking) 300ms vs 320 ms for 10k items array	08:41.29
	(or was it 40k? anyway, very very small diff, and way simpler implementation - worth it in my book)	08:42.13
	sec, let me try to dig up the c implementation and measure again.	08:44.54
	actually, i'll just test it with fixed stack big enough for the sorts i'll try, and throw on overflow. it won't get faster than that with c stack implementation.	08:46.33
moolc	source/pdf/pdf-object.c:1258:16: warning: variable âdocâ set but not used [-Wunused-but-set-variable]	08:58.38
tor8	moolc: saw your email, haven't had time to look yet	08:59.38
avih	tor8: the only diff is at the stack macros at the top: https://pastebin.mozilla.org/9085451	09:00.19
	for 80k items, the c code is ~680 ms, and the jsarray code is ~710ms (roughly on average with several repetitions)	09:01.15
moolc	tor8: cool	09:01.33
avih	tor8: obviously it could also be implemented with recursion, but then the heap could grow quite big on worst case scenarios. i think the own-stack solution is way nicer.	09:10.25
	way nicer to use the heap for such things imo.	09:10.53
	the stack* could grow quite big...	09:11.15
	(for for these 81920 items, the max stack depth was 34)	09:26.32
	fwiw, the array i was testing - without a comparison function so the internal toString based comparison is this, doubled 11 times:	09:32.44
	var arr = ["1000X Radonius Maximus","10X Radonius","200X Radonius","20X Radonius","20X Radonius Prime","30X Radonius","40X Radonius","Allegia 50 Clasteron","Allegia 500 Clasteron","Allegia 51 Clasteron","Allegia 51B Clasteron","Allegia 52 Clasteron","Allegia 60 Clasteron","Alpha 100","Alpha 2","Alpha 200","Alpha 2A","Alpha 2A-8000","Alpha 2A-900","Callisto Morphamax","Callisto Morphamax 500","Callisto Morphamax 5000","Callisto Morphamax 600","Callisto	09:32.46
	Morphamax 700","Callisto Morphamax 7000","Callisto Morphamax 7000 SE","Callisto Morphamax 7000 SE2","QRS-60 Intrinsia Machine","QRS-60F Intrinsia Machine","QRS-62 Intrinsia Machine","QRS-62F Intrinsia Machine","Xiph Xlater 10000","Xiph Xlater 2000","Xiph Xlater 300","Xiph Xlater 40","Xiph Xlater 5","Xiph Xlater 50","Xiph Xlater 500","Xiph Xlater 5000","Xiph Xlater 58"];	09:32.46
	hmm.. it looked smaller in my editor :)	09:33.34
moolc	avih: as opposed to what? (IOW where does it look larger?)	09:41.47
avih	looks bigger at the irc paste, i.e. sorry for the apam-ish paste :)	09:42.18
	spam*-ish	09:42.28
	in my editor it's 4 lines...	09:42.44
moolc	avih: i pitty you... my editor _is_ my irc client	09:43.09
	and mail and news...	09:43.16
avih	you pray to the emacs gods?	09:43.47
moolc	avih: hope https://boblycat.org/~malc/scratch/viper2.png answers your question	09:45.35
avih	well, i don't pitty you, even if i should :)	09:46.20
	i say everyone uses what's bet for them.	09:46.53
	best	09:46.58
moolc	avih: we'd be living in paradise were that assertion true	09:50.38
	last i checked we aren't	09:50.45
tor8	avih: try this on for comparison: https://pastebin.mozilla.org/9085454	09:50.47
	that's using the system qsort with a temporary js_Value array	09:50.59
avih	moolc: correction: what they think is best for them :)	09:51.57
moolc	tor8: http://ix.io/	09:52.03
	avih: warmer! :)	09:52.16
avih	:)	09:52.29
	(though it was implied, TBH :) )	09:53.09
tor8	moolc: hmm, I shall try to remember that one	09:54.03
moolc	avih: irc is notoriously bad at properly conveying sarcasm innuendo and intentions	09:54.09
	tor8: thought it's right up your alley :)	09:54.26
avih	tor8: wouldn't this break for two js_States in two threads sorting concurrently?	09:54.33
tor8	avih: yes, hence the TODO: qsort_r	09:54.55
avih	ah	09:55.03
tor8	but before I start getting into that nastiness, I figured it'd be worth having your input	09:55.30
avih	sure. sec	09:55.42
	tor8: i believe qsort_r/s are way less available than plain qsort. i did consider using the native qsort, and it's possible too with your own context reachable from the items, just wasn't worth the effort IMO.	09:56.59
	tor8: definitely worth it. about 10x faster - ~70ms for 80k items, and on the face of it the sort seems correct.	10:01.47
	(with mingw gcc 7.3 64)	10:02.30
	now i should look at what it does.	10:03.02
	hmm.. so you use O(N) additional memory	10:05.36
	probably worth it though	10:06.03
	it does use "internal" implementation knowledge though. fair enough for a built in implementation, but i _think_ an implementation which uses the mujs API couldn't get away easily with this approach	10:08.17
	but yeah, definitely nice that it bypasses the "official" accessors	10:09.10
	quite the overhead it seems. about time you implement js array as c array for some subsets of js arrays :)	10:10.42
moolc	tor8: btw. have you ever considered using Symbola instead of Charis? it doesn't have variants, but other than that... (on the plus side your symbols will be covered by it too)	10:13.46
avih	tor8: does your implementation account for empty items? i think it doesn't. for an empty item i think it uses the last non empty one.	10:19.18
	a correct and more efficient implementation would be to: pass #1: count the empty and undefined items and collect the non-empty items. pass #2: sort only the non empty ones. pass #3: copy the sorted items to the begining, #4: append the number of undefined items, #5: clear the rest of the array to the original length	10:21.45
	for very sparse (and big) arrays, such as maybe hash table implementations, sorting only non empty items could yield orders of magnitude faster sort	10:24.15
tor8	avih: yes. an "external" implementation can't use the trick of using js_Value directly (since that exposes too many dangerous surfaces for clueless users)	10:24.35
avih	yeah	10:24.44
tor8	sorting sparse arrays is going to be problematic no matter what	10:25.24
avih	yeah, but it'd be O(N) rather than multiplied by log(N). could be very meaningful	10:25.57
tor8	but the 'flattening' when copying into the temporary array could possibly handle that reasonably easily	10:26.01
	handling sparse arrays is 'implementation defined' according to the spec	10:26.15
avih	yes, since you go over them anyway.	10:26.17
tor8	I could iterate over the keys rather than 0..array.length	10:26.32
	but then I should still only be looking at numeric keys	10:26.47
avih	not sure i follow. you do the same allocation (worst case all items are non empty), and same going over the items, but only copy non empty ones, and count empty and undefined.	10:27.17
	copy non empty and non undefined	10:27.41
tor8	comparing undefined items is fast in this implementation	10:28.03
inflex	Does Tamir Evan show up here very frequently?	10:28.32
tor8	handling sparse arrays would create a shorter temporary array	10:28.33
avih	yes, but potentially it multiplies the number of comparison by log(N) for no reason. all undefined go after the defined, and before the empty	10:28.39
inflex	Wanted to just pass him an update to fix the git submodule issue we ran in to last night	10:28.52
avih	but yes, if you count the undefined and empty before mallocing the temp array, then sparse array would only add O(defined items) temp memory	10:30.19
	careful that if you realloc the temp array, then you start getting issues with longjmp protections	10:31.11
	i _think_ volatile won't be enough on such case, as the qsort implementation doesn't necessarily with with volatile types	10:31.49
	+work	10:32.03
	(i did went there during my attempts)	10:32.17
	but if you have an efficient way to count the properties, even if it includes undefined but non-empty items, then it would be good enough for sparse arrays.	10:33.50
	tor8: btw, re js arrays as c array, i think i have a reasonably useful approach. not strictly c array - at least not for sparse arrays, but still way more efficient than the current code. the implementation is continuous memory of items ordered by their index value - where the index is part of the item itself. access is binary search for the item, which would be O(1) for non sparse array. splice etc would be implemented with memmov + rewrite of the indices for	10:46.41
	moved items. you'd get O(N) for splice, O(1) for push/pop, O(1) for access of non sparse arrays, and O(log(N)) for access of sparse arrays, but pure integers in c and no need to compare string property names.	10:46.42
	where N is the non empty items. so you get sparse fur 100% free	10:50.26
	for*	10:50.34
	the nice thing about it is that sparse doesn't need special considerations, and for non sparse it _is_ a c array. i think it's really quite nice.	10:51.32
inflex	That's quite elegant.	10:51.46
avih	and if the array is sparse relatively evenly, then that O(N) access will become typically O(1), because you start the search from a position relative to the array length. e.g. if length is 100 and it has 10 non-empty items spread evenly, and you want to access index 80, you start at index 8, and in 1-2 iterations you find your actual item	10:55.47
	actually, the continuous memory would be just the sorted indices where each points to a jsvalue item. it will have a lot less overhead than the current implementation, and splice/copy etc would just be a matter of memcpy/mov and rewriting a bunch of indices without ever touching the actual values	11:02.30
tor8	using a c array for js array objects has a few issues that complicate matters	11:03.42
	we still need to store key properties since each property can have metadata attributes, like readonly, no-delete, getter/setter accessor functions, etc.	11:04.42
avih	it's not strictly a c array. just continuous memory of index values which point to the actual valuies.	11:04.50
	right.	11:05.07
	i'm not very familiar with the internal implementation, but i _think_ the approach, in a nutshell, is relatively solid. of course, the devil is always in the details, but still, solid approach is a nice starting point	11:06.16
tor8	the benefit would be had from accessing the js_Property as a mixture of array and tree lookups, instead of just tree lookups	11:06.18
	take a look in jsvalue.h the struct layouts there should be telling enough	11:06.35
	all values (the stuff stored on the stack) are js_Value structs (which are 16 bytes)	11:07.40
avih	"telling enough" - depends who's listening :)	11:07.49
tor8	values that point to objects point to a js_Object struct which lives on the heap	11:08.01
	and each js_Object has a js_Property *properties binary tree of properties	11:08.23
avih	i roughly know that much, yes	11:08.42
tor8	where each js_Property is a string name, some attributes, and a getter/setter	11:08.43
	doing js arrays as c arrays would mean having two structures for holding properties	11:09.06
	a tree, and an array	11:09.10
avih	correct	11:09.15
tor8	I have experimented with it, but the code got massively more complicated last time I tried	11:09.38
avih	the tree for non array-index properties, and the continuous indices for tindex items	11:09.48
tor8	and it didn't cope with all the weird corner cases of property attributes	11:09.49
	but something simpler than I tried then might work (I was hoping to avoid the js_Property stuff altogether)	11:10.12
avih	you tried you mean the arraybuffer branch (or whatever its name was)?	11:11.10
	also, jsproperty could be enhanced a bit to use the "c array" if the object is an array	11:11.49
	tor8: yeah, that's an unfortunately ugly way for qsort context. regardless, i don't get how it handles empty values. what does js_getindex(J, 0, i); do for an empty item?	11:26.18
	(in what you just pushed)	11:26.33
	wouldn't it just use undefined? and then fill the array with undefined value for every empty value? that would be incorrect IMO	11:28.48
	not to mention way more memory used for sparse arrays after sort	11:29.18
tor8	avih: correct on all points (undefined)	11:31.10
avih	you should just collect the non empty values, then copy them back, then do js_setLength to number of collected items, then back to the original value	11:31.26
tor8	all js_array functions behave similarly; sparse arrays are not handled well by the js spec	11:31.30
avih	all implementations sort defined items first, then undefined, then empty last. and you could do that easily too	11:32.08
	tor8: (untested) https://pastebin.mozilla.org/9085466	11:37.42
tor8	it would mean bloating the sortslot array with yet one more field	11:38.02
	but given its alignment requirements, that's probably not an issue	11:38.38
	so let me give it a try	11:38.42
avih	tor8: sorry https://pastebin.mozilla.org/9085467	11:39.06
	yes, the temp array is O(len) rather than O(non empty), but not worse than your approach, and if you have an efficient way to count the number of non empty items (or even all the own properties) in O(defined properties), then you can allocate a more efficient amount of memory.	11:43.06
	oh, and btw, one of the biggest advantages of my suggested approach for continuous memory arrays is that iterating them is O(defined items).	11:45.25
	which is highly useful for map, filter, etc.	11:46.09
tor8	hm, try tor/master	11:51.34
	avih: ^	11:51.37
avih	tor8: is delindex for the rest more efficient than two setlength?	11:53.17
	(i do understand setlength can imply those delindex or equivalent)	11:53.49
tor8	no, but it is clearer (and it matches the same behavior as the initial pulling-to-temp-array)	11:53.54
	setlength is slightly optimized, and can be faster than the equivalent delindex loop	11:54.18
avih	then IMO put that as a comment and use setlength	11:54.46
tor8	setlength has a special case for handling sparse arrays, other than that it still calls delindex behind the scenes	11:54.48
avih	yeah, i assumed so, but possibly can do that with less searching.	11:55.17
tor8	unfortunately that optimization involves creating an iterator object (which mallocs a lot of stuff, since an iterator has to be stable if properties are deleted while it is running)	11:56.10
avih	behavior wise, i don't think it's different as far as the user can tell.	11:56.11
	gotcha.	11:56.21
tor8	I was going to say -- premature optimization :)	11:57.37
	now if you want to handle sparse arrays properly, you'd iterate the properties instead of the array length when creating the temporary array too	11:58.04
avih	tor8: i _think_ it doesn't behave fully well with empty items.	12:01.21
	(empirically). trying to come up with a test case. in a nutshell though, it seems it can make empty items disappear or become empty strings.	12:01.59
	tor8: no, it's ok. it's concat which removes empty items.	12:05.09
	without concat, it does "put" the empty ones at the end, just after the undefined ones.	12:05.37
	so you push it to master?	12:07.04
	<tor8> now if you want to handle sparse arrays properly, you'd iterate the properties instead of the array length when creating the temporary array too <-- that's always the best way if you can do so efficiently, isn't it? but in all your iteration function you always go from 0 to len, i.e. including empty items.	12:09.14
tor8	avih: you'll have to define "iterate the properties" to be more specific -- which properties? ;)	12:10.24
	the own properties, or also those of the inherited objects	12:10.33
	the spec is pretty clear about iterating over the integers, not the properties	12:11.01
avih	hmm..	12:12.27
	i never though of array items as inherited, though i guess inheritance should work here the same as everywhere else	12:13.12
tor8	and the setlength trick will only work for actual Array objects	12:13.41
	not other objects which you can pass to Array.prototype.sort.apply()	12:13.55
	a = Object.create([5,4,3]); a.sort(); a is not an array with magic .length handling	12:15.27
	but it has Array.prototype.sort in its prototype chain	12:15.43
avih	huh	12:15.53
tor8	trying to be clever when JS is involved is guaranteed to backfire :)	12:16.36
avih	lol	12:16.42
	so you're pushing the delindex thingy to master?	12:17.00
tor8	I will	12:17.57
	inflex: sorry, missed your question first time around. relative git submodules probably don't work nicely with githubs automated buttons.	12:20.20
	then again, githubs automatic button voodoo seldom does what I want/expect anyway :)	12:20.53
	moolc/malc/malc_: (for the logs) I need the variants, and it looks more like computer modern than Charter	12:24.07
	not to mention the inocompatible licensing	12:26.08
inflex	tor8, already sorted it out, just had to do a bit of manual adjustment to the .gitmodules and it's working fine now	12:37.29
	( made them point directly to the Artifex repos on github - https://github.com/inflex/mupdf/blob/master/.gitmodules )	12:38.19
sebras	tor8: wrt to deep copy vs. keep: yes it probaably ought to have been keep.	13:00.40
	tor8: wrt to the question of cleaning out resources or not... I wasn't entirely sure what we wanted, after a quick discussion with robin I understood it like the clean shouldn't clean out the resources while sanitize should. perhaps I was mistaken?	13:01.40
tor8	'clean' is intended to pretty-print the syntax	13:02.39
	'sanitize' does fancy processing and removes redundant state changes	13:02.54
sebras	ok, is sanitize only operating inside content streams?	13:03.15
	because we do cleaning of duplicate objects etc as well.	13:03.30
tor8	both the 'clean' and 'sanitize' operate on content streams	13:03.32
	and recursively the resources used by the content stream	13:03.58
	so that type3 fonts and patterns and other XObject forms will also have their content streams cleaned/sanitized	13:04.15
sebras	tor8: right, but if we ignore content streams for the time being. clean doesn't remove any other objects that are redundant otherwise, right?	13:08.06
	tor8: if that's the case perhaps we should only clean out redundant resources if we actually sanitize the stream?	13:08.34
	tor8: or add another -d flag..? ;)	13:08.43
tor8	sebras: the -c flag to mutool clean touches every page's content stream data, nothing else	13:09.14
	sebras: yeah, it's probably fine to just leave the resource dict as-is for '-c'	13:09.37
	but then there is no way to get it to remove unused resources other than the full-blown -s sanitizing filter	13:09.59
sebras	tor8: ah, right. I forgot about the -c flag.	13:11.00
tor8	this stuff only happens when asking for -c or -s	13:11.20
moolc	tor8: i wanted to extract fonts from one pdf and failed.. i'm %99 positive some reincarnation of mutool was able to do that, is my memory once again at fault?	13:11.30
sebras	tor8: mmm, so in that case clean without flags leaves resources and content streams intact, clean -c removes redundant resources but leave the content stream intact, while clean -cs would remove redundant resources and also sanitize the stream. that seem reasonable some how.	13:11.44
tor8	as of today: -c: pretty-print content streams, leave resource dictionary intact. -s: recreate content stream and remove redundant state changes, and remove unused resources from the resource dictionary.	13:13.01
	-s combined with -g will drop unused resources from the file	13:13.12
	I guess -c and -s are conflicting flags	13:13.42
sebras	tor8 sounds to me like -sg and just -s are not different wrt to content streams and their resources (they do affect _other_ type of objects differently of course)	13:15.10
tor8	clean=syntax\|state might be a better way to phrase it	13:15.21
	-s only recreates the /Resources dictionary (by removing stuff that is unreferenced from the content stream)	13:15.53
	-g eventually removes unreference resource objects from the file, if nothing else uses them	13:16.18
	moolc: it should still work.	13:16.43
sebras	tor8: right, so with -s the resource objects would still be there but no longer references by the resource dict.	13:17.31
	tor8: this is a bit of a mess. :)	13:17.40
tor8	yes. it is.	13:17.48
	many of these 'mutool clean' flags should just be separately available operations that act on a pdf_document	13:18.12
	not baked into the magic pdf_save_document options	13:18.22
moolc	tor8: what exactly should? '$ mutool huh pdf'? (i've completely forgotten what "huh" should be ;( )	13:18.33
tor8	mutool extract	13:18.40
moolc	tor8: and object should be? (root?)	13:21.29
tor8	https://mupdf.com/docs/manual-mutool-extract.html	13:22.44
sebras	tor8: do you want me to make another commit ot replace pdf_deep_copy_obj() with pdf_keep_obj()?	13:23.39
tor8	sebras: I already have one on tor/master	13:23.55
sebras	tor8: ok.	13:24.04
tor8	I was just wondering if there was a deeper reason that I didn't understand :)	13:24.23
moolc	tor8: well sure, but i started asking because NOTHING is produced when i run mutool extract on this pdf here	13:24.40
sebras	tor8: no, I was probably just confused as usual.	13:24.46
tor8	then there are likely no fonts in it. try mutool info.	13:24.54
moolc	tor8: Fonts (4):	13:25.20
	all four are Type0 if that's of any relevance	13:25.57
tor8	Type3 would be the relevant type ... since they don't have an embedded font file.	13:26.50
	the fonts could also be non-embedded in which case extracting them would be impossible	13:27.13
moolc	tor8: they are embedded	13:27.39
	the one page pdf is whooping 400K in size, and i can read it just fine	13:27.54
	and llpp reports that they are four subsets of calibri	13:28.13
	Fonts (4):	13:32.45
	1(3 0 R):Type0 'CIDFont+F1' Identity-H (11 0 R)	13:32.45
	1(3 0 R):Type0 'CIDFont+F2' Identity-H (19 0 R)	13:32.45
	1(3 0 R):Type0 'CIDFont+F3' Identity-H (27 0 R)	13:32.45
	1(3 0 R):Type0 'CIDFont+F4' Identity-H (35 0 R)	13:32.48
		13:32.50
	is what mutool info says	13:32.54
tor8	moolc: what does 'mutool show $file 11' say?	13:35.08
moolc	tor8: http://ix.io/1anO	13:35.49
tor8	ah. it's not working because the FontDescriptor is not a numbered object.	13:36.19
	quoth the specification: FontDescriptor dictionary (Required except for the standard 14 fonts; must be an indirect reference)	13:36.55
moolc	tor8: so, in essence, the pdf producer that msword uses blows goats?	13:37.28
tor8	moolc: in a word, yes.	13:38.41
moolc	tack expletive	13:39.00
tor8	moolc: you can get the data using 'mutool show' though	13:39.30
	mutool show -b $file 11/DescendantFonts/FontDescriptor/FontFile2	13:40.01
moolc	tor8: no doubt, but all of this is just a measuring dicks contest between me and an ex co-worker, my cv in pdf form was 10x times smaller than his, and i wanted to know why	13:40.18
	tor8: /tmp	13:41.38
	- ~/x/rcs/git/mupdf/build/native/mutool show -b $file 11/DescendantFonts/FontDescriptor/FontFile2	13:41.39
	null	13:41.39
		13:41.39
	tor8: https://boblycat.org/~malc/scratch/bravocnntypographers.png	13:44.02
sebras	tor8: 10-line bugfix on sebras/master	13:47.40
	tor8: it clusters well.	14:13.20
tor8	sebras: hmm. do you have a file for that?	14:14.43
	sebras: it might make sense to just assume a sane default if it's set to 0 instead	14:14.59
	like 1000 or 2048 (type1/truetype default values)	14:15.09
sebras	tor8: ok, that's why I asked for a review. :)	14:16.11
	tor8: the test file is in the bug report, but it is a fuzzed file, so I doubt it will make you happy.	14:16.32
tor8	if (units_per_EM == 0) units_per_EM = (ft_kind(face) == TRUETYPE) ? 2048 : 1000; should do the trick I think	14:20.56
sebras	tor8: no warning?	14:23.19
tor8	sebras: nah. nobody looks at warnings anyway... :P	14:23.39
	I think you could probably just get away with setting it to 2048 no ft_kind check required	14:24.14
*sebras*	learns that he's a nobody.	14:24.15
tor8	I just need to check the freetype implementation to see where it gets/sets the units_per_EM for non-truetype files	14:24.35
	I suspect it can't be 0 for type1/cff files	14:25.27
	so we only need to worry about the truetype case (where 2048 is a decent fall-back)	14:25.42
	if (units_per_EM == 0) units_per_EM = 2048 (with or without warning)	14:26.06
sebras	tor8: new commit, clustering as we speak.	14:29.26
tor8	sebras: you don't need the (FT_Face) cast. font->ft_face is a void*	14:30.51
sebras	tor8: what cast?	14:31.43
	tor8: look again.	14:31.47
tor8	FT_Face face = (FT_Face) fontdesc->font->ft_face;	14:31.50
sebras	tor8: that's not the code from sebras/master... ;)	14:32.01
tor8	not now it isn't... :)	14:32.29
	sebras: LGTM.	14:37.34
sebras	tor8: done!	14:39.26
avih	tor8: btw, some numbers comparing the new native sort with pure js implementation for 10k items of pure strings: 1. for default toString sort, the native is ~40 times faster than the js sort (8ms / 300ms). 2 for trivial compare function (return a > b ? 1 ... ) it's 10x faster (20 ms / 200 ms - weird, not sure why it's 200 here and 300 without a function), and for slightly less trivial but still fairly fast compare function it's only twice faster (150ms/300ms).	15:22.56
	that's the "internal" compare function which yields 300ms: return (a = ''+a) > (b = ''+b) ? 1 : a == b ? 0 : -1;, and that's the external trivial compare function which yields 200ms: return a > b ? 1 : a == b ? 0 : -1;	15:25.29
	i guess the assignment and creation of new string is expensive-ish, though it's required if one doesn't know in advance the items are strings.	15:26.26
	maybe it could be special cased (concat of string and an empty string) to avoid creation of a new value on such case	15:27.32
	anyway, just fyi. thanks for adding qsort :)	15:28.56
inflex	Tamir_Evan, I sorted out that .gitmodules problem for myself. Not sure if you care, but I can share the file if you preferred.	17:23.04
Tamir_Evan	inflex: I saw the changes you made, and will probably do something similar in my own repo in the near future.	17:23.27
inflex	np, thanks again for your work, things have progressed nicely	17:23.46
Tamir_Evan	inflex: Thank you fro bringing the issue with the repo forking to my attention, and for making use of my repo.	17:25.49
inflex	Well, wasn't really an issue with your fork per`se, more just seems the way the original mupdf was done. btw, what sort of changes did you have to do in order to get MinGW to build the GL version on Windows?	17:27.13
Tamir_Evan	inflex: It was mainly changes to the 'Makethird', and a few lines in the 'Makefile'. They were mainly done here: https://github.com/TamirEvan/mupdf/commit/019f3a09e7adb3b5c023b2067c3e43af8050b33d , but some of the changes were done in other commits (both before and after).	17:42.07
inflex	okay, so not overly dramatic, but certainly important. Surprised it wasn't in there by default.	17:45.24
pihug12	Hi! I was wondering if "mupdf-android-viewer-1.13.0-universal.apk" & "mupdf-android-viewer-mini-1.13.0-universal.apk" on https://mupdf.com/downloads/ were generated from the same Git project?	20:17.33
tor8	pihug12: no, they are created from separate git repositories.	20:30.21
pihug12	- http://git.ghostscript.com/?p=mupdf-android-viewer-mini.git --> last commit for v1.13	20:35.55
	- http://git.ghostscript.com/?p=mupdf-android-viewer.git --> last commit for v1.12	20:36.05
	The first APK is build with the 2nd repository despite the "wrong" version?	20:37.02
tor8	pihug12: try pulling again.	20:37.54
pihug12	Seems good now :)	20:40.16
	And the tag is missing for v1.13 in the "mupdf-android-viewer-mini.git" repository	20:41.00
	The version from the APKs seem to be still v1.12. Some strings may need to be updated in these files:	21:00.14
	- http://git.ghostscript.com/?p=mupdf-android-viewer.git;a=blob;f=app/build.gradle;hb=HEAD	21:00.25
	- http://git.ghostscript.com/?p=mupdf-android-viewer-mini.git;a=blob;f=app/build.gradle;hb=HEAD	21:00.33
tor8	pihug12: hm, yes, seems like a number or two has been missed	21:08.27
pihug12	Thanks!	21:11.35
tor8	I'll rebuild the apk binaries tomorrow.	21:11.56
pihug12	Is this possible to put the 1.13.0 tag on these last 2 commits?	21:11.56
	I think F-Droid builds are based on tags	21:12.42
tor8	pihug12: yeah, no problem.	21:12.51
pihug12	Cool! Perfect!	21:13.49
	Thanks for your time. I will check with F-Droid now.	21:14.13
	Forward 1 day (to 2018/05/16)>>>

Log of #mupdf at irc.freenode.net.