Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2020/01/26)Fwd 1 day (to 2020/01/28)>>>20200127 
ator hisacro: PDF is a random access file format, and mupdf needs a "seekable" file. You have to save to a temporary file and open that, mupdf cannot open content from stdin.10:23.27 
  emfipp: we don't ship an installer. all the mupdf exe files are standalone and don't need any installation, just run them from wherever.10:24.08 
  emfipp: mupdf-gl takes a -r option on both windows and unixes10:24.38 
  emfipp: mupdf-gl does not (due to how it uses FreeGLUT for windowing) support extracting the dpi from the monitor10:25.24 
  not at the same level as mupdf-w32/x11, but we do look at it for scaling the UI10:27.11 
sebras ator: one day I shall install docker in a VM instead of running it native. :-P10:27.16 
avih ator: i've been thinking about optimized object array keys (index) access/storage. that's my rough initial approach: https://0x0.st/ir5D.txt thoughts?12:41.06 
  everything is o(1) and keys never move between optimized and normal storage12:43.05 
  i didn't consider enumeration yet12:49.15 
  the only cost this has (other than more logic) is unused memory if an array becomes sparse after it's been not-sparse. but i think it's a fair tradeoff, because we've already had to pay this price earlier anyway, so it doesn't add a new price, just doesn't has a "no refunds" policy13:02.35 
  just has* a "no refunds"..13:03.06 
  and even this can be modified, e.g. if sparseness becomes extreme, unoptimize the storage once but keep it frozen.13:04.28 
ator avih: with a bit of reshuffling, you/I/we/someone could add a "small integer" property array part in js_Object to go along with the current property tree13:10.12 
  this would end up being somewhat similar to lua's approach to their tables13:10.21 
  where they have an 'integer indexed array' part and a 'general hash table' part13:10.33 
avih ator: why small?13:10.59 
ator in Lua it the "small" part grows if you extend it by one13:11.23 
avih also, i don't think "-1" is common. i think the js definition of index being uint < 2^31-1 is good13:11.45 
ator but add an integer key that is larger than that, will make that key "sparse"13:11.48 
  each entry in the array will still need to be a full property, with attributes and getter and setter13:12.10 
  so basically the same as the current table, but a faster way to index the properties13:12.30 
  s/index/find/13:12.40 
avih of course, my only concern was whether it's stored at the tree or in an optimized c array13:12.47 
ator enumeration and deletion are the "tricky" parts13:13.17 
  a deleted property needs to not exist, and is different from a property with value 'undefined'13:13.34 
  which is where my previous attempts to do a quick hack for this have fallen down13:13.46 
avih ator: i'm not familiar with the lua mechanics especially if a key changes its storage, but i think my approach covers it well, and can cover the lua one too with tunning of the values13:14.03 
ator but at the time, I also attempted to keep the same behavior as chrome/spidermonkey where enumeration returned properties in the same order they were inserted13:14.20 
  I gave that up, which should make the task easier now13:14.39 
avih ator: i considered deleted. not yet enumeration (but should be easy, because order is implementation defined)13:14.41 
ator ... since the order is implementation defined13:14.59 
avih ator: how does lua handle a "non small integer" which later becomes "small" because gaps have been filled?13:15.45 
ator don't know, it may keep track of fill rate and rearrange storage if it grows too sparse13:16.04 
avih well, in my approach it's well defined.13:16.22 
  ator: did you look at my paste?13:16.45 
ator why 'freeze'? (and I assume this is not related to Object.freeze())13:17.04 
avih correct, it's related to optimized storage size13:17.44 
  and i freeze the size because it makes everything else O(1)13:18.06 
  otherwise you need O(N) when you delete13:18.31 
ator in lua, contiguous integer keys starting from 1 (would be 0 in mujs) always go in the array part, everything else goes into the hash part13:18.46 
avih (to maintain count, in case you delete ix size-1)13:18.57 
ator we could maintain a bit in the array part saying 'deleted' and enumeration would skip it13:19.17 
  keep count of how much in the array part is deleted, and if it grows beyond a certain ratio, repack the object properties13:19.35 
  by either shrinking (if they are all at the end) or moving everything past the first hole into the hash/tree part13:20.09 
avih well, i considered it allocated objects with NULL at the c array, but it can also be an array of objects with is_deleted key too13:20.14 
ator the array part will need to be some variant of js_Property struct (missing the name, left, right, and level fields; keeping atts, value, setter, getter)13:20.50 
avih yes13:21.12 
ator one bit in the 'atts' field can store whether the key exists or not13:21.42 
avih re repack, my approach avoids it. it's a one-way approach, but sure, expensive repacking is possible too13:21.52 
ator better than freezing it completely, I think13:22.10 
avih however, i think that the cost is likely to be small even without repacking13:22.21 
ator { 1:'foo', 2:'bar', 5000:'zot' } should still be unfrozen, IMO13:22.37 
avih and it's way simpler without repacking13:22.40 
  ator: in theory i agree (1/2/500), in practice this is hard to maintain, but i aborted this approach early, it might be simpler than i guessed13:23.27 
  the thing is, my approach has everything O(1). repacking etc can be O(N)13:24.26 
  (even just checking if repacking is needed)13:24.41 
  however, i do maintain sparseness in O(1), so if repacking depends on sparseness, checking it is O(1) even if the repacking is O(Nlogn)13:25.40 
  the thing i afraid the most is that some usage patterns can end in frequent expensive repacking.13:27.29 
ator implement without repacking, or check for repacking (which is O(1) -- just compare 'deleted' count with 'added') only when doing GC13:31.15 
avih 'deleted existing' with 'added non-existing', but yes13:33.46 
ator avih: would you keep the array part 2^m sized, and the sparseness check is whether it fits in the current allocation or is exactly one past the edge?13:34.05 
avih i thought the array size is 2^m, but it doesn't have to be. it's not part of the sparseness calc13:34.38 
  the sparseness only counts highest inserted ix and count13:35.05 
ator problem is when growing, the new array part may need to swallow up existing properties in the hash part13:35.08 
  or fall back to looking at the hash part13:35.14 
avih not sure i follow13:35.26 
ator cause if you start with {1:foo, 2:bar, 10:zot}13:35.38 
  you would have an array part with 1 and 2, hash part with 1013:35.48 
avih in my approach (without repacking), keys mover move between optimized and tree13:35.51 
ator but if you keep growing the array part up to 1013:35.53 
avih never*13:36.00 
ator then 10:zot would need to migrate13:36.02 
  yeah, you'd 'freeze' as soon asy ou see the 10 and the array part would never expand, right?13:36.22 
avih yes13:36.30 
  repacking would indeed involve moving between storages13:36.52 
  the thought is that if we're adding items which make it sparse, it's likely to remain sparse13:37.43 
  but as long as it's reasonably dense, we keep it optimized13:38.05 
ator yeah, either you treat it as a compact 'array' and get the benefits13:38.21 
  deviate from that and you get punished13:38.34 
avih it's not impossible to add a repack. but considering it's O(N log n), it must not happen frequently13:38.57 
  where N is the sparse length. e.g. {1:"foo", 1000:"bar"} N is 100013:39.31 
  however, with smart code N can be min(num-existing-keys, sparse-length)13:40.42 
  (you're alternating normal enumeration steps at the tree with n=n+1 at the optimized storage)13:41.25 
  my thought was that storage_base-len is 4 or 8, and grow is *2 or *1.5 or something along these lines13:44.52 
  also note that in my approach, on insertion we don't care about sparseness when it fits the storage. so it can become sparse and then regain density without freezing the size13:46.29 
  i.e. we only consider sparseness just before we're about to grow the storage13:47.39 
  ator: look at it this way: if it becomes sparse, it falls back to current behavior and that's it, except that ix which fit the storage size are still optimizd.13:52.26 
  i don't disagree it's nicer if it's not one-way, but the cost (logic, runtime) can be high13:53.44 
  and in my approach the cost is very reasonable i think13:54.13 
  just "no refunds" for things you've already paid13:54.33 
ator sure. that's not where the real complexity lies though -- that's in getting enumeration, deletion, attribute properties, and the prototype lookup chaining, setters and getters,13:56.38 
  interactions with "classes" special behavior (String, Array), userdata fallback setters and getters13:57.07 
  and object and property level permissions13:57.37 
avih i'm not sure i get that. all those things (maybe except "virtual" properties like String) remain as is. it's just lookup/insertion/enumeration which changes.13:58.49 
  lookup/deletion/insertion are trivial. enumeration should be too (scan the optimized storage, then do normal tree scan)13:59.44 
  virtual keys like String - no idea. i never looked at that code, but i imagine it shouldn't change much either, because it only modifies how a key is accessed IF it's stored14:02.47 
ator avih: if you hide the details in jsV_getproperty and family of friends, yes, you're probably right14:02.56 
  it's been a while, I'm not as familiar with the code as I used to be :)14:03.06 
avih yes, that was my intent (before trying to touch the code)14:03.17 
  heh14:03.22 
  the optimization is a property of the storage. tree or tree and c-array. whether the storage is abstracted well, or if it's leaky (e.g. things elsewhere use left/right) - i don't know14:04.52 
ator nothing else uses the left/right/level in js_Property14:08.21 
avih i imagined the storage as having the API: has_key(s), get_key(s), del_key(s), set_key(s, val) (maybe returns the prev val object existed). and users of the storage should not care how the key is mapped internally to a value. it can be hash, tree, array, whatever14:08.36 
  ator: i don't see a js_property.h, what's the API?14:17.37 
  jsproperty.h *14:17.44 
  would be easier to approach it if the API is clear14:18.10 
  i'm also not seeing the definition of struct js_Property ...14:21.35 
  ah, in jsvalue.h. github omitted this result in its search14:23.07 
  well, left, right, level should not be global. they should be local to a wrapper struct at jsproperty.c /me thinks14:24.25 
  not sure what atts is, but i imagine it's part of the property itself and not part of the storage (i.e. should remain global)14:25.30 
  and not sure if name should be global or not. depending if it's only used for access (so local) or also used elsewhere14:26.14 
  jsG_freeproperty should probably move to jsproperty.c14:28.59 
ator avih: sorry, having internet troubles here15:00.37 
  avih: atts has the DONTENUM, DONTCONF, READONLY bits for the property15:00.56 
  avih: the files are organized to minimize the number of functions that need to be publicly visible, not by data structure15:01.28 
 <<<Back 1 day (to 2020/01/26)Forward 1 day (to 2020/01/28)>>> 
ghostscript.com #ghostscript
Search: