| <<<Back 1 day (to 2017/05/18) | 20170519 |
krihsna | quotations from epub file from pages tennyson's sage, aeneas in the meaning of life and death and chapter II from epub available at http://www.gutenberg.org/ebooks/author/47561. i have screenshot w/issue https://s9.postimg.org/rgo7okfgv/quotation.png | 11:27.28 |
| *quotations are not visible, they are spread over number of pages | 11:28.38 |
kens | If you think you have found a bug, please open a bug report and attach the specimen file there, with a description of the problem, and a command line to reproduce it | 11:28.44 |
avih | tor8: ping re pr? | 11:30.53 |
krihsna | kens: i will file a bug at https://bugs.ghostscript.com/ in couple of minutes | 11:31.27 |
kens | OK | 11:31.32 |
malc_ | krihsna: do you have a direct link to epub? | 11:32.08 |
krihsna | malc_: http://www.gutenberg.org/ebooks/53829.epub.noimages?session_id=c1f776b919b318e1182d4f7fb0bf1aea7c7033fe | 11:34.40 |
| I am using mini 1.11 | 11:35.30 |
| the book name is Pagan Ideas of | 11:36.03 |
| Immortality During the | 11:36.03 |
| Early Roman Empire | 11:36.04 |
malc_ | krihsna: thanks | 11:36.40 |
krihsna | created bug: https://bugs.ghostscript.com/show_bug.cgi?id=697923 | 11:56.48 |
kens | I see krishna can't folow simple instructions, he has not attached the file to the bug report :-( | 12:06.26 |
malc_ | kens: btw. the epub krihsna posted a link to here does show a broken rendering (quotations notwithstanding) starting with page 12 | 12:13.15 |
kens | I'm not saying its not a bug, but I did say 'attach the file' which he didn't do | 12:13.34 |
tor8 | avih: haven't had time to look in detail yet, but yes, something like that | 12:20.34 |
malc_ | tor8: the document krihsna referenced does indeed show rather severe issues with html (i assume) | 13:14.52 |
tor8 | project gutenberg epubs have a lot of invalid (and frankly, quite shit) XHTML, in my experience. | 13:17.30 |
malc_ | tor8: just thought you might want to take a look | 13:22.27 |
tor8 | of course | 13:22.46 |
| malc_: 'auto' margins strike again... | 13:27.42 |
malc_ | tor8: css rules | 13:28.56 |
avih | tor8: k, mostly on topic, i think that the spec defines "valid array index" as roughly uint32, but js_itoa which getindex et al use is int (and doesn't actually handle negatives). it might be worth streamlining access with "valid index" for possibly further speedups. arrays are important in js for calculations etc, and it can make a huge difference. | 14:09.32 |
| so when trying to access an array index, if it's a number and a valid index, it won't even go through Number.toString and instead use js_hasindex etc | 14:10.35 |
| as an anecdote, i ran some string concatenation speed tests (using +=, array.join, etc), and the array access speed completely overwhelmed and dominated the results. with my patch, it actually shows which is faster and gives meaningful values. people expect arr[i] where i is a valid index to be basically instant | 14:13.08 |
tor8 | avih: I have experimented with fast array implementations, using an actual array. the problem is that so much in the spec relies on the assumption that arrays are just objects, and their properties can have the same attributes as normal object properties. | 14:15.19 |
avih | i don't disagree | 14:15.40 |
tor8 | if you want really fast arrays, you're going to have to lose something -- either code simplicity, or spec compliance. | 14:16.04 |
avih | and yeah, arrays which are actual c array, as long as the array is "normal" (starts at 0 with no empty indices) would surely be nice | 14:16.22 |
tor8 | V8 etc go to extreme lengths to detect that code doesn't use any of the weird cases before they start using actual arrays | 14:16.28 |
| but they do the same to map object properties on objects that only use inherited properties to fixed field offsets in a block of memory, etc | 14:17.03 |
avih | well, it's certainly an interesting mini project to make arrays both fast and simple-ish integrate cleanly and nicely with the rest of the code. but at least where an array index is evaluated, just testing if it's a valid index and using a faster tostring is a very good start | 14:17.53 |
tor8 | avih: on tor/typed- | 14:18.03 |
| arrays branch there's a WIP commit for fast arrays | 14:18.09 |
| it doesn't build cleanly on top of the current master though | 14:18.33 |
avih | i might give it a go at some stage. fwiw, duktape performs about 2x faster than with my patch, and about 40x faster array access than current mujs code | 14:19.34 |
| 2x is already the same ballpark. roughly :) | 14:20.17 |
tor8 | to keep the code simple, I don't plan to get rid of the number -> string -> number round tripping | 14:20.19 |
| but your patch to speed up the test is certainly something I'm going to consider | 14:20.29 |
| and using an array part for properties may be doable | 14:20.47 |
avih | yeah. notice that it leaves 0 to the following code, as isnormal excludes 0 | 14:21.02 |
tor8 | but we still need to allocate properties with all their baggage (such as setters, getters, and flags, etc) for each entry | 14:21.32 |
avih | but i prefered to keep it simple too. just take the bulk off that bigass number stringification | 14:21.44 |
tor8 | but a flat array part would allow for O(1) property lookups for arrays instead of the balanced binary tree of properties | 14:22.46 |
avih | tor8: i'm not familiar enough with the current implementation to discuss that intelligently. i understand what you're saying though, but don't yet see where the complication is | 14:23.13 |
| tor8: you're right, but since it's logn, it probably can be considered O(1) too | 14:24.17 |
| O(log(n)) | 14:24.28 |
tor8 | yes. it's got a large K in there though, since it's doing a lot of string compares | 14:24.47 |
avih | right. that adds a high multiplier | 14:25.13 |
| tor8: within the code itself, at the bytecode level, does it keep the identifier names, such that essentially property access is O(length of id as string) ? | 14:26.36 |
| so identifiers with 30 chars are 30x faster to resolve than with 1 char? (assuming at the same env level away from the invocation) | 14:27.40 |
| slower* | 14:27.47 |
tor8 | No. It's just a simple binary tree lookup for a string. | 14:28.59 |
| see jsV_getproperty | 14:29.45 |
avih | so 1 char id can end up as a leave 3 levels deep with the same probability as 40 chard is? | 14:29.50 |
| id* | 14:30.02 |
tor8 | the binary tree is balanced. | 14:30.46 |
avih | so "yes", right? :) | 14:31.44 |
tor8 | we don't do anything special in regards to the length of the property name. | 14:33.02 |
avih | so access time is O(numParentEnvs * log(average ids per env)) | 14:33.20 |
| average number* of ids per env | 14:33.50 |
tor8 | something like that yes, for object property accesses (where number of envs is really number of prototype parents) | 14:34.21 |
| for local variables, if you don't use inner functions, eval, or the magic 'arguments' variable, there are special fast opcodes that access the stack slots directly | 14:35.08 |
avih | huh... isn't everything "inner functions" of the script function? | 14:35.39 |
tor8 | no. script functions are special cases, where their 'locals' are global variables. | 14:36.28 |
avih | right. so basically if all my functions are "flat" on the top level scope of the script file, that's the optimized case? | 14:37.17 |
tor8 | yes. | 14:38.16 |
| if you use closures or fancy 'dynamic' features like the with statement, or eval, we can't optimize the accesses since all bets are off | 14:38.50 |
avih | that's interesting. | 14:39.20 |
| isn't all the array traversa; functions like so? | 14:39.37 |
| forEach etc? | 14:39.45 |
| i guess i could give it a reference to an existing "outer" function though | 14:40.17 |
| anyway, gtg now. thx for the info, please push my patch :) later and enjoy your weekend. | 14:41.40 |
| Forward 1 day (to 2017/05/20)>>> | |