| <<<Back 1 day (to 2017/07/06) | 20170707 |
avih | tor8: i didn't try to follow the algorithm, but it seems less accurate than the simpler approach even when using uint64_t and higher limit (tried both UINT64_MAX and pow(2, 53) ) | 09:29.58 |
tor8 | avih: it's very simple. it figures out a power of radix which will scale the original number to as large an integer as will fit comfortably in the integer. | 09:31.44 |
avih | it's not that part which is inaccurate. it's the LSDs which are | 09:32.34 |
tor8 | in the end that should results in all the bits of the mantissa put into the integer in one multiplication | 09:32.43 |
avih | but something fails with it as it's clearly less accurate. try my test program | 09:33.12 |
tor8 | I am using your test program | 09:34.39 |
| any particular case where you're seeing the inaccuracy as compared with firefox? | 09:35.00 |
avih | sec | 09:36.25 |
tor8 | grisu2 failing with -1e-30 is more worrying :( | 09:38.01 |
| but that *could* be our strtod rather than the dtoa operation | 09:38.22 |
avih | yes, iirc it also fails with other similar (non 30) exponents | 09:38.26 |
| tor8: can't reproduce the issue, but maybe it only happens with limit of pow(2, 53). with 54 it's ok. i'm getting this with the double approach: (359999999999999.9).toString(36) --> 3jlxpt2prz | 09:46.19 |
tor8 | avih: I also test with for (var i=2;i<=36;++i) print("pi in", i, Math.PI.toString(i)); | 09:46.23 |
avih | what? | 09:46.52 |
| oh | 09:47.07 |
tor8 | lots of guaranteed bits of mantissa in that sample number :) | 09:47.26 |
avih | lots indeed :) | 09:47.46 |
| though not more than double can hold :p | 09:47.58 |
tor8 | avih: ...ps0 vs ...prz is easily explained by rounding | 09:48.36 |
| same with z.zzzzzzzzzz vs 10 | 09:49.03 |
avih | anyway, new code with limit of pow(2,53) gives this 3jlxpt2ps0 while 54 gives 3jlxpt2prz.w (which is the same result as the double approach with same limit) | 09:49.07 |
| yes, but i expected 53 bits limit to not be worse than can be produced with double, which also has 53 bits significand | 09:49.50 |
| but yes, it can be explained by rounding. | 09:50.48 |
| anyway, i really think you should use 64 bits values and not lose more resolution. or at least 54 bits values :) | 09:51.21 |
tor8 | my floating point voodoo is too weak, but could it be that scaling by a number > 2^53 would round off the lower bits | 09:51.49 |
avih | speed wise, on my set it's similar to the simpler double approach, just with more code :) | 09:51.55 |
tor8 | yes, I'm going to use uint64_t | 09:52.00 |
| 64-bit version is on tor/master | 09:53.01 |
| using 2^53 as the limit | 09:53.07 |
avih | with uin64 it's ~40% slower | 09:53.11 |
| (the 32 bits version has a similar speed as the simple double approach) | 09:54.37 |
tor8 | what cpu? | 09:54.56 |
| I totally expected the bottleneck to be the pow() function calls | 09:55.33 |
avih | native and vm of haswell i7 -xxxxU | 09:55.39 |
tor8 | which could be optimized by use of a lookup table at the cost of bloating the binary | 09:55.49 |
| and then I'd rather have a slow implementation for this hopefully never used function | 09:56.07 |
avih | nah, possibly just multiply and divide by radix instead | 09:56.08 |
tor8 | or division and modulo math for uint64 being 40% slower than the equivalent for uint32, I could believe that | 09:56.59 |
avih | but it's just considerably more code i think. and harder to follow than the simple double approach. and it's tricky with rounding and choosing the fraction point position | 09:57.03 |
| with the simple approach you just multiply by radix and take the LSD of the int part | 09:57.50 |
tor8 | that approach uses two separate ways of dealing with the integer and fractional part, and needs to reverse the integer string to concatenate them at the end | 10:00.46 |
avih | regardless, i didn't test your latest patch, but if you only changed it to uint6t_t then i think a limit of pow(2, 54) is better than 53. and it doesn't lose nor add precision | 10:01.07 |
tor8 | with the integer and knowing the point position it's just one loop with the same scaling on both ends | 10:01.15 |
avih | i know. i had such code too, and i do roughly follow what your code does. it's just more tricky | 10:01.52 |
| eventually i decided on the simpler approach as the "fixups" of the integer approach ended up bigger than a whole section dedicated to the fraction part, and it wasn't slower either | 10:02.53 |
| re 53 vs 54, possibly it just uses the limit differently, but 54 matches closer to the simpler approach with 53 bits and never taking a digits with less than full bits precision | 10:05.24 |
tor8 | the simple approach doesn't round the final digit, does it? | 10:06.31 |
avih | correct, your approach rounds the remaining bits which don't cover a full digit, while the double approach crops them | 10:07.56 |
| but the specific example i gave seems to lose bits with the new approach and 53 limit | 10:08.35 |
tor8 | or does the other approach introduce new wrong bits? | 10:10.54 |
avih | it doesn't. it crops the bits which don't cover a digit | 10:11.12 |
tor8 | if I bump the limit to 2^60 I get garbage at the end | 10:11.18 |
| (0.5).toString(9) is emitted as "0.4444444444444444385" if using 60 bits | 10:12.01 |
| which is just plain bad | 10:12.05 |
avih | if the double approach uses 54, then it ends in ...prz.w . if it uses 53 then it becomes just prz and no fraction. the accurate conversion is pwz.weeeee..... | 10:12.24 |
| i didn't say use 60. i was discussing between 53 and 54 | 10:12.42 |
tor8 | with 52 bits and +0.5 rounding when converting to the integer, it becomes "0.4444444444444445" | 10:12.46 |
| with 53 bits the +0.5 rounding has no effect that I can see | 10:13.05 |
avih | sorry, prz.weee... | 10:13.08 |
tor8 | so with 53 bits we get "0.4444...4" | 10:13.37 |
avih | not sure what you mean by "no effect". | 10:13.47 |
tor8 | I get the same results with or without it | 10:13.58 |
avih | the rouding does work well as far as i can tell. it's a good thing. the limit is just too low with the new approach and 53 | 10:14.23 |
| for this specific value, correct, but for other values it does make a useful diff | 10:14.47 |
tor8 | prz.w is wrong though, if we take mozilla's implementation as canon | 10:17.59 |
| 3jlxpt2prz.v is what mozilla returns | 10:18.09 |
avih | it's wrong, according to https://baseconvert.com/high-precision | 10:19.32 |
| (which could be buggy, but i think it's not) | 10:20.17 |
tor8 | wolframalpha agrees, it should be prz.weeeeeeee | 10:21.06 |
avih | however, prz.w does round to prs0 | 10:21.34 |
| sorry, to ps0 | 10:22.43 |
tor8 | and it's the BSD strtod that doesn't quite cope with 1e-30 | 10:23.50 |
| which isn't super surprising, given its simplicity | 10:24.18 |
avih | so upgrade it, and call it a day with the numbers fixes. for now :) | 10:24.46 |
tor8 | yeah... but I like these small diversions from mupdf work :) | 10:25.30 |
| if only system strtod didn't have this crap locale-dependency :( | 10:25.52 |
avih | i don't think i got the mupdf reference.. you mean mupdf already uses this code? and you prefer to not change the mupdf code? | 10:28.15 |
| besides, isn't the locale system just choosing between comma and period for the fraction separator? | 10:29.09 |
| (and thousands separator, which we don't need) | 10:29.27 |
tor8 | it is. and we've been bitten by that in mupdf's pdf parsing. | 10:29.59 |
avih | interesting | 10:30.46 |
tor8 | but mupdf only uses float precision, so doesn't use the same functions | 10:31.53 |
| IIRC it was a german locale and libc that parsed '10.000' as 10000 rather than 10. | 10:32.59 |
| but it was a long time ago | 10:33.07 |
avih | for "code" parsing it should set the locate to C, right? | 10:33.55 |
tor8 | it should, but a library shouldn't mess with that, for obvious reasons. | 10:37.13 |
avih | right | 10:37.56 |
sebras | Robin_Watts: I think I found a bug in the cmm context cloning. | 19:01.42 |
| Robin_Watts: and also I belive that threading has never really been enabled in mudraw? at least in linux. | 19:02.13 |
| Robin_Watts: I'm not entirely sure my pthread patch is correct, but atleast it will trigger disucssion. | 19:02.35 |
| oh, but it is fairly late over there, then I'll not wait around for comments tonight. | 19:07.13 |
sebras | sleeps. | 19:07.22 |
mvrhel_laptop | Robin_Watts: I know its late there but are you available for a sec? | 21:46.26 |
| oops | 21:46.37 |
| brb | 21:46.41 |
| Forward 1 day (to 2017/07/08)>>> | |