| <<<Back 1 day (to 2014/02/02) | 2014/02/03 |
kens2 | THe short form of the problem is that Ray made a commit which changes memory. IOn the process he moved a call to gs_malloc_release. | 09:24.44 |
| That causes GS to exit with heap corruption if you specify pdfwrite | 09:25.04 |
| If I move the call back to where it was, then it doesn't..... | 09:25.15 |
Robin_Watts | ah. | 09:25.22 |
kens2 | But, the only things between the calls are setting some variables to NULL | 09:25.41 |
| If I set *all* of them to NULL before the call, its all OK if I miss any one then it isn't | 09:26.05 |
| OK mail sent. I doubt anyone will have any ideas but I can hope | 09:28.28 |
| Time for coffee | 09:29.41 |
chrisl | kens2: there is a library on Linux that can sometimes help with issues like this - http://www.stlinux.com/devel/debug/mudflap | 09:29.48 |
| But linking in another library and its effect might move the problem again | 09:30.09 |
Robin_Watts | does memento report the problem? | 09:30.29 |
chrisl | memento causes the problem to disappear | 09:30.41 |
Robin_Watts | valgrind? | 09:30.52 |
chrisl | valgrind isn't very good at spotting stack corruption, overflows etc | 09:31.16 |
Robin_Watts | I have to get ready to take helen to the station, but I can try to have a look when I get back later if you want. | 09:31.51 |
kens2 | It only seems to exhibit on Windows | 09:33.24 |
| Given that its in FreeLibrary, I'm not totally surprised | 09:33.37 |
| And you have to run under a debugger, because its on exit, so you don't see it normally, only the debug build under a debugger warns about it | 09:34.07 |
| I'm at a loss regards tracking down the real problem. I've suggested Ray moves the call back where it was, because that makes the problem go away. I'm just concerned that its hiding something really wrong. | 09:35.16 |
Robin_Watts | tor8: ping | 11:57.03 |
| kens2: I'm trying to reproduce that bug here, and can't. | 12:01.58 |
| I'm usinga VS2005 debug build of 363f3bc | 12:01.59 |
| Ah. Got it. | 12:02.01 |
tor8 | Robin_Watts: morning. | 12:02.40 |
Robin_Watts | tor8: So, I wondered if you had any thoughts to solve this SVG output thing. | 12:03.02 |
| Namely the crapness of freetype. | 12:03.13 |
tor8 | Robin_Watts: the problem in freetype happens because it first applies the charsize, then runs the hinting process, then scales by the transform | 12:03.40 |
| my first attempt was to set the charsize to 1 and put the entire font matrix in the transform, and ran into that problem | 12:04.09 |
Robin_Watts | Ah, so basically, any bbox we get out is going to be skewed by the hinting. | 12:04.33 |
tor8 | the workaround is (since we don't care about hinting) to set the charsize suitably large, and reduce the scale of the transform by the inverse | 12:04.36 |
Robin_Watts | So we can't get a bbox out that we can scale and guarantee that the hints will stay within it? | 12:05.20 |
tor8 | so I set the charsize to 65536 (rather than the default of 64) and then divide the final transform. it's all fixed point integers. | 12:05.20 |
Robin_Watts | tor8: yeah, but for the file I'm looking at, we end up with bbox values of '3' or so. | 12:05.49 |
tor8 | well, the svg output shouldn't do any hinting now should it? | 12:05.51 |
Robin_Watts | i.e. the rounding is significant. | 12:06.10 |
| the +/- 0.5 we could easily be off by is a significant portion of the size of the char. | 12:06.45 |
tor8 | just a sec, could you point me to the relevant bits in svgdevice.c? | 12:07.11 |
Robin_Watts | sure. Let me forward you a file too. | 12:07.19 |
tor8 | because fz_outline_ft_glyph does the transform coord scaling stuff I just talked about, so I must be confused about what the real problem is | 12:08.06 |
Robin_Watts | svg_dev_text_as_paths_defs | 12:08.35 |
| at around about line 413 calls fz_outline_glyph | 12:09.03 |
tor8 | yeah. | 12:09.23 |
Robin_Watts | to get a path, and that gets passed to svg_dev_path | 12:09.25 |
tor8 | and the fz_bound_glyph bbox doesn't match that, is that what you're saying? | 12:09.36 |
Robin_Watts | There are 2 problems here. | 12:09.45 |
tor8 | or is the glyph path that comes out of it all chunky? | 12:09.49 |
Robin_Watts | The first one is that the code as checked in, gives 'chunky' glyphs. | 12:10.09 |
| I can solve that by asking for one 1000 times larger and then dividing the path down. | 12:10.23 |
| That gives nice smooth glyphs. | 12:10.33 |
tor8 | Robin_Watts: it *shouldn't* give us chunky glyphs, since we're already asking for huge paths and dividing them down | 12:10.37 |
| but it could be the final to-fixed-point step that truncates too much, depending on what you ask for | 12:10.59 |
| in fz_outline_ft_glyph in source/fitz/font.c | 12:11.16 |
| freetype has 26.6 fixed point coordinates coming out | 12:11.41 |
Robin_Watts | But it reveals the second problem - that the fz_bound_glyph a few lines earlier is giving a bbox (from it's cached values) that is too small and the edges of the glyph are truncated. | 12:11.44 |
tor8 | so if you ask for a [1 0 0 1] final scaled glyph, you're going to get very chunky glyphs out | 12:12.01 |
Robin_Watts | tor8: Right. Because I define the glyphs myself, I ask for [1 0 0 1] ones. | 12:12.28 |
tor8 | so the correct approach is to ask for a 1000x1000 glyph outline, and scale down in the svg output | 12:12.28 |
Robin_Watts | OK. So I have code to do that. That seems reasonable. | 12:12.38 |
| But the bbox is harder to solve. | 12:12.54 |
tor8 | Robin_Watts: might be because the bounding box calculations are also being done in chunky space? | 12:13.52 |
Robin_Watts | The transform I pass into the fz_bound_glyph thing here is not the problem, because the bbox I get back is a cached one, scaled by my transform. | 12:13.55 |
| hence any chunkyness is not because of the small scale of my transform. | 12:14.16 |
tor8 | because we pass in fz_identity and cache that, and scale up then | 12:14.20 |
| so the fz_identity coming into the (cached) fz_bound_ft_glyph will compute a bbox on the chunky glyph | 12:14.47 |
| and then that'll be upscaled at the tail end of fz_bound_glyph | 12:15.07 |
Robin_Watts | Yes. | 12:15.13 |
| The problem is that the cached bbox value is chunky before I ever get to it. | 12:15.34 |
tor8 | maybe we should ask for a fz_scale(1000) matrix to the cached bbox transform | 12:15.39 |
| and divide by 1000 before applying the final requested transform | 12:15.59 |
Robin_Watts | The first call to fz_bound_glyph comes from pdf_show_char | 12:16.01 |
tor8 | so, in fz_bound_glyph there are two cases we can hit: | 12:16.28 |
| (1) cached, and (2) uncacheable. | 12:16.39 |
| in the uncachable case, we always use the font bbox which is bound to be inaccurate, but hopefully big enough. | 12:17.10 |
| the cached case, we're always using chunky outlines :( | 12:17.21 |
| we ought to fix that the same way | 12:17.26 |
Robin_Watts | Right. | 12:17.29 |
tor8 | by scaling and unscaling by 1000 | 12:17.38 |
Robin_Watts | Right. | 12:17.46 |
| We should remove the fz_identity from the fz_bound_ft_glyph call, IMHO, as it's only ever called from one place. | 12:18.10 |
| and then internally we should scale and unscale. | 12:18.23 |
tor8 | we're correctly unaffected by the mid-transform-for-hinting truncation but I hadn't considered that we then truncate and cache for further scaling | 12:18.39 |
Robin_Watts | Probably we should scale/unscale by 1024 rather than 1000, as I would hope that FP divide by 1024 should be faster :) | 12:18.44 |
tor8 | Robin_Watts: there is an option to freetype to give us completely unscaled metrics | 12:19.06 |
| 2048 is what truetype usually uses, 1000 is for postscript fonts | 12:19.28 |
Robin_Watts | tor8: Ah, so maybe we should use 2048 too. | 12:19.45 |
tor8 | and then you can get the units_per_EM scaling factor and apply it manually | 12:19.46 |
Robin_Watts | tor8: This is sounding more involved than I'd hoped for, but OK. | 12:20.06 |
| I will have a prod at that a bit later, unless you want to. | 12:20.36 |
tor8 | Robin_Watts: just a sec, it's just another option flag to the FT_LoadGlyph, and then a divide by constant from the face->units_per_EM rather than just a hardcoded value | 12:20.50 |
Robin_Watts | ok. | 12:21.04 |
tor8 | just looking up the details | 12:21.04 |
| Robin_Watts: http://freetype.org/freetype2/docs/reference/ft2-base_interface.html#FT_LOAD_XXX | 12:21.35 |
| FT_LOAD_NO_SCALE (replaces FT_LOAD_NO_HINTING) and the CharSize and Transform stuff is completely ignored | 12:21.53 |
| should be faster too | 12:22.01 |
Robin_Watts | tor8: OK,will give that a whirl. | 12:22.10 |
| that sounds better for our purposes. | 12:22.27 |
tor8 | there may be more places where we could do the same (it's a freetype feature I only discovered recently) | 12:22.51 |
| http://freetype.org/freetype2/docs/reference/ft2-base_interface.html#FT_FaceRec | 12:23.05 |
| and there is face->units_per_EM is the scaling factor (commonly 1000, 1024 or 2048) | 12:23.24 |
| Robin_Watts: if you add that flag you can completely ignore the matrix and charsize calculations and setting it | 12:24.19 |
Robin_Watts | yeah, makes sense. Thanks. | 12:24.43 |
| kens2: So I think I see what's the problem. | 12:24.54 |
tor8 | Robin_Watts: oh. but we use the transform to add synthetic boldening and italicizing... | 12:25.18 |
| anyway, scale/unscale will solve the immediate problem | 12:25.37 |
Robin_Watts | hmm. can we do boldening/italicising ourselves when we scale? | 12:26.03 |
tor8 | (but those could both be trivially compensated for by just expanding the bbox after calculating) | 12:26.04 |
| yeah. we embolden by 5% and shear by a consistent amount | 12:26.24 |
Robin_Watts | ooh. | 12:26.31 |
| Do we have the face->units_per_EM before we call Load ? | 12:26.44 |
tor8 | Robin_Watts: yes. that is set when the font object is leaded. | 12:26.57 |
| loaded. | 12:27.00 |
Robin_Watts | If so, we could set the transform up to be (x 0 0 x 0 0) where x = face->units_per_EM. | 12:27.06 |
tor8 | so that is always accessible | 12:27.08 |
| Robin_Watts: yes (I'd set the charsize to units_per_EM) and the transform to identity though | 12:27.47 |
Robin_Watts | ok. | 12:27.55 |
| I will try that. That sounds reasonable. | 12:28.01 |
tor8 | to avoid truncating early | 12:28.02 |
Robin_Watts | yeah. | 12:28.04 |
| kens2: The call to gs_malloc_relase(minst->heap); frees ctx. | 12:29.29 |
| hence the lines immediately afterwards that set ctx->blah = NULL; write to deallocated memory. | 12:29.46 |
| So putting the gs_malloc_release back seems exactly the right thing to do. | 12:31.31 |
rayjj__ | My IRC client won't connet :-( | 15:07.38 |
Robin_Watts | rayjj__: DDOS of freenode at the moment. | 15:08.20 |
| Far fewer people here than usual. | 15:08.36 |
rayjj__ | kens: Sorry about that. As Robin mentioned earlier, since gs_malloc_release frees ctx, it needs to be cleared first (if at all) | 15:08.37 |
Robin_Watts | rayjj__: kens isn't here. | 15:08.51 |
| He said in email that he'd let you fix it? | 15:08.59 |
rayjj__ | Robin: I haven't looked have you or ken reviesd that commit, or should I ? | 15:09.20 |
Robin_Watts | we left it for you. | 15:09.28 |
| I can do if it's a problem though. | 15:09.49 |
rayjj__ | Robin_Watts: the only network I'm having a problem with is irc.freenode.net. | 15:10.20 |
| I can do it | 15:10.26 |
Robin_Watts | ok. | 15:10.31 |
rayjj__ | valgrind should have spotted that problem right away. I thought someone mentioned in the logs that valgrind didn't complain | 15:11.17 |
Robin_Watts | Likewise memento should have spotted it. | 15:11.56 |
rayjj__ | Oh, I realize why valgrind didn't show it. The linux build doesn't use the iapi interface. Duh. | 15:13.40 |
chrisl | rayjj__: also, kens described is as a stack corruption problem - valgrind isn't terribly good as spotting stack corruption | 15:14.30 |
rayjj__ | Robin_Watts: Does memento have any "final" run ? because that's probably the last thing done before exit, so maybe memento doesn't get a chance to inspect memory | 15:14.56 |
Robin_Watts | rayjj__: Memento has an atexit handler, yes. | 15:15.13 |
rayjj__ | so that should have been able to see the write to the ctx that had been freed ? | 15:15.46 |
Robin_Watts | but it probably doesn't check for corruption on the atexit, just checks for leaks. | 15:15.52 |
| hmm. It does do a Memento_checkAllMemory. | 15:16.27 |
rayjj__ | Not sure how it could cause stack corruption. | 15:16.36 |
Robin_Watts | I suspect that it wasn't stack corruption at all. | 15:16.59 |
rayjj__ | I'm not going to bother with a clusterpush since it was unique to windoze. | 15:22.11 |
| Robin_Watts: do you want to glance at the change ? It's on my repo | 15:23.43 |
Robin_Watts | rayjj__: Looks good to me. | 15:24.19 |
rayjj__ | Robin_Watts: thanks | 15:24.29 |
Robin_Watts | np. | 15:24.32 |
rayjj__ | OK. pushed | 15:24.43 |
| let's see how many bugs I can put in today ;-) | 15:26.19 |
henrys | finally connected | 18:31.26 |
| I guess marcosw meant significantly different results in his email about filetype:docx | 18:39.58 |
Robin_Watts | henrys: Yes. I am disappointed by how much they differ :( | 19:41.21 |
marcosw | Robin_Watts: one of the big differences with GhostDocs vs. Word is in where lines wrap, presumably caused by differences in fonts. Any thoughts on how to deal with this? Can windows version of da-test make use of fonts in c:\fonts ala Ghostscript? | 19:56.04 |
Robin_Watts | marcosw: I understood that the fonts were supposed to be clones. | 20:29.41 |
| but ray says that postscript fonts have different metrics to windows fonts, so maybe they don't match in the way we'd hope? | 20:30.18 |
| I can build a version of the converter with a different set of fonts pickled in and we can retry? | 20:32.34 |
mvrhel_laptop | bbiaw | 21:10.31 |
marcosw | Robin_Watts: sure, I have automated scripts to generate the ghostdocs bitmaps, the word pdfs, and/or generate the output pdfs, so re-testing is easy. | 21:20.51 |
| Forward 1 day (to 2014/02/04)>>> | |