| <<<Back 1 day (to 2012/05/23) | 2012/05/24 |
kens | Hmm, back in memory problems again. | 12:47.04 |
| Anyone got any suggestyions ? I' getting a seg fault in restore_finalize, Memento builds don't exhibit the problem but debug and release builds do. -Z? and -Z@ don't show any problems. | 12:47.58 |
| I really need to know whhat the object type its trying to finalize is, and why its crashing. | 12:48.17 |
| -dNOGC makes no difference either | 12:48.25 |
| What's really annoying is that my latered code isn't even called wiith this ile, so it must be just a memory layout prolem. | 12:49.19 |
Robin_Watts | kens: do debug builds show it ? | 13:02.09 |
| sorry, I just reread. | 13:02.36 |
kens | Robin_Watts : yes, certainly do | 13:02.47 |
Robin_Watts | On what platform ? | 13:02.59 |
kens | Also fails on Linux, I'm assuming tis the same problem | 13:03.01 |
Robin_Watts | windows and linux then. | 13:03.07 |
kens | I'm using Windows, but I was prompted by a cluster run showing a seg fault | 13:03.16 |
Robin_Watts | If you run it in the debugger, can't you tell by looking up the stack what type it is? | 13:03.30 |
kens | Cluester = 64-bit Linux, windows = 32-bit | 13:03.34 |
| Robin_Watts : I have no idea what its telling me | 13:03.44 |
Robin_Watts | Wossa command? | 13:03.55 |
kens | ANd I#'m not sure that it helps. I really want to know what's wrong with it :-) | 13:03.58 |
| COmmand ? | 13:04.04 |
Robin_Watts | What command can I run here to duplicate the problem? | 13:04.16 |
kens | Oh, you'd need my (small) code modification too, shall I send you a patch ? | 13:04.33 |
Robin_Watts | go for it. | 13:04.38 |
kens | OK couple of minutes, just writing a comment. | 13:04.49 |
| Robin_Watts : on its way now | 13:10.15 |
| By email | 13:10.21 |
Robin_Watts | gottit. Just a mo. | 13:13.05 |
kens | Thanks | 13:13.10 |
| Hmm pre->d.o = 0xf1f1f1... | 13:14.26 |
| Doens't look good | 13:14.32 |
| Oh no wait, that's size. Still look swrong though | 13:15.04 |
Robin_Watts | Where is test-setweightvector.ps ? | 13:15.28 |
kens | Its one of the standard tests, I can send that too if you like | 13:15.41 |
| standard cluster testsd | 13:15.50 |
Robin_Watts | I need the path to it... | 13:15.58 |
kens | yeah, just getting it | 13:16.06 |
| tests_private/comparefiles/test-setweightvector.ps | 13:16.20 |
Robin_Watts | thanks. | 13:16.33 |
| That runs without failure for me. | 13:17.08 |
kens | On Windows ? | 13:17.15 |
Robin_Watts | yes. | 13:17.40 |
kens | Hmmm I'll update and try again with a clean build | 13:17.53 |
Robin_Watts | If the device is ps2write, why are you going to out.pdf ? | 13:19.32 |
kens | Doesn't matter what the filename is | 13:34.16 |
| Still crashes for me | 13:34.22 |
| Are you suing gswin32c ? | 13:34.42 |
| Hmm, both crash for me | 13:35.11 |
| In the debugger of course. | 13:35.36 |
Robin_Watts | I am. | 13:36.21 |
| Ah. | 13:36.44 |
kens | Crashes fort me with and without the debugger, and using gswin32c or gswin32 | 13:36.47 |
Robin_Watts | When it comes to the gs prompt, I need to type 'quit' for it to error. | 13:37.05 |
kens | Well, yes | 13:37.15 |
| I guess I should have said | 13:37.22 |
Robin_Watts | GPL Ghostscript GIT PRERELEASE 9.06: .\psi\isave.c(935): Chunk parsing error, 0x281e750 != 0x281e720 | 13:37.31 |
kens | Hmm, that sounds related. | 13:37.44 |
| Let me just try gswin32c instead | 13:37.59 |
| Yes, I see that before the seg fault | 13:38.29 |
| Different pointers of course | 13:38.52 |
Robin_Watts | right, me too. | 13:39.06 |
kens | isave.c 935 is a macro :-( | 13:39.28 |
| END_OBJETCS_SCAN | 13:39.37 |
Robin_Watts | Run with -Zu to get more information about what is being finalised. | 13:40.17 |
kens | Really ? OK | 13:40.26 |
| Looks like fonts | 13:41.01 |
| Immediately before the parsing error is 'unlikning font 0x2b0b528, base=0x2b0b528, prev=0x0, next=0x0 | 13:41.58 |
| Sorry, that's immediately before the chunk parsing error, but theres another one immediately before the seg fualt, same message but different pointers | 13:42.47 |
| So it looks like something is corrupting the font memory | 13:43.09 |
Robin_Watts | Right, so SCAN_CHUNK_OBJECTS starts at (cp)->cbase and adds the rounded size each time and expects to reach (cp)->cbot. | 13:44.09 |
| If it doesn't, it gives that warning. | 13:44.24 |
| Once you've had that warning, the pointer should be considered to be in hyperspace, so any SEGV is not surprising. | 13:44.41 |
| So... something is corrupting the size of the chunk data. | 13:44.53 |
kens | Yeah, I see. So the problem is why is the onft not matching | 13:44.59 |
| I guess I need to find out where the memory is allocated | 13:45.46 |
| Hmm, of course it may not be the font. Its 'something' in that chunk | 13:46.52 |
| I wonder if Memento avoids this because its one object per chunk | 13:47.14 |
Robin_Watts | kens: quite possibly. | 13:48.31 |
| So the fonts are being created from: build_charstring_font | 13:53.45 |
kens | from build_gs_primitive_fotn I think | 13:54.04 |
| Called by the other one | 13:54.10 |
| Ah, or even build_gs_outline_font | 13:54.40 |
| How many layers ? :-) | 13:54.47 |
| AH, that calls build_base_fnt | 13:56.39 |
| Which is actually a parameter from build_gs_primitive_font | 13:57.17 |
Robin_Watts | build_gs_sub_font is where we end up. | 13:57.37 |
kens | :-) Beat me to it | 13:57.48 |
Robin_Watts | zbfont.c line 856 | 13:57.51 |
kens | Hmmm odd, that's a CIDFont, well I guess that's possible | 13:58.30 |
| Maybe its a different font that causes the problem | 13:58.50 |
Robin_Watts | Oh, well, I could be wrong. Let me keep debugging for a bit. | 13:59.13 |
kens | Its a Multiple Master test file, so I wouldn't expect it to be a CIDFont, but I could be wrong | 13:59.37 |
| That looks like maybe its teh DroidSansFallback | 13:59.47 |
| Oh OK, it doesn't have to be a CIDFont, its jst a font. | 14:01.52 |
| If I do -dNOGX the warning goes away, but it still crashes in the same place | 14:07.19 |
| NOGC | 14:07.25 |
Robin_Watts | hmm. I've tried putting prints where the font objects are made... and I clearly haven't caught them all. | 14:08.05 |
kens | Not easy to find them all I don't think. | 14:08.24 |
Robin_Watts | I can see 6 builds, and at least 9 finalises. | 14:16.24 |
kens | buildofnts ? I see 10 I think | 14:16.37 |
Robin_Watts | Hmm. I've put breakpoints everywhere I can see a gs_font_type1 structure descriptor being used. | 14:18.28 |
| and the only one that gets called is build_charstring_font. | 14:18.39 |
kens | And no luck ? | 14:18.40 |
Robin_Watts | And that gets called 5 times. | 14:18.44 |
| (sorry, I can't count to 6 correctly. I see 5 builds for gs_font_type1's) | 14:19.08 |
kens | OK the rest are type 3 I think | 14:19.29 |
Robin_Watts | gs_font_base ? | 14:19.54 |
kens | THe rest of the calls I see to build_gs_sub_font are for type 3 fotns I mean | 14:20.39 |
Robin_Watts | ok. | 14:20.47 |
kens | I see 10 calls there, 5 of them for type 1/2/4 | 14:20.48 |
Robin_Watts | Which all result in a gs_font_type1 structure? | 14:21.06 |
kens | That's mode debatable | 14:21.17 |
Robin_Watts | Cos it 's a gs_font_type1 structure that has the wrong size on finalise. | 14:21.24 |
kens | THey result in a 'pfont' which is declared as a gs_font_type1 pointer | 14:21.52 |
Robin_Watts | Right. | 14:21.59 |
kens | But its not obvious to me where the structure is allocated | 14:22.06 |
| (yet) | 14:22.12 |
Robin_Watts | BUT... I can never see an allocation for the particular one that is finalised. | 14:22.20 |
kens | Indeed, that's why I tried with -dNOGC to prevent pointers being moved | 14:22.43 |
Robin_Watts | I see the 5 creations at the start of the page, then 3 at the end. | 14:23.15 |
kens | I think the 3 at the end are (probably) where pdfwrite copeis the font | 14:23.39 |
| The ones it needs | 14:23.51 |
Robin_Watts | Where do such copies occur ? | 14:25.34 |
kens | Where, in the code ? | 14:25.57 |
Robin_Watts | indeed. | 14:26.00 |
kens | the copies are made in gxfcopy.c | 14:26.11 |
| IIRC pdfwrite always makes 2 copeis of a font | 14:27.18 |
| Look at copy_font_type1 at line 987 | 14:27.49 |
Guest91523 | hello, i was having a lot of system slowdown on an Ubunutu machine, did Top and it looked like gs was causing it. killed the process and it speed right up. any ideas of the cause? | 14:27.50 |
| if this is the wrong place to ask, i'll go XD | 14:28.16 |
kens | GS shouldn't be running unless you are printing | 14:28.49 |
| Or doing a convert wiht ImageMagick, or somethign else which invokes GS | 14:29.20 |
Robin_Watts | Guest91523: GS is used as part of the printing system on Ubuntu, so if you (or another user) were printing something that would explain it. | 14:30.18 |
kens | Robin_Watts : gs_copy_font looks like a possible culprit | 14:30.19 |
Guest91523 | as i thought. problem is we weren't doing anything like that. and it would come up right after the system booted. the user said it's been slow for three days and she...doesn't know enough to really be much of a help tracking down the start point | 14:30.24 |
kens | The only reason I can think of for it to come up after startup wooudl be if there was a print job pending somehow. That would be CUPS and I don't know much about that | 14:31.09 |
Guest91523 | I know some about it. still, it's a start. thanks for the help | 14:31.32 |
Robin_Watts | Guest91523: Check back here in a while. | 14:32.08 |
| The ubuntu printer guru is frequently here. | 14:32.22 |
Guest91523 | aye sir. how long is "a while?" | 14:32.23 |
Robin_Watts | I can't say. Whenever tkampeter next appears. | 14:32.42 |
kens | Hard to say. You want tkamppeter | 14:32.45 |
| He's here a lot, but not guaranteed | 14:32.56 |
Guest91523 | alright, thank you very much. i'll be seeing ya | 14:33.10 |
Robin_Watts | np. | 14:33.14 |
kens | Robin_Watts : as far as I can tell the copied font structures get a label which describes them as copies, so I don't *think* its any of those | 14:34.05 |
Robin_Watts | Well, then I'm confused as to where it's getting these type1's to free. | 14:34.32 |
kens | Me too :-( | 14:34.40 |
Robin_Watts | The odd thing is that lots of them are freeing successfully. | 14:34.41 |
kens | THe pointers dont' match at all | 14:34.46 |
| Robin_Watts : well if I'm stomping on memory that would be feasible | 14:35.02 |
| But the pointers make no sense to me | 14:35.12 |
Robin_Watts | OK. I have the following debug: | 14:37.22 |
| gs_copy_font 29af3e0->2a19258 | 14:37.31 |
| and then later on, the one that SEGVs is: | 14:37.39 |
| [u]restore finalizing gs_font_type1 0x2a19258 | 14:37.50 |
kens | AH, taht's interesting | 14:38.04 |
Robin_Watts | So copying *is* maintaining the structure type. | 14:38.21 |
kens | I can't claim to be an expert :-) | 14:38.36 |
| And it does look like its the copied font being trashed | 14:38.56 |
| Or is that the other way around ? | 14:39.02 |
| Which pointer is frist in your printout there,m *font or **pfotn_new ? | 14:40.00 |
guest9123 | check the print que, two PDF jobs in there from the other machine user | 14:40.25 |
kens | OK so I guess that's why it starts up GS immediatley | 14:40.43 |
| In order to process the jobs in the queue. | 14:40.55 |
| Presumably one of them is causing GS to enter a loop | 14:41.06 |
Robin_Watts | guest9123: Right, and they are probably hugely complex ones or something which is why they never complete. | 14:41.10 |
guest9123 | yep, cleared it out, a quick reboot to test and it's quick as a slicked up speed boat. | 14:41.11 |
Robin_Watts | (or they could be malicious postscript that goes into an infinite loop) | 14:41.25 |
kens | PDF | 14:41.31 |
Robin_Watts | oh, right. | 14:41.44 |
kens | :-) | 14:41.50 |
| What version of GS guest9123 ? | 14:42.03 |
guest9123 | this is what i love about linux. got a problem, bounce into the IRC network of the program that's goofing | 14:42.21 |
| and. i..am not sure. i can go check if you like. | 14:42.30 |
kens | Its not vital, depends if you want us to look into why the printe job was lopping | 14:42.47 |
| looping* | 14:42.53 |
guest9123 | for the sake of debugging, i will. back in a moment | 14:43.24 |
Robin_Watts | kens: Let me add some code to walk the chunk to test it BEFORE we start finalising - that will tell us if it's the finalisation thats killing it or whether it's broken already. | 14:44.05 |
guest9123 | man page said 8.71 | 14:44.49 |
kens | Robin_Watts : OK thanks | 14:44.57 |
| guest9123 : That's *really* old, the current version is 9.05 | 14:45.16 |
guest9123 | o_o wow. okay. xD | 14:45.30 |
kens | I guess thats the veriosn shpipping with LTS version of Ubuntu though | 14:45.46 |
| The problem may have been fixed by now | 14:46.06 |
guest9123 | yea, pretty sure it's 10,4 on there | 14:46.06 |
kens | If you still have the PDF file we can look at it and see if tis fixed. | 14:46.29 |
guest9123 | pretty sure they were just printed from firefox. so a reboot would ta-.. e-o if they set the print job, rebooted. that'd remove the file from Temp right? and without the file, gs would freeze? | 14:47.49 |
| well. tmp. but you get the idea | 14:48.00 |
kens | if there's no file GS shuld bomb out with an error | 14:48.19 |
| Which CUPS should intercept and 'deal' with | 14:48.29 |
chrisl | guest9123: the print queue isn't in tmp | 14:48.38 |
guest9123 | true.. | 14:49.06 |
kens | Robin_Watts : the pointers I'm seeing don't match the fault, I must be doing something wrong.... | 14:50.42 |
chrisl | guest9123: you might be able to do "cancel -a" at a terminal, that should remove any pending jobs from the print queue. | 14:52.24 |
Robin_Watts | kens: Right, the chunk is corrupted. | 14:54.57 |
| before any of the finalisation calls are made. | 14:55.07 |
kens | Well, I guess that's not surprising, the only question is what's corrupting it ? :-) | 14:55.23 |
Robin_Watts | I wonder if there are debugging calls we can make to make the chunk self-check. | 14:55.33 |
kens | Its 'probably' the memory freeing stuff in pdfwrite, though it must be fairly subtle, since it hasn't shown up before | 14:56.08 |
| Its possible its a long-standing problem | 14:56.16 |
guest9123 | chrisl: i already cleared the queue, it's working fine again. was just trying to figure out why it happened in the first place | 14:56.20 |
kens | Robin_Watts : if you can fu=igure out what the corruption is, I coul dset a memory watchpoint to see the corruption take place | 14:56.44 |
guest9123 | kens: ubuntu is saying 8.71 is the most up to date by it's standards. so..i don't know. sorry i can't help you guys figure out the cause any more than that | 14:57.12 |
chrisl | guest9123: ah, sorry, I spotted the discussion rather late..... | 14:57.36 |
kens | guest9123 : Not a huge problem, its probably because you are on a long term support version. | 14:57.48 |
guest9123 | chrisl: no worries xD i know what it's like to drop into a convo late | 14:57.58 |
| kens: alrighty. i'll take my leave now then. thank you all a bunch, keep up the good work | 14:58.19 |
kens | bye guest9123 | 14:58.32 |
ray_laptop | Robin_Watts: kens: -Z? does the 'validate_pointers' stuff | 15:57.30 |
henrys | kens:I guess it would be good to have a bugzilla enhancement to check implement your parameter list with the other languages. I do fear featuritis though - are we really going to improve the product by making these options functional? | 15:57.35 |
kens | ray_laptop : didn't show me any problems, I tried -Z? | 15:57.46 |
| henrys, I worry about which ones people are going to demand 'because you can do it with PostScript' | 15:58.11 |
| How do we explain to customers that some features are not avialable when you aren't running a PS interpreter. Especially ones that aren't obviously PostScript like Overpritn and stuff | 15:58.53 |
| Robin_Watts : can you send me that chunk validation code please, and the pointer pritnout you were doing that showed the problem please ? | 15:59.19 |
Robin_Watts | kens: I'm still bashing on it. | 15:59.32 |
kens | Oh, ,OK don't let me joggle your elbow then | 15:59.46 |
ray_laptop | it finds things for me sometimes but it may not be run when you need it. | 16:00.03 |
kens | ray_laptop : I did try it, and -Z@ | 16:00.13 |
| And Memento :-) | 16:00.19 |
ray_laptop | kens: Robin_Watts: can you add a call to 'ialloc_valdiate_spaces' | 16:00.28 |
kens | I tried everything I could think of first | 16:00.30 |
| henrys if I get time to figure out which of the 'important' parameters do and don't work I'll add them to the existing enhacement bug #693058 | 16:01.21 |
| Actually maybe I'll just tidy up the text file and add it as is | 16:01.40 |
henrys | kens:okay | 16:01.59 |
kens | I have to be off I'm afraid, got to try and assemble a bed b efore taking Melanie riding. | 16:02.03 |
| Goodnight all | 16:02.13 |
henrys | bye kens | 16:02.28 |
| HP layoff - wow! | 16:02.54 |
| that's a lot of people | 16:03.13 |
ray_laptop | henrys: yeah, I heard 27k people on the news yesterday | 16:03.33 |
Robin_Watts | It begs the question, what the hell were HP doing with 27K people?! | 16:05.09 |
mvrhel | wow | 16:09.58 |
| that is huge | 16:10.01 |
ray_laptop | 27k is only 8% of their global workforce -- appalling | 16:10.38 |
mvrhel | oh ray_laptop: Getting overprint to work with cust 532 additive RGB device may be tricky | 16:11.00 |
| at least I can't think of a quick and dirty fix for it | 16:11.46 |
| wow that is even crazier. I would have thought hp was smaller than that these days | 16:12.24 |
ray_laptop | mvrhel: if they switch to CMYt will overprint work ? | 16:12.42 |
mvrhel | ray_laptop: yes | 16:12.50 |
| oh | 16:12.56 |
| CMY | 16:12.58 |
ray_laptop | mvrhel: so the problem is related to overprint | 16:13.06 |
mvrhel | yes it should | 16:13.06 |
henrys | Robin_Watts:current revenues are 127 billion, if you do the math and compare it to artifex revenues per employee the numbers are probably similar. | 16:13.43 |
mvrhel | I am pretty certain. That overlapping circle in the Altona file is an overprint test and if the device is additive, overprint is not used per the spec | 16:13.49 |
| overprint will only work for the colorants that are supported though | 16:14.38 |
| with the lack of K, I am sure there could be issues when compared to a CMYK device for some file out there | 16:15.02 |
ray_laptop | mvrhel: what does that mean ? | 16:15.03 |
mvrhel | what does what mean | 16:15.10 |
ray_laptop | mvrhel: I see -- you were the lack of a K plane ? | 16:15.34 |
| s/were/were concerned/ | 16:15.44 |
mvrhel | yes | 16:15.47 |
ray_laptop | do we have a CMY device laying around to test with ? | 16:16.12 |
mvrhel | I have not seen one, and I am curious how it would behave today since we dont have a CMY ICC profile laying around to go with it | 16:17.02 |
| actually we could use CMYK still | 16:17.22 |
| although I am certain something would explode | 16:17.42 |
| I am talking about the ICC profile now | 16:17.54 |
ray_laptop | mvrhel: well, their code predates ICC color | 16:18.22 |
mvrhel | yes. so not an issue for them | 16:18.36 |
| just thinking about other issues (as if we don't have enough) | 16:19.04 |
ray_laptop | mvrhel: it isn't hard to modify an RGB device to make a CMY variant (change color_info.polarity and the encode/decode color), but we should do it with 8.71 code | 16:20.52 |
mvrhel | right | 16:21.33 |
| ray_laptop: is this in my lap to do or is marcos going to do some testing? | 16:26.42 |
ray_laptop | mvrhel: unfortunately the 'dci' macros assume that the polarity is additive unless the num_components is >= 4 | 16:27.09 |
mvrhel | ugh | 16:27.19 |
Robin_Watts | henrys: Wow. That maths is... staggering. | 16:27.31 |
| (337500 total worldwide workforce = 376000 per employee) | 16:28.49 |
| Net earnings are $1.6Billion for the last quarter, so say 6.4billion per year? | 16:30.15 |
ray_laptop | mvrhel: but doing without the macros is not too bad. The psdrgb device might be an OK starting point since it has the color info specified directly: psd_device_body(spot_rgb_procs, "psdrgb", 3, GX_CINFO_POLARITY_ADDITIVE, 24, 255, 255, GX_CINFO_SEP_LIN, "DeviceRGB"), | 16:30.25 |
Robin_Watts | which is more like 19000 per employee. | 16:31.23 |
ray_laptop | darn. Now my phone crashed (a Motorola Droid Razr). maybe it caught something from being next to my laptop. | 16:33.59 |
| I was listening to voice mail and the keyboard froze, then it just rebooted | 16:35.18 |
mvrhel | ray_laptop: so do you need me to do any of this stuff for cust 532? | 16:38.05 |
Robin_Watts | ray_laptop: Can you think of any reason why an ivalidate_spaces(); at the top of gs_main_finit should fail? | 16:38.59 |
ray_laptop | mvrhel: if you could test a cmy device with 8.71 that would help, but the real test needs to be on their code. Do you want to get "dirty" with their simulator ? | 16:39.16 |
mvrhel | ray_laptop: I am really not interested in getting sucked into that one. I really don't even like back pedaling to 8.71.... | 16:40.15 |
| but if it needs to be done | 16:40.27 |
| what svn version is it that you check out? | 16:40.58 |
Robin_Watts | git checkout ghostpdl-8.71 | 16:41.43 |
ray_laptop | Robin_Watts: no -- as long as init_done >= 1 | 16:41.51 |
Robin_Watts | ? | 16:41.52 |
mvrhel | no, I thought there was a later version | 16:41.56 |
| ray_laptop? | 16:42.01 |
| isnt there code slightly match to something like in May of the 8.71 release | 16:42.24 |
| do we have a git version of 8.71? | 16:42.46 |
Robin_Watts | mvrhel: Yes. | 16:42.51 |
mvrhel | oh | 16:42.54 |
| good deal | 16:42.57 |
ray_laptop | mvrhel: let me check (I have a 8.72-pre-icc sandbox that I use for testing "standard" gs) | 16:42.59 |
Robin_Watts | Tor worked magic to ensure that all our history is in git. | 16:43.11 |
ray_laptop | mvrhel: I'll check the date... | 16:43.12 |
mvrhel | that is great about git being available for this | 16:43.30 |
ray_laptop | mvrhel: the 8.72-pre-icc version I have is from Jul 2 2010 | 16:44.27 |
mvrhel | ok | 16:44.34 |
| let me see if I can get git to get this | 16:45.41 |
Robin_Watts | We have a ghostpdl-8.71 tag then the next I can see is ghostpdl-9.00 | 16:46.45 |
| 6a82ae2 is where icc_work went back into the trunk. | 16:47.07 |
mvrhel | oh. so there is no way to get intermediate items just the tag items | 16:47.13 |
ray_laptop | mvrhel: you may want to checkout 36aaf13c1472a03dacce62805758f93482fdb016 | 16:47.29 |
mvrhel | ok cool | 16:47.35 |
Robin_Watts | mvrhel: If you know the hash numbers, you can go there. | 16:47.38 |
mvrhel | i know it now | 16:47.49 |
ray_laptop | mvrhel: so the pre-icc code you need would be 254698b4562e97cc6ad8a83fac951742e6257a85 right ? | 16:56.42 |
mvrhel | wait I thought you said 36aaf13c1472a03dacce62805758f93482fdb016 | 16:57.07 |
ray_laptop | mvrhel: I don't know why my files are dated Jul 2, since the icc_work branch merge was May 24 2010 | 16:57.35 |
Robin_Watts | mvrhel: When exactly are you looking for? | 16:58.02 |
ray_laptop | mvrhel: sorry for the confusion. | 16:58.03 |
mvrhel | Robin_Watts: I am looking for what ever ray_laptop tells me I need | 16:58.21 |
Robin_Watts | Before icc_work was created? Or before it was merged back ? | 16:58.22 |
ray_laptop | mvrhel: the 254698b commit is the one prior to the icc_work branch merge (that Robin_Watts mentioned: 6a82ae2 ) | 16:59.02 |
mvrhel | ok that should be just fine for what we want to do | 16:59.24 |
Robin_Watts | Right. That seems sane to me - but it might not correspond exactly to what customer 532 are using. | 16:59.37 |
mvrhel | well it will be close enough | 16:59.49 |
Robin_Watts | especially because my gs_2_colors stuff went in around that time | 16:59.51 |
mvrhel | for this testing | 16:59.53 |
ray_laptop | mvrhel: are you sure you don't want the simulator code ? | 16:59.59 |
mvrhel | ray_laptop: if you think it would be a good idea or easier that is fine | 17:00.54 |
| ray_laptop: iirc you had sent out some instructions on the simulator? | 17:06.34 |
Robin_Watts | mvrhel: I suspect you'd be well advised to avoid the simulator if you can. | 17:09.26 |
mvrhel | ok thanks | 17:12.09 |
| trying to make a new clone for this and having network problems for some reason | 17:12.55 |
Robin_Watts | mvrhel: Just copy the clone you have ? | 17:20.49 |
| From the msys bash shell: | 17:21.14 |
| cp -pr ghostpdl ghostpdl2 | 17:21.24 |
mvrhel | ok. much better | 17:25.20 |
| now I am at 254698b4562e97cc6ad8a83fac951742e6257a85 | 17:26.45 |
| argh. marcos is seeing the opposite timings that I am getting | 17:29.35 |
| oh crap | 17:35.17 |
| apparently my testing was screwed up | 17:35.25 |
| Robin_Watts: so we are going to have to add in some path complexity test | 17:35.41 |
| perhaps I should pass you my patch and let you take a look at adding this? | 17:37.39 |
| although I need to do a few more things to the patch | 17:38.09 |
| since will will still want lop_pdf14 when we are not in a knockout | 17:39.57 |
Robin_Watts | ok. | 17:40.01 |
| feel free to kick it back to me when you feel you've got to a suitable place. | 17:40.25 |
mvrhel | Robin_Watts: ok. it is probably going to be next week since I have this cust 532 issue and I am gone friday-monday | 17:40.52 |
Robin_Watts | ok. Or feel free to kick it back sooner if you think I can finish it. | 17:41.20 |
| (hmm, I thought I'd deleted everything after "ok" before hitting return :) ) | 17:41.46 |
mvrhel | :) | 17:41.54 |
ray_laptop | mvrhel: thanks for picking up the cust 532 issue (at least the first part). | 17:49.00 |
Robin_Watts | With kens memory corruption problem, I've put an ivalidate_spaces() call into do_call_operator before the call to op_proc | 17:50.03 |
| [!]operator 1length | 17:50.29 |
| [!]operator 3getinterval | 17:50.31 |
| [!]operator 1cvn | 17:50.33 |
| [!]operator 0%file_continue | 17:50.35 |
| [!]operator 0%for_pos_int_continue | 17:50.36 |
ray_laptop | mvrhel: my concern is that the pdf14 device does stuff with polarity and the num_components. | 17:50.37 |
Robin_Watts | [!]operator 1copy | 17:50.38 |
| [!]operator 2get | 17:50.39 |
| [!]operator 3filenameforall | 17:50.42 |
| [!]operator 0%for_pos_int_continue | 17:50.44 |
| [!]operator 1.dicttomark | 17:50.45 |
| [!]operator 2get | 17:50.47 |
| GPL Ghostscript GIT PRERELEASE 9.06: .\psi\ilocate.c(551): Bad object 0x1d71148(30871912), ssize = 0, in chunk 0x204f430! | 17:50.49 |
| So that looks to me like the .dicttomark operator is causing the stack corruption. | 17:50.58 |
ray_laptop | Robin_Watts: that's strange | 17:51.37 |
Robin_Watts | indeed. | 17:52.01 |
ray_laptop | so now you need to figure out _which_ .dicttomark is doing it (it's a heavily used operator) | 17:53.15 |
Robin_Watts | I'm hoping that filenameforall is less used. | 17:53.54 |
ray_laptop | Robin_Watts: it's mostly used by the resourceforall | 17:57.30 |
Robin_Watts | Ok, so I have to step past 7 filenameforall's and it's the next .dicttomark. | 17:57.52 |
| And the chunk problems occurs when we dict_create | 17:58.20 |
| (ivalidate_spaces() works before that call, and fails afterwards) | 17:58.36 |
| Damn. No ray. | 18:27.14 |
ray_laptop | Robin_Watts: sorry -- I was in transit from the coffee shop | 18:34.50 |
Robin_Watts | No worries. | 18:35.00 |
| So, I'm trying to understand this chunk corruption. | 18:35.12 |
| Am I right in thinking that ivalidate_spaces will walk the chunks checking that the objects in the chunks are all correctly sized etc? | 18:35.51 |
| (i.e. it'll check that for each chunk, you can walk through the blocks by starting at the chunk base, and adding rounded(block_size) each time until you get to the top of the chunk. | 18:37.12 |
| AIUI refs are a separate thing. Does ivalidate_spaces() check the refs too ? | 18:37.54 |
ray_laptop | Robin_Watts: refs are on the stacks and in some types of objects, but AIUI validation doesn't trace the stacks like the GC does | 18:44.09 |
Robin_Watts | sorry, I referred to "stack corruption" earlier, when I meant "chunk corruption" | 18:44.47 |
| The call to dict_create() in zdicttomark is the thing that breaks everything. | 18:45.16 |
ray_laptop | Robin_Watts: right -- I remember you said that. | 18:45.38 |
Robin_Watts | and that starts by doing a gs_alloc_ref_array | 18:45.44 |
| which fiddles with mem->cc.{rtop/rbot}, so I wondered if that was something that ivalidate_spaces would have checked. | 18:46.39 |
| if that was corrupt, it could explain why we are seeing random memory overwritten. | 18:47.01 |
| which would mean that the bug wasn't here, but was really elsewhere. | 18:47.14 |
ray_laptop | Robin_Watts: can you put a data breakpoint on the location of the bad 'ssize' to see who writes it ? (how consistent is the address of the bad object) | 18:59.18 |
Robin_Watts | windows 7 randomises addresses. | 19:00.19 |
| but I'll try. | 19:00.29 |
| Ah. | 19:01.27 |
| pre->d.o.size == 0x1ef113c | 19:01.40 |
ray_laptop | Robin_Watts: in my experience, the top bits get randomized, but if you know the "base" of something, the low bits stay consistent | 19:01.41 |
| that's a pretty large size | 19:02.07 |
| and looks like an address | 19:02.22 |
Robin_Watts | Sorry | 19:02.35 |
| &pre->d.o.size == 0x1ef113c | 19:02.40 |
| and in the gs_alloc_ref_array we just did, on entry: mem->cc.rtop == mem->cc.cbot == 0x1ef1138 | 19:03.12 |
| ah, and the final result of that allocation was 0x1ef1130 | 19:04.28 |
| so it's the object that we just allocated that causes ivalidate_spaces to die. | 19:05.02 |
| No, sorry. | 19:06.32 |
| It's the object immediately after the one we just allocated. | 19:06.47 |
| 0x1ef1148 | 19:06.56 |
| that has ssize = 0 | 19:07.03 |
ray_laptop | Robin_Watts: what is the 'oname' (or the otype) | 19:09.43 |
Robin_Watts | of ? | 19:09.56 |
ray_laptop | of the object at x1ef1148 | 19:10.12 |
Robin_Watts | What expression are you asking for? If I put (blah *)0x1ef1148 into the watch window, what should blah be? | 19:11.10 |
ray_laptop | Robin_Watts: (obj_header_t *) | 19:11.50 |
| you may need to use (obj_header_s *) | 19:12.15 |
Robin_Watts | (obj_header_t *)optr = 0x1ef1148 and I can select fields from that | 19:13.01 |
| d.o.t.type = 0x3d0800 (ssize=0, sname=-1, ....) | 19:13.24 |
ray_laptop | Robin_Watts: if you use the 'immediate' window, it dumps all fields | 19:13.44 |
Robin_Watts | I think the previous block in the chunk (that it's just walked through) is duff. | 19:14.04 |
| Let's look at (obj_header_t *)0x1ed1130 | 19:14.17 |
ray_laptop | Robin_Watts: you think its size is wrong so it goes to the wrong place ? | 19:17.21 |
Robin_Watts | That's my current idea, yes. | 19:17.39 |
| In gs_alloc_ref_array we went into the topmost branch of the if. | 19:18.06 |
| and obj was set to 0x1ef1130 | 19:18.13 |
| so should I expect to see sane things when I look at (obj_header_t *)0x1ef1130 ? | 19:18.31 |
ray_laptop | Robin_Watts: what is rcurr in the chunk ? | 19:18.57 |
ray_laptop | expects 0x1ef1130 | 19:19.35 |
Robin_Watts | rcur = 0, rtop =0 | 19:19.41 |
| no, sorry ,was too hight on the stack. | 19:21.23 |
| rcur == 1ef1020 | 19:21.30 |
| rtop=1ef1178 | 19:21.38 |
ray_laptop | Robin_Watts: sorry net glitch. | 19:26.26 |
Robin_Watts | rcur == 1ef1020 | 19:26.38 |
| rtop==1ef1178 | 19:26.42 |
ray_laptop | right, so who called validate_object ? (validate_ref ?) | 19:28.20 |
Robin_Watts | spaces->memory ->chunk->ref_packed ->ref->object | 19:28.45 |
ray_laptop | since you are in the middle of the ref array | 19:28.58 |
| Robin_Watts: also, what is (obj_header_s *)rcurr | 19:31.29 |
| i.e. does its ssize look correct | 19:32.05 |
Robin_Watts | (obj_header_s *)cp->rcurr = 0x1ef1020 | 19:32.41 |
| and d.o.size = 0xc842c881 | 19:32.58 |
| d.o.t.type = garbage, so ssize = ???? | 19:33.12 |
| Should we be looking at (obj_header_t *)cp->rcur-1 ? | 19:34.01 |
| That looks much more reasonable. | 19:34.16 |
| d.o.size = 0x158 | 19:34.24 |
| d.o.t.type = _st_refs ssize=8 | 19:34.44 |
ray_laptop | I agree those values look better, but I don't understand why you need rcurr-1 (rcurr is obj_header_t *) | 19:37.06 |
Robin_Watts | Looking at gs_alloc_ref_array, it uses mem->cc.rucr[-1].o_size | 19:37.38 |
| The [-1] thing may be a trick to 'prepend' the obj_header onto the start of the block. I use that in memento all the time | 19:38.32 |
ray_laptop | Robin_Watts: I see. So rcurr is the pointer to the first ref and it's type is obj_header_t * to allow the [-1] to get the header. Funky | 19:39.27 |
| Robin_Watts: did you answer: who called validate_object ? | 19:40.51 |
Robin_Watts | spaces->memory ->chunk->ref_packed ->ref->object | 19:41.00 |
ray_laptop | oh, sorry I wasn't reading that as a call stack | 19:41.16 |
Robin_Watts | Sorry. | 19:41.24 |
ray_laptop | Robin_Watts: so that means that the ref array is (probably) OK, but the contents the 'dict' is packed with has a bad entry | 19:42.15 |
Robin_Watts | ok... | 19:43.08 |
| The exact error I'm getting is: | 19:43.14 |
| GPL Ghostscript GIT PRERELEASE 9.06: .\psi\ilocate.c(551): Bad object 0x1ef1148(32444776), ssize = 0, in chunk 0x1f6f430! | 19:43.41 |
| 1ef1148 seems dangerously close to where the refs are. | 19:44.24 |
| cp->rcur-1 = 1ef1010 and d.o.size = 0x158 | 19:45.14 |
ray_laptop | right, so validate_ref_packed is going through the ref array and seeing an invalid object | 19:45.27 |
Robin_Watts | so the next object should be at 1ef1010+158 = 1ef1168 ? | 19:46.09 |
ray_laptop | Robin_Watts: I think so | 19:46.59 |
| or 178 | 19:47.19 |
| if there is a next one (not at ctop) | 19:47.32 |
Robin_Watts | Ah, so we have an array of refs. One of which is pointing to 1ef1148, and the error is the validation saying "that's not a valid object" ? | 19:47.40 |
ray_laptop | or is it cbot ? | 19:47.56 |
Robin_Watts | cbot == 1ef1178 | 19:48.21 |
ray_laptop | Robin_Watts: a packed array of refs are all just contiguous refs | 19:48.31 |
Robin_Watts | right. And a ref is what? A pointer to an chunk object? | 19:48.51 |
ray_laptop | OK cbot is the free area, so there is nothing past the array of refs in that chunk. Reasonable | 19:48.55 |
| Robin_Watts: a ref_s | 19:49.36 |
Robin_Watts | ok, a ref can have lots of different things in it according to the type in the structure. | 19:50.11 |
ray_laptop | right -- according to the tas.type_attrs (16 bits) | 19:50.35 |
Robin_Watts | this ia a t_fontID or a t_struct or a t_astruct. | 19:50.47 |
| and it's supposed to point to an object. And it's not doing so. | 19:51.11 |
| Is this something we could debug by using -ZA ? | 19:51.34 |
| That shows all the allocations, right? So I could look for something that used to be at that address but got freed later or something? | 19:52.05 |
ray_laptop | Robin_Watts: -ZA should show who last allocated (and maybe freed) the object pointed to in the ref array | 19:52.40 |
| -ZA is your (loquacious) friend :-) | 19:53.22 |
Robin_Watts | OK. I've had it for this evening. | 19:53.31 |
ray_laptop | I don't blame you. | 19:53.40 |
Robin_Watts | I'll look back into it more tomorrow, with -ZA. | 19:53.47 |
| Thanks ray! | 19:53.50 |
ray_laptop | no problem. I need to dig into this stuff anyway. | 19:54.33 |
| Robin_Watts: if you know the o_stack contents at the .dicttomark, then you can see which thing is bogus (0x1148 - 0x1020) / sizeof(ref_s) | 19:57.41 |
Robin_Watts | Ah. | 19:58.01 |
ray_laptop | unfortunately, -Z! doesn't print the stack top and depth as it goes along | 19:58.45 |
Robin_Watts | i_ctx_p->op_stack.stack->p | 19:59.41 |
| I have to run. Will look tomorrow. Thanks. | 20:01.21 |
ray_laptop | IMHO, the ref_stack_count(o_stack) (and e_stack) and debug_print_ref(imemory, iosp) are more useful than the stack addresses dumped by do_call_operator when SHOW_STACK_DEPTHS is enabled | 20:05.14 |
| and to me, the word "operator " on every lline of output is a waste | 20:06.16 |
| bbiaw. | 20:07.37 |
Robin_Watts | There is an alternative to -Z! - anyone remember what it is? | 23:21.20 |
| Forward 1 day (to 2012/05/25)>>> | |