| <<<Back 1 day (to 2013/01/08) | 2013/01/09 |
henrys | mvrhel_laptop:well now I have fixed several issues and I can convert my xps to tiff using c++ and c sharp using the windows apis and conversion tools but the xps viewer will not display the xps file. Ever run into that one? | 02:34.03 |
mvrhel_laptop | oh wow | 02:34.22 |
| will acrobat open it? | 02:34.43 |
henrys | ms acrobat open xps? | 02:35.12 |
mvrhel_laptop | I think so | 02:35.16 |
henrys | I only have a license for the mac | 02:35.22 |
| let me try it. | 02:35.40 |
mvrhel_laptop | yes. AR opens xps here on my machine | 02:36.39 |
| and wants to save it as a pdf | 02:37.02 |
| I am fighting with the stupid file access limitations in the windows app developement | 02:38.13 |
henrys | agh microsoft word says something intelligent about it doesn't like / in the filename | 02:38.37 |
mvrhel_laptop | one's app is rather limited on what files it can open and where | 02:38.39 |
henrys | (s) | 02:38.43 |
| mvrhel_laptop: we've tripped over that every mobile platform, to be expected. | 02:40.33 |
mvrhel_laptop | ok. I can't seem to figure it out. I am going to take a break from this for a bit | 02:41.03 |
sebras | Robin_Watts: tor8: there's a patch over at sebras/master (fa32ede) which you might want to add to master... | 07:55.43 |
kens | Right, off for a couple of hours. | 10:20.39 |
Robin_Watts | Hey mvrhel_laptop | 15:26.40 |
| Can you let me know when you have a few minutes to talk about this MT threading stuff please? | 15:27.21 |
mlilja | question about Segmentation fault | 15:51.28 |
| how to see what is causing it? are there any logs? I'm running on linux | 15:52.53 |
kens | Seg fault is a crash, its caused by a code bug | 15:53.09 |
| Is this MuPDF or Ghostscript ? | 15:54.44 |
mlilja | it's gs | 15:58.05 |
kens | WHich version ? | 15:58.12 |
mlilja | 8.71 | 15:58.42 |
kens | Then the firt thing you shold do is upgrade the current verswion is 9.06 ands 8.71 is 3 years old | 15:59.06 |
| Most likely its been fixed. If the problem persists with current code you should report the bug at http://bugs.ghostscript.com | 15:59.39 |
mlilja | ok, thanks, I will do the upgrade | 16:03.30 |
Robin_Watts | mvrhel_laptop: ping! | 16:09.40 |
henrys | mt comes up and everyone heads for the hills ;-) | 16:56.24 |
Robin_Watts | henrys: indeed. | 16:57.01 |
mvrhel_laptop | hi Robin_Watts . just got back from dropping off the kids | 16:59.01 |
kens | And when hebrys wakes up, so does everyone else (Bagpuss refrence) | 16:59.45 |
Robin_Watts | mvrhel_laptop: See my mails. I'm very confused. | 17:00.00 |
mvrhel_laptop | ok hold on reading.... | 17:00.12 |
kens | Oh look a support mail for henrys :-) | 17:01.35 |
henrys | I am pretty sure we don't have hebrew or arabic built in. | 17:03.55 |
| they probably wouldn't get along well anyway ;-) | 17:04.23 |
JakeSays | i have two paths that basically draw the same object - one to fill, one to stroke. in mupdf, when the stroke/path driver functions are called, is there anything that links those two paths together? | 17:05.02 |
Robin_Watts | henrys: Yeah, historically I'd suspect that any font that contained glyphs from both sets would explode at random intervals. | 17:06.04 |
JakeSays | hmm. or is it the same path object? | 17:06.26 |
kens | JakeSyas, depnds on the path rendering mode | 17:06.38 |
Robin_Watts | JakeSays: The caller *could* use the same path object, but there is no guarantee. | 17:06.44 |
JakeSays | hmm. dangit. | 17:07.00 |
mvrhel_laptop | Robin_Watts: I don't quit understand what I am looking at with ray's traceback | 17:07.15 |
| how do I tell that #12 has an issue | 17:07.29 |
Robin_Watts | mvrhel_laptop: Basically, I think ray has missed the point. | 17:07.41 |
mvrhel_laptop | I agree with your reply to him | 17:07.57 |
| I am trying to understand where the allocation problem occurs | 17:08.13 |
| but looking at what he attached I can't tell what is going on | 17:08.36 |
Robin_Watts | With his traceback, he claims that the memory pointer in #12 (the ContextID, which is actually a gs_memory_t *) is the non threadsafe basic one for the main thread. | 17:08.38 |
| And that the one higher up the stack is one of the thread specific ones. | 17:09.12 |
mvrhel_laptop | ok | 17:09.24 |
Robin_Watts | So, if that's true, then we have a mismatch of thread and non thread ones, and that's indicative of a problem. | 17:09.30 |
mvrhel_laptop | right | 17:09.35 |
Robin_Watts | We don't have enough information in his callback to make that determination for ourselves though. | 17:09.50 |
mvrhel_laptop | ok good. I was feeling a little dumb | 17:10.00 |
Robin_Watts | So, the thing that's confusing me, is that I've tried some REALLY simple debugging, and I'm seeing problems that I can't explain. | 17:10.42 |
| Specifically, I've made the lcms malloc/free/realloc functions print the mem * and the threadID in use. | 17:11.11 |
| My plan was that I should run the program, and if I saw a mem * being used from more than one thread, that would be a good indication that there was a problem. | 17:11.55 |
| Make sense ? | 17:11.58 |
mvrhel_laptop | that makes sense | 17:12.36 |
| did you run this particular file? | 17:12.51 |
Robin_Watts | thesis10.pcl at 600dpi at 4 rendering threads, yes. | 17:13.10 |
mvrhel_laptop | ok. where is this file | 17:13.33 |
Robin_Watts | So I run it, and I see some logging, and I get the first indication that there are problems. | 17:13.40 |
| BUT... at no point prior to seeing those indications do I ever see any lcms allocations done by anything other than the main thread. | 17:14.10 |
| tests/pcl/thesis10.pcl | 17:14.38 |
ray_laptop | Robin_Watts: mvrhel_laptop: sorry -- I just noticed you guys were working here. | 17:16.06 |
mvrhel_laptop | Robin_Watts: ok. well there may be some allocations by the main thread setting up the icc manager etc | 17:16.34 |
Robin_Watts | ray_laptop: Hi. I've sent a couple of mails. | 17:16.39 |
| mvrhel_laptop: And indeed, I see those. | 17:16.53 |
ray_laptop | Robin_Watts: right. I've seen them. | 17:16.59 |
mvrhel_laptop | but that manager is not used during the band playback | 17:17.21 |
Robin_Watts | mvrhel_laptop: If you look in http://ghostscript.com/~robin/log.txt .... | 17:17.34 |
ray_laptop | Robin_Watts: the problem is that the 'sandboxing' as you call it, of lcms2 doesn't allow for different allocators -- it uses our allocator, but in some functions just pulls it from a previous structure. | 17:18.03 |
Robin_Watts | You'll see where we call 'Wrapping base allocator %x for thread %d - got %x' | 17:18.10 |
mvrhel_laptop | yes | 17:18.36 |
Robin_Watts | ray_laptop: No, you've missed the point, I think. | 17:18.40 |
ray_laptop | if the 'src' structure was created during the main thread, it will have a non-thread-safe chunk allocator (on PCL) | 17:18.41 |
| and a rendering thread is supposed to use the thread's allocator | 17:19.15 |
Robin_Watts | ray_laptop: It is true that links and profiles have an allocator pickled into them. | 17:19.28 |
| But mvrhel_laptop designed the system so that all the links/profiles used by a given thread should have the same allocator. And that allocator is supposed to be specific to that thread. | 17:20.28 |
| hence the fact that that allocator isn't a general thread safe one is NOT a problem. | 17:20.49 |
ray_laptop | Robin_Watts: without serious forking of lcms2 to always pass a ContextID through the call stack, we can't keep it from using the non-threadsafe allocator | 17:21.25 |
Robin_Watts | The thing we need to explain is HOW we ended up with 2 cms objects (links/profiles/whatever) that have different allocators pickled into them. | 17:21.27 |
mvrhel_laptop | that is a puzzle | 17:21.44 |
Robin_Watts | Either ray_laptop or I are missing something here. | 17:22.27 |
ray_laptop | Robin_Watts: an object that is created during the main thread is being referenced during the rendering -- that is not uncommon. | 17:22.30 |
Robin_Watts | ray_laptop: It absolutely should be uncommon! | 17:22.47 |
| How can a cms object that is created during the main thread be getting through to the rendering thread? | 17:23.06 |
ray_laptop | Robin_Watts: the device profile _is_ created during the main thread and used by all rendering threads | 17:23.31 |
| to form link profiles | 17:23.36 |
Robin_Watts | ray_laptop: Not AIUI. | 17:23.44 |
| mvrhel_laptop: would you care to jump in here? Have I got that wrong? | 17:23.57 |
ray_laptop | mvrhel_laptop: I thought you agreed with that point | 17:23.58 |
mvrhel_laptop | let me look at how the device profiles are shared amongst the threads | 17:24.18 |
| I forget | 17:24.27 |
Robin_Watts | My understanding is that the device profiles are recreated for each rendering thread. | 17:24.55 |
kens | Night folks, enjoy MT :-) | 17:25.10 |
ray_laptop | I have the debugger frozen in the 'gotcha' state. I see gsicc_get_link_profile. How can I tell what profiles it is trying to use to form the link profile ? | 17:25.26 |
| it has an gs_input_profile and a gs_output_profile, it is being called from gx_concretize_ICC | 17:28.33 |
mvrhel_laptop | ray_laptop: it depends. are either of these items NULL? | 17:29.09 |
| so in gxclthrd.c | 17:29.58 |
| line 150 | 17:30.01 |
| I see that if CMM_THREAD_SAFE is set | 17:30.12 |
| we just use the same device profiles across all the threads | 17:30.30 |
| and do reference counting | 17:30.36 |
ray_laptop | mvrhel_laptop: correct | 17:30.53 |
mvrhel_laptop | but what happens otherwise.... | 17:31.07 |
| is not so clear reading this. | 17:32.04 |
| I would need to run it and see | 17:32.14 |
Robin_Watts | to be clear, at the moment CMM_THREAD_SAFE is NOT set, right? | 17:33.24 |
mvrhel_laptop | right | 17:33.29 |
| so we don't do that now | 17:33.35 |
ray_laptop | I will verify that | 17:33.42 |
| the code at 151 (and 152) is skipped over in the debugger | 17:34.29 |
mvrhel_laptop | ok. hold on | 17:37.52 |
| ok. | 17:39.00 |
| so the ndev gets its own icc structure | 17:39.08 |
| during put_params | 17:39.41 |
| there is a separate allocation | 17:39.51 |
Robin_Watts | mvrhel_laptop: Right. That matches up with what I see with my debugging | 17:40.01 |
mvrhel_laptop | during line 154 | 17:40.04 |
| if it had previously been set in line 151/152 then it would not have done this | 17:40.34 |
| so from this viewpoint, each thread should have source and destination profiles only allocated from its thread | 17:41.19 |
Robin_Watts | I see us 'wrap' the allocator, then use that allocator within the main thread to do the 4 allocations required within the gscms_get_profile_handle_mem call. Then I see the rendering thread started. | 17:41.32 |
ray_laptop | mvrhel_laptop: the copydevice that created ndev gets thread_memory | 17:42.03 |
Robin_Watts | Then I see NO LCMS OPERATIONS AT ALL during the running of the thread, and then I see everything being shut down. | 17:42.07 |
| and (on some runs) I see memory corruption at that point. | 17:42.34 |
mvrhel_laptop | all the source colors are probably RGB and the device is RGB so no conversions | 17:42.49 |
Robin_Watts | If you look at the particular run that I have online, it first spots a problem after band 7. | 17:43.25 |
| and I can't see lcms being used at all during any of those bands. | 17:43.53 |
| which makes me wonder if lcms is a red herring. | 17:44.04 |
mvrhel_laptop | Robin_Watts: where in your file do I see the problem? | 17:44.35 |
Robin_Watts | See the "Attempt to free block" line ? | 17:44.48 |
ray_laptop | Robin_Watts: It is DEFINITELY being used during rendering !!! | 17:44.55 |
Robin_Watts | That's Memento objecting to a block being freed that isn't in its list of allocated blocks. | 17:45.27 |
ray_laptop | Robin_Watts: I set a bp in gsmchunk.c line 418 and line 621 | 17:45.45 |
| this detects 'conflict' use. | 17:46.12 |
Robin_Watts | ray_laptop: My debugging is not showing any use of lcms memory. I'd love for someone to tell me that I'm being dim (and why!) | 17:46.17 |
ray_laptop | Robin_Watts: what do you mean "use of lcms memory" ? | 17:47.05 |
Robin_Watts | As I described in my second mail of today, all lcms memory allocations/deallocations etc go through gs_lcms2_{malloc,free,realloc} | 17:47.59 |
ray_laptop | The call stack I put in the bug shows TWO DIFFERENT ContextID -- the 0x130b320 is the main thread's allocator. Farther up in the stack you see the ContextID for the threa | 17:48.31 |
| d | 17:48.34 |
Robin_Watts | so I have code in there that 1) prints the current threadID and the current memory pointer. | 17:48.57 |
ray_laptop | clearly the rendering thread is allocating objects in order to build a link profile for the rendering thread. | 17:49.18 |
Robin_Watts | and 2) (just in case some prints aren't making it out), I also assert that the current threadID is the same as the MAIN thread id. | 17:49.37 |
ray_laptop | Robin_Watts: printing is problematic since it drastically alters the thread flow | 17:49.44 |
Robin_Watts | ray_laptop: Right, and if I was getting clean runs with the prints in we might think that that was the reason. | 17:50.21 |
| but as it is, I am seeing the same memory corruption you are, without LCMS ever making any allocations other than on the main thread. | 17:50.49 |
ray_laptop | Robin_Watts: if you set breakpoints where I mentioned (line 418 and 621) you can catch it failing | 17:50.58 |
| note that the conflicting use of the main thread's chunk allocator is random -- not at all repeatable between debug runs. Nothing like "always in band 7" | 17:52.18 |
mvrhel_laptop | that is weird. how can it be random? | 17:52.42 |
Robin_Watts | ray_laptop: Right. I could abandon all my logging and go back to vanilla code, and catch it failing using your breakpoints. Doing that would not help explain how I am seeing memory corruption prior to lcms2 being involved though. | 17:53.01 |
| (My working theory is that if I managed to run through more bands, I might hit some where lcms2 usage becomes an issue.) | 17:54.23 |
ray_laptop | Robin_Watts: did you describe above about what you are seeing as 'memory corruption' ? | 17:55.17 |
| Robin_Watts: i.e., what are you testing for ? | 17:55.35 |
Robin_Watts | ray_laptop: I'm watching for either Memento giving me an error, or the chunk memory manager complaining of leaks. | 17:56.03 |
ray_laptop | I am NOT worried about the leaks (at this time). | 17:56.22 |
| what Memento error ? | 17:56.30 |
Robin_Watts | (or Windows complaining with a dialogue box that the heap has been corrupted) | 17:56.30 |
| Memento sometimes complains that someone is trying to free a block that hasn't previously been allocated. | 17:56.55 |
ray_laptop | Robin_Watts: I am on linux since the addresses remain consistent across runs it makes it easier to identify allocators | 17:57.13 |
Robin_Watts | basically I'm seeing the kind of things you'd expect to see if we have thread unsafe access to an allocator. | 17:57.39 |
ray_laptop | Robin_Watts: only with NumRenderingThreads > 0 ? And which allocator and which thread ? | 17:57.56 |
Robin_Watts | Only with NumRenderingThreads > 0. | 17:58.11 |
ray_laptop | Robin_Watts: and this always happens in band 7 ??? | 17:58.33 |
Robin_Watts | No. | 17:58.40 |
ray_laptop | oh, OK. | 17:58.47 |
Robin_Watts | It happens in various different places. In the particular run I have online, it happens after band 7. | 17:59.08 |
| but in all cases, I can see the corruption without having ever seen lcms2 allocations on anything other than the main thread. | 17:59.38 |
ray_laptop | Robin_Watts: OK. And this 'corruption' is always attempt to free a block that wasn't allocated, or does it differ ? | 18:00.46 |
Robin_Watts | It differs, as you'd imagine. | 18:00.57 |
ray_laptop | let's agree to ignore leaks for now | 18:01.02 |
| Robin_Watts: do you ever get a bp at line 418 or 621 ? | 18:01.21 |
Robin_Watts | Urm... I thought that the chunk memory manager complaining of a 'leak' was an indication that someone had tried to use a chunk after it was supposedly finished with? | 18:01.46 |
| i.e. they aren't leaks, they are indications of thread unsafe usage ? | 18:02.07 |
| 418 and 621 of what file? | 18:02.42 |
ray_laptop | Robin_Watts: no, it can just be things that were allocated by a thread that are still allocated when the thread shuts down | 18:03.02 |
Robin_Watts | I have additional debugging in various files. | 18:03.07 |
ray_laptop | base/gsmchunk.c | 18:03.10 |
| they are dprintf statements | 18:03.25 |
Robin_Watts | Right. The "not idle" ones. | 18:03.49 |
ray_laptop | dmprintf1(mem, "*** chunk_obj_alloc: this memory allocator is not idle, used for: %s\n", | 18:04.08 |
| right | 18:04.11 |
Robin_Watts | I get a crash before ever seeing those trigger. | 18:04.28 |
ray_laptop | I've never seen that except with the main thread allocator, and when a rendering thread is doing an allocation, or has just done one (the rendering thread may have gone past the allocation by the time I bp) | 18:05.23 |
Robin_Watts | ray_laptop: But, sorry, probably I should shut up at this point, because you/mvrhel were making progress. | 18:05.24 |
mvrhel_laptop | I moved on.... :) | 18:05.46 |
ray_laptop | were we ? | 18:05.48 |
mvrhel_laptop | I figured you two had this figured out | 18:05.52 |
Robin_Watts | I figure there are 2 possibilities here. | 18:06.20 |
ray_laptop | I don't understand if there is a way to get a rendering thread to use the correct allocator. | 18:06.21 |
| in the lcms2, that is | 18:06.40 |
| it seems not to have the thread's ContextID down in some of the routines that are allocating using ContextID from a hProfile | 18:07.34 |
Robin_Watts | Option 1) I have a flaw in my debugging. In this case, it seems likely that we'll make progress by examining the stuff that ray is hitting. | 18:07.39 |
| The problem in this case would be that somehow we are mixing thread and non-thread cms objects. | 18:08.03 |
ray_laptop | Robin_Watts: right, and I don't know the lcms2 code enough to know which it's using (i.e. is it due to the device profile ?) | 18:09.04 |
Robin_Watts | Option 2) My debugging is indicative that we're getting problems with lcms2 not even doing anything on the rendering threads, and so maybe lcms2 is being run over by something here. | 18:09.07 |
| So, let's work with option 1 for now, as mvrhel_laptop and you seemed to be making progress on that. | 18:09.26 |
mvrhel_laptop | I haven't done anything yet | 18:10.02 |
| but I will start looking at this if you want | 18:10.15 |
Robin_Watts | ray_laptop: Part of the reason I started to add my printfs was so that I could see WHICH lcms objects were created in which threads in the hopes I could see where the wrong one was used in a different thread. | 18:10.49 |
| ray_laptop: So do you have a failing case caught in your debugger? | 18:11.55 |
ray_laptop | mvrhel_laptop: if you can help determine if it is possible to get a thread to always use the right allocator, that would help | 18:12.19 |
| Robin_Watts: yes, but this is not the same as the stack I posted. | 18:12.44 |
| in this case the rendering thread has advanced past the allocation | 18:13.17 |
Robin_Watts | ray_laptop: If we ensure that we don't mix objects from different threads then you will always have them using the right allocator, I believe. | 18:13.42 |
ray_laptop | but I can re-run to get something similar to what's posted. In this case the "DefaultICCintents" passing an hProfile in | 18:14.15 |
Robin_Watts | Let's work with what you have at the moment. | 18:14.39 |
| Can you paste the callstack somewhere? | 18:14.47 |
ray_laptop | Robin_Watts: actually, let me re-run, then I'll paste the stack... | 18:15.35 |
Robin_Watts | ok. | 18:15.42 |
ray_laptop | OK, now I have one where the bp was detected in a rendering thread, chunk_obj_alloc with cname="lcms" | 18:16.46 |
| once again we are down below DefaultICCintents | 18:17.34 |
Robin_Watts | Can you pastebin the callstack so we have something to refer to ? | 18:18.15 |
ray_laptop | Robin_Watts: doing that now... just a sec | 18:20.01 |
rayjj | http://pastebin.com/fS0K0jzy | 18:20.26 |
Robin_Watts | right, so 0x1aa6bb0 is the 'correct' memory pointer for the thread. | 18:21.39 |
ray_laptop | so in this case it is doing cmsPipelineAlloc with the wrong allocator (the main threads' is 0x130b320) | 18:22.03 |
rayjj | The BuildRGBOutputMatrixShaper is getting the ContextID from cmsGetProfileContextID(hProfile) | 18:23.12 |
Robin_Watts | OK, so the profile that has the wrong ContextID is 0x135c2b8 | 18:23.39 |
ray_laptop | correc | 18:23.58 |
Robin_Watts | i.e. lcms_deshandle | 18:24.04 |
| mvrhel_laptop: Are you with us? | 18:24.20 |
mvrhel_laptop | I am trying to follow | 18:24.28 |
Robin_Watts | This is where I was hoping to be in the position of looking back up my logging to see what thread 0x135c2b8 was created in :( | 18:25.04 |
mvrhel_laptop | to be honest, I am a little puzzled why we would be having conversions occurring in pcl | 18:25.28 |
| I am going to grab the file and run it my self | 18:25.40 |
ray_laptop | mvrhel_laptop: I can examine the hProfile. What do I look at to see where this came from (or what it is) ? | 18:25.55 |
rayjj | ColorSpace is cmsSigRgbData | 18:26.54 |
mvrhel_laptop | hProfile is an lcms object | 18:27.09 |
rayjj | right, I cast it to (_cmsICCPROFILE *) in order to look at it | 18:27.36 |
mvrhel_laptop | I don't know. The only place that I could see where a profile could come from and have an issue like this is if it came from the interpreter itself | 18:28.12 |
| but this is during playback of the band yes? | 18:28.27 |
| is the the source or the destination profile ? | 18:28.41 |
rayjj | can I tell if it is a device profile or something else ? | 18:28.44 |
Robin_Watts | lcms_deshandle | 18:28.54 |
rayjj | mvrhel_laptop: how do I tell if it is the src or dest | 18:29.01 |
mvrhel_laptop | well do you have the stack that called in | 18:29.02 |
Robin_Watts | http://pastebin.com/fS0K0jzy | 18:29.15 |
mvrhel_laptop | uh | 18:29.38 |
Robin_Watts | See line 30. | 18:29.39 |
| lcms_deshandle=0x135c2b8 | 18:29.57 |
rayjj | in this case it is from gx_concretize_ICC which was called from gx_concretize_Indexed | 18:30.03 |
mvrhel_laptop | what size is the source handle? | 18:30.22 |
| that is what color space | 18:30.27 |
rayjj | mvrhel_laptop: so is deshandle mean the destination ? | 18:30.28 |
mvrhel_laptop | yes | 18:30.32 |
| srchandle means source | 18:30.40 |
| what color space is lcms_srchandle in? | 18:30.54 |
rayjj | mvrhel_laptop: since it is concretize_ICC I assume the src is an ICC profile | 18:31.06 |
mvrhel_laptop | yes. | 18:31.25 |
| can you look at what colorspace lcms_srchandle is | 18:31.39 |
| I am curious | 18:31.42 |
| well if lcms_deshandle is the problem, that is going to be the device profile | 18:32.23 |
| Can you look at ndev->icc_struct | 18:32.43 |
rayjj | mvrhel_laptop: the lcms_srchandle has ColorSpace = cmsSigRgbData also (but has the ContextID correct for the thread) | 18:34.10 |
mvrhel_laptop | and in particulare device_profile[0] | 18:34.13 |
| rayjj: can you look at dev->icc_struct->device_profile[0]? | 18:34.37 |
rayjj | mvrhel_laptop: OK. navigating up to where I have the dev | 18:34.45 |
mvrhel_laptop | weird that it is trying to do an RGB to RGB conversion in PCL | 18:35.19 |
| actually, it is probably going to ignore and not do the transform | 18:35.40 |
rayjj | mvrhel_laptop: the device is image24 and the dev->icc_struct == 0 | 18:35.56 |
mvrhel_laptop | oh you need to drill down | 18:36.11 |
rayjj | drill down ??? | 18:36.24 |
mvrhel_laptop | hmm | 18:36.24 |
| I mean get to the real device | 18:36.38 |
| let me get the file and run | 18:36.58 |
rayjj | mvrhel_laptop: the file is in tests/pcl/thesis10.pcl | 18:37.18 |
mvrhel_laptop | where did the desprofile come from in the concretize_ICC code | 18:37.42 |
| that should have come from the device | 18:37.48 |
rayjj | looking at gsicc_get_link. The pcs_in has the cmm_icc_profile_data with the lcms_srchandle (i.e 0x24e8338) | 18:40.23 |
| mvrhel_laptop: in gsicc_get_link, the dev_profile has memory=0x130b320 and device_profile[0]=0x1360438 | 18:43.34 |
mvrhel_laptop | ok and those numbers mean...... | 18:44.24 |
rayjj | device_profile[0].name = "default_rgb.icc" | 18:44.33 |
mvrhel_laptop | are they allocated how you would expect them to be? | 18:44.52 |
rayjj | yes, for the device's profile. So the problem is that the 'link' logic is using the memory allocator from this profile to allocate from within a thread | 18:45.59 |
mvrhel_laptop | ok you just confused me | 18:46.38 |
| you said its ok and then its not ok | 18:46.45 |
rayjj | and as I said before, the only thing I can see to do is to have ALL profiles be created with memory->non_gc_memory->thread_safe_memory | 18:46.48 |
| mvrhel_laptop: it is expected that the device profile will have been created by the main thread, right ? | 18:47.31 |
Robin_Watts | rayjj: NO. | 18:47.38 |
mvrhel_laptop | each thread has its own allocation of its profile | 18:47.55 |
Robin_Watts | It is expected that the device profile will have been created by the rendering thread. | 18:47.59 |
mvrhel_laptop | I thought we just went through that | 18:48.01 |
| during the put params | 18:48.26 |
rayjj | I thought the device profile is shared | 18:48.26 |
mvrhel_laptop | hehe | 18:48.30 |
Robin_Watts | ALL the profiles used withing a rendering thread should have been created by the rendering thread. | 18:48.40 |
mvrhel_laptop | only if CMM_THREAD_SAFE is set | 18:48.57 |
| which it is not | 18:48.59 |
| with CMM_THREAD_SAFE we share the profiles amongst the thread devices | 18:49.24 |
| and ref count | 18:49.27 |
| otherwise, they each do their own allocation | 18:49.35 |
| perhaps something odd is occurring in pcl | 18:49.55 |
rayjj | This profile has a rc.count = 1 (but the rc.memory and the device->memory are that of the MAIN THREAD. So i need to look at this 'image24' device's allocation | 18:51.17 |
mvrhel_laptop | that is odd. I would have thought your device would be one of the thread devices | 18:52.10 |
rayjj | the thread has cldev = 0x24e2938, but it renders into a 'buffer device' where the bits are rendered. The image24 buffer device is 0x24e2938 | 18:54.33 |
| oops. the buffer device is 0x24e68b8 | 18:54.53 |
| (cut and paste error) | 18:54.59 |
Robin_Watts | mvrhel_laptop: So where do the thread specific device profiles live? new_dev->icc_struct ? | 18:56.05 |
mvrhel_laptop | ndev->icc_struct | 18:56.26 |
| ndev->icc_struct->device_profile[0] | 18:56.36 |
rayjj | I need to look at where the buffer device is allocated and set up. I'll see if that is the issue... | 18:57.30 |
mvrhel_laptop | yes | 18:58.03 |
Robin_Watts | dev->icc_struct = gsicc_new_device_profile_array(dev->memory); | 18:59.23 |
| and dev->memory is the wrapped base allocator (i.e. the one to be used for the thread). | 19:00.01 |
| so that looks right to me. | 19:00.03 |
mvrhel_laptop | Robin_Watts: so that is the structure that holds a bunch of the profiles | 19:00.40 |
Robin_Watts | Right. | 19:00.47 |
mvrhel_laptop | the profile itself is then allocated during the put params | 19:00.54 |
Robin_Watts | And the device_profile[0] is non NULL here. | 19:01.06 |
mvrhel_laptop | since put params will find that the profile is missing | 19:01.08 |
| oh then it has been set | 19:01.15 |
Robin_Watts | right. I'm in put params now. | 19:01.22 |
mvrhel_laptop | oh ok | 19:01.27 |
Robin_Watts | and memory looks fine in that. | 19:01.46 |
mvrhel_laptop | I am trying to see when we have a call with image24 to the link code | 19:01.58 |
| I just get ppmraw | 19:02.19 |
| let me check for gx_concretize_Indexed | 19:02.44 |
Robin_Watts | Thread->bdev ends up being an image24 thing. | 19:05.33 |
ray_laptop | right and gdev_create_buf_device is called with thread->memory | 19:06.21 |
| oh, I bet I know what's going on :-( | 19:09.13 |
mvrhel_laptop | I am relieved someone does | 19:09.33 |
Robin_Watts | <FX: Drum roll...> | 19:09.35 |
ray_laptop | there are multiple buffer devices. One for the main thread and one for each rendering thread | 19:09.45 |
mvrhel_laptop | shoot I need to head out | 19:10.10 |
| need to pick up my son early from school | 19:10.21 |
ray_laptop | mvrhel_laptop: nope -- still looking... | 19:10.40 |
Robin_Watts | can continue to play cardboard programmer for ray :) | 19:10.49 |
mvrhel_laptop | i will be back... | 19:10.49 |
Robin_Watts | OK, I've backed all my debugging out, and just put a line in to try to catch the call to cmsCreateTransformTHR where the input and/or output profiles stored ContextID differs from the supplied on. | 19:21.45 |
| s/on/one/ | 19:21.49 |
| ray_laptop: With all my debug taken out, except for a check for the mismatching ContextID, I still get memory corruption before I hit my check. | 19:27.43 |
| Ok, if I change to using debug rather than memento, I can get it to trigger. | 19:32.25 |
rayjj | Robin_Watts: the check for the ContextID ? | 19:34.37 |
Robin_Watts | right. | 19:34.45 |
rayjj | so is it the Input that has the main thread's allocator ? | 19:37.27 |
| and what's your callstack | 19:37.46 |
Robin_Watts | Rerunning now. | 19:38.16 |
| I thought it was the Output (the "des") that had the main threads allocator. | 19:38.33 |
rayjj | are you on Windoze or linux ? | 19:38.53 |
Robin_Watts | doze. | 19:39.01 |
| now it's refusing to fail. | 19:39.46 |
| ok, I have a failure in the debugger. | 19:42.07 |
rayjj | which failure ? | 19:42.26 |
Robin_Watts | Output->ContextID is the wrong one. | 19:42.27 |
| a call to cmsCreateTransformTHR with ContextID != Output->ContextID | 19:42.55 |
rayjj | Robin_Watts: that should be OK, shouldn't it -- isn't it going to fill in the profiles ? Isn't that just left over data ? | 19:43.42 |
Robin_Watts | No, cmsCreateTransformTHR takes 2 profiles and outputs a link. | 19:44.20 |
rayjj | Oh, I didn't read the code right | 19:44.47 |
Robin_Watts | Both those transforms SHOULD be share the same memory pointer (the thread specific one) | 19:44.47 |
| This is exactly the case you've captured, I believe. | 19:45.07 |
rayjj | are you in a rendering thread ? | 19:45.08 |
Robin_Watts | I am. | 19:45.14 |
rayjj | and what is the call stack ? | 19:45.21 |
| so all this tells you is that we are trying to link two profiles and one of them was created with the wrong mem ? | 19:46.36 |
Robin_Watts | yes. | 19:46.47 |
| It confirms that I'm seeing a profile created in the main thread and then used in a rendering thread. | 19:47.38 |
rayjj | once we start rendering, we shouldn't see the ContextID for any profiles ever be the main thread's, right ? | 19:47.41 |
Robin_Watts | indeed not. | 19:47.52 |
| So in gsicc_get_link_profile, we are finding cms_output_profile from gs_output_profile->profile_handle | 19:51.04 |
rayjj | Robin_Watts: can you tell if it came from the clist ? | 19:55.41 |
Robin_Watts | We get the gs_output_profile by doing a 'get_profile' device call. | 19:56.15 |
rayjj | Robin_Watts: you don't have src_dev_link set do you ? | 19:56.20 |
Robin_Watts | I don't think so. | 19:56.31 |
| The device in question is an image24 device, which forwards to the ppmraw device. | 19:56.40 |
rayjj | what is the image24's memory value | 19:57.27 |
Robin_Watts | The image24s memory value is 3d48710 (the thread one) | 19:58.01 |
| The ppmraw device to which it forwards has a memory of 2cb1bb0 (the main threads one) | 19:58.27 |
| So that's the problem. | 19:58.33 |
| The buffered memory device forwards to the main device, not the thread specific copy. | 19:59.02 |
| Does that make sense ? | 19:59.32 |
rayjj | Robin_Watts: buffer devices usually forward most functions to the underlying clist target device (things like encode_color, etc) | 20:00.29 |
Robin_Watts | right, but we copy the device for each thread, right? | 20:01.14 |
rayjj | but forwarding 'procs' s/b OK. The dev->memory is still correct for any allocations | 20:01.17 |
| we copy the device, then 'fill_in_procs' | 20:01.31 |
Robin_Watts | rayjj: The problem is that when we ask the image24 device for a profile, it doesn't go to the render threads device to get the profile, it goes to the main device to get the profile. | 20:02.01 |
rayjj | and the buffer device fills in procs from the thread's copy | 20:02.05 |
| OH. | 20:02.28 |
Robin_Watts | gxclthrd.c line 207 | 20:04.04 |
| if ((code = gdev_create_buf_device(cdev->buf_procs.create_buf_device, | 20:04.15 |
| &(thread->bdev), cdev->target, | 20:04.17 |
| band*crdev->page_band_height, NULL, | 20:04.19 |
| thread->memory, clist_get_band_complexity(dev,y)) < 0)) | 20:04.20 |
| Instead of cdev->target can we make that ndev ? | 20:04.57 |
rayjj | Robin_Watts: I don't see where the 'target' is used in gx_default_create_buf_device for anything other than color_info and fill_in_procs | 20:07.55 |
| and the ndev has the same procs filled in, and the same color_info, right ? | 20:08.19 |
Robin_Watts | rayjj: The image24 is the buffer device, right? | 20:08.53 |
rayjj | where do we 'ask the image24 device for a profile' ? | 20:08.59 |
Robin_Watts | We call 'get_profile' on the image24. | 20:09.05 |
rayjj | from where ? | 20:09.13 |
Robin_Watts | and that's implemented by gx_forward_get_profile (or whatever) | 20:09.29 |
| let me try and find where, hold on... | 20:09.47 |
| gsicc_cache.c line 572 | 20:10.32 |
rayjj | oops. I have to run.. A guy is waiting at my house to replace the garage door. I'll check back later | 20:10.32 |
Robin_Watts | ok. | 20:11.16 |
| I'll email mvrhel to explain where we got to. | 20:11.30 |
mvrhel_laptop | Robin_Watts: so it looks like the get_profile is grabbing the wrong profile | 20:23.30 |
Robin_Watts | sends email :) | 20:23.45 |
mvrhel_laptop | ok | 20:23.48 |
| I will readit | 20:23.52 |
Robin_Watts | mvrhel_laptop: testing a fix now :) | 20:23.53 |
mvrhel_laptop | need to eat now | 20:23.58 |
Robin_Watts | likewise I'll be heading off in a mo. | 20:24.23 |
| balls. test didn't work. Hopefully you might be able to sort it though. | 20:48.09 |
ray_laptop | Robin_Watts: what did you try ? | 20:56.26 |
Robin_Watts | ray_laptop: I tried what I mentioned in the email. | 23:57.07 |
| Forward 1 day (to 2013/01/10)>>> | |