Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2018/03/27)20180328 
ray_laptop I've pushed a patch to rename lcms2art to lcms2mt to my repository. Any comments (RobinWatts and chrisl in particular)04:56.41 
kens Well GhostXPS comprehensively corrupts non-Latin filenames :-(08:28.48 
  I think because it passes them through a param list08:29.15 
  It looks like the same effect happens for PDF files too, which is more than unfortunate08:29.35 
chrisl In such cases, they should be converted to UTF8 before being handed onto the "internals"08:32.02 
kens Yeah it is converted to UTF-808:35.35 
  But I don't think its converted back again when we try to open the file. I may be mistaken08:35.49 
  Really I only just started looking at the problem.08:36.01 
chrisl Hmm, that's strange.... not sure how the device end would know the difference08:37.09 
kens Its not the device end, its the front end opening the file. Or at least, I think it is08:37.40 
  I'm at pl_cursor_open and the filename is UTF-8, about to see where this goes08:38.08 
chrisl Ah, I bet it doesn't go through the gp_* API08:38.38 
kens Very probably not, debugging it in a second.....08:38.51 
  Well, it calls gp_fopen()08:40.38 
  Oh, no it doesn't08:40.49 
chrisl And I think it's the gp_fopen that converts from UTF-8 to wchar_t08:41.34 
kens My bad, it does call gp_fopen, which returns NULL08:41.53 
  Round again to see why08:42.00 
  utf8_to_wchar() returns garbage it seems08:42.46 
chrisl Well, that's probably not good....08:43.24 
kens I'm going to have to guess that's the problem :)08:43.35 
  Its possible the UTF-8 is garbage before I get there of course, I'll need to check that out in a minute08:44.00 
  Ah, looks like utf8_to_wchar() simply isn''t dealing properly with non-Latin characters. Looks like it stops when it finds one. Odd, I'd have though GS used all this too08:44.49 
  Well it definitely isn't processing all theinput characters08:47.05 
  OK so in gp_wutf8.c, utf8_to_wchar(), line 42:08:49.58 
  if ((c & 0xC0) != 0x80)08:49.58 
  return -1;08:49.58 
  That's where the code stops08:50.04 
chrisl Huh, I rather thought gs used the same code08:51.00 
kens and returns an error, which (of course!) we don't check08:51.01 
  Yes I thought that too08:51.07 
  I'm going to run the same filename through GS in a minute and see what's different.08:51.21 
  We do detect the fact the the fopen failed, because the FILE* is NULL, so we correctly return an ioerror08:52.17 
  OK lets see what GS does08:52.28 
  OK so I'm clearly mistaken somewhere, GS does call the same code with the same result08:54.13 
  And gives me the same error as the PDF interpreter did.08:55.41 
  I suspect this may be due to me running the windowed binary, lets try the command line08:56.09 
chrisl And that code hasn't changed since 201108:56.24 
kens OK so if I run the command line code, then teh filename is different08:57.12 
  And the utf8_to_wchar works as expected08:58.25 
  So I have to beleive that the UTF8 in GXPS is in fact wrong.08:58.39 
chrisl I don't follow - command line code??08:58.59 
kens gswin32c instead of gswin3208:59.10 
chrisl Oh, that's not good08:59.26 
kens Nope :-)08:59.34 
  Running gswin32c <filename> appears to be subtly different from running gswin32 <filename>08:59.55 
  gswin32c works09:00.02 
chrisl Eek!09:00.06 
kens It 'looks like' the UTF8 when I run gswin32 is wrong the same way as when I run gxps so I think I'll go back to gxps because the code is simpler to follow09:01.07 
  Need to see where the filename is converted to UTF8 because it does appear that whatever we're seeing is not correct for some reason.09:01.36 
  Ignore me while I burble randomly.....09:01.56 
  So it 'loooks like' the filename is UTF8 when we run gswin32c and its wchar when we run gxps and gswin32c. Unsurprisingly, trying to take a wchar, treat it as UTF8 and create a wchar from it results in bad voodoo09:12.40 
  I can understnad that when we deal with filenames in the PostScript world "(....) (r) file" then the string has to be UTF8. I'm less clear on why it *isn''t* UTF8 when we run gswin32, but is when we run gswin32c. It also looks like the GXPS code needs to convert the wchar to UTF8 and its not doing that step......09:14.19 
  This is, of course, Windows-specific code, Linux already gives us UTF8 filenames09:14.40 
  Hmm, the GXPS code looks like it does convert to UTF8. So what am I missing ? :-(09:15.42 
chrisl gswin32c converts to utf809:15.51 
kens Yeah I think gswin32 doesn't though09:16.09 
  I don't know why.....09:16.14 
  The filenames are different when I look in gp_fopen()09:16.30 
  In GS we end up here from zlibsystemfile or something like that09:16.44 
chrisl I think gswin32 conversion to UTF8 maybe wrong09:17.01 
kens That could be it, I'm sort of concentrating o ngxsp at the moment09:17.23 
  Becaus eon the whole it looks simpler :)09:17.30 
  gxps definitely converts the filename to UTF8 :-(09:18.21 
chrisl It is wchar_t to start with, though?09:18.45 
kens Yes it is09:18.59 
  It seems like there's a double conversion going on somewhere, by the time I get to gp_fopen() the filename has changed again09:19.24 
  I need to find out where that is happening09:19.31 
  Trying to remember what sequence of binary is correct (out of three different ones) is taxing me this morning. I shoudl probably write them all down09:20.15 
chrisl kens: I'd look at what's happening in base/gsargs.c09:22.59 
kens Yeah I'm in there now :-(09:23.22 
  I'm thinking that;s the problem09:23.32 
  I hate param lists :-(09:24.12 
  The filename is definitely b*ggered by the time we try to process it in pl_main_process_options09:25.48 
  OOps, I shoudln't be here!09:25.54 
  Back in a couple of hours09:25.59 
Robin_Watts ray_laptop: (For the logs) Commit looks good.10:37.07 
  ray_laptop: (For the logs) We need to update the WhyThisFork.txt file to mention that we changed cmsChangeBuffersFormat.10:37.41 
kens Anyone know if the JPEG devices are actually capable of writing to stdout ?12:49.57 
  If not I'll try and find out in a bit.12:50.19 
  Hmm, well a quick look doesn't seem like it should be a problem. Oh well, another problem to look at.12:53.01 
  OK so the problem with the Cyrillic filename is (as Chris suggested) that in gsargs.c, arg_next() we do :12:57.42 
  i += codepoint_to_utf8(&cstr[i], c);12:57.42 
  Since the filename argument has laready been converted from wchar to UTF8, this pretty much corrupts it I think.12:57.42 
Robin_Watts codepoint_to_utf8 should do nothing in the case where the arg encoding is UTF8.13:03.17 
kens How does it know ?13:03.28 
Robin_Watts hmm. ignore me maybe.13:04.38 
  Does this only go wrong with gxps builds?13:04.47 
kens It definitely does change the content13:04.49 
  At the moment I can say it goes wrong with gxps and gswin32 on Windows, it does not go wrong with gswin32c on Windows (no idea why yet), I have not yet tried gpcl13:05.24 
Robin_Watts Are gswin32 and gswin32c entered differently?13:06.02 
kens The symptom for gswin32 and gxps are the same, the filename is corrupted, its the same filename in each case, corrutped the same way, and it appears to be due to applying UTF8 encoding twice13:06.02 
  I don't know yet, I decided to concentrate o gxps because the code is easier to follow. Ghostscritp gets the arguments from PostScript....13:06.29 
  Don't feel you have to respond to me BTW I'm just thinking out loud here.13:06.50 
chrisl It'll go wrong with gpcl6, too because it uses the same front end as gxps13:06.56 
Robin_Watts yeah, WinMain vs wmain13:07.01 
kens Yes I expected that, but since I hadn't tried it, I didn't like to say13:07.09 
  My suspicion woudl be that gswin32c doesn't UTF8 encode the filename twice :-)13:07.34 
  Let me puddle onwards....13:07.43 
  Looks like its not that simple :-(13:08.49 
  If I don't do the doube encoding I stil get the wrong answer out of utf8_to_wchar13:09.10 
Robin_Watts I suspect new_main in dwmain.c should be setting the arg encoding to UTF8.13:09.53 
kens Sorry that's a bit of code I'm not looking at right now. I didn't even know we could state the arg encoding13:10.21 
Robin_Watts WinMain (and wmain) are both called with wchar_t's.13:10.41 
kens Right13:10.50 
Robin_Watts Both convert the wchar_t's into utf8.13:10.53 
kens Yeah same as the gxps code13:11.01 
Robin_Watts That then gets fed into the gsapi_init_with_args stuff.13:11.09 
kens gp_fopen converts the UTF8 back to wchar13:11.17 
Robin_Watts kens: as it should.13:11.25 
kens and uses Windows calls to open the file13:11.30 
  Yes, when the name is correctly UTF9 encoded, this works (gswin32c)13:11.42 
Robin_Watts the difference, I think between the WinMain and wmain routes is that wmain calls main_utf8, which calls gsdll.set_arg_encoding(instance, GS_ARG_ENCODING_UTF8);13:12.13 
  and the new_main route does not call that.13:12.24 
  That call should happen before the init_with_args call.13:12.33 
kens Well that sort of sounds like it13:12.40 
Robin_Watts see dwmain.c line 68013:12.53 
kens Though I don't think that will hel pwith gxps13:12.55 
Robin_Watts see dwmainc.c line 68013:13.00 
  I bet I forgot to call it in the gxps case too.13:13.13 
kens Possibly, but gsargs doesn't seem to cater for it. Unless possibly the copdepoin_to_utf8 does ?13:13.51 
  I haven't actually looked in that routine13:13.59 
Robin_Watts kens: That code is an utter mindfuck.13:14.15 
kens Its certainly challenging me this morning :-)13:14.28 
Robin_Watts Take if from me that if the code is passing in UTF8, and hasn't set the arg encoding to UTF8 that routine will go wrong.13:14.56 
kens Hmm, I see the call to gsdll.set_arg_encoding13:14.56 
  Yeah it does go wrong that's for certain :-D13:15.07 
  I'll poke this with a shaper stick for a bit, thanks13:15.17 
Robin_Watts Good luck, and thanks.13:15.25 
kens Ah Henry came to the same conclusion about the PCL problem as I did, the enmuerated value is totally weird.13:22.54 
  Afternoon HenryStiles I notice you concluded the same as I did about that PCL report. That's a very peculiar value13:26.13 
  I checked the latest PXL spec I could find and it only has the same three values13:26.28 
HenryStiles kens: yeah I really hate putting exceptions like that in the interpreter, but my printer works.13:27.20 
kens Oh, I never thought to try my printer, I guess I shoul dhave13:27.34 
  Do you think HP ignores it ?13:28.03 
HenryStiles seems to do that13:28.15 
kens Oh, well I suppose that's understandable at least13:28.29 
chrisl That seems like the kind of PCL5 crap they were trying to get away from with PXL - ignore crap input, and carry on regardless :-(13:29.01 
kens Hmm so we set (gs_main_instance *) minst->get_codepoint based on the type of argument encoding. Which is fine. Presumably we use that somewhere to 'decode' the UTF8 and because we set it to NULL, it simply doesn;'t so anything. So at least the first part of the problem is that the code in gsargs doesn't do that, its just calls codepoint_to_utf813:30.38 
HenryStiles chrisl: yeah for the most part XL does a good job with that, but there are exceptions.13:30.48 
kens Oooh, but some code in gsargs *does* use get-codepoint.13:31.41 
HenryStiles chrisl, kens : I thought Robin_Watts fixed all the filename argument stuff sometime ago.13:31.51 
kens Yeah it seems not13:31.58 
Robin_Watts kens: Is the arg encoding being set to utf8 ?13:32.10 
kens Robin_Watts : not in gxps, but at the moment the gxps code isn't calling anything which uses minst->get_codepoint either13:32.40 
  I was just rambling through the gswin32c code seeing what setting the encoding actually did.13:33.03 
Robin_Watts kens: Ok. I would strongly suggest that you make it be set to utf8 before you do anything else, otherwise it will never work.13:33.14 
kens Yes, one of the reasons for doing this was exactly to see how to do it :-)13:33.41 
Robin_Watts I'm perfectly prepared to believe there are cases in there that are still wrong.13:33.46 
kens Its sort of weird certainly13:33.57 
  Its definitly a complicated area13:34.12 
Robin_Watts The code is an utter mindfuck. it has to cope with args being decoded, then pushed back into the arg list, so you mustn't decode them a second time.13:34.34 
kens plmain.c seems to set gp_local_arg_encoding() which is probably not correct :-)13:34.47 
  But it looks like gxps doesn't set it at all13:35.10 
chrisl There's a coffee out there with my name on it - I'm going to track it down.......13:35.46 
kens Aha, I see the get_codepoint call13:35.52 
Robin_Watts I am being defeated by simple ****ing arithmetic at the moment.13:36.32 
kens Ah, its set by arg_init. Right, finally feel I'm making some forward progress instead of floundering13:36.34 
  Arithmetic is always a trial for real mathematicians :-)13:36.58 
  OK looks like getting the encoding correct will do the job, manually hacking it seems to work. Thanks for the hint Robin_Watts13:46.52 
Robin_Watts kens: No worries.13:47.00 
kens Robin_Watts : car to cast a quick eye over:14:25.46 
  http://git.ghostscript.com/?p=user/ken/ghostpdl.git;a=commit;h=33701ac07115cb2f634a40ab73a5c127ca870ad814:25.46 
  Particularly the change in plmain.c. I think dwmain.c is non-controversial because its a copy of what's in dwmainc.c14:25.46 
Robin_Watts I can't really comment on plmain.c without some digging, cos I have no idea what the code there does.14:31.04 
kens Oh it sets arg_codepoint, which is broadly the same as the set_encoding done in dwminc.c14:31.43 
Robin_Watts Do gxps etc now set the arg encoding with this change?14:31.43 
kens It doesn't set the encoding as such, but arg_init gets set up with the correct 'encoding'14:32.10 
Robin_Watts right.14:32.15 
kens Normally that is set from minst->encoding (or something liek that0 which is set by set_encoding14:32.26 
  This has the same effect overall14:32.33 
Robin_Watts It looks to me like arg_get_codepoint will now be set to NULL in almost all cases.14:32.54 
kens Yes14:33.05 
Robin_Watts cos we "never" build with GS_NO_UTF814:33.11 
kens You would have to set GS_NO_UTF8 to get it non-NULL14:33.17 
Robin_Watts right. Well, if this works, I'm happy.14:33.27 
kens It definitely works for me, and it fixes gxps and gpcl at the same time14:33.42 
  OK so I'll commit it. We may need further work, but its defintiely better than it was.14:33.57 
  I guess this is something we may also need to address with language switching. I don't htink the current PL approach is going to be sufficient, its too primitive14:34.31 
Robin_Watts mvrhel_laptop: ping.14:38.40 
kens OK on to JPEG output to stdout.14:40.00 
  Coffee first I fancy14:40.06 
  I see you scared Michael off Robin_Watts14:40.18 
Robin_Watts Indeed.14:40.33 
  mvrhel_laptop: ping ?15:01.02 
mvrhel_laptop Morning Robin_Watts 15:01.12 
Robin_Watts morning.15:01.16 
  Quick question about the thresholding stuff.15:01.32 
mvrhel_laptop ok15:01.53 
Robin_Watts You apply a sign fix to both contone and screen before doing the subs.15:02.03 
mvrhel_laptop ok15:02.13 
Robin_Watts Is there a reason why you don't preapply that to the screen data?15:02.18 
mvrhel_laptop no15:02.29 
Robin_Watts good answer :)15:02.44 
  Also, did we try doing the bit reversal in SSE?15:03.34 
mvrhel_laptop I thought you or Steve did15:03.45 
Robin_Watts By using the shuffle instruction before we movemask?15:03.49 
mvrhel_laptop and it was slower15:03.53 
  but I dont know15:03.56 
  for sure15:03.59 
Robin_Watts I have this memory of trying it, but I'm trying to think how it could have been slower! :)15:04.09 
ray_laptop Robin_Watts: Thanks for the comments on the lcms2mt rename15:13.45 
  Robin_Watts: oh, and I haven't yet gotten rid of cmsChangeBuffersFormat -- I've added the comment about "replacing" it with the Clone function in the "Why" doc. I'll get rid of the Change function and push an amended change15:23.32 
  later, however. I have an errand to run.15:23.54 
Robin_Watts ray_laptop: Yeah, but we should document the new function in the "Why" doc, and say why it's a problem.15:24.46 
  I am happy to do that after you put your commit in though if you'd rather.15:25.28 
ray_laptop Robin_Watts: (part of) the explanation I had was "replace cmsChangeBuffersFormat which cannot be used to alter a link profile that might be in use by another thread." It doesn't explain "why" it can't be used by another thread. Should I state that as well?16:27.57 
Robin_Watts ray_laptop: I'd say something like "The goal of this work is to allow transforms to be used in multiple threads at the same time. Sadly cmsChangeBuffersFormat conflicts with this aim. Accordingly, we've replaced it with cmsCloneTransformWithChangedFormats (or whatever it was), which allows safe operation with (almost) as low a cost."16:30.19 
ray_laptop Robin_Watts: mvrhel_laptop: I've moved the relevant part of cmsChangeBuffersFormat into CreateNamedColorDevicelink where it was used.16:51.59 
mvrhel_laptop ray_laptop: sounds ood16:52.10 
ray_laptop and gotten rid of the former16:52.12 
mvrhel_laptop and even good16:52.14 
Robin_Watts cool.16:52.16 
mvrhel_laptop Robin_Watts, and ray_laptop: ok another version now on casper17:24.13 
Robin_Watts looking.17:24.21 
mvrhel_laptop hopefully we are pretty much there now17:24.26 
ray_laptop Robin_Watts: I didn't use your wording exactly, but I added more information that discusses the problem and reasoning for the change.17:27.00 
  mvrhel_laptop: looking...17:27.06 
Robin_Watts cool.17:27.09 
ray_laptop mvrhel_laptop: well, actually still downloading.17:28.25 
mvrhel_laptop Yeah I need to replace that pdf of the images with just a single image17:29.08 
  I did that with the altona files17:29.23 
ray_laptop mvrhel_laptop: oh, yeah, that might help. It doesn't need 95M pixels :-)17:29.44 
mvrhel_laptop including the v2 altona one natively really made it large (and slow)17:29.46 
Robin_Watts So, what name did we settle on? :)17:31.38 
  Introduction, paragraph 3, "with out"17:32.49 
mvrhel_laptop name17:33.06 
  ?17:33.07 
Robin_Watts lcms-mt or lcms2-mt ?17:33.21 
  There was discussion of this yesterday on here.17:33.34 
mvrhel_laptop ok. Not that I was part of17:33.43 
  :)17:33.45 
  I am fine with the code having an lcms2 designation if that is what lcms has17:34.08 
  ie. lcms2 become lcms2mt17:34.20 
Robin_Watts "This severely limits performance capabilities of applications..." "performance capabilities" ? not just "performance" ?17:34.24 
mvrhel_laptop Robin_Watts: I would prefer if you would write up your comments in an email if you could17:35.11 
  it would make it easier for me17:35.15 
Robin_Watts Sure.17:35.19 
  On the website he calls it "LittleCMS"17:35.30 
mvrhel_laptop yes17:35.33 
Robin_Watts but all the downloads etc are "lcms2-2.9", and the directories are "lcms2-2.9" etc.17:35.51 
  so either we should be "LittleCMS-MT" or we should be "lcms2-mt", IMAO.17:36.18 
mvrhel_laptop I might be ok with LittleCMS-MT. lcms2-mt I am not ok with in the paper17:37.10 
ray_laptop I think LittleCMS-MT is consistent with Marti's docs and usage17:38.52 
mvrhel_laptop yes 17:39.04 
  we will go with LittleCMS-MT17:39.38 
Robin_Watts cool.17:40.19 
ray_laptop chrisl: I'd appreciate it if you could give the configure.ac and makefile changes a look. I know it builds on linux, but that is the extent of the validation I have (my repos for renaming to lcms2mt)17:44.39 
  Robin_Watts: please have a look at the WhyThisFork in what I just pushed, please (mvrhel, too??)17:48.08 
Robin_Watts Will do, still reading.17:48.20 
mvrhel_laptop Robin_Watts: getting ready to head out for a bit. I will look at what you have when I return...17:59.37 
ray_laptop mvrhel_laptop: I don't see problems with the paper now18:33.13 
chrisl ray_laptop: Oops, sorry I forgot to reply - I looked the lcms2mt changes earlier, and I think it's okay19:16.13 
ray_laptop chrisl: thanks19:20.50 
mvrhel_laptop Robin_Watts, or ray_laptop you around?20:52.31 
  or HenryStiles ...20:52.59 
HenryStiles yup20:55.32 
mvrhel_laptop HenryStiles: could you bring chains off the regression testing for me while I run some timing tests?20:55.55 
  I dont have super user ability20:56.02 
  to do this20:56.05 
  there it goes. Thanks HenryStiles 21:00.36 
  I will let you know when I am done21:00.46 
HenryStiles okay I never did it before and had to figure it out21:00.48 
  mvrhel_laptop: leave a note here and I'll enable it later, have an errand21:27.03 
mvrhel_laptop HenryStiles: ok thanks. It is going to be awhile21:27.24 
  bbiab22:10.10 
 Forward 1 day (to 2018/03/29)>>> 
ghostscript.com #mupdf
Search: