IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2012/12/11)2012/12/12 
tkamppeter chrisl, you asked for me yesterday?09:15.22 
chrisl tkamppeter: yeh, we've got a cups bug that's rumbling on in launchpad.....09:15.54 
  Well, cups with gs, I should say09:16.07 
  tkamppeter: https://bugs.launchpad.net/bugs/97812009:16.30 
tkamppeter chrisl, I have seen the last comments. Can you post on the bug with which parameters you generated the working PostScript file? Then I can add the appropriate exception to the pdftops filter, for Toshiba printers.09:18.58 
chrisl tkamppeter: I didn't generate it, I modified it by hand. But, IIRC, we already have a couple of printers that disable *all* the compression from ps2write, so it should be same as those.09:20.50 
  But I suggest we wait to hear back from a few other people (hopefully) trying it out09:21.23 
  tkamppeter: as long as that bug is "on your radar" again, I don't think we need anything from you until we get a few more people running the test job.....09:35.39 
  and now I'm off to play squash09:35.51 
Robin_Watts samples_mupdf_001.zip and samples_gs_001.zip are now uploaded to my casper home dir.14:45.13 
  Hi marcosw.16:33.01 
marcosw morning Robin_Watts 16:33.08 
Robin_Watts You know the company we visited last week? Well, their security team has been looking at gs and mupdf (independent of our visit).16:33.36 
  They've sent a couple of archives of files that cause crashes.16:33.54 
  I've uploaded them to casper in my home directory.16:34.09 
  samples_{gs,mupdf}_001.zip16:34.17 
  chrisl (or anyone else)... so we have the gsapi interface designed for people to drive the gs lib.16:36.31 
  We also have a gsdll interface.16:36.39 
  Which seems to wrap the gsapi one.16:36.56 
  Is gsdll windows specific ?16:37.01 
marcosw Robin_Watts: There appear to be several :-) files in each archive16:37.23 
Robin_Watts no, I'm seeing macos stuff in there too...16:37.31 
  marcosw: Oh yes. All with "unique stack traces" apparently.16:37.49 
marcosw so I should enter one bug for each file and assign them all to you?16:38.22 
Robin_Watts marcosw: I have no idea. I've not had a chance to look at any of them yet.16:38.46 
  But henrys said I should share the archives with you, so... there you are.16:39.01 
henrys marcosw:they should be treated like customer bugs16:39.21 
Robin_Watts henrys: There are quite a few files...16:39.42 
marcosw something close to 2000 PDF files16:39.56 
  though that's both the mupdf and gs archives, so there may be overlap.16:40.15 
  henrys: I'll ask miles/joann to generate a customer number of the potential customer, so that we can track the bugs.16:40.48 
henrys marcosw:okay.16:41.43 
  the dll is supposed to work on linux, windows and mac16:42.42 
  last I looked at it.16:42.52 
Robin_Watts ok.16:42.56 
henrys marcosw:since there are so many do you want to split up the tests among the staff?16:48.19 
Robin_Watts henrys, marcosw: Presumably marcosw is going to look at the gs ones and leave the mupdf ones to me ?16:50.02 
marcosw henrys: presumably the mupdf ones are the most important, since that's what they are discussing licensing, so I was going to look at those first. Is there anybody other than tor8 and Robin_Watts I should assign bugs to?16:50.05 
henrys not for mupdf bugs16:50.54 
marcosw Robin_Watts: I was going to go through all the bugs and check to see if they can be duplicated with master before entering them.16:51.16 
Robin_Watts If it turns out that some/most of the bugs have been fixed already, then that would be appreciated, as it will save me/tor8 time.16:51.47 
  If on the other hand, every file causes a bug to be opened, it may be more efficient to just have us do it as we go.16:52.15 
marcosw do we know what version of mupdf they tested with?16:52.24 
Robin_Watts marcosw: I would imagine 1.1 or earlier.16:52.39 
marcosw so there is hope that some have been fixed. 16:53.08 
henrys were all these bugs produced in an android environment?16:53.14 
marcosw is there any suggestion that some of these are valid pdf files? i.e. are they expecting output or just not a segfault?16:54.07 
Robin_Watts henrys: The crash logs show what looks like x86 assembly to me.16:54.55 
henrys Was the associated email mailed to support and I missed it? 16:55.05 
marcosw a lot of these are crashing in j2k_decode16:55.16 
Robin_Watts And it is indeed Mupdf-1.116:55.48 
marcosw or jbig2 or fz_paint16:55.52 
henrys I am certainly hoping they haven't found 2000 unique crashes, that would be very bad news.16:56.00 
marcosw only 1246 for mupdf :-(16:56.27 
  would does "asan" mean? (the files all have SIGSEGV or asan as part of the filename).16:57.00 
Robin_Watts "address sanitiser"16:57.25 
  That's the tool they've used to detect problems.16:57.36 
marcosw actually many of the files are duplicated, i.e. 16:57.42 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:40 925.pdf.asan.13.424916:57.44 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:41 925.pdf.asan.38.424916:57.45 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:40 925.pdf.asan.40.424916:57.46 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:41 925.pdf.asan.50.424916:57.48 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:40 925.pdf.asan.6b.424916:57.49 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:41 925.pdf.asan.8.424916:57.51 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:41 925.pdf.SIGSEGV.48.424916:57.52 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:40 925.pdf.SIGSEGV.5fa.424916:57.54 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:40 925.pdf.SIGSEGV.745.424916:57.55 
  340 -rw-r----- 1 marcos marcos 340409 2012-12-01 12:41 925.pdf.SIGSEGV.f4c.424916:57.55 
Robin_Watts http://code.google.com/p/address-sanitizer/wiki/AddressSanitizer16:58.23 
marcosw any idea what that means? Does 925.pdf crash in 10 different ways?16:58.34 
henrys 4249 is likely the process number so how many of those do we have?16:58.46 
Robin_Watts I suspect that each file is for a different detected callstack?16:59.15 
henrys with the same exact size?16:59.30 
marcosw there appear to be 542 unique PDF files.16:59.45 
Robin_Watts but honestly, I've just downloaded the zips and reuploaded them at this point. I haven't done any digging beyond checking that each file is indeed a PDF file, and that the first one does indeed crash mupdf.17:00.02 
  I'm trying to avoid being distracted from unicode hell.17:00.15 
henrys see in 10 minutes we've reduce the workload to 1/4 at this rate we should be done in a few more minutes ;-)17:00.27 
marcosw Robin_Watts: give me a couple of hours to look at the files; i'll send an email when I have something to report.17:01.04 
Robin_Watts marcosw: Great, thanks.17:01.16 
marcosw Oops, it's 9:00, I have to run. Back online in a bit.17:01.19 
Robin_Watts I'll dig up their emails and post them to support.17:01.32 
henrys Robin_Watts: thanks17:01.55 
kens2 time to go, goodnight all17:03.16 
Robin_Watts Night kens2 17:03.32 
  henrys, ray, anyone else interested: I've just put a new version of my Unicode changes up on bug 692381. My very limited testing suggests it works, but it would be good to have it sanity checked before I go any further.17:21.33 
  If any of you have the time to look at the proposed change and decide whether you think it's reasonable etc, I would be very grateful.17:22.04 
henrys okay let's get chrisl to weigh in also before a commit17:23.57 
Robin_Watts I'd kinda like Sags to OK it too, given that he's the one who has been finding faults so far.17:24.39 
  ok. Now to document GS_THREADSAFE etc.17:25.26 
mvrhel_laptop Hi Robin_Watts 17:30.45 
  so does the PAM format support 16 bit values? Do you just set the max value to 65535 and do you use big or little endian encoding?17:31.29 
Robin_Watts mvrhel_laptop: Hi17:38.52 
  Hold on...17:38.54 
  http://netpbm.sourceforge.net/doc/pam.html17:39.39 
marcosw Robin_Watts and henrys: the good news is that the crashes are all easily reproducible: mupdf <filename> is sufficient in all cases I've tried. Also none of the files appear to be valid, i.e. Acrobat can't read them. Also in many cases mupdf prints warnings and errors before crashing, i.e.:17:39.42 
  mupdf(7372) malloc: *** mmap(size=1010085888) failed (error code=12)17:39.45 
  *** error: can't allocate region17:39.45 
  *** set a breakpoint in malloc_error_break to debug17:39.47 
  error: malloc of array (-1047476 x 3136 bytes) failed (integer overflow)17:39.48 
  error: out of memory17:39.50 
  error: cannot draw xobject/image17:39.51 
  warning: Ignoring errors during rendering17:39.52 
  Bus error17:39.53 
  Exit 13817:39.54 
Robin_Watts That suggests that maxval can be 65535.17:40.06 
marcosw you'd think "out of memory" would not be a "Ignoring errors during rendering" condition :-)17:40.22 
mvrhel_laptop Robin_Watts: yes. OK thanks17:40.25 
marcosw I think I'll triage the files based on where the crash occurs (jp2k, jbig, fz_, etc). and open one bug for each category.17:40.39 
Robin_Watts and the data should be most significant byte first (i.e. the wrong way round :) )17:40.40 
mvrhel_laptop alright. I am trying to nudge Max then to use this format. The one that he sent me does not include any depth (number of colorants) in the header. I was just not sure about the 16 bit handling of PAM but it would appear to be just fine17:41.46 
Robin_Watts marcosw: The idea is that in the case of such errors we ignore them and continue as best we can, so the user gets *something* on the screen.17:41.48 
henrys Robin_Watts:the patch seems reasonable to me except the trivial nit that I don't like the term "rune"17:41.57 
Robin_Watts but we leave an indication there that the rendering is incomplete so that callers can expose that to the user somehow.17:42.40 
mvrhel_laptop rats. I am at the coffee shop and left my external drive with the PDF FTS files at home needed it to work on my 2 P1 customer bugs17:42.46 
Robin_Watts henrys: I should perhaps have used 'codepoint'17:42.55 
mvrhel_laptop with SVN I should be able to get the individual files though17:43.06 
Robin_Watts mvrhel_laptop: If there is a particular file, I could mail it.17:43.11 
mvrhel_laptop oh that would work 17:43.19 
  hold on17:43.22 
  fts_25_2526.pdf17:43.41 
Robin_Watts or you can scp from peeves if you have that set up.17:43.54 
mvrhel_laptop and fts_14_1418.pdf17:44.00 
  I used to be able to get to peeves is that where they are17:44.16 
Robin_Watts They are on every cluster node.17:44.28 
mvrhel_laptop let me try peeves17:44.36 
Robin_Watts /home/marcos/cluster/tests/... or /home/marcos/cluster/tests_private/...17:44.53 
mvrhel_laptop ok I am on peeves17:44.57 
  ok. this should work just fine. thanks Robin_Watts 17:46.13 
Robin_Watts np.17:46.19 
  hey sags17:50.57 
sags @Robin_Watts, about the GSDLL interface (for the logs): A comment in base\gsdll.h says /* This interface is deprecated and will be removed in future ghostscript releases. Use the interface described in API.htm and iapi.h. */.17:50.59 
  However, I have absolutely no idea how many people use (or rely on) it. Maybe it's better to keep it.17:51.07 
Robin_Watts sags: Yeah, I added the obvious entrypoint just in case.17:51.34 
  Does the proposed patch meet with your approval?17:52.01 
sags I didn't know there is a new patch untill now, so have not looked.17:52.33 
Robin_Watts sags: Ah. I only uploaded it 20 minutes ago or so. I thought that was what had prompted you to appear :)17:53.19 
mvrhel_laptop hmm my ubuntu under hyper-v seems snappier today17:53.39 
sags Anyway, I'm thinking about a different way to handle the @files charset, which eliminates the need to store it in the context/ etc -- "sniff" the charset.17:53.41 
  As a side effect, it solves a problem that I always forget about, the BOM. On Windows, it's usual that UTF-8 files contain a BOM at the beginning, and this has to be skipped. Also UTF-16 files need a BOM to know the endianness.17:55.29 
Robin_Watts sags: I'm not sure I follow. With the proposed patch, we set the encoding up front. "sniffing" implies guesswork to me...17:55.37 
  Possibly we should update the patch to skip the BOM if it is met (and is the right type), and to error out if the wrong type is met.17:56.48 
sags Yes, some guesswork, but I think it's better overall. In the end, an @file can be a created by a different tool than the GSAPI client (for example be more of a "config" file), so there's not an absolute connection between the GSAPI function used and the encoding of @files.17:57.28 
Robin_Watts Well, there is if we define there to be :)17:58.11 
  I could conceive of a system where we assume that the format of the @ files is assumed to be the same as the configured encoding format, but where we swap to a different encoding for the @ file being processed if we hit a BOM.17:59.41 
marcosw occasionally get's confused which window the pointer is in and types shell command into chrome, surprisingly this often works (i.e. 'man sed') 17:59.55 
sags If the BOM is present there's no guessing. If there's no kind of BOM, then we can either:18:00.37 
  (1) assume host encoding (which translates to ANSII on Windows, UTF-8 on Linux)18:01.06 
  or (2) Verify if there are any NUL bytes in the 1st 1024 bytes or so of the file. If yes, consider the file as UTF-16 "HE" "host endianness". If no asume native host encoding.18:02.21 
  Then convert string to UTF-8 in the wrapper function, and you don't need to handle varying encodings anywhere else.18:03.43 
  (The JIT-recoding is still needed but only for the @files.)18:04.28 
Robin_Watts sags: I am tempted to go with what we have in the patch for now. We can add 'sniffing' later if we have a call for it.18:07.52 
  The problem with anything that involves guesswork is that someone will find a case where it fails, and complain.18:08.58 
  (like someone will come up with an @ file that contains a single character (> 128) and no BOM, and we'll pick the wrong encoding)18:09.51 
sags As long as it will not need any change to the GSAPI interface, yes it can be added later. Even if the behaviour will change a bit, that will be "documented as an enhancement" :)18:10.12 
Robin_Watts sags: I believe it should not need any API changes.18:10.36 
  Have a look over the patch at your leisure and let me know what you think.18:10.51 
  I hope it addresses all your concerns.18:10.58 
sags Yes, there definitely are cases when the encoding cannot be guessed. Example: an UTF-16 @file containing a single chinese filename, without drive/ directory/ extension and without a terminating newline and without a BOM. So no character in the range U+0000..U+00FF. There nobody can tell it's a UTF-16 or an ANSII one.18:13.15 
  Ok, I'll take a llok at that patch, most likely this weekend.18:14.04 
mvrhel_laptop bbiaw19:13.49 
marcosw cut down the number of problem mupdf files by another 23, they had included some ghostscript errors in the mupdf .zip file (haven't looked into the ghostscript .zip file to see if they made the opposite error as well, so this may be a short lived reduction).20:28.31 
 Forward 1 day (to 2012/12/13)>>> 
ghostscript.com
Search: