Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2018/08/21)20180822 
moriarty hey folks00:25.04 
paulgardiner I have a file with a too-long name. When I view it, the problem object is loaded via pdf_resolve_indirect, which catches the error and the resulting render looks AFAICT the same as with AR. When I try to save the file, the problem object is loaded via pdf_load_object, nothing catches the error and the save fails. Any thoughts?09:40.52 
tor8 paulgardiner: how long is the name?09:44.51 
  the spec limits names to 127 bytes09:45.14 
paulgardiner Not sure. At least twice that, I think.09:46.06 
tor8 Hm. I think we should probably return PDF_TOK_ERROR instead of throwing. Let me cook up a patch for you to try.09:58.45 
  paulgardiner: try the commit on the top of tor/master10:00.56 
paulgardiner Oh great. Thanks Tor10:01.11 
tor8 I have no idea if that will help, but I think it will make it consistent with other types of syntax errors at least10:01.36 
paulgardiner Ah, now pdf_parse_dict throws "invalid key in dict" because it's got a PDF_TOK_ERROR where it expects a PDF_TOK_NAME.10:09.22 
  I wonder if it's worth my trying catching the original error where the save function preloads all the objects... whether saving would complete with a missing object.10:11.00 
tor8 oh ... that just moves the error to a higher level, when parsing not lexing.10:11.19 
paulgardiner yeah10:11.33 
tor8 paulgardiner: or we could tweak the parsing to not throw syntax errors, and just put 'null' in the place when assembling duff arrays and dictionaries10:13.22 
paulgardiner Yes, I wondered that. Just wasn't sure how the rest of the code might behave in the presence of those 'null's.10:15.24 
  I'm just can try out catching the errors in the preload10:15.49 
sebras it was me adding the 127 byte limit. I remember having a file that had a too long name that caused trouble, but I didn't note which one so we might be causing that problem to reappear.10:17.10 
tor8 paulgardiner: try the next commit on tor/master too10:18.29 
  sebras: we're not removing the limit. I changed the error handling from fz_throw() to return PDF_TOK_ERROR10:18.55 
sebras tor8: ok, I was worried that you were considering removing it altogether.10:19.24 
paulgardiner tor8: Whoop! That works.10:24.36 
  I think I may have seen this before and opened a bug for it.10:25.02 
  699620, although I seem to be suggesting the error is inappropriate there. I think it may be that a single sequence of chars is used both as a name and as a string within the document, so that although the name is ignored, the string still appears on the page.10:28.44 
tor8 paulgardiner: that content object stream has a dictionary with a name that exceeds the max length: /Yes:#20If#20"Yes",#20print#20your10:31.41 
  #20former#20name#20exactly#20as#20it#20appears#20on#20your#20present#20license#20or#20non-driver#20ID#20#28Identification#29#20card.10:31.41 
  with the patches on tor/master it goes through and jsut drops that dictionary key/value pair10:32.22 
paulgardiner That string still appears in the document even with your change though, so I think it also occurs as a string.10:33.27 
  Actually, maybe not that string, but the one I first ran into as a too-long name: "Yes - (If you marked Yes the vehicle..."10:37.12 
tor8 paulgardiner: the offending name is the name of one of the AP/D states10:39.15 
  annotation object 1672 in that file10:39.27 
  has an /AP/D dictionary with an /Off key and a /YES#20-#20Complete#2010:39.51 
  Voter#20Registration#20Application#20Section#20#28Not#20necessary#20if#20you#2010:39.51 
  bring#20this#20form#20to#20a#20DMV#20#28Department#20of#20Motor#20Vehicles#29#210:39.51 
  0office#29. 1574 0 R>>/N<</YES#20-#20Complete#20Voter#20Registration#20Applicat10:39.51 
  ion#20Section#20#28Not#20necessary#20if#20you#20bring#20this#20form#20to#20a#2010:39.51 
  DMV#20#28Department#20of#20Motor#20Vehicles#29#20office#29. key10:39.51 
paulgardiner Yes. All I'm saying is I think the same sequence of characters occurs elsewhere in the document as a string, which is what confused me into suggesting that the error was inappropriate in the bug.10:40.37 
tor8 now, I see two ways of handling this -- either we drop the offending too long name completely or we truncate it... I think dropping it is probably better.10:40.42 
  paulgardiner: yes, the same string shows up in the document text10:40.56 
  but it does mean we cannot actually select the "YES -10:42.22 
  Co10:42.22 
  mp10:42.22 
  lete Vo10:42.22 
  ter10:42.22 
  Registr10:42.23 
  atio10:42.25 
  n10:42.27 
  Ap10:42.29 
  p10:42.31 
  licatio10:42.33 
  n10:42.35 
  Sectio10:42.37 
  n10:42.39 
  (10:42.41 
  No10:42.43 
  t n10:42.45 
  ecessar10:42.47 
  y if yo10:42.49 
  u br10:42.51 
  in10:42.53 
  g this fo10:42.55 
paulgardiner Your current patch fixes both the occurrences of this I've so far run into.10:42.56 
tor8 r10:42.57 
  m to10:42.59 
  a DMV o10:43.01 
  ffice)" checkbox10:43.03 
  oops, copy paste didn't go well from that document10:43.05 
sebras tor8: I'm surprised you weren't rate limited. :)10:43.28 
tor8 paulgardiner: I hope you mean "now fixed" in bug 69480411:27.51 
sebras tor8: hm... I get bmpcmp differences with the nested_depth removal.11:33.10 
tor8 sebras: yes. I'm getting other errors too.11:34.08 
sebras tor8: I've gotten a few random timetouts.11:34.39 
  tor8: but my bmpcmp results are consistent Bug688796b.pdf is drawn differently.11:34.59 
  I can reproduce that locally.11:35.03 
tor8 yeah. that's possibly due to the change in pdf-op-run.c where the condition for render_direct is different11:35.35 
sebras ah, yes. I did notice that expression last night, but had forgotten about ti.11:39.17 
  tor8: I'm guessing that we can consider this one a progression? if so I'll go ahead and push sebras/master..?11:39.46 
  well the first 5 patches.11:40.07 
tor8 sebras: I would like to figure out the other problems first, but up to and including "Bug 699666: Forbid cycles in Type3 font CharProcs." are good to push11:40.35 
sebras tor8: I only get the timeouts on some of the machines.11:44.46 
  tor8: locally I could reproduce a long rendering time for one of the files (11:45.09 
tor8 did it get longer?11:45.23 
sebras but the other two PDFs both 0 bytes)11:45.29 
  tor8: I think it was just statistical variation. I did run it inside gdb and looked at the backtrace and it seemed to be rendering a 2000x5000 pixmap11:46.28 
  tor8: rerunning this I can see that Bug692933.pdf results in a 5760x2268 pixmap, so I'm not surprised that it takes 4 sec on my machine to render. and possibly too long, triggering a time out, on some of the cluster's slower(?) machines..11:50.15 
  Bug690237.pdf and Bug692160.pdf are both 0 byte files, so why these triggered timeouts I'm not sure.11:50.47 
tor8 that sounds like a cluster problem -- those files are not 0 bytes11:52.47 
sebras tor8: oh really? I downloaded them from the cluster.11:55.45 
sebras takes a look locally.11:55.50 
tor8 ~/src/tests_private/comparefiles $ ll Bug690237.pdf11:56.10 
  -rw-r--r-- 1 tor tor 12M Nov 4 2016 Bug690237.pdf11:56.10 
sebras tor8: redownloading helped for that file. I'll try the other.11:58.30 
tor8 your report says they stopped timing out, not that they started timing out...11:58.56 
  and they are huge pages, so no wonder they're on the edge of timing out11:59.23 
sebras oh! I see what happened now.12:01.42 
  tor8: I grabbed the files from my download folder before they were fully downloaded. firefox creates a file.pdf.part and a file.pdf which is empty while it is being downloaded. then once the download is done the .part-file is renamed to the correct name.12:02.34 
  tor8: I guess my location doesn't help with the download time...12:05.44 
  tor8: so with that I believe this is now sorted..?12:16.48 
  i.e. I was just being silly.12:17.03 
paulgardiner tor8: looks like that too-long name breaks the form and stops the checkboxes from being checked, even with your fix (probably unsurprisingly).12:38.17 
tor8 paulgardiner: yeah. at least the content shows up with the fix :)12:46.35 
paulgardiner Yes,definitely an improvement. :-)12:47.12 
  Somehow AR is working around it.12:47.21 
tor8 maybe they ignore the limit in the spec12:47.36 
paulgardiner Yeah, could be.12:47.44 
sebras tor8: so if we convert such a file, do we shorten the name to make it conform to the spec or do we emit a broken PDF?12:50.38 
  or clean or whatever.12:50.46 
tor8 sebras: with my current patch, we drop the offending name (turn it into 'null')12:51.13 
  another approach would be to truncate and warn12:51.32 
paulgardiner I have a file that wont clean because of the issue12:51.47 
  .. resulting file wont load in AR12:52.14 
sebras can we test if AR has any upper limit at all? what about a 1M character name?12:53.23 
  perhaps they just raised it a bit.12:53.35 
tor8 paulgardiner: with or without the patches on tor/master?12:56.21 
paulgardiner I tried both, I believe.12:57.18 
sebras pdfref20 mentions the 127 char name limit as an architectural consideration, not a strict limit.12:57.23 
paulgardiner Without gave an empty file, With gave a sensibly sized file that wouldn't load.12:57.59 
tor8 it's probably missing a vital part. have you got the file somewhere?12:58.20 
paulgardiner Just emailed it.13:02.44 
sebras tor8: may I bother you with a review of sebras/master?13:58.37 
tor8 sebras: sebras/master LGTM14:03.22 
sebras tor8: it clusters well, I'll go ahead and push then.14:04.56 
tor8 sebras: go for it.14:05.10 
  paulgardiner: yeah, I see some weird stuff going on with mutool clean with that patch14:10.03 
  it's dropping '0' bytes and truncating output14:10.14 
paulgardiner Does sound weird14:11.06 
tor8 maybe encryption-related14:12.50 
paulgardiner Oh, I hadn't noticed it was encrypted.14:13.29 
tor8 but with the 'truncation' patch, it goes through properly...14:14.11 
  valgrind says it's writing uninitialized data14:16.45 
  paulgardiner: I'm inclined to believe that the encryption writing is what's breaking the cleaning of that file14:39.41 
  the fact that it also has too long names is just the icing on the cake14:39.55 
paulgardiner Yes, now you mention it. I believe I've seen other cases where encryption breaks cleaning.14:40.47 
  There are some checkboxes on the second page that don't work though, probably because of the too-long names.14:41.56 
tor8 yes. truncating instead of dropping those names make the checkboxes work.14:42.24 
  but cleaning is still broke14:42.30 
paulgardiner Ah okay. That's fine then.14:42.51 
sebras tor8: how do you feel about "Remember to end groups/softmasks even upon exception."?14:46.38 
  tor8: that's on sebras/master and will be the last thing I'll do today.15:14.19 
kens tor8 can you answer this question ?:15:21.30 
  https://stackoverflow.com/questions/51969592/get-mutool-to-output-structured-text-as-xml15:21.30 
sebras tor8: ok, I'll leave that for another day, need to go. bye!15:34.27 
tor8 kens: answered15:36.28 
kens THaniks tor, I wasn't absolutely sure what the question meant....15:36.43 
tor8 sebras: my brain is too mushy today, I'll tackle your last commit tomorrow (or when you're back from HK)15:36.50 
sebras tor8 (for the logs): after is good enough, thank you. :)22:51.59 
 Forward 1 day (to 2018/08/23)>>> 
ghostscript.com #ghostscript
Search: