Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2017/08/09)20170810 
Hejsan I'm getting a "jbig2 decoder FATAL ERROR: Invalid SYMWIDTH value (-40) at symbol 758 (segment 0x00)" on one of my computers but not the other for the same pdf file09:55.14 
  any idea? I was thinking maybe missing fonts09:55.38 
kens Nope09:55.45 
  jbig2 is an image format09:55.53 
  Its hard to see how you could get an error on one computer and not another if you are using the same setup on each09:56.10 
Hejsan thats what im thinking09:56.27 
  the pdf consists of scanned images containing text09:56.37 
kens Scanned images can't contain text :-) They might be pictures of text though09:56.56 
  jbig2 is a cnot uncommon compression format for monochrome images09:57.13 
  especially images of text09:57.19 
  Same OS, same version of MuPDF ?09:57.57 
  You could try the current development code, there have been some jbig2dec fixes09:58.32 
avih tor8: i'm getting some intermittent segfault in regex match (and replace, which is also in match) on one system - alpine linux, but i'm not sure how to get accurate trace. in gdb (which i'm vert unproficient with) i see only function names once the code gets into mujs. i tried to build and link with the debug lib, and i've been successful as far as i can tell, but gdb still only shows function names for mujs code (but in mpv code i do see exact line number11:01.56 
  inside functions). ideas?11:01.56 
  very*11:02.11 
  this happens with and without compiler optimization11:03.22 
  also, it's hard for me to reproduce it in mujs REPL. it did repro for a while, but not anymore. i _think_ it's not a bug in my code, and the regex is quite simple ( .match(/\\/) or .match(/^~~/) ), so i'm a bit stuck..11:07.19 
Hejsan @kens same OS (ubuntu 12.04.5) same mupdf11:11.40 
kens Hejsan : then to be honest, I can only assume its something like memory corruption11:12.09 
  Though I'd expect that to be intermittent11:12.20 
Hejsan does mupdf use the OS installed fonts at all?11:18.05 
kens I htink not, no11:19.19 
  I'm reasonably certain this is the case11:19.28 
  But if the PDF only has scenned images, then MuPDF won't need any fonts anyway11:20.05 
Hejsan its scanned, but OCRed by the scanner11:20.24 
  I can even select the handwriting in adobe11:20.42 
kens Unless it draws glyphs (and not in Text rendering mode 3), doesn't matter11:20.44 
tor8 avih: I assume you've also tried with valgrind?11:20.45 
avih tor8: i have not. i'm not sure it's available on alpine though11:21.11 
tor8 kens: Hejsan: mupdf never uses system fonts (unless you're a customer and have added specific hooks for it -- the code as shipped does not)11:21.25 
kens Yeah I was reasonably sure that was the case11:21.49 
avih tor8: in general my linux debugging skills are very weak. the most i can typically do is add printf's and get a stacktrace with gdb11:21.58 
tor8 valgrind will print a partial stack trace as well, and often catches the problem earlier than gdb11:22.41 
avih i didn't claim it was an unuseful tool :)11:23.40 
  tor8: back to my original question though, do you know how to get line number in stack traces for the mujs code? i tried make debug, and i'm quite sure the debug libs were used (i removed the global installed mujs lib, pointed the pc file to my debug build dir, and verified that if i rename the debug lib then mpv fails to link), but i still only get function names in backtrace11:31.46 
  for the mujs parts tat is. for mpv parts i do get line numbers inside functions too11:32.43 
  hmm.. actually, even if line numbers did work, they might be in one.c meh11:33.16 
  i really dislike that one.c . it's ok to generate it if someone wants to embed in their project easier, but it's a PITA everywhere else11:34.02 
tor8 avih: did the cc/clang build command include -g?11:34.43 
  line numbers will track over #include from one.c11:35.02 
avih i could check, but libmujs.a did end up 3 times bigger when i used make debug rather than just make11:35.15 
tor8 one.c allows function inlining over module borders -- which can be a significant speedup11:35.23 
  of course it's going to be 3x bigger -- it includes debug symbols and mappings to line numbers11:35.59 
avih supposedly, but it didn't show up in gdb. it was a reply to your "did it have -g?"11:36.30 
tor8 the default if you just type 'make' is a release build with optimizations and no line numbers11:36.51 
avih of course, but i ran "make debug", the lib did end up bigger, so everything pointing to it being a correct debug lib, yet line numbers didn't show up.11:37.26 
  maybe it was stripped elsewhere?11:37.32 
tor8 possibly the final link step doesn't include -g so would not include the debug symbols11:38.11 
avih would libmujs.a still end up 3 times bigger on such case?11:38.43 
  or you mean for mpv? but mpv functions do show with line numbers at the very same trace. the line numbers dissappear halfway through the trace when the code enters mujs11:39.31 
tor8 on my 64-bit linux, build/debug/libmujs.a is 1.1M and build/release/libmujs.a is 631K11:40.15 
avih on alpine 64 (musl) it's 450k and 160k, respectively11:41.11 
tor8 yeah. gcc/clang has really bloated debug info.11:41.23 
avih it is gcc, just without glibc11:41.39 
tor8 the final linked binary 'mujs' shell is 514K vs 208K for debug/release11:41.51 
avih sec. let me fire up that machine11:42.07 
  however, the size wasn't a point. it was just an indication debug is being built.11:42.47 
tor8 I get line numbers in build/debug/mujs11:43.49 
  gdb build/debug/mujs ; break js_throw ; run ; type in some garbage ; bt11:44.18 
avih will check. in sec11:44.36 
  here 'mujs' is 450/180k for debug/normal. but anyway, it's been said before (by some), the size doesn't matter :)11:46.51 
  it was just an indication.11:46.57 
tor8 avih: yes, I get that. I suspect some step in the build either strips or neglects the -g.11:48.29 
avih (minor spam)11:48.50 
  Breakpoint 1 at 0x25c31: file jsrun.c, line 1207.11:48.52 
  (gdb) run11:48.52 
  Starting program: /home/user/dev/mpv/mujs.github/build/debug/mujs11:48.52 
  warning: Cannot call inferior functions, Linux kernel PaX protection forbids return to non-executable pages!11:48.52 
  Warning:11:48.52 
  Cannot insert breakpoint 1.11:48.53 
  Cannot access memory at address 0x52be8373c3111:48.54 
  i'll try to boot the non hardened kernel. i did try it before and it segfaulted there too.11:49.32 
  anyway, yes, i do get line number when breaking on js_throw.11:53.29 
sebras Hejsan: given the error you see it seems like either the jbig2 image inside your pdf is broken or perhaps that the jbig2dec library used to decode it has a bug in it that manifests differently on the two computers (perhaps one is 64 bit and one 32 bit?).13:34.00 
  Hejsan: if you can reproduce this and are able to share the document you might want to report a bug at bugs.ghostscript.com and attach the document allowing someone to try to reproduce this and see if we can spot any bugs.13:35.06 
malc_ It must be really perplexing to be a swede and address someone with a nick Hejsan13:35.53 
Hejsan Not allowed to share the document unfortunately13:36.25 
  both 64 bit13:36.35 
  I saw that I was missing the "Type1/Symbol" font on the one that crashed, however since it shouldn't use system fonts I'm not sure if installing it will fix anything - waiting until office hours are over until i test13:37.23 
  https://gist.github.com/anonymous/7da5853ba8550de7cdce3e9edd750244 here is the stacktrace13:40.23 
  As you can see I'm using jmupdf which is based on an old verison of mupdf13:40.36 
  license issues13:40.46 
kens Oh, well you're probably stuffed then. We have done a load of fixes in jbig2dec which might well have resolved this, but they won't (obviously) be in an old version of the code13:41.30 
Hejsan I understand13:42.06 
kens That stack trace looks like memory corruption to me13:42.13 
  Because its crashing trying to free the memory buffer13:42.29 
Hejsan What would that mean, broken RAM?13:42.59 
sebras Hejsan: also I noticed that mupdf's callback printing the error message is not used, instead jbig2dec's standard callback is used. nowadays we use our own callback indicating that there are likely a lot of changes in the jbig2dec related code.13:43.56 
kens No, it measns that there was a bug in the code which overran (or possibly underran) a buffer, corrupting memory. When you try to free it, the memory manager tries to use some of the structure members, which have been overwritten, causing a crash13:43.59 
  sebras is one of the developers who fixed stuff i that library :-)13:44.52 
malc_ kens: I think it's free code in glibc noticing lack of sentinel or some such, not that it(free) follows some rogue pointer13:45.20 
sebras Hejsan: btw, what are the licensing issues you are facing?13:46.59 
kens malc_ it depends if its *our* memory manager....13:47.12 
  and that probably depends on what version of jbig2dec and MuPDF is in use13:47.28 
sebras kens: malc_: it most likely isn't our allocator since I committed a patch to set jbig2's custom allocator august 6th...13:48.23 
kens overwriting sentinels (or guard bands) doesn't usually lead to a segv though, so I sort os suspect it might be our memory management13:48.23 
Hejsan sebras: jmupdf is based on the told mupdf still on GPL13:48.27 
avih tor8: well, it doesn't crash in valgrind... and it doesn't show errors (1) which don't show when not trying to use mujs. i've reduced the mujs related code in mpv to the bare minimum (js_newstate, js try, ploadstring of some regex -> crash), and it still crashes when running directly or in gdb, but not in valgrind. the crash happens inside js_pcall13:55.36 
sebras Hejsan: I understand. you (or your customer) is developing a web service rendering PDFs and you are not able to release the source under AGPL (or compatible). please be adviced that this old jmupdf version will only grow older and will not receive any bug fixes. if this is something you (or customer) depends on then you might want to consider getting a commercial license of mupdf.13:56.19 
avih ploadstring succeeds as well as the following pushglobal, but then js_pcall(0) crashes13:56.29 
  js_call (inside an outer js_try) crashes too without reaching the "catch" of my js_try13:57.02 
sebras Hejsan: at least this is the situation as I'd expect it to be. :)13:57.18 
Hejsan sebras: it is under consideration & discussion13:57.48 
kens Hejsan you could always *try* the current code to see if it resolves your problem. If it doesn't then you can report a bug, if you can find a file which you can release. Note that we can mark bugs private in our tracker which makes them unavailable publicly (but obviously we can see them). If you want to do that please talk to one of us first.14:00.41 
sebras Hejsan: excellent, just to make sure you know, please contact sales@artifex.com for negotiating commercial licenses.14:00.42 
Hejsan sebras: already contacted, could not get a quote without giving away too many details about our company that our founder is willing to - I'll keep pitching it, failing files will only help my case14:02.28 
kens Especially if they don't fila in current code ? :-D14:02.59 
Hejsan kens: Will do, the thing is the file is working fine on a similar setup (obv I'm missing something)14:04.35 
sebras Hejsan: it might be that, despite the bug existing on both systems, the memory allocation on one system is more forgiving because of pure chance (where buffers are allocated etc). so it might not be something you can easily influence.14:05.59 
kens Hejsan like I said, it may be very slight differences in teh memory layout when the executable is run. Memory corruption bugs are like that, they are the devil to track down and fix. Changing anything often makes them go away14:06.43 
kens reads what sebras wrote and nods14:07.01 
  Recently we've been working hard to resolve these kinds of problems in all our products.14:07.52 
Hejsan sebras: kens: thank you, sorry for bothering you with already solved bugs14:08.04 
avih tor8: does regex match abuse the stack?14:08.07 
malc_ jbig2 detected an error, then malloc printed an error, and the backtrace is pivoted on free... fun fun14:08.18 
kens Hejsan, always welcome to chat14:08.21 
  If we're busy we just won't answer :-)14:08.32 
Hejsan :)14:08.37 
avih one thing with musl's default (and alpine, to a slightly lesser degree) is that it has relatively small stack size by default14:08.58 
malc_ avih: small = ?14:10.15 
sebras Hejsan: maybe you can test your document in the latest desktop linux or android apps and see if it works consistently there. perhaps that helps your case. :)14:10.51 
  Hejsan: MuPDF is availabe as an app on Google Play.14:11.39 
avih malc_: i don't recall. iirc glibc gives 2M or 8M by default, and mush _iirc_ ~80k, and alpine uses a bit higher default14:13.41 
malc_ avih: I see, thanks (FWIW it's 8M)14:14.37 
tor8 avih: the regexp matcher only recurses for lookahead rules, everything else is a flat loop14:14.39 
sebras tor8: the patches on sebras/master have been reworked a bit. are they to your liking now?14:14.41 
  tor8: most importantly the first one fixing a reported bug of course.14:14.54 
tor8 avih: that's the (?= and (?! stuff14:15.12 
  avih: if your regexp doesn't use those constructs, matching is just a loop with no stack use14:16.15 
avih tor8: the only two patterns i tested are, specifically: ("polyfills").replace(/\\/g, "/") and ("polyfills").match(/^~~/) and both crash14:16.54 
  only in mpv. and it's right at the begining of the thread run.14:17.25 
  and relatively complex scripts run just fine. i don't think it's the stack, but i'm out of ideas. gonna try to build an older musl now.14:18.01 
tor8 avih: the only thing I can do is reinforce my suggestion to use valgrind.14:18.47 
  or do a make build=sanitize14:19.04 
avih tor8: i said i did, it didn't show up errors, except the one i posted above, which seemingly is a general musl thing which i do not need to worry about14:19.28 
malc_ avih: btw. just tried alpine iso.. ulimit -s claims 8M14:19.36 
tor8 avih: ah, sorry, I missed your message in the backlog :)14:19.57 
avih no :)14:20.03 
  malc_: i said i don't think it's the stack. much bigger (mujs) things run fine. but it's another vector to consider.14:20.26 
  malc_: are you sure that's not 8k? i see $ ulimit -s --> 819214:21.55 
  (i don't know what's the unit)14:22.20 
malc_ avih: unit is kilobytes14:23.33 
avih hmm.. so it definitely shouldn't be the stack then14:23.47 
  malc_: quiting the main maintainer of alpine: "<ncopa> avih: its not ulimit -s by default wiht musl, its 80k by default."14:27.57 
  quoting*14:28.05 
malc_ avih: dallias?14:28.52 
  uhm.. no dallias is the nick of the author of musl14:29.51 
avih malc_: he's musl's maintainer. ncopa is alpine's14:30.04 
malc_ avih: raises the question why would ash lie (ulimit is a builtin).. but i guess the i'd better ask on #musl|alpine14:34.24 
avih i'm using bash, but which ulimit gives empty output, so not sure where it's defined. possibly busybox' ash indeed14:35.32 
malc_ avih: `ulimit -s' or -a if you want all limits14:36.36 
  avih: ulimit is a builtin in both (ash/bash) and returns the same 8192 (kilobytes) in both under alpine/musl14:38.54 
avih fwiw, preliminary results shows it might have to do with the stack size. it fails with the default or explicit 80k, and doesn't fail with 512k14:39.51 
  possibly the regex function match abuses the stack in unexpected ways imo.14:40.07 
  but that's a very preliminary guess14:40.15 
  tor8: my experiments suggest that a stack size of ~280k (270*1024 still fails) is the minimum for the following inside js_try: js_loadstring(J, "@/test.js", "('polyfills').match(/^~~/);"); js_pushglobal(J); js_pcall(J, 0);14:56.19 
  does that sound reasonable to you?14:56.29 
tor8 avih: try reducing the MAXTHREAD constant in regexp.c14:56.52 
  gotta go, back in a couple of hours14:56.57 
avih k, thx14:57.04 
  wait, all the threads use the stack of the thread which called it?14:57.25 
malc_ avih: all15:01.08 
avih tor8: reduced from 1000 (?) to 10 and now it works with a 80k default stack. well, part 1 done, we know where it comes from.15:01.11 
  malc_: btw, considering musl is relatively lean, if it needs 280k stack before it gets to the first line of c-code at match() (and before spawning up to 1000 threads), it's unlikely to be useful on more constrained systems, such as embedded ones with little ram.15:26.23 
sebras Robin_Watts: do you mind reviewing e73593152 on sebras/master?15:28.41 
  Robin_Watts: it fixed a reported bug.15:28.46 
malc_ avih: not sure i understand what you are saying15:29.54 
avih hmmm Rethread[1000], where each item is 2 pointers and a struck of int and 32 pointers. so yeah, that can be a lot15:30.13 
  malc_: mujs is designed to fit with embedded systems as far as i understand, and this specific case suggest it might not be very usable on such cases15:30.52 
  hmmm.. 1000 * 35 (mostly pointers) * 8 bytes per value (64 bits system) is exactly 280000 bytes. pretty much my observation. good confirmation.15:32.35 
malc_ avih: oh, i have nothing to do with mujs, nor do i know the expected setting and audience... tor8 is your man15:32.37 
avih malc_: i know he is :)15:33.00 
malc_ avih: you should use v8... or maybe not15:34.25 
avih why would i? i embed mujs in a video player. i like that it' small.15:34.45 
malc_ avih: word of caution.. tor8 likes lua much more than he likes js...15:35.13 
avih but i bet node doesn't fail on a simple regex on alpine ;)15:35.21 
  i can do lua, but prefer js...15:35.46 
malc_ mpv already has lua, yet you want to breed it with js.. g'd luck15:36.16 
avih malc_: it's already in 0.26 released some weeks ago15:37.10 
  (and was in for about 3 years in my fork, which didn't get merged due to mujs' prior license)15:37.31 
  (mpv)15:37.47 
malc_ avih: hopefully it will not affect me much.. i have only 17 lines of lua in my .config/mpv/scripts/my.lua15:38.40 
avih malc_: i suggest gining js scripting a try https://mpv.io/manual/master/#javascript15:39.20 
  (i changed the dir name from mpv/lua to mpv/scripts before we were about to merge the js support some years ago, but then we had the mujs license issues after i merged the renamed folder name)15:40.19 
malc_ heh, the only script i have there is to workaround an issue with intel opengl.. it wasn't even written by me, someone suggested the idea in the bugtracker, i just did cosmetic changes..15:41.57 
avih now is the time to port it to js :)15:42.57 
malc_ can't say that i'm a fan of scheme in c disguise either, so uhm... no, not untill i'm forced to15:43.44 
avih well, scripting in mpv is extremely powerful. you can even sort of use it now as a noce.js replacement, since it has a compliant 'require', but regardless, you can do a _lot_ of things with scripting in mpv15:44.53 
  node.js*15:45.03 
malc_ avih: my needs are rather basic15:52.22 
avih sex and drugs?15:52.36 
malc_ food and excretion15:55.12 
avih imagine how much better your life could have been with some juicy mpv scripting!15:56.23 
malc_ avih: but lua already covers that16:21.03 
 Forward 1 day (to 2017/08/11)>>> 
ghostscript.com #ghostscript
Search: