| <<<Back 1 day (to 2013/12/18) | 2013/12/19 |
Robin_Watts | kens, chrisl... | 11:24.34 |
kens | pong | 11:24.39 |
Robin_Watts | You know the " **** File has unbalanced q/Q operators (too many Q's) ****" message | 11:24.53 |
| ? | 11:24.55 |
kens | Yep | 11:24.57 |
Robin_Watts | is there any chance we could make it so that it only gives that message once per file? Or once per page at most ? | 11:25.18 |
kens | Only with difficulty | 11:25.33 |
Robin_Watts | I tested a fix last night, and the cluster fell over due to oversized logs. | 11:25.54 |
| And the cause for that was that line (which was wrong, btw) being spewed 8 billion times. | 11:26.17 |
kens | Its caused by catching an error from restore/grestore as I recall, so we'd need to set a flag, and reset it on every page content stream | 11:26.25 |
Robin_Watts | and you can't see the logs when they are oversized, or even what files caused them. | 11:26.32 |
kens | Robin_Watts : the unbalanced q/Q may be caused by some other error in the file | 11:26.47 |
Robin_Watts | kens: That sounds like fairly minimal pain compared to the pain of 8 billion repeated lines. | 11:27.04 |
kens | If we abort a stream then the save stack is wrong, and you get that error, even though the q/Q may match | 11:27.10 |
| Robin_Watts : if you want to code for it, be my guest | 11:27.25 |
Robin_Watts | That line is produced by the postscript ? | 11:27.40 |
kens | Its produced by the PDF interpreter | 11:27.56 |
Robin_Watts | Right. | 11:28.06 |
| Do we have any global state at that point? | 11:28.17 |
| Can't we just have a flag "I have spewed this meaningless drivel already" and check/set that before printing ? | 11:28.46 |
kens | Yes, but off the top of my head I don't know what, or how you would access it, I'd have to go look | 11:28.48 |
| Robin_Watts : and where would you reset it ? | 11:29.01 |
chrisl | I'll have a look - I'm at a bit of a loose end just now | 11:29.10 |
Robin_Watts | I don't care about resetting it, frankly :) | 11:29.14 |
kens | is looking at a memory problem atm | 11:29.21 |
Robin_Watts | But possibly we could reset it when we print the "This file was created by Bob the Builder" ,message. | 11:29.42 |
kens | call the PDF interpreter with PDFSTOPONERROR and you won't get this | 11:29.52 |
chrisl | I wonder why we issue the warning in two different places..... | 11:31.11 |
kens | Probably tow different contexts | 11:31.23 |
chrisl | Oh, I see why..... | 11:32.02 |
| One is for too many q's and one for too many Q's | 11:32.17 |
kens | OK | 11:32.26 |
Robin_Watts | kens: That's not an acceptable answer, IMHO. | 11:37.15 |
| That's a workaround for a deficiency in our software. | 11:37.23 |
kens | what ? | 11:37.24 |
Robin_Watts | using PDFSTOPONERROR | 11:37.36 |
kens | the pesaence of the4 warning is a work-around for broken PDF files | 11:37.41 |
| presence | 11:37.45 |
Robin_Watts | Giving the warning once is a feature. | 11:38.02 |
| Spewing it endlessly is a bug. | 11:38.08 |
kens | Robin_Watts : I really don't care, and I'm busy with somethign else. If you want to change ti go ahead, I'mnot prepared to argue with you about whether its acceptable or not. THat's the way it is, and the way it has been for ages | 11:38.45 |
Robin_Watts | kens: Sure. I'm not expecting you to drop stuff to deal with this. | 11:39.06 |
| But "that's just the way it is" is a song by Bruce Hornsby, not a valid argument for something being right. | 11:39.46 |
kens | refuses to be drawn into an argument | 11:40.06 |
chrisl | I believe I have a solution | 11:40.40 |
Robin_Watts | chrisl: Ah, excellent. Thanks. | 11:40.58 |
chrisl | Luckily we have a joyously awful custom operator called ".forceput", which lets us store values in read-only dictionaries | 11:41.45 |
| So the question is, should the message be once per problem page, or once per problem file? | 11:43.00 |
kens | Per page | 11:43.10 |
chrisl | In which case, it's even simpler..... | 11:44.16 |
Robin_Watts | Either Per Page or Per Stream or Per File would be fine, I think. | 11:44.31 |
chrisl | It's per stream now, that's what you're complaining about | 11:44.56 |
Robin_Watts | oh, ok. | 11:45.10 |
| Then Per Stream is bad :) | 11:45.18 |
chrisl | Okay, now I need to break a file to test this | 11:48.20 |
Robin_Watts | chrisl: I can give you such a breakage. | 11:48.45 |
kens | just decompress almost any file and white space a 'q' | 11:48.46 |
chrisl | Robin_Watts: is it the one that borked the cluster? | 11:49.08 |
Robin_Watts | Yes. | 11:49.13 |
chrisl | That'll do! | 11:49.18 |
Robin_Watts | In gdevp14.c search for if (!sep_target) | 11:49.35 |
| should be around line 7100 | 11:49.50 |
| After that if, we do: code = pdf14_update_device_color_procs_push_c | 11:50.13 |
| the value of code is never used, which is just as well, because if you use it you get an infinite number of those messages. | 11:50.41 |
chrisl | With what test file? | 11:51.04 |
Robin_Watts | so I added if (code < 0) return code; (and appopriate { } ) and that was what triggered the cluster to die. | 11:51.07 |
| The command line is: | 11:51.11 |
| gs/debugbin/gswin32c.exe -r300 -o out.pbm -sDEVICE=pbmraw ../ghostpcl/tests_private/pdf/sumatra/1900_-_cairo_transparency_inefficiency.pdf | 11:51.31 |
chrisl | Hrm, no, I don't - it runs to completion for me...... | 11:54.56 |
Robin_Watts | hmm. | 11:56.20 |
chrisl | possibly depends on other changes? | 11:58.08 |
Robin_Watts | possibly. I will investigate some more. | 12:00.13 |
| Do you want to give me your patch to test? | 12:00.37 |
kens | If I may suggest ? Send Robin the patch and let him test it | 12:00.40 |
| Or take any PDF file with multiple streams per page, and hack out one q/Q from teh first stream | 12:01.22 |
chrisl | I'd like to test it at least on a simple case first | 12:01.25 |
Robin_Watts | chrisl: I have a plumber here at the moment, so may lag, but I'll gladly do any testing you feel is required. Thanks for this. | 12:04.53 |
chrisl | Robin_Watts: okay, I'm struggling to break a file so it issues lots of warnings, so..... | 12:05.26 |
Robin_Watts | Oh, I bet the error reported there gets ignored higher up in the current code. | 12:11.39 |
| I've added a load of code to propogate such errors. | 12:11.58 |
| that would explain it. | 12:12.08 |
chrisl | No, I've got it now - the code I was working with was *way* out of date | 12:12.20 |
Robin_Watts | oh, ok. | 12:13.11 |
chrisl | Mind, it's taking a *hell* of a long time to run. We may have hit some kind of infinite loop :-( | 12:15.36 |
kens | The cluster kills infinte loops IIRC | 12:16.15 |
chrisl | Yep, it's just it points to more being wrong here than spewing a lot of warnings | 12:17.54 |
kens | I htink that's true anyway, the code is broken if you return that. I think Robin was just complaining that the logs got broken. | 12:18.31 |
| TO my mind its not terribly worrying which way the error shows up, but it seemed to bother Robin | 12:18.53 |
chrisl | TBH, it's bothered be before - not with the cluster, but with my own scripts. I don't like the test reports eating up *all* my disk space! | 12:19.52 |
| s/be/me | 12:20.00 |
kens | I don't log the output, so it doesn't worry me | 12:20.55 |
chrisl | I'm often looking for errors or warnings.... | 12:22.14 |
Robin_Watts | chrisl: Are you cluster testing with that extra thing in to break it ? | 12:22.59 |
chrisl | Robin_Watts: no, I'm just having a think about how I've implemented this | 12:23.33 |
Robin_Watts | plumber has fixed heating (I hope). | 12:28.19 |
Robin_Watts | heads for run, bbs. | 12:28.26 |
fnodeuser | hello, are there any plans to improve the situation with pdf files with filesizes of over 500 MB? the sumatrapdf developer said that it depends on you, the mupdf team, to solve the problem of the high memory usage | 12:34.04 |
tor8 | Robin_Watts: I think I've figured out why commit da277059b37380d57028ff79a636f4d725c96e8f is broken | 12:59.10 |
| what happens to your quantisation tricks if the coordinates are negative? | 12:59.29 |
tkamppeter | chrisl, kens, I want to modify the gdevcups,c so that one can use both "cups" and "pwgeaster" as device name (and if the user uses the latter, PWG Raster mode should be selected). | 13:31.20 |
kens | I'm not sure what your quesation is Till | 13:33.03 |
| If you want two names I think you need 2 devices, but the code body can be identical | 13:33.18 |
tkamppeter | chrisl, kens, problem is that I have to duplicate the huge data structure gs_cups_device to appear also as gs_pwgraster_device and I tried with macros and this did not really work. | 13:34.16 |
| kens, I can e-mail you the files, so that you can see the problem. | 13:34.49 |
kens | Yes that's what I do with pdfwrite/ps2write/eps2write | 13:34.51 |
tkamppeter | kens, if I have the huge structure twice in the code, all works, if I use macros, it does not build. | 13:35.39 |
kens | tkamppeter then that sounds like a macro problem, since the expansion of hte macro should be identical ti having the code there | 13:36.05 |
| THe wya pdfwrite does this is to have the structure defined in an include file (gdevpdfb.h) and includes it multiple tiems. | 13:38.13 |
| Not the structure definition, but the instantioation of the device which uses the structure | 13:39.40 |
tkamppeter | kens, I have sent you the files by mail now. | 13:40.54 |
kens | Till its not really my area, more like Chris's | 13:41.11 |
tkamppeter | chrisl, are you around? | 13:41.31 |
kens | probably at lunch | 13:41.45 |
chrisl | I'm eating. I'm not especially good with device definitions - mail me the files, and I'll look this afternoon | 13:42.21 |
tkamppeter | chrisl, done. | 13:56.07 |
Robin_Watts | tor8: Dunno. Will look in a mo. | 14:05.38 |
tor8 | Robin_Watts: I think the "Reassemble the complete transform" step is where things go pear shaped | 14:09.19 |
| if I revert the quantisation logic to what we had before, it all comes together. but my cold is making my head muzzy enough that I can't quite grasp the details of the rounding and truncation combinations your new stuff does. | 14:09.53 |
| Robin_Watts: the "fixed" reversion of logic is up at tor/foo | 14:10.26 |
| but that doesn't have the nice properties of the rounding you added | 14:10.48 |
Robin_Watts | tor8: leave it with me for a bit. | 14:11.05 |
chrisl | tkamppeter: check your mail...... | 14:38.20 |
tkamppeter | chrisl, thank you very much. | 14:41.46 |
chrisl | tkamppeter: no problem | 14:42.18 |
tkamppeter | chrisl, did a quick check, works. | 14:45.53 |
chrisl | tkamppeter: cool. It is a very stupid restriction on the pre-processor :-( | 14:46.48 |
henrys | Robin_Watts: supernatural season 9 underway | 14:58.51 |
Robin_Watts | henrys: I've seen 8, but 9 hasn't been shown here yet. | 15:10.17 |
| http://pdfliberation.wordpress.com/2013/11/15/hackathon/ How come we aren't mentioned? | 15:17.38 |
chrisl | Robin_Watts: I accidentally cluster tested with the gdevp14.c change, and it ran through, and the cluster didn't complain, so I think that's a good sign | 15:18.01 |
Robin_Watts | chrisl: Your last cluster test had many files starting to produce errors, didn't it? | 15:20.00 |
| oh, but that's probably because of the gdevp14.c change. | 15:20.26 |
chrisl | Oh, crap, no I didn't see that - how come it didn't come up before? I better revert that :_( | 15:21.05 |
paulgardiner | henrys: Linda and I watched s09e09 last night, but now have to endure the xmas break. | 15:21.36 |
Robin_Watts | chrisl: You haven't pushed yet, have you? | 15:22.13 |
chrisl | No, not pushed - I was worried I had, though | 15:23.30 |
Robin_Watts | mvrhel_laptop: The patch I sent last night for your consideration... | 15:23.41 |
| 1) The line you were worried about was actually the line that started the whole thing - I got a warning from the cluster telling me that the value assigned there was never used. | 15:24.16 |
| 2) The cluster tests failed when I tested it, because of a bug in the pdf14 stuff that caused an infinite amount of crap to be spewed to the logs. Chris has kindly been fixing the infinite amount of crap, but the pdf14 bug remains. | 15:25.12 |
| I'll get you full details of that problem in a mo, but the short version is that pdf14_update_device_color_procs_push_c can be called with an unknown group_color and we get a rangecheck back. | 15:26.47 |
| We only survive that at the moment, because we ignore the return value. | 15:27.01 |
henrys | paulgardiner: how long can it drag on? | 15:29.01 |
paulgardiner | I don't know. They've shown several "last" series. I'd guess (average life expentency - Dean's current age) / series duration | 15:34.55 |
Robin_Watts | henrys: The only 'bad' series I remember was the start of 6, while they struggled to get it back on track after tying it up so nicely at the end of 5. | 15:47.05 |
| paulgardiner: Scott has just forwarded a question about mupdf annotations to support. | 15:53.37 |
| Do you want to handle that, or should I ? | 15:53.47 |
paulgardiner | Robin_Watts: okay ta | 15:53.50 |
mvrhel_laptop | Robin_Watts: ok. if you can give me a file that has the issue I will look into it | 16:06.41 |
| or is it on the dashboard someplace | 16:07.00 |
Robin_Watts | mvrhel_laptop: gs/debugbin/gswin32c.exe -r300 -o out.pbm -sDEVICE=pbmraw ../ghostpcl/tests_private/pdf/sumatra/1900_-_cairo_transparency_inefficiency.pdf | 16:07.11 |
| I believe there are many files like that. | 16:07.29 |
mvrhel_laptop | is there a better one than a cario one.... | 16:07.30 |
| cairo | 16:07.34 |
Robin_Watts | That's just one example. | 16:07.39 |
| That's the only one I actually know about. | 16:07.45 |
mvrhel_laptop | those are such badly set up files | 16:07.48 |
Robin_Watts | When chris gets his patch in, I can rerun and get a list of files that have the problem, probably. | 16:08.23 |
mvrhel_laptop | ok | 16:08.25 |
| I will look at that file | 16:08.35 |
Robin_Watts | We can easily ignore it until chrisl's patch goes in. | 16:08.50 |
| I know you are still battling the clist issues. | 16:08.58 |
| clist/transparency/pattern/etc | 16:09.10 |
mvrhel_laptop | ok. yes. I would like to chat with ray about this. any idea how he is doing? | 16:09.26 |
Robin_Watts | I haven't spoken to him since yesterday. He seemed OK, but I think the novelty has worn off. | 16:09.53 |
| tor8: Testing with -ve values for e and f seems to give the right results for me. | 16:39.01 |
| Or at least, they look right to me in my test program. | 16:39.11 |
| I wonder if it's a and/or d need to be -ve for it to go wrong? | 16:39.36 |
| tor8: Can you give me a file where this causes a problem please? | 16:58.28 |
tor8 | Robin_Watts: pdfref17.pdf, rotate left (with 'a' key) 5 times | 17:03.29 |
chrisl | Robin_Watts: I've pushed the q/Q change - I need to head out, but I'll check the logs when I get back just in case | 17:04.13 |
Robin_Watts | tor8: I see it, thanks. | 17:04.31 |
| I was looking for it in orthogonal text :) | 17:04.52 |
tor8 | right. it shows up in pretty much any text once you rotate left 90 degrees (which makes the y coordinates negative) | 17:05.33 |
Robin_Watts | tor8: To get a 90 degree rotation, that's 6 presses. | 17:06.32 |
tor8 | I expect just translating a page so that it ends up in negative coordinate space should work, but since we 'normalise' page coords so that they start at 0,0 and extend in the positive direction, that'll be trickier to test | 17:06.44 |
Robin_Watts | Just to be clear are you saying I should be able to see it at 90 degrees ? | 17:06.51 |
tor8 | Robin_Watts: yes. you should be able to see it at 90 degrees (as long as it's in the left-rotated quadrant) | 17:07.09 |
Robin_Watts | yeah, maybe in november. | 17:07.19 |
tor8 | you should see very uneven character spacing | 17:07.22 |
Robin_Watts | OK. Thanks. | 17:07.29 |
tor8 | go forward to page 2 | 17:07.33 |
| and it's stunningly obvious | 17:07.40 |
Robin_Watts | ew, yes. | 17:07.48 |
| OK. I can work from this. Thanks. | 17:08.04 |
ray_laptop | morning, mvrhel_laptop | 17:13.38 |
| I'm home now | 17:13.45 |
mvrhel_laptop | hi ray_laptop | 17:13.51 |
| ok I am stepping through the clist writing phase right now | 17:14.02 |
| seeing when the cmd put colors occur and what band they go it | 17:14.15 |
| go in | 17:14.17 |
| hold on a sec | 17:14.20 |
ray_laptop | mvrhel_laptop: so which file are you on (I inderstand that 'simple' works OK | 17:14.26 |
mvrhel_laptop | easy5 | 17:15.03 |
| ray_laptop | 17:15.13 |
| and then using your command line | 17:15.14 |
ray_laptop | mvrhel_laptop: thx | 17:15.16 |
mvrhel_laptop | put a counting break point at | 17:15.29 |
| gx_pattern_cache_ensure_space line 2046 in gsptype1.c for 22. 23 is the "extra read of the pattern" | 17:16.24 |
| excuse me. "using your command line" ? why did I type that | 17:16.57 |
| using visual studio i mean | 17:17.15 |
| ray_laptop ^^ | 17:17.23 |
Robin_Watts | mvrhel_laptop: Presumably you meant "using the command line options given in the bug" | 17:17.48 |
ray_laptop | mvrhel_laptop: oops. I need to git pull this sandbox. just a minute. | 17:18.04 |
mvrhel_laptop | oh yes that is what I meant | 17:18.17 |
ray_laptop | rebuilding after update... | 17:21.03 |
Robin_Watts | We are now listed on the hackathon page. (http://pdfliberation.wordpress.com/2013/11/15/hackathon/) | 18:19.07 |
| http://www.quietroom.co.uk/santa_brandbook | 18:38.00 |
ray_laptop | Robin_Watts: the description seems a bit limited. I wonder who wrote that ? It also doesn't mention any of the apps or anything other than 'basic text extraction'. I suspect that we'd get more interest with a better description | 18:46.04 |
Robin_Watts | ray_laptop: The author of the site wrote the description. | 18:46.49 |
| The hackathon is specifically about pdf text extraction, hence him focussing on that. | 18:47.28 |
| I pointed him at the xml output modes of the text extraction, but he said "hmm, doesn't seem like it would be easy to build anything from that." | 18:47.58 |
ray_laptop | someone (Robin?) should contact them and give a bit more info. For instance, svg output and image extraction, and rendering to popular formats with full transparency. But... | 18:48.35 |
| Robin_Watts: oh, well. | 18:49.07 |
Robin_Watts | I countered with "I would imagine that anyone trying to do any serious text analysis would need at least that much information, and XML is easy to work with with free tools out there" | 18:49.14 |
| ray_laptop: Yeah, the non text extraction based features aren't going to get mentioned on this page as they aren't relevant. | 18:49.46 |
mvrhel_laptop | ray_laptop: I am back | 18:50.27 |
Robin_Watts | ray_laptop: I think if we decided to sponsor them we could set our own challenge. Probably within that we could specify the use of MuPDF ? | 18:55.05 |
mvrhel_laptop | ray_laptop: so sure enough, on the last image, when it does the write out to band 14 it writes out to all the bands | 18:55.06 |
| ray_laptop: and *NOT* doing the write out to all bands fixes the issue | 18:56.15 |
Robin_Watts | mvrhel_laptop: That sounds like progress. | 18:59.16 |
mvrhel_laptop | yes | 18:59.24 |
| ray_laptop: so pre->nbands = 1 for band 17 when we write out the full pattern. later when we do band 14 pre->nbands = 5 so we decide to write out to all bands | 19:21.32 |
| either one of two things needs to be fixed here. 1) we need to avoid writing out the same pattern to the same bands or 2) I need to figure out a better place to store the pointer to the group buffer that the tile is filling | 19:22.54 |
| so that I can restore it | 19:23.03 |
| when we blow away the current tile | 19:23.11 |
| 1) seems like a better approach to me, | 19:23.39 |
| actually, getting it to simply write out to all bands when it does 17 would seem to make sense | 19:27.52 |
| ray_laptop: that also fixes the issue | 19:29.24 |
| essentially, a problem is that we start the image, writing out the pattern to a single band, then later decide we will also write it out to all the bands | 19:30.04 |
| so I think the test | 19:30.30 |
| if (!all_bands && dc_size * pre->nbands > 1024*1024 /* arbitrary */) | 19:30.32 |
| all_bands = true; | 19:30.34 |
| with the use of pre->nbands needs a little work | 19:30.49 |
| ray__laptop: I have to head out for a bit. bbiaw | 19:32.43 |
Robin_Watts | marcos: I'm guessing that you're playing with the cluster with a new user called robin.mhw ? :) | 20:33.54 |
| tor8: (For the logs) The subpixel rendering thing is cured with http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=0d837b4a07681331648016cca7093402df68ca9f | 20:34.38 |
| It turns out that (int)-98.5 = -98. Wheras floorf(-98.5) = -99. | 20:35.02 |
| The actual subpixel adjustment code was correct. | 20:35.13 |
| It was the code after that that was wrong. The subpixel adjustment just had the effect of pulling more things onto an integer boundary which caused the problem. | 20:35.51 |
mvrhel_laptop | hi ray_laptop | 21:54.19 |
| did you see my comments from earlier? | 21:54.26 |
ray_laptop | "conibuation" -- that's a new one. Someone needs a spell checker ;-) | 23:56.45 |
| Forward 1 day (to 2013/12/20)>>> | |