| <<<Back 1 day (to 2012/06/12) | 2012/06/13 |
sebras | and now rebased on top of a fetched master. *sigh* I should have a githook that tells me if my master is behind origin/master... | 00:35.25 |
Derelict | Hi all. Trying to use gs to convert a tiff to PCL for a macro. When I run this | 00:45.26 |
| gs -sDevice=pcl3 -sColorModel=Gray -s/tmp/OutputFile=9101-header-grayscale.pcl -dNODISPLAY 9101-header-grayscale.ps | 00:45.26 |
| I never get an output file. Any thoughts? | 00:45.26 |
| That's not what I'm running. I'm running -sOutputFile=/tmp/filename | 00:45.59 |
| gs -sDevice=pcl3 -sColorModel=Gray -sOutputFile=/tmp/9101-header-grayscale.pcl -dNODISPLAY 9101-header-grayscale.ps | 00:47.00 |
| All I get is the GS> prompt. If I don't include -dNODISPLAY I also get an X11 window with the ps file displayed. | 00:48.31 |
| Mac OS X, ghostscript from macports. | 00:49.59 |
alexcher | Derelict: You need -dNOPAUSE -dBATCH to suppress all prompts. | 01:03.42 |
| Derelict: You also need to have showpage operator somewhere. | 01:04.36 |
| Derelict: -dNODILPLAY options selects an output device that produces no output. You need a different device. | 01:06.34 |
| Derelict: gs options are case-sensitive. You need -sDEVICE= | 01:07.29 |
Derelict | let me try that. | 01:07.48 |
alexcher | Derelict: s/-dNODILPLAY/-dNODISPLAY/ | 01:08.40 |
Derelict | There it is. With -sDEVICE=pcl3 and removing -dNODISPLAY I get the file on showpage. thanks a bunch. The onld web pages I was looking at had -sDevice and I didn't notice the all caps in the man pags. | 01:10.28 |
chrisl | kens: I've found the problem with that vertical writing issue in pdfwrite...... | 07:34.42 |
| zchar42_set_cache() does the same fake vertical metrics "trick" that FAPI_do_char() used to :-( | 07:35.36 |
kens | That was quick chrisl | 07:35.51 |
| I had only got as far as downloading the file. | 07:36.02 |
chrisl | I knew what I was looking for, just not where to look - made a big difference! | 07:36.24 |
kens | I guess that FAPI copied that fake code out of the zchar42_set_cache code.... | 07:36.25 |
chrisl | Yeh, it looks that way | 07:36.41 |
kens | chrisl does that mean you are going to take over the bug ? | 07:36.48 |
chrisl | I guess I might as well. The only thing I need to work out is how to find if the Truetype font is embedded or read off disk, then I can fix it | 07:37.45 |
kens | Seems like you are well ahead of me... | 07:38.00 |
chrisl | Well, from the working on the FAPI code, it seems that we generate the "fake" vertical metrics to handle the case where a TTF is used to substitute a vertical writing CIDFont, which makes sense | 07:39.00 |
kens | Yes, that seems reasonable, but we shouldn't have to do that if hte font is embedded. | 07:39.37 |
chrisl | *But* a real CIDFont with wmode 1 but without vertical metrics should be treated as a wmode 0 font | 07:39.52 |
kens | Ah... | 07:40.03 |
| That sounds like some kind of Adobe kludge | 07:40.27 |
| Presumably to allow you to use a horizontal font in vertical writing | 07:40.48 |
chrisl | Yeh, it's pretty naff, IIRC, it is actually documented that way | 07:40.59 |
| Okay, that works for the test job, now I'll need to check the right thing happens when the font isn't embedded - I think get more coffee first, though..... | 07:50.01 |
ray_laptop | g'nite (not the best way to show up) :-) | 08:30.27 |
kens | night ray_laptop | 08:30.35 |
| Kind of late.... | 08:30.40 |
ray_laptop | kens: yep -- just whar | 08:31.03 |
| kens: just what I was realizing | 08:31.28 |
kens | :) | 08:31.35 |
ray_laptop | bbiaw | 08:32.17 |
Derelict | I'm using this to convert a grayscale tiff to PCL for use in a macro. I get ugly monochrome with dithering out. Any suggestions on flags to use? | 08:35.22 |
| tiff2ps -h 11 -w 8.5 9101-Invoice-Macro-Image.tif > 9101-Invoice-Macro-Image.ps | 08:35.23 |
| gs -sDEVICE=ljet4 -sOutputFile=9101-Invoice-Macro-Image.pcl -dNOPAUSE -dBATCH -q 9101-Invoice-Macro-Image.ps | 08:35.23 |
| printing the ps file looks fine. | 08:35.23 |
kens | tiff2ps converts to PostScript, not PCL.... | 08:36.31 |
Derelict | yeah. then I'm using gs to convert the ps to PCL | 08:36.48 |
kens | The ljet4 device is (probably) a monochrome device | 08:36.50 |
| So the result of sending greyscale to it will be halftoned | 08:37.05 |
Derelict | It does grayscale. | 08:37.08 |
kens | The printer might bem is the GS device ? | 08:37.29 |
| I would suggest trying a colour device. | 08:37.42 |
Derelict | I'm being dumb. there's an lj5gray device. let me try that. | 08:38.59 |
| Hmm. my printer barfs on that. | 08:40.36 |
| ljet4pjl works. thanks. | 08:45.12 |
kens | Can't say we did much :) | 08:45.35 |
d3c | will MuPDF use multiple cores when processing PDFs? | 11:01.42 |
tor8 | d3c: not out of the box, no. | 11:03.20 |
d3c | tor8: ok, this is for Amazon EC2, so if I pick an instance with more cores, it wouldn't speed up processing time? | 11:03.54 |
tor8 | mudraw doesn't use multiple threads, but the mupdf library has support for some concurrency | 11:04.36 |
| parsing always happens single-threaded, but rendering can be split up and done in multiple threads | 11:05.08 |
d3c | tor8: ok. seems like I have to do some things myself to make it work? | 11:06.21 |
| tor8: (for rendering) | 11:06.26 |
tor8 | you need to supply threading primitive callbacks to the mupdf library (we're threading library agnostic) then you can spawn new contexts for the worker threads | 11:07.07 |
| each thread needs its own context | 11:07.18 |
| then you parse to a display list, and pass the display list to your worker threads to render the page in tiles that you assemble at output | 11:07.51 |
| d3c: you can look in doc/multi-threaded.c for an example | 11:09.31 |
d3c | tor8: ok⦠I'm using the mudraw command line tool, though. I'm not sure how to do what you say. | 11:09.38 |
| tor8: but not your problem. I'll see what I can do :) | 11:09.55 |
tor8 | d3c: you'll need to modify the mudraw command line source... | 11:10.00 |
| or bash the multi-threaded example into something similar to mudraw. the example is basically the same as mudraw minus all the options. | 11:10.31 |
| the example is also for stress testing since it spawns one thread per page, so you'll need to change that | 11:11.28 |
d3c | tor8: ok. I just ran mudraw on a really big EC2 instance. it is actually able to utilize 100 % of the CPU | 11:12.51 |
| tor8: out of the box | 11:12.55 |
chrisl | d3c: have you got files with lots of pages, or lots of smaller files to process? | 11:13.15 |
d3c | chrisl: ~4000 PDFs with ~50-100 pages each | 11:13.36 |
| chrisl: 10-150M each | 11:13.54 |
chrisl | So, you might be better to consider running multple instances of the mudraw executable, instead of a since, multithreaded executable | 11:14.06 |
| s/since/single | 11:14.22 |
d3c | chrisl: yeah, that may be right. will give it a try | 11:14.37 |
paulgardiner | tor8: You ansked what was the purpos of pdf_get_stream. It turns out the answer is "to add unnecessarily to the code base and provide a way that I can achieve something in 3 calls rather than in just one. :-) | 11:16.42 |
tor8 | paulgardiner: morning. I thought pdf_load_stream would be what you wanted? | 11:18.21 |
paulgardiner | Possibly, although my only use currently is to obtain an fz_stream, so perhaps I should call pdf_open_stream? | 11:19.08 |
tor8 | if you just want the stream (and not load it into a buffer) open_stream is the way to go, yes | 11:19.38 |
paulgardiner | Will that access stmbuf in the case that the stream has been preloaded into memory? ... if that makes sense/ | 11:19.41 |
tor8 | yeah, that's part of the changes I resurrected | 11:19.54 |
| so if you've updated the stream with pdf_update_stream it'll read the data back with pdf_open_stream | 11:20.13 |
paulgardiner | Great. Glad you spotted that. | 11:20.29 |
| I'll update it and let you know when I've pushed again to casper. | 11:20.56 |
tor8 | also, I'd like if you renamed pdf_xobject_set_contents to pdf_update_xobject_contents (or set, if you insist) | 11:21.15 |
paulgardiner | Sure | 11:21.28 |
tor8 | Robin_Watts cherry-picked and put your patches onto master | 11:21.48 |
paulgardiner | Ah. I'll stick my fix on the end of master. | 11:22.16 |
tor8 | I thought I'd get search working today in the gtk+ viewer, then figure out how to do linking. once that's done I'll bug you about placing form widgets onto the page :) | 11:23.08 |
paulgardiner | Right. | 11:24.22 |
tor8 | can you compile/run gtk stuff on your platform? if not I'll take a stab at porting it to win32 as well. | 11:24.58 |
| I've never tried compiling gtk on windows, I'm guessing it's a nightmare | 11:25.27 |
paulgardiner | I have opensuse on a headless server and vnc access | 11:25.49 |
| placing form widgets on the page? Are we looking to place native widgets over the document area of the widget? On windows so far, I've just been invoking a central dialog box. | 11:27.26 |
chrisl | building gtk on Windows was a nightmare the last time I tried it (it may be better now), but at that time there were developer packages available to download - I'm not sure how often they update them, though. | 11:27.53 |
paulgardiner | My biggest worry concering moving permenantly away from windows for development would be the loss of VS debugging. | 11:29.06 |
chrisl | paulgardiner: if the plan is to use gtk, it would be wise to do that development on Windows, anyway, because not all gtk capabilities are available on Windows. It would be a pain to trip over any of those porting it over to Windows. | 11:34.19 |
paulgardiner | Yeah, that a good point. | 11:35.06 |
| tor8: once you have it working on linux, I'd be happy to see if I can get it to build for Windows. | 11:36.31 |
tor8 | paulgardiner: great. I'll let you do that :) hopefully it'll be as easy as dropping in a .lib somewhere | 11:38.12 |
chrisl | slinks off to lunch after mammoth commit........ | 11:38.29 |
tor8 | if not, cooking up a native win32 version shouldn't be too much work | 11:38.41 |
Robin_Watts | paulgardiner: I wrote mujstest last night (or the guts of it) | 11:43.23 |
| It's basically a hacked win_main.c that wraps pdfapp.c (with all the windows specific stuff removed) | 11:43.54 |
paulgardiner | Robin_Watts: streuth. Not hanging about then?! | 11:44.10 |
Robin_Watts | Am I right in thinking that currently, if you click in a text field, you get a dialog box pop up. Everything freezes while you enter some text inthat box, and when you hit return it goes back to the app? | 11:44.52 |
| paulgardiner: It all fell out far faster than I expected. | 11:45.04 |
| I haven't tested it at all of course :) | 11:45.11 |
paulgardiner | Robin_Watts: yes that's right about the dialog box. although I wasn't necessarily thinking that we'd use the same way to test. Could call the library entry points without the dialog | 11:46.18 |
| Robin_Watts: but if that works nicely... | 11:47.18 |
Robin_Watts | no, it doesn't produce a dialogue. | 11:47.29 |
| but when you click somewhere, it calls wingettext (or whatever). mujstest implements that to just return a string. | 11:48.01 |
| So the scripts will do: TEXT "blah"\nCLICK 100 200 | 11:48.21 |
| and then on that click, it'll get "blah" entered into the field at 100 200. | 11:48.35 |
paulgardiner | Ah yes. That makes sense | 11:48.37 |
Robin_Watts | It's slightly ass backwards, but it should work. | 11:48.50 |
| paulgardiner: So today I will need to write some test scripts. | 11:50.49 |
paulgardiner | I think we'll also need another type of text where we enumerate all the buttons and emulate a click on each, but there is no enumeration method at the moment. | 11:51.14 |
Robin_Watts | I was hoping to hack something together where we would throw out coords of all the fields when we first render them. | 11:52.00 |
paulgardiner | That test could be applied to all 2500 test files without having to set up specifically for each. | 11:52.01 |
Robin_Watts | Then we could run each file once, capture the output, and munge that into a mujstest script for each file. | 11:52.34 |
paulgardiner | Oh yeah. Automatically create a set of tests | 11:53.22 |
Robin_Watts | So, what are the best files for me to test? | 11:53.40 |
| You say you have 2500... how can I get that set from you? | 11:53.54 |
| Can you copy them onto your server, and I'll wget them ? | 11:54.36 |
paulgardiner | Sure | 11:55.16 |
kens | tor8 ping | 12:24.10 |
Robin_Watts | tor8: ping #2 | 12:24.29 |
tor8 | double pong! | 12:25.35 |
Robin_Watts | tor8: Yesterday you mentioned a problem with transparency in ghostxps, and I said maybe it was the same problem as I was seeing with SMasks in PDF. Maybe it is the same problem, but it won't be cured by the same fix; the fix for mine will be in the PDF interpreter, which clearly won't affect the xps problem. | 12:25.44 |
tor8 | Robin_Watts: okay. then I'll need to dig more. | 12:26.11 |
Robin_Watts | That occurred to me while running earlier, so thought I should tell you. I'll shut up and leave you with kens now :) | 12:26.11 |
kens | tor8 a long time ago when working with XPS I seem to remember there was a way to tell gxps to use a non-zipped directory or something, rathter than having to keep unzippign and rezipping a directory tree. But I can't make it work, can you remind me please ? | 12:26.50 |
tor8 | kens: point it to the _rels/.rels file | 12:27.34 |
kens | Ah, thanks tor8 | 12:27.41 |
paulgardiner | Robin_Watts: zipping the test files didn't shrink them much (unsurprisingly I guess) and there's 2GB of them. I've instead put them here http://intranet.glidos.net/~paul in their orignal directory structure. Does that work for you? | 12:28.40 |
kens | THat was it tor8 thanks :-) | 12:28.50 |
Robin_Watts | paulgardiner: Perfect. | 12:28.50 |
paulgardiner | If you wget the lot then if possible cap transfer rate at 500Kbits | 12:29.50 |
Robin_Watts | wget is refusing to recurse for me... | 12:42.53 |
| No, I'm having no luck with wget at all :( | 12:51.13 |
paulgardiner | tor8: Merging seems to have broken a few things, so it might take a bit longer than I expected to a get to the changes we discussed. | 12:52.31 |
Robin_Watts | paulgardiner: Can you make the file available under ftp ? | 12:52.59 |
| s/file/files/ ? Sorry. | 12:53.05 |
| or zip 'em, and I can wgetc a single tgz ? | 12:53.24 |
paulgardiner | 2GB though | 12:53.38 |
Robin_Watts | That's fine. | 12:54.08 |
paulgardiner | ok | 12:54.12 |
Robin_Watts | (well, it'll have to be fine :) ) | 12:54.33 |
chrisl | Robin_Watts: wget might work with the "-m" option | 12:56.01 |
Robin_Watts | chrisl: I tried that. | 12:56.08 |
chrisl | Oh, well, that's supposed to work.... :-( | 12:56.22 |
Robin_Watts | yeah :( | 12:56.26 |
paulgardiner | Robin_Watts: Ok. Try now | 12:59.11 |
Robin_Watts | Fetching now. thanks. | 13:00.10 |
d3c | what's the maximum size PNGs MuPDF can generate from a PDF? any hard limit somewhere? I'm getting segmentation faults when trying to generate a 20000x20000 PNG (where I afterwards need to crop an area from the PNG in a big resolution.) | 13:28.10 |
Robin_Watts | d3c: We have a hard limit of 4gig for the unpacked size. | 13:44.45 |
d3c | Robin_Watts: ah, right. that's what the bug was about. can I calculate at what dimensions I'll hit that limit? | 13:45.38 |
Robin_Watts | 4*w*h must be less than (1<<32) | 13:45.59 |
d3c | Robin_Watts: thanks a lot. | 13:47.06 |
Robin_Watts | chrisl: Should we be offering lzma compression for tiffs? | 14:11.05 |
chrisl | Robin_Watts: why? | 14:11.42 |
kens | non-standard, not in TIF 6 spec | 14:11.54 |
Robin_Watts | Well, we offer all the other compression schemes ? | 14:11.57 |
chrisl | No we don't | 14:12.04 |
Robin_Watts | s/all the/many of the/ | 14:12.21 |
| lzma = better than lzw, and we offer lzw. | 14:12.38 |
chrisl | We offer all the baseline compressions, none of the optional/supplemental ones | 14:12.53 |
Robin_Watts | I believe lzw is the best lossless compression we offer for non 1bpp tiffs. | 14:12.57 |
tor8 | Robin_Watts: lzma certainly isn't 'baseline' | 14:13.11 |
Robin_Watts | Fair enough. I won't fight it. Just wondered if it was worth adding rather than disabling it in libtiff. | 14:13.50 |
tor8 | well, by default we shouldn't be producing them anyway | 14:14.10 |
alex11 | hi all, can I have a question regarding mudraw vs gs? | 14:15.51 |
chrisl | FWIW, I don't see the advantage, given how cheap bandwidth and disk space are these days, of offering a non-standard, quite slow compression, to get a slightly smaller file. | 14:15.52 |
Robin_Watts | chrisl: Fair enough. I was pondering the 2 (or 4) gig filesize thing. | 14:16.42 |
| alex11: Sure. shoot. | 14:16.49 |
chrisl | Ugh, there are also some license "funnies" that would make including liblzma a bit of a maintenance nightmare...... | 14:17.34 |
alex11 | I have a quite detailed image embedded in a pdf file - it's kind of technical drawing on a millimeter paper. I compared using gs and mudraw to save a page of this pdf to a png | 14:17.45 |
chrisl | Robin_Watts: If we're going to allow tiff extensions, then a better solution would be to allow BigTIFF files, which allow 64 bit offsets | 14:18.13 |
Robin_Watts | Sure. Like I say, it was just an idle thought. | 14:18.36 |
alex11 | ...and the thing is: gs is more precise (I guess mudraw's antialiasing blurs things a bit) but it has a nasty blue-ish color to it as compared to mudraw's | 14:18.59 |
Robin_Watts | alex11: You can disable antialiasing in mudraw. | 14:19.20 |
alex11 | what I'd like to have is gs precision with mudraw's colors - can I have it? | 14:19.24 |
Robin_Watts | gs uses color management, so it's colors should be 'better' than mudraw. | 14:20.06 |
| but that does assume you have color management set up correctly. | 14:20.19 |
| Use mudraw -b 0 | 14:20.44 |
alex11 | in gs I use "-dGraphicsAlphaBits=4 -dTextAlphaBits=4" - is it comparable? | 14:21.30 |
kens | What version of GS ? | 14:21.54 |
Robin_Watts | Then you should get comparable output from gs and mupdf. | 14:21.56 |
alex11 | 9.05 | 14:22.06 |
kens | What ICC profile are you using for the output management ? | 14:22.07 |
tor8 | alex11: if you render with gs at a much higher resolution, then downscale the image using imagemagick or pnmscale you will get the best of both worlds, with the worst speed of any solution | 14:22.15 |
alex11 | kens: I'm running on defaults | 14:22.21 |
tor8 | Robin_Watts: mupdf without antialiasing suffers from not having any dropout prevention thuogh | 14:22.46 |
kens | alex11 well the GS colour ought to be correct, but colour management is tricky, and it depends a lot on the source colour space and destination space | 14:22.49 |
| You can modify the conversion by using a different ICC profile | 14:23.01 |
Robin_Watts | tor8: true. | 14:23.05 |
paulgardiner | tor8: Removed pdf_get_stream, did the requested renamings and it's all back working again. There's a commit on my master branch. | 14:24.31 |
tor8 | paulgardiner: fab. | 14:24.41 |
alex11 | kens: this is all on-screen thing; I have an app that produces a png that is later embedded in the pdf; the pdf and the mudraw-generated png have the same colors as the original png, but the one generated with gs is blueshifted. are there any available ICC profiles I could experiment with? any suggestions on some starting points? | 14:25.42 |
kens | alex11 colour management isn't really my area, and our colour expert is at a trade show this week | 14:26.41 |
| I think there are some words about colour management in the docuemntation | 14:26.57 |
| OK in teh GS/doc directory is a PDF named 'GS9_color_management.pdf' | 14:27.54 |
| THere are somwe words on usage in rthere | 14:28.11 |
| Its also possible this is a bug | 14:28.51 |
| There is an open bug report that sounds 'similar' to yours relating to whitePoint calibration | 14:29.14 |
alex11 | thanks, I have this and crunching through already, just thought that it could be some kind of simple switch that is by default set in mudraw that you guys know about, so that the colors in mudraw are more round-trippish than ing gs | 14:29.42 |
kens | http://bugs.ghostscript.com/show_bug.cgi?id=692825 | 14:29.44 |
chrisl | When did -dUseFastColor come in? That should be in 9.05, I think | 14:30.04 |
kens | Yes t is | 14:30.10 |
| Err, maybe not actually | 14:30.19 |
chrisl | Also, I think 9.05 will embed an ICC profile in the png output, which could be having an effect | 14:30.43 |
kens | alex11 try rendering the file in that bug report above and see if it looks like your problem | 14:30.45 |
alex11 | kens yes, it looks very similar | 14:33.32 |
kens | Then it is probably taht bug :) | 14:33.56 |
alex11 | that's very comforting :-) | 14:34.06 |
| how's that it is not present in mudraw? isn't mudraw using gs? | 14:34.22 |
kens | Michael will get to it eventually | 14:34.23 |
| Noo muidraw is totally doivorced from GS | 14:34.33 |
| divorced* | 14:34.40 |
alex11 | OK, that makes sense | 14:35.09 |
| thank you all for your help, bye | 14:35.37 |
kens | bye | 14:35.41 |
Robin_Watts | paulgardiner: When I run: win32/Debug/mudraw.exe -o out.png ../MyTests/calc.pdf | 15:15.26 |
| on the forms branch I get lots of: warning: assert: index 0 > length 0 etc | 15:15.43 |
| is that expected? or cured by your latest fix? | 15:16.05 |
chrisl | Why am I suddenly getting a load of pcl/pxl errors in my clusterpush............ when all I've changed is a parameter to libtiff's configure :-( | 15:16.31 |
kens | chrisl I got a similar problem | 15:16.59 |
chrisl | Hmm, I've gone back several commits, and still get the error, too | 15:17.43 |
kens | Well, at least that explains why I'm getting seg faults and erros that *can't* be related to my changes. | 15:18.42 |
paulgardiner | Robin_Watts: not expected. I haven't seen asserts when running the app. | 15:18.52 |
kens | I just did a cluster push of HEAD in confusion | 15:19.03 |
Robin_Watts | chrisl, kens: That is to be expected. | 15:19.15 |
kens | ?? | 15:19.20 |
chrisl | Huh???? | 15:19.37 |
Robin_Watts | marcosw is 600 miles away from the nearest land of any description, for at least a week, so of course the cluster is going to go wrong :) | 15:19.46 |
kens | rofl | 15:19.54 |
chrisl | :-) | 15:20.14 |
| I get errors running the files locally, too, so I guess the cluster is just confused thinking these are "new" errors? | 15:21.26 |
Robin_Watts | paulgardiner: You wouldn't see them while running mupdf, only mudraw. | 15:21.49 |
| And they aren't 'asserts', just warning prints. | 15:22.02 |
| And it's pdf_new_rect at fault. | 15:23.02 |
| You should be calling pdf_array_push, I think. | 15:24.04 |
kens | chrisl I'm seeing seg faults, and they are definietely new | 15:24.51 |
paulgardiner | Robin_Watts: I guess we don't have the windows app sending output to the console | 15:25.13 |
Robin_Watts | kens: Did clusterpushing head work? | 15:25.21 |
| paulgardiner: Indeed not. | 15:25.25 |
kens | Robin_Watts : will let you knwo when its finished | 15:25.34 |
Robin_Watts | paulgardiner: I am preparing fixes for it here - don't worry about it. | 15:26.41 |
paulgardiner | Robin_Watts: Ah yes. pdf_array_push probably. Thanks. | 15:26.53 |
kens | 1Huh | 15:31.34 |
| The cluster aborted my run | 15:31.39 |
Robin_Watts | paulgardiner: Fix is on my "forms" branch. | 15:41.31 |
| What is stopping us publishing the forms branch to the main repo on casper again? | 15:41.55 |
| If more than one of us is working on the same branch, I think it makes sense to do that now... | 15:42.12 |
paulgardiner | Probably should be commited to master on the main repo, since pdf_new_rect is there | 15:42.13 |
kens | Hmm cluster sems totally borked | 15:42.19 |
Robin_Watts | kens: Let's leave it for 20 minutes and see if it self heals. | 15:42.46 |
kens | It tried | 15:42.54 |
| Now its aborting again | 15:42.58 |
| But I was going to leave it again | 15:43.08 |
chrisl | kens: can you try some of the erroring/seg faulting files locally? | 15:43.14 |
kens | chrisl I suppose I can yes, but I am not on Linux | 15:43.32 |
Robin_Watts | paulgardiner: Will take care of putting it on master. | 15:43.35 |
paulgardiner | ta | 15:43.49 |
kens | OK it just flushed my cluster test | 15:44.03 |
chrisl | kens: given the extent of the apparent problems, it will be interesting to find what happens on Windows... | 15:44.07 |
kens | Its gone to MuPDF now | 15:44.11 |
| chrisl give me a couple of miutes | 15:44.20 |
| Need to rebuild various binaries | 15:44.28 |
Robin_Watts | paulgardiner: but my question to you and tor8 about publishing forms remains. | 15:44.40 |
kens | chrisl my test was slightly off from trunk, I was missing your idict.h fix | 15:45.40 |
tor8 | Robin_Watts: I'm okay with putting a forms branch on the master repo | 15:46.02 |
| or gold repo, or origin repo, whichever term is appropriate | 15:46.27 |
paulgardiner | Oh. Don't know. Now it's fairly long, I'm no longer rebase it, so it would be no hardship to be unable to. | 15:46.29 |
chrisl | kens: that shouldn't matter, there should be no functional difference | 15:46.36 |
kens | Yes, I know, just thought I would mention it | 15:46.48 |
| The report I got back said there were no problems :) | 15:47.01 |
| THough reading carefullly, it did say it ran 0 tests | 15:47.19 |
| Well MuPDF seems to be behaving | 15:48.56 |
Robin_Watts | I need to update the cluster w.r.t muPDF. Since I made mudraw return a non-zero error code on incomplete renders, the cluster registers lots of mupdf things as failures that aren't really. But that should be independent fof the gs problems. | 15:50.00 |
kens | MuPDF is working, GS apparently isn't so it seems like the cluster is OK but the source is not.... | 15:50.38 |
| I wonder if the 'user' sources are insane in some way | 15:51.10 |
| chrisl wa yours a user run, as opposed to commit ? | 15:51.23 |
chrisl | Yes | 15:51.34 |
kens | Mine too, but your commit runs seem to be OK | 15:51.46 |
| So I wonder if its something to do with that. | 15:51.53 |
Robin_Watts | Well, the diff at the end of the clusterpush report looks reasonable. | 15:52.12 |
kens | Your one Robin_Watts ? | 15:52.30 |
Robin_Watts | no, yours. | 15:53.01 |
kens | Good grief, 442-01.ps takes a *long* time to render at 300 spi | 15:53.11 |
| Robin_Watts : yes, the diff is tiny | 15:53.26 |
| But as you can see, it ran 0 tests :) | 15:53.34 |
| Because it aborted twice | 15:53.40 |
| chrisl the first 'seg fault' file I tried works fine ehre | 15:54.05 |
chrisl | Hmm, I wonder if I can force a complete upload of the tree, and try that...... | 15:55.06 |
kens | Ah, but the second one I try does in deed seg fualt | 15:55.09 |
chrisl | Which file? | 15:55.37 |
kens | Bug692217.pdf 300 dpi, psdcmyk | 15:55.59 |
| And that was with a debug build too | 15:56.21 |
| I haven't finished building the release binaries yet | 15:56.45 |
henrys | chrisl:have you checked on a local originating machine the "Error_reading_input_file" errors | 15:57.53 |
| ? | 15:57.54 |
chrisl | henrys: some of them, yes | 15:58.09 |
henrys | and they reproduce? | 15:58.37 |
Robin_Watts | tor8, paulgardiner: forms branch is now on golden repo. | 15:58.56 |
chrisl | henrys: Yes | 15:59.10 |
| kens: that file is seg faulting for me on source from the 8th of June..... | 16:01.00 |
kens | chrisl for me Bug692217.pdf seg faults i ngx_patter_size_estmate because pinst is NULL | 16:01.29 |
chrisl | Same here | 16:01.36 |
kens | chrisl so why has hte seg fault not shown up before ? Puzzled.... | 16:01.44 |
| Looks like a clist problem that one | 16:01.58 |
chrisl | I'm going to try some git hammering, just to be sure about this..... | 16:02.22 |
henrys | chrisl:have you changed your build since the error report on the dashboard? pushed again? | 16:02.39 |
kens | Well you r commit to 'aquash a warning' does not show a seg fault with that file.... | 16:02.46 |
chrisl | henrys: no, other people have been using the cluster, so I've been poking at things locally | 16:03.29 |
| kens: that surprises me | 16:03.43 |
kens | chrisl me too, but I checked the summary | 16:03.58 |
| I suppose I should check the detailed log | 16:04.05 |
| Umm the detailed log seems to be empty | 16:04.45 |
chrisl | I need to push another fix for that - I forgot that dict_find_string() takes a ref *....... | 16:05.15 |
henrys | yes something is terribly broken in pcl land. | 16:06.30 |
| chrisl:in the current code not including your local patch. | 16:09.11 |
kens | chrisl I'velooked at a coulpe of the 'passed' logs and they both have huge numbers of test files being removed | 16:09.23 |
| And tehy run *very* few tests | 16:09.44 |
chrisl | henrys: my local patch doesn't affect pcl | 16:09.48 |
| kens: huh? Files removed? | 16:10.11 |
kens | Look at the 'passed' link for commit 39433fc | 16:10.31 |
chrisl | Yeh, I see it now - wtf does that mean? | 16:10.51 |
kens | The following 64626 regression file(s) have been removed: | 16:10.53 |
| THe next commit has something simila | 16:11.05 |
| similar | 16:11.08 |
| ANd if you look at the number of test actually run, there were very few | 16:11.20 |
| ran 486 tests in 3273 seconds on 10 nodes | 16:11.34 |
| No differences in 361 non-pdfwrite/ps2write tests | 16:11.34 |
| No differences in 65 pdfwrite tests | 16:11.34 |
| No differences in 60 ps2write tests | 16:11.34 |
| It 'looks' like your huge header commit broke something, but I have no idea what it would be | 16:12.05 |
| Or why | 16:12.13 |
Robin_Watts | Let me check the logic in the cluster for such "removed" files. | 16:12.31 |
ray_laptop | 3273 seconds on 10 nodes for only 486 files ? Seems slow | 16:13.55 |
kens | I think most of that time is the build | 16:14.09 |
| But even so | 16:14.23 |
chrisl | Well, it would not surprise me to learn that I'd messed something up in changing the headers, but why would a clusterpush not show the problem? | 16:14.25 |
Robin_Watts | Hmm. It looks like there is a skip.lst file that lists files not to test. | 16:14.32 |
ray_laptop | kens: really ? gs builds in < 60 seconds on peeves. I know we build more than just gs, but ... | 16:15.23 |
kens | ray_laptop : seems to take longer than that o the cluster | 16:15.42 |
| But it still seems a long time | 16:15.48 |
Robin_Watts | Is that 327.3 seconds on each of 10 nodes ? | 16:16.06 |
kens | Beats me | 16:16.12 |
Robin_Watts | I can't see any code in the cluster to ever delete skip.lst | 16:16.31 |
chrisl | The long time would probably be because it tries to run all the files, before adding them to the skip.lst | 16:18.08 |
Robin_Watts | I think the skip.lst may be a red herring. | 16:19.24 |
chrisl | Well, wherever it actually holds the information of tests to "remove" | 16:20.08 |
Robin_Watts | but something is causing files to be moved into the filesRemoved perl array. | 16:20.14 |
ray_laptop | hmm... I just did a make clean ; make on peeves with a clean and it gets errors during libtiff | 16:21.32 |
chrisl | ray_laptop: missing lzma? | 16:21.57 |
Robin_Watts | Ah. | 16:24.13 |
| It tested all those files, but didn't find a record of them in the 'previous' tab. | 16:24.33 |
| It tested all those files, but didn't find a record of them in the 'previous' results table, so it can't compare them to see if they passed or failed. | 16:24.56 |
| hence they are "removed" from the results. | 16:25.08 |
chrisl | That's clearly wrong..... | 16:25.44 |
Hohlraum | hey guys i'm breaking apart a pdf into individual pages and the resulting pages added up are 2x the total file size of the original. Ideas? Here is the command for the first page: "gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -dLastPage=1 -sOutputFile=page1.pdf allpages.pdf | 16:25.52 |
Robin_Watts | Hohlraum: Why is that unexpected? | 16:26.10 |
ray_laptop | chrisl: -c ../gs/tiff//libtiff/tif_aux.c | 16:26.18 |
| In file included from ../gs/tiff//libtiff/tiffio.h:33, | 16:26.20 |
| from ../gs/tiff//libtiff/tiffiop.h:75, | 16:26.21 |
| from ../gs/tiff//libtiff/tif_aux.c:32: | 16:26.23 |
| ../gs/tiff//libtiff/tiff.h:68: error: expected =, ,, ;, asm or __attribute__ before int8 | 16:26.24 |
Robin_Watts | If a font (say) is used on multiple pages then it will only be included once in the complete file, but in each file when split up. | 16:26.57 |
Hohlraum | because the contents of this pdf is basically nothing but scanned images. guessing that maybe they aren't being re-compressed in any way. | 16:27.06 |
chrisl | ray_laptop: new on me - peeves ran the cluster test on the tiff update just fine | 16:27.34 |
ray_laptop | chrisl: I'm trying again after running autogen.sh | 16:28.48 |
chrisl | Ah, that would be an issue, yes | 16:29.10 |
| ray_laptop: if it doesn't work this time, let me know and I'll log into peeves and try it myself | 16:29.40 |
Hohlraum | Robin_Watts: or I guess I should just ask, why would the combined size of the individual pages of a PDF broken out into individual pages be larger than the same pages combined into a single pdf? | 16:30.14 |
kens | Several reasons | 16:30.48 |
| 1) There is 'boiler plate' in each PDF file | 16:30.58 |
| The xref table, object definitions etc. | 16:31.07 |
| 2) Reuse of objetcs on multiple pages | 16:31.17 |
ray_laptop | Hohlraum: if there are images that are the same on multiple pages, I think we just re-use them. Same thing with fonts (as Robin_Watts mentions) | 16:31.21 |
kens | 3) Differences in decompressing/recompressing | 16:31.28 |
henrys | well about half the xl cet files just crash with the current code. | 16:32.01 |
kens | Wihtout seeing the original PDF file I can't really comment | 16:32.10 |
ray_laptop | Hohlraum: if the original pages were JPX or JPEG and we end up choosing Flate | 16:32.18 |
chrisl | henrys: error out, you mean? | 16:32.25 |
henrys | time to bisect | 16:32.26 |
| chrisl:no I see a crash in the sse2 code. | 16:32.39 |
Hohlraum | Robin_Watts & kens: They are different images on each page. Basically someone is scanning documents as images and saving them as a pdf. | 16:32.40 |
| Robin_Watts: Is there a way to force compression? | 16:32.59 |
kens | Hohlraum it may depend on teh compression, and the size of each page | 16:33.04 |
| Hohlraum see docs/ps2pdf.htm | 16:33.17 |
Robin_Watts | Hohlraum: So, it's possible that they are in the original version as (say) JPEGs, and they might be going out as (say) LZW. | 16:33.35 |
| As kens says, see docs/ps2pdf.htm | 16:33.42 |
kens | Or they may be JBIG2 | 16:33.46 |
henrys | chrisl:with pbmraw or any halftoning device. | 16:33.54 |
kens | Since they are scanned pages | 16:33.56 |
| And it depends on the size of each page, as I've said | 16:34.18 |
| Creating a PDF file has an unavoidable size, even with no actual content | 16:34.36 |
henrys | maybe it's my mac | 16:34.41 |
chrisl | henrys: I just tried code from last week, and got similar errors from the pcl/pxl tests | 16:34.54 |
kens | If eech page is small (in kbytes) then the overhead becomes significant | 16:35.03 |
Hohlraum | The original PDF is 9 pages of scanned images and is 905k. Broken out the individual pages total 1.7MB. | 16:35.16 |
henrys | chrisl:are you seeing segmentation faults with pbmraw? | 16:35.29 |
kens | I'd have to see the file | 16:35.29 |
Hohlraum | reading the docs now. | 16:35.30 |
ray_laptop | Hohlraum: if you run gs with the -dPDFDEBUG option it will spew out what it is interpreting so you can see what's going on with the original and the result | 16:35.37 |
Hohlraum | I'll give that a try | 16:35.51 |
chrisl | henrys: I wasn't looking for seg faults, I was looking for erroring out. | 16:35.59 |
ray_laptop | chrisl: autogen.sh took care of it. Sorry for the confusion | 16:36.50 |
chrisl | ray_laptop: no worries - I should have sent a mail round reminding everyone, I just forgot | 16:37.13 |
ray_laptop | chrisl: I usually am able to get by without it, so I forgot | 16:37.58 |
kens | chrisl your commit fae7be4 doesn't show these errors, doesn't have lots of files being removed, and runs what seems to be a decent number of etsts without error | 16:38.15 |
henrys | chrisl:customer says 692365 is still wrong - I'll forward to support. | 16:38.19 |
chrisl | henrys: can you give me a specific file that seg faults? | 16:38.20 |
kens | will be back later | 16:39.22 |
henrys | ./pcl6 -sDEVICE=pbmraw -o /dev/null ~/tests_private/tests_private/xl/pcl6cet/c102.bin | 16:39.43 |
| Segmentation fault: 11 | 16:39.43 |
| chrisl:I can't imagine your change did that. | 16:41.25 |
chrisl | henrys: unless the script did something nasty, and I didn't notice when I eyeballed the changes..... | 16:42.20 |
| henrys: I don't get a seg fault, I get "Warning interpreter exited with error code -953" | 16:43.14 |
ray_laptop | ahh, the old -953 error ;-) | 16:43.39 |
Hohlraum | Here is a debug output of the job that loops over the pdf pages and dumps each one. http://pastebin.com/fbb72YrV | 16:43.52 |
chrisl | Yeh, can't we get meaningful error messages from pcl? | 16:44.03 |
Hohlraum | I'm guessing /BitsPerComponent 8 /ColorSpace /DeviceRGB /Filter /JPXDecode /Height 1639 /Length 5 0 R is the meaningful line? | 16:44.43 |
henrys | the old -953 trick, I'll have to call support on my shoe phone ;-) | 16:44.45 |
| many of the files are supposed to error. but that doesn't explain your list of "new errors" in the local push log | 16:45.57 |
chrisl | Do you know if c102.bin is supposed to error? | 16:46.35 |
henrys | checking now. | 16:47.22 |
chrisl | Hmm, well it errors with 9.05, so I assume it is supposed to | 16:49.46 |
Robin_Watts | henrys: Oh, how did Sabrinas art market thing go? | 16:49.55 |
| (that was last week, right?) | 16:50.01 |
henrys | the HP doesn't error out no. I'll bisect it, it's not likely you broke this I just wish the cluster wouldn't "trick us" | 16:50.14 |
| Robin_Watts:hot - upper 90's day 1 better day 2 though. She sold quite a bit, not enough to support me yet. | 16:50.52 |
kens | Hohlraum : That means its using JPEG2000 compression | 16:51.04 |
| We can't use that, encoders need a licence, and its not free | 16:51.18 |
Robin_Watts | henrys: well, tell her to keep working at it :) | 16:51.48 |
kens | Also, applying JPEG2000 to a JPEG200 image has the same unpleasant results as applying JPEG to JPEG, so you wouldn't want to do that nayway | 16:51.59 |
| This almsot certainly explains your size differences | 16:52.13 |
Hohlraum | kens: OK, any recommendations on things to try to reduce the size? | 16:52.32 |
henrys | the problem with painting is that to make good money you have to be dead. | 16:52.33 |
kens | If all you want to do is make several PDF fiels I woudl not reccomend using pdfwrite for this task | 16:52.35 |
| Hohlraum : use a differetn tool, I would suggest pdftk | 16:52.50 |
Robin_Watts | henrys: As the spouse of the artist, you're in a position to capitalise :) | 16:53.07 |
Hohlraum | kens: Alright. Much appreciated. | 16:53.37 |
kens | chrisl you should probably reassign #692365 back to me | 16:54.04 |
| But I can't believe it is urgent if htey have not noticed for a year | 16:54.16 |
chrisl | kens: okay, I was going to check the rendering, too, though | 16:55.04 |
kens | Hmm, OK makes sense | 16:55.12 |
henrys | kens, chrisl:we have now reported that to him fixed 2x can we have a closer look before the next iteration? ;-) | 16:55.40 |
Hohlraum | kens: while I have your attention. Any idea why ubuntu/debian's ghostscript packages seem to shit themselvs when it comes to processing these PDFs? I'm guessing it has to do with the custom build I'm using now is using some included library and ubuntu/debian are using their own library. Just not sure which one. jasper maybe? | 16:56.12 |
kens | Hohlraum : No idea, and you would need to be more specific about the problem ;-) | 16:56.45 |
Hohlraum | kens: Doing any kind of processing on these specific PDFs takes several minutes. very very slow. | 16:57.17 |
kens | GS needs to use a JPEG2000 decoder, we used to use JasPer, I beleive we use openJPEG now (anyone remember for sure ?) | 16:57.27 |
Hohlraum | kens: with my own compile is very very fast. | 16:57.33 |
kens | Hohlraum : JPEG200 *is* slow | 16:57.37 |
| Hohlraum : The distros insist on using the sysme shared libraries | 16:58.02 |
| You are probably using our own version of the libraries | 16:58.17 |
Robin_Watts | kens: Sounds right to me. (jasper -> openJPEG) | 16:58.18 |
kens | really isn't here now | 16:58.34 |
Hohlraum | kens: yes I am. | 16:58.39 |
chrisl | As we can't currently share OpenJPEG, they'll be using Jasper, which almost certainly explains it | 16:59.07 |
| kens: the rendering looks correct to me, so I'll reassign it to you | 16:59.23 |
Hohlraum | kens: pdftk resulted in 9 pages that are nearly the same size total as the original. thanks | 17:00.14 |
henrys | chrisl:you are going to reopen and assign to kens or should I? | 17:00.22 |
chrisl | henrys: I'm doing it now | 17:00.39 |
Robin_Watts | Hohlraum: If you just want to split the files out, you can use mupdfclean. | 17:02.31 |
henrys | chrisl:like you I don't see the seg fault on linux just mac. | 17:05.00 |
chrisl | henrys: could you quickly try source from before the headers update, just to put my mind at rest? I need to go out shortly...... | 17:06.17 |
Gigs- | openjpeg has improved a lot | 17:07.04 |
| second life had a lot to do with that | 17:07.09 |
| they used kakadu decoding jpeg2000 for texture transport, when they open sourced the second life client people needed to make openjpeg suck less | 17:07.42 |
henrys | chrisl:I did. | 17:07.49 |
chrisl | henrys: and it still seg faults? | 17:08.00 |
Gigs- | kakadu is still the king but it's proprietary | 17:08.01 |
henrys | yes trying 9.05 now. | 17:08.12 |
Gigs- | iirc jasper is pretty much a reference implementation, not really designed for real use | 17:09.00 |
chrisl | Gigs-: even as a reference implementation, it's poor - I'm pretty sure there are parts of the spec it doesn't work correctly on | 17:11.09 |
Gigs- | hmm, well by virtue of it being incorporated into the spec, one could argue that it is the gospel regardless | 17:12.58 |
| though I guess that's just a semantic argument | 17:13.12 |
chrisl | Dratted X crashed on me :-( | 17:32.53 |
henrys | chrisl:works okay on the mac pro - just my laptop has something strange going on. | 17:37.53 |
| which is why the clusters didn't flag it. | 17:38.07 |
chrisl | henrys: Hmm, strange. I'm also seeing weird errors with the XPS interpreter...... | 17:38.49 |
| I have to head out - I'll check later in case anything comes up | 17:42.14 |
Robin_Watts | paulgardiner, tor8: 2 commits on my forms branch. | 18:09.29 |
| The first one adds mujstest - hopefully free from any contentious stuff. | 18:10.02 |
| The second one adds a new flag to mudraw that enables mudraw to output simple mujstest scripts for given files. | 18:10.31 |
| tor8: I can imagine you may dislike the bloating of mudraw, and the addition of the new function to get the page rectangles for all the annotations on a page. | 18:11.05 |
| I am open to better ideas for how to do it. | 18:11.21 |
paulgardiner | Robin_Watts: I thought you needed the generation of page rectangles only temporarily to create the initial set of test files for mujstest. | 18:24.48 |
Robin_Watts | paulgardiner: Yes. | 18:25.02 |
| But I've done that initial generation by adding a new flag to mudraw. | 18:25.26 |
| and to implement it, I need some way to walk the annotations to get the page rectangles out. | 18:25.49 |
paulgardiner | So once there, why not keep it? Right | 18:25.52 |
Robin_Watts | yeah. | 18:25.59 |
| 17 minutes to go on the download, btw. | 18:26.15 |
paulgardiner | I have a few things like that kept on seperate branches, but whether they'd still work if rebased is of course the issue with that way... | 18:27.16 |
Robin_Watts | It doesn't seem like a huge bloat to me, and it's always possible that someone will give us more files to test in future. | 18:28.33 |
tor8 | Robin_Watts: we should add more bloat to mudraw anyway -- banded rendering so that problems like d3c is having don't crop up as often | 18:49.31 |
d3c | tor8: I've modified my script that fires mudraw so it will never use a width and height that makes 4wh>(1<<32) like Robin_Watts mentioned. I don't have anymore memory problems now | 18:51.13 |
| tor8: just FYI | 18:51.33 |
| Robin_Watts: http://bugs.ghostscript.com/show_bug.cgi?id=693118 :) | 19:03.55 |
Robin_Watts | ooh. | 19:07.12 |
| Oh, ffs. | 19:23.25 |
| I just deleted the file I spent 6 hours downloading from paul. | 19:23.55 |
| all cos of svn. | 19:24.01 |
| off we go again then... | 19:24.44 |
kens | Ooops :( | 19:39.01 |
henrys | oops that should have been CLUSTER UNTESTED | 20:26.50 |
| Forward 1 day (to 2012/06/14)>>> | |