| <<<Back 1 day (to 2011/10/11) | 2011/10/12 |
ray_laptop | this channel seems to be picking up some new people (at least nicknams I don't recognize). | 01:47.24 |
LaoLang_cool | ray_laptop: yes, new mupdf fan | 01:58.48 |
arthurf | ray_laptop: Yes, it's an interesting channel - watching rip and pdl developers chat about all sorts of interesting details / issues. | 03:03.03 |
AlecTaylor | is here to lurk | 04:06.50 |
henrys | back from Chicago! | 04:13.41 |
mvrhel2 | how was it | 04:19.32 |
| did the weather cool down? | 04:19.42 |
ray_laptop | mvrhel2: I was looking into 692158, from the "why is this _so_ slow when the pattern cliist is used?" aspect | 04:25.03 |
mvrhel2 | let me look at the bug | 04:25.53 |
ray_laptop | the bug is discussing the VMerror when MaxPatternBitmap is really big -- that's expected and all we can do is run in 64-bit mode | 04:26.49 |
mvrhel2 | did you run any profiling on it? | 04:27.16 |
ray_laptop | but when we use pattern clist it is TOTALLY sucky slow | 04:27.22 |
mvrhel2 | I can check into it with the windows profiler if you would like | 04:28.11 |
ray_laptop | mvrhel2: try at a lower res, maybe. | 04:28.56 |
mvrhel2 | hmm. AR seems to be a dog at this one too | 04:29.22 |
| its drawing in slow-mo | 04:29.43 |
| the river is slowly filling in | 04:29.53 |
ray_laptop | it is doing LOTS of clist playbacks due to tile_by_steps being used to fill an image mask. The thing is that it's doing it during the parsing. | 04:30.09 |
mvrhel2 | have you watched AR draw it at low resolution? | 04:31.18 |
ray_laptop | for each 'mask' op (a 16x16 1-bit mask) it (1) accumulates the mask as a clip path, (2) fills the area covered by the mask with the pattern color using the clip path that was accumulated. | 04:32.08 |
| it seems like it _should_ be putting the clip path into the clist, then doing the 'fill', but it's not. It ends up doing the fill which runs 'tile_by_steps' over the mask area so that parts that are visible through the mask get written to the main page clist. | 04:34.03 |
| it is _really_ horrendous. For instance, for each pixel of the mask, it searches the rectangle list for the clip path to find out what part (if any) is visible. | 04:35.42 |
mvrhel2 | ray_laptop: I may need to chat with you about this one tomorrow. I am starting to fade from too little sleep last night | 04:36.04 |
| ick | 04:36.28 |
henrys | mvrhel2:very hot had a slow race, but had fun nonetheless. | 04:36.30 |
ray_laptop | why we don't just paint the rectangle list is one question -- don't bother installing a clip device | 04:36.35 |
| mvrhel2: OK. I'm pretty tired too, so tomorrow is better. | 04:37.04 |
mvrhel2 | ok. talk to you then. I am calling it a night then. | 04:37.31 |
| henrys: was the race in the downtown area? | 04:38.06 |
| along the lake at all? | 04:38.22 |
ray_laptop | henrys: sorry to hear it was hot. Not good for running -- also unusual for this late in the year. | 04:38.24 |
henrys | no it started and ended near grant park - you never really get close to the lake. | 04:39.19 |
| http://www.chicagomarathon.com/CMS400Min/uploadedFiles/Chicago_Marathon/Runner_Information/11%20Course%20Map%2009-26-11.pdf | 04:40.59 |
| The course record was broken those guys finish before it gets hot! | 04:42.29 |
| Unfortunately somebody died in the last 500 meters. | 04:44.26 |
mvrhel2 | oh no | 04:44.48 |
henrys | yeah really weird 35 years old, vetetan marothoner, inconclusive autopsy... | 04:48.42 |
| off to unpack good night see you tomorrow. | 04:52.57 |
Robin_Watts | kammerer: So fz_scale_pixmap never fails to take less than 90% of the CPU time on your android device ? | 09:42.14 |
kammerer | i test it on several books, they contain scanned images, each page - one big image | 09:44.34 |
Robin_Watts | Well, this points to 2 possible routes. | 09:45.08 |
| 1) Make fz_scale_pixmap faster. | 09:45.24 |
| 2) Call it less. | 09:45.26 |
| 1) is easy enough; I just need to ARM code the cores. | 09:45.37 |
kammerer | in my case it was called one time | 09:45.52 |
Robin_Watts | 2) is harder, but ultimately will probably give better results. | 09:45.52 |
| If I can introduce a cache of scaled images (so that subsequent repaints of a screen don't rescale the picture) we'd be better off on panning etc. | 09:46.40 |
| Are the scanned images black and white or color ? | 09:47.23 |
kammerer | gray | 09:47.37 |
| black and white | 09:47.44 |
Robin_Watts | I'll stick it on my list of things to look at, thanks. That's very helpful feedback. | 09:48.21 |
| kammerer: Would you object to me sharing the link with your timing results here? | 09:48.40 |
kammerer | yes | 09:49.01 |
| and one more - i'don't know is arm issue or maybe slow sd card of my nook - but to precache my scanned page in list_device also take several seconds | 09:51.01 |
| link to test results: https://docs.google.com/spreadsheet/ccc?key=0AiZOsHXQgyTWdDNpbFZXcXRfZkJmeURZemJnYkxYenc&hl=en_US | 10:08.29 |
Robin_Watts | kammerer: Thanks. That way others can see it, and it's in the log so I won't lose it. | 10:14.38 |
AlecTaylor | "Hmm. MuPDF, bless their hearts, is a cool bit of tech, but MUCH less sophisticated than Poppler. If I found the right project, pdfdraw is no exception -- a very small piece of code that doesn't do any structure analysis; it looks like it just spits out whatever blobs are natively in the PDF. If you find that I'm wrong about that, please let me know." - Josh Richardson (poppler newsgroup) | 10:58.45 |
| Robin_Watts: Is his assumption correct? | 11:01.30 |
tor8 | AlecTayler: we don't do much analysis, that is correct. you may want to look at the "text" branch of mupdf, which changes the text api:s and does a bit more thorough sorting and grouping of text than the master | 11:23.58 |
| I haven't got around to merging yet, because there are still some performance problems with it | 11:24.12 |
| AlecTaylor^ | 11:24.20 |
| still no analysis of what is what though, just better sorting and lines grouped into paragraphs | 11:24.58 |
AlecTaylor | pos tagging? | 11:25.28 |
tor8 | the text structs are annotated with font and size information, and has bounding boxes, just like now | 11:26.20 |
AlecTaylor | Would it be as useful as what the pdf2html project uses in terms of bounding boxes? | 11:26.50 |
tor8 | I'm not very familiar with pdftohtml output | 11:27.19 |
AlecTaylor | would like to draw his own rectangle around potential headers and footers, check them against succeeding pages until arrive at actual header & footer by recognising the pattern | 11:27.33 |
tor8 | well, the "text" branch has grouped the text into a hierarchy of blocks, lines, spans, chars and has bounding boxes at each level | 11:30.24 |
AlecTaylor | Where do I get it from; and how can I link it to pdfdraw? | 11:31.55 |
tor8 | run "git checkout text" in your git repo | 11:33.37 |
AlecTaylor | C:\libraries\MuPDF>git checkout text | 11:36.44 |
| Branch text set up to track remote branch text from origin. | 11:36.44 |
| Switched to a new branch 'text' | 11:36.44 |
| Have you updated the pdfdraw tool for it? | 11:36.59 |
tor8 | yes. | 11:37.32 |
AlecTaylor | Great, I'll give it a go | 11:38.27 |
tor8 | try the various -t -tt -ttt flags to pdfdraw on the text branch | 12:02.30 |
| one of them will give you styled html output | 12:02.35 |
Robin_Watts | tor8: Did you see kammerers link above? | 12:12.48 |
tor8 | the pixmap scaling timing? | 12:13.36 |
Robin_Watts | yeah. | 12:13.41 |
tor8 | yes. I wonder if there's a smarter way to deal with 1bpp b&w images. | 12:18.03 |
Robin_Watts | I suspect those are greyscale, not 1bpp. | 12:18.55 |
kens | Ah, I see all the cluster nods are up, good I shouldn't have to wait too long for Chris's run to complete | 13:34.02 |
AlecTaylor | :] | 13:34.25 |
chrisl | kens: that's my revenge for you getting in before me earlier this morning..... ;-) | 13:34.44 |
AlecTaylor | Wow, the repo is really fast today | 13:36.54 |
| Took me 20 minutes yesterday, 2 miniutes today (git clone) | 13:37.05 |
kens | chrisl given the number of tests I'm running I suspect I'm getting in everyone's awy | 13:37.24 |
chrisl | kens: yep, but I'm really only worried about me ;-) | 13:38.14 |
ray_laptop | yuck. Another email flood -- this time from tor8 :-( | 14:02.21 |
AlecTaylor | yay | 14:05.34 |
| *dances* | 14:05.38 |
kens | Excellent, Mupdf's pdfclean fixes a file which Acrobat can't :-) | 14:06.46 |
AlecTaylor | :D | 14:07.01 |
| What happened to the file? | 14:07.09 |
kens | I edited it :-) | 14:07.19 |
| TO be honest, I removed 3.5 Mb of content | 14:07.32 |
AlecTaylor | hahah | 14:07.41 |
kens | from teh page stream | 14:07.42 |
AlecTaylor | bich | 14:07.43 |
| Don't be mean to PDF files! - They're very friendly once you get to know them | 14:08.10 |
kens | I've got a problem with pdfwrite and the linwidth of stroked text with this file, every time I try and reduce the PDF with Acrobat it 'fixes' the problem. | 14:08.30 |
| So I decomp[ressed it, removed most of the extraneous stuff and then used MuPDF to fix it | 14:08.53 |
LaoLang_cool | Is there mailing list for mupdf? | 14:09.02 |
kens | Still goes wrong and with luck I can reduce it further so I can debug my problem | 14:09.11 |
AlecTaylor | LaoLang_cool: Just use the ps one | 14:10.00 |
| or the pdf one | 14:10.12 |
LaoLang_cool | AlecTaylor: got it, maybe irc is better... | 14:11.10 |
Robin_Watts | LaoLang_cool: This is probably the best place to ask questions about mupdf, yes. | 14:11.31 |
LaoLang_cool | Robin_Watts: yes, many gurus about here! Hello~ | 14:11.56 |
| can mupdf add a printing function on windows? | 14:14.17 |
Robin_Watts | SumatraPDF is built on top of mupdf, and that does printing. | 14:14.35 |
LaoLang_cool | Robin_Watts: but still like mupdf more, and sumatra lacks some feature, for example rotating the page. | 14:15.39 |
| anyway, I have both on my system :) | 14:15.48 |
Robin_Watts | My point is that printing can be done with what mupdf provides already. | 14:16.00 |
LaoLang_cool | Robin_Watts: I don't understand, pdfdraw to picture then print? | 14:16.40 |
Robin_Watts | That's one way. | 14:16.53 |
| Or you can hook the mupdf device interface. | 14:17.04 |
LaoLang_cool | Robin_Watts: can you explan more detail? Don't know what is ``mupdf device interface''.. | 14:17.39 |
Robin_Watts | LaoLang_cool: I'm buried in stuff at the moment, and I'm not at all familiar with how Sumatra does it. It's open source, so your best bet is to look. | 14:18.18 |
LaoLang_cool | Robin_Watts: oh, thank you, I will! | 14:18.36 |
| take your time :) | 14:18.57 |
Robin_Watts | Welcome back henrys. How was it? (Besides hot) | 14:20.25 |
henrys | It's a great race one of the best I've done. | 14:21.08 |
| had a really good time in chicago too. | 14:21.25 |
AlecTaylor | MuPDF Device Interface... hmm, sounds more like a job for the boost libraries interfacing with CUPS and Windows stuff as well... agreed? | 14:21.49 |
Robin_Watts | henrys: A good race time? | 14:21.53 |
henrys | no the heat got to me around mile 20 - I ran with the 3:50 pace group up until then but faded back to the 4:00 group for the finish. | 14:22.40 |
Robin_Watts | AlecTaylor: I'd need to understand what 'boost libraries' were before I could comment on that. | 14:23.03 |
AlecTaylor | fooo | 14:23.32 |
Robin_Watts | henrys: Well, that sounds like a good time to me :) | 14:23.32 |
AlecTaylor | MuPDF is C | 14:23.36 |
| I keep forgetting | 14:23.39 |
| :\ | 14:23.40 |
Robin_Watts | The mupdf device interface lets you provide your own implementation for each type of object. | 14:24.22 |
| so you can reroute the rendering into GDI calls etc if that's what you want to do. | 14:24.42 |
AlecTaylor | k | 14:24.49 |
Robin_Watts | Or you can do text extraction etc. | 14:24.56 |
kens | just create a printer Device Context, attach the MuPDF bitmap to it and tell it to print | 14:25.10 |
AlecTaylor | kk | 14:25.22 |
| Hmm, quick unrelated question; which XML library does MuPDF use? | 14:25.45 |
henrys | Robin_Watts:I saw you had a problem reducing a gl/2 problem if that is still needed to be done send it to me. | 14:25.59 |
Robin_Watts | I have at least 2 classes of problems left with the plank vs pamcmyk4. | 14:26.24 |
| One of those cases *may* be a bug in the existing code. | 14:26.40 |
henrys | hmm I read all my mail last night and now I have 106 messages. | 14:26.44 |
| hard to believe ;-^ | 14:26.56 |
kens | 81 from Tor at lunch time | 14:27.19 |
Robin_Watts | I suspect we have a problem with 'small' rectangles getting the phase of halftone tiles wrong. | 14:27.40 |
| But the other problem is much bigger, and (hopefully) more obvious, so I'm looking at that first. | 14:28.22 |
tor3 | kens: odd, I only expected 6 or so emails from that push... | 15:13.51 |
kens | way more than that | 15:14.27 |
tor3 | which ones were sent? 0.9.3-0.9.9 for mupdf are the ones I expected. | 15:15.25 |
| my "version" folder is full of unread emails so I can't really tell which are new | 15:15.48 |
| AlecTaylor: MuPDF does not use an XML library | 15:17.04 |
kens | 0.8.167->0.9.230 then 0.9.1 -> 0.9.20 or thereabouts | 15:17.44 |
tor3 | LaoLang_cool: sumatrapdf have added a GDI+ device to MuPDF that they use for printing. | 15:18.06 |
Robin_Watts | tor3: All the 'text' branch stuff is newly released to the world, right? | 15:19.00 |
tor3 | http://code.google.com/p/sumatrapdf/source/browse/trunk/mupdf/fitz/dev_gdiplus.cpp | 15:19.10 |
| Robin_Watts: yeah. it's been sitting in my local repo on casper for a while. | 15:19.49 |
| but not really advertised | 15:19.56 |
Robin_Watts | That's what caused the flood. | 15:19.58 |
tor3 | kens: oh. that's a big flood! | 15:20.26 |
kens | It certainly was. | 15:20.42 |
tor3 | wonder why it did that | 15:21.20 |
AlecTaylor | tor3: Why doesn't MuPDF use an XML library? | 15:27.22 |
Robin_Watts | AlecTaylor: Sledgehammer. Nut. | 15:27.45 |
AlecTaylor | Mini-XML? | 15:28.01 |
| oh | 15:28.05 |
| xD | 15:28.05 |
| Hate those analogies | 15:28.09 |
| Line 74 in pdfdraw.c should be bool not int | 15:40.03 |
tor3 | bool is C++ | 15:44.00 |
AlecTaylor | um | 15:53.00 |
| seriously?! | 15:53.05 |
| xD | 15:53.05 |
kens | yes, C does not have a bool | 15:53.14 |
AlecTaylor | #include <stdbool.h> | 15:53.39 |
| :P | 15:53.41 |
AlecTaylor | broke SumatraPDF | 15:53.53 |
ab5tract | hi again all | 15:54.02 |
kens | You can call a header anything you like, and you can typedef a data type, but its not part of the language | 15:54.05 |
ab5tract | back with a stange problem | 15:54.28 |
| has anyone experienced ps2pdf mangling clipping paths? | 15:54.43 |
Robin_Watts | But if you can time travel back to before Ansi C appeared, and make the lib a standard one, we might accept it. | 15:54.51 |
kens | ab5tract : I assume you mean Ghostscript, and the answer is no, or we would have fixed it. | 15:55.20 |
ab5tract | from a latex document including a ps file, when i send the postscript file (generated from dvi) directly to printer, it works perfectly | 15:55.22 |
AlecTaylor | kens: It's part of C99 | 15:55.26 |
| in <stdbool.h> | 15:55.30 |
kens | Well, that's new then. | 15:55.34 |
AlecTaylor | aye | 15:55.37 |
Robin_Watts | AlecTaylor: As I said... time travel back to Ansi C... | 15:55.42 |
AlecTaylor | Is MuPDF using C99? | 15:55.43 |
ab5tract | kens, ps2pdf wraps ghostscript, correct? | 15:55.48 |
kens | I think Robin_Watts pointed out we realy use C89 | 15:55.48 |
| ab5tract : its a shell script that executes Ghostscript, I wouldn'rt use it | 15:56.07 |
AlecTaylor | When was MuPDF first written? | 15:56.08 |
tor3 | no, MSVC doesn't support c99 and it looks like it never will | 15:56.11 |
| and we need to support MSVC | 15:56.23 |
AlecTaylor | wahh | 15:56.26 |
| MSVC sucks then, doesn't it :P | 15:56.33 |
kens | Which is the reason for old versions of C :-) | 15:56.34 |
Robin_Watts | AlecTaylor: It's not a question of when it was first written. It's a question of keeping portability. | 15:56.44 |
kens | Its not the only one, embedded C implementations are usually well behind the tiems | 15:56.48 |
ab5tract | kens, its a standard tool in generating latex pdfs. and it uses ghostscript. so if it mangles the clipping paths, thats a problem in ghostscript, is it not? | 15:57.18 |
kens | ab5tract : I'm just saying I wouldn't use it. | 15:57.38 |
| I'd invoke Ghostscript directly. | 15:57.51 |
| And its more likely a rpbolem in the pdfwrite device than in Ghostscript | 15:58.11 |
ab5tract | fair enough. i can write a custom shell script for the client | 15:58.12 |
| interesting.. | 15:58.30 |
sebras | AlecTaylor: My oldest repo says it was started in 2002, but I know that tor[38] started it earlier than that... | 15:58.41 |
Robin_Watts | ab5tract: There are many many tools out there that use ghostscript. If you want us to spend time investigating a problem then we ask you (and it's only polite) to report it to us in a form that uses ghostscript itself, rather than any wrappers. It saves us a lot of time, and avoids us hunting for bugs that lie in other peoples wrappers. | 15:59.36 |
ab5tract | Robin_Watts, no problem | 15:59.51 |
| ill check no | 15:59.59 |
| now | 16:00.00 |
kens | ps2pdf might be *our* shell script though :-) We do have one by that nasme | 16:00.03 |
Robin_Watts | kens: Maybe, but there are others out there too. | 16:00.20 |
kens | I still wouldn't use it though | 16:00.28 |
| Oops look at the tiime, oh my paws and whiskers! | 16:01.28 |
| Night all. | 16:01.31 |
Robin_Watts | Night kens. | 16:01.43 |
| mvrhel2: Do you have that effect on many people? :) | 16:01.55 |
mvrhel2 | you mean making people leave? | 16:03.15 |
Robin_Watts | Turning people into Lewis Carroll characters :) | 16:04.08 |
mvrhel2 | ah | 16:04.19 |
Robin_Watts | Did you see marcosw_'s mail? Does seem a bit too good to be true. | 16:05.24 |
mvrhel2 | yes. that seems a bit odd. especially since we did some optimizations that I *think* had minor differences | 16:05.56 |
ab5tract | Robin_Watts, kens: the ps2pdf on my system is from ghostscript | 16:07.28 |
| and directly invoking gs results in the same problem | 16:07.40 |
henrys | Robin_Watts, mvrhel2:how's the planar stuff coming? my pcl changes have gone much deeper than expected: 62 files changed, 4476 insertions(+), 1008 deletions(-) I'm going to say at least another 2 weeks before I have pcl ready to be reintegrated and I'd like that new code to be in before the "planar product" | 16:07.45 |
ab5tract | so it is likely a problem with the pdfwrite device, as you said | 16:07.55 |
Robin_Watts | henrys: So I should flounder around for at least 2 more weeks? Easy... :( | 16:08.11 |
| I fixed a problem in there today, but that leaves me with at least 2 more, I think. | 16:08.57 |
ab5tract | Robin_Watts, the weirdest thing is that other programs which use gs to read postscript (scribus, libreoffice) do not exhibit the same behavior (ie it works as expected) | 16:09.41 |
mvrhel2 | henrys: I have been working on screen creation but I will do a spot check to see how the fast color halftoning is behaving with the planar device | 16:09.49 |
| there is one issue with small images that I need to track down | 16:10.02 |
| but for the most part it seemed to be working when I did some earlier testing | 16:10.24 |
AlecTaylor | I just ran pdfdraw from the "text" branch with the -ttt switch on http://ia600307.us.archive.org/8/items/lawofthehayes00ewinrich/lawofthehayes00ewinrich.pdf. There's nothing useful in the output! - How do I get useful output? | 16:10.49 |
henrys | is there anything left to do with pdf/ps for planar or are we down to pcl rop and trans problems? | 16:12.01 |
Robin_Watts | no, and yes, respectively, I think. | 16:12.22 |
| certainly all the problems I've had recently are rop based. | 16:12.45 |
| (and specifically are to do with trying to shoehorn planar textures through the clist (from patterns) for strip_copy_rop. | 16:13.18 |
| ) | 16:13.25 |
henrys | well that's really great can we show off some fast numbers for ps/pdf in planar mvrhel2 - is planar halftoning faster? | 16:13.27 |
mvrhel2 | henrys: I did not do any timing for pdf/ps. I will take a look at that today | 16:14.42 |
Robin_Watts | marcosw has done various performance testing runs for us, but all but the original ones have had the fast thresholding disabled. | 16:14.51 |
| (at my request - wanted to fix the hotspots without masking them by michaels new code) | 16:15.10 |
henrys | no hurry I think we can safely say this project will be late anyway ;-) | 16:16.31 |
chrisl | mvrhel2: thanks for looking at the patch for 692550 - as I said in the bug, didn't expect you to get to it for a while! (I just pushed the patch) | 16:17.15 |
mvrhel2 | chrisl: ok. that is good. thanks for fixing that | 16:18.52 |
tor3 | AlecTaylor: looking into it | 16:19.05 |
chrisl | mvrhel2: NP. As I was the only one who could reproduce it, I felt I should volunteer.... | 16:20.08 |
tor3 | AlecTaylor: what output do you see, and why isn't it useful? | 16:20.19 |
ray_laptop | henrys: do you want to respond to Horiana ? I was going to, but didn't want to duplicate the response. Ken might be the best to respond, but he's gone for the day (AFAK) | 16:23.38 |
| s/AFAK/AFAIK/ | 16:23.48 |
henrys | why isn't marcos responding to this stuff? He said he wanted to do it. | 16:24.07 |
AlecTaylor | tor3: This is a page: http://pastebin.com/SW1LBbsJ | 16:24.47 |
ab5tract | Robin_Watts, internet hiccup, not sure if you said anything to me since my last visible transmit | 16:24.49 |
ray_laptop | alexcher: will you have a chance to look over my pdf_main.ps patch for PDFFitPage this AM ? Phil had called me, so I think it is pretty urgent | 16:24.53 |
ab5tract | "gs -sDEVICE=pdfwrite -sOutputFile=myfile.pdf -dBATCH -dNOPAUSE myfile.ps" is a sane invocation for pdf generation, correct? | 16:25.01 |
| the output of that exhibits the same problem | 16:25.19 |
Robin_Watts | ab5tract: I hadn't seen that before :) | 16:25.20 |
ray_laptop | henrys: I think Marcos wanted to be the interface for problem reports, not necessarily info requests | 16:25.24 |
AlecTaylor | I just ran pdfdraw from the "text" branch with the -ttt switch on http://ia600307.us.archive.org/8/items/lawofthehayes00ewinrich/lawofthehayes00ewinrich.pdf. http://pastebin.com/SW1LBbsJ <-- one page from it, how is that useful? | 16:25.25 |
ab5tract | ok | 16:25.32 |
Robin_Watts | But yes, that looks sane. | 16:25.37 |
ab5tract | so that means it is a problem either in ghostscript or pdfwrite device | 16:25.54 |
henrys | ray_laptop:okay I'll clarify that with him and in the mean time respond to horiana | 16:26.03 |
ab5tract | where would i file a bug report? | 16:26.04 |
Robin_Watts | ab5tract: Can you create a bug at bugs.ghostscript.com, add that invocation and the input file, and make sure it's pointed at pdfwrite please? | 16:26.29 |
| kens is the lucky chap that has to look at that, but he's gone for the night. | 16:27.21 |
tor3 | AlecTaylor: yeah. for some reason it believes the font size is 0 and that's why the characters aren't assembled into lines | 16:27.35 |
| since it believes they are infinitely far apart, the algorithm works by scaling with the font size | 16:27.54 |
ray_laptop | henrys: I didn't parse your statement: was it "I'll ... AND respond." or "I'll ... and (you should) respond" ? | 16:28.21 |
henrys | I'll clarify with marcosw what he wants his role to be but I'll respond to Horiana first. I'm writing the response to her now. | 16:29.29 |
AlecTaylor | tor3: Hmm... so what do I do; can I turn scaling off? | 16:29.42 |
Robin_Watts | AlecTaylor: It's a bug in mupdf. tor is looking. Give him time... | 16:32.55 |
AlecTaylor | kk | 16:34.44 |
tor3 | AlecTaylor: you can pull a fix from git | 16:36.05 |
AlecTaylor | Great, which branch? | 16:36.19 |
| ahh | 16:37.09 |
| got it | 16:37.10 |
tor3 | the text branch, obviously...? | 16:37.14 |
AlecTaylor | So you made float size 1.0f :] | 16:37.22 |
| (you know the code pretty well :D) | 16:37.34 |
tor3 | not only, I changed the default to something more sane so if we hit a similar issue it won't be as bad | 16:37.52 |
| the real fix in the missing argument | 16:38.13 |
AlecTaylor | hmm | 16:38.23 |
| Why is there a brace there... | 16:38.27 |
| 7>..\apps\xpsdraw.c(130): error C2059: syntax error : '{' | 16:38.37 |
| fz_rect mediabox = (fz_rect){0, 0, page->width * 72 / 96, page->height * 72 / 96}; | 16:38.46 |
| overloaded operators? | 16:38.49 |
tor3 | overloaded operators are C++ | 16:39.22 |
AlecTaylor | ah, right. | 16:39.31 |
tor3 | don't try compiling C code as C++ if that's what you're doing | 16:39.38 |
| they're not compatible | 16:39.50 |
AlecTaylor | tor3: I know I know! - I'm just a C++ guy, so a lot of the C paradigms are confusing to me | 16:40.07 |
tor3 | that brace is part of the struct initializer | 16:40.17 |
Robin_Watts | Why the cast? | 16:40.28 |
tor3 | good question. | 16:40.58 |
AlecTaylor | oh wait | 16:41.20 |
| no worries | 16:41.22 |
AlecTaylor | cleaned the project, it now compiles | 16:41.29 |
| (damn, gotta remember to do that with each pull) | 16:41.36 |
tor3 | Robin_Watts: probably because you need the cast in expressions (like "return (struct foo){1,2,3};") | 16:41.48 |
ab5tract | Robin_Watts, http://bugs.ghostscript.com/show_bug.cgi?id=692585 | 16:42.05 |
Robin_Watts | tor3: Right. | 16:42.12 |
tor3 | and my fingers are on autopilot! | 16:42.24 |
Robin_Watts | ab5tract: It would help us a lot if you could give us an example that doesn't vary each time. | 16:43.06 |
ab5tract | ill try to create a more minimal example | 16:43.34 |
Robin_Watts | Thanks. | 16:43.37 |
ab5tract | but do you see the issue at your end? | 16:43.44 |
Robin_Watts | I haven't checked,and can't at the moment. | 16:43.59 |
ab5tract | Robin_Watts, ok | 16:45.15 |
| i've added a non-generative example | 16:45.24 |
ray_laptop | mvrhel2: are you available for consultation / brainstorming on 692158 ? | 16:45.25 |
ab5tract | thanks for your consideration! | 16:45.36 |
Robin_Watts | ab5tract: Thanks for that. | 16:46.18 |
mvrhel2 | ray_laptop: yes. I am at the coffee shop though. it is loud here today... | 16:46.19 |
AlecTaylor | An excerpt (one page) from the PDF I extracted: http://pastebin.com/KcPmYbdU - How is this useful? - Basically I'm trying to extract text+geometric-layout-info from each page of a PDF... | 16:48.08 |
chrisl | ab5tract: what are you using to view the PDF output? Both Ghostscript and Acrobat render the PDF correctly, but evince shows spurious stroked lines. | 16:51.28 |
AlecTaylor | How do I extract text+geometric-layout-info from each page of a PDF? - pdfdraw -ttt *.pdf just gives me: http://pastebin.com/KcPmYbdU (1 page excerpt) | 16:51.38 |
ray_laptop | AlecTaylor: I was going to look at what ghostscript's 'textwrite' gave for the file, but it appears totally broken (I just opened a bug) | 16:52.25 |
Robin_Watts | tor8: The block bboxes look broken in that output. | 16:53.23 |
| AlecTaylor: What geometric layout info do you want? | 16:54.02 |
| That file contains bboxes for each line. | 16:54.17 |
| (and span, and char) | 16:54.27 |
AlecTaylor | Robin_Watts: I want to know where (and in what order) it appears on the page | 16:54.56 |
Robin_Watts | AlecTaylor: Well, that information is there, isn't it? | 16:55.17 |
AlecTaylor | foo | 16:55.58 |
| I see it now | 16:56.00 |
| Sorry, 4AM here | 16:56.06 |
| Bit off :P | 16:56.10 |
AlecTaylor | is off to learn an XML library | 16:57.44 |
| Thanks for the help :] | 16:57.50 |
Robin_Watts | np. | 16:57.56 |
AlecTaylor | Hmm... actually I have class in 6 hours, I'll sleep now; learn the XML library tomorrow :) | 16:58.21 |
mvrhel2 | ah. the life of a student... | 17:02.49 |
ray_laptop | wonders if we should normalize the way mupdf and ghostscript output text info. | 17:03.14 |
| let ken and tor duke it out for whose format is better ;-) | 17:03.47 |
Robin_Watts | http://www.googlefight.com/index.php?lang=en_GB&word1=Ken+Sharp&word2=Tor+Andersson | 17:04.50 |
mvrhel2 | ray_laptop: so for the dot profile or spot function how do you think I should enable that to be specified in my code? have a few sample canned ones that people can look at and code up their own if they want? | 17:06.06 |
ray_laptop | mvrhel2: probably just a few 'sample' functions makes sense -- round, elliptical, diamond, ... (any of the common ones you have seen) | 17:08.09 |
mvrhel2 | ok | 17:08.52 |
ray_laptop | mvrhel2: you had made a comment that I'm not sure about: that the minumum dot size is taken care of by the TRC. | 17:09.39 |
mvrhel2 | well, in ordered screens I would argue yes. In clustered no | 17:10.06 |
ray_laptop | mvrhel2: for really sparse (highlight) shades where not all of cells are filled in, then I think you _do_ need to impose a minimum dot size | 17:10.40 |
| mvrhel2: what is a "clustered" screen ? | 17:10.54 |
mvrhel2 | sorry not clustered I mean stochastic | 17:11.18 |
ray_laptop | mvrhel2: with stochastic, then minumum dot size (and possibly shape) is requred since we are dispersing the dots as much as possible | 17:12.09 |
mvrhel2 | with my clustered dots, at *some* point we will reach the turn on point | 17:12.11 |
| this will be *noticed* during the creation of the TRC | 17:12.27 |
| and should be accounted for in that process | 17:12.36 |
ray_laptop | mvrhel2: creation of the TRC ? I assumed that is done by printing with a strict linear choice, then measuring with a densitometer. | 17:13.41 |
mvrhel2 | yes. | 17:13.48 |
| but if no dots are showing up for the first 10% then the TRC that you created will take care of this | 17:14.11 |
| in that it will push the dots to be larger sooner | 17:14.27 |
| this wont work in the stochastic case | 17:14.52 |
ray_laptop | mvrhel2: oh, I thiink I see what you are getting at. If the dots don't image, then the densitometer won't see them | 17:15.01 |
mvrhel2 | yes | 17:15.06 |
| and since I am clustering my dots it is OK | 17:15.18 |
| in the stochastic case that wont work though | 17:15.31 |
| you could find that you put down a lot of small dots that never show anything | 17:15.56 |
| and then suddenly wham | 17:16.00 |
| you have a spider web of yuck on the papter | 17:16.10 |
| paper | 17:16.11 |
ray_laptop | mvrhel2: the trouble is that the reliability of trying to image individual dots is highly variable from machine to machine, and on a machine depending on the environment, and the age of the imaging components, etc. | 17:16.17 |
| mvrhel2: so manufacturers decide to impose a 'minimum' that can be reliably imaged, and work from there | 17:16.47 |
| mvrhel2: so, I think we agree on the stochastic case, but with ordered dither, we may still need the parameter | 17:18.11 |
mvrhel2 | We def. agree in the stochastic case. But I do firmly believe that when dealing with clustered ordered screens, it is not an issue like it is with stochastic screens. | 17:18.43 |
Robin_Watts | ray_laptop: What constitutes a "minimum dot size" in general? If the minimum dot size is 2, does that meant a 2x2 square? | 17:19.07 |
| or would a 2x1 block count? | 17:19.29 |
mvrhel2 | Robin_Watts: it may depend upon how the vertical and horizontal resolution are related too | 17:19.43 |
ray_laptop | Robin_Watts: with genpat I have a minimum dot shape that can be chosen from a menu: 1x2, 2x1, 2x2 with the corner missing, 2x2, ... | 17:20.33 |
Robin_Watts | ray_laptop: Right. | 17:20.49 |
mvrhel2 | that would be easy enough for me to add in. | 17:21.08 |
ray_laptop | Robin_Watts: some laser engines have different imaging characteristics between vertical and horizontal | 17:21.30 |
mvrhel2 | ray_laptop: are those shapes before or after accounting for different vertical and horizontal resolutions? | 17:21.57 |
ray_laptop | mvrhel2: those are actual dots, independent of resolution. The resolution is where I take into the account the density contribution | 17:23.04 |
| mvrhel2: e.g., a 1200x600 resolution means that the density contribution is 1/2 that of a 600x600 dot | 17:24.20 |
mvrhel2 | hmm. but a 1x2 is different than a 2x1 with 1200x600? | 17:25.24 |
Robin_Watts | Same area covered, so presumably the same density contribution? | 17:26.13 |
mvrhel2 | but if I am determining a turn on sequence, they are different | 17:26.32 |
Robin_Watts | Why? | 17:26.51 |
mvrhel2 | 2x1 is like a 600x600 1x1 | 17:26.53 |
Robin_Watts | Right. | 17:26.58 |
mvrhel2 | ok. density is the same | 17:27.28 |
| but the shape is radially different. | 17:27.38 |
Robin_Watts | Regardless of the resolution, every pixel covers the same proportion of the cell. | 17:27.39 |
| shape is different, yes. | 17:27.48 |
mvrhel2 | and in fact, I don't think the density would be the same | 17:27.54 |
| but that is a dot gain nonlinearity issue | 17:28.06 |
ray_laptop | mvrhel2: "ideal" density is the same | 17:28.13 |
Robin_Watts | I don't follow how density comes into it. | 17:28.21 |
| distribution throughout the cell is different (and we'd like an even distribution for non clustered stuff) | 17:28.59 |
mvrhel2 | anyway forget about density for a sec. ray_laptop. with the 1200x600 case, when you specify 2x1 min dot, that will appear as a (ideally) square yest? | 17:29.13 |
| yes | 17:29.16 |
ray_laptop | mvrhel2: correct | 17:29.29 |
mvrhel2 | ok. that is all I needed | 17:29.36 |
| then I should easily do what you did with respect to this | 17:30.01 |
| first though, let me add in the dot profile and the different resolutions | 17:30.21 |
ray_laptop | mvrhel2: np. | 17:30.45 |
mvrhel2 | I guess with respect to the dot profile / min dot shape interaction, we just pick up from the min dot shape and go from there | 17:30.51 |
ray_laptop | mvrhel2: yes, that makes sense | 17:31.09 |
AlecTaylor | mvrhel2: Yup, life of a student :P - gotta love it | 17:32.02 |
ray_laptop | mvrhel2: at/near minimum dot size, the user can't expect too much 'shape' to be discernible | 17:32.04 |
mvrhel2 | right | 17:32.16 |
| :) | 17:32.24 |
AlecTaylor | :P | 17:32.33 |
ab5tract | thanks for the quick turnaround on the bug! | 17:46.29 |
| should have tried different pdf viewers earlier, sorry for the bother | 17:46.50 |
chrisl | ab5tract: you agree with my conclusion, then? I should close the bug? | 17:51.50 |
chrisl | closes bug, as it's time to finish...... | 17:53.52 |
Snerf | is there a way to set a image size when converting from pdf to png? | 17:57.57 |
Robin_Watts | Snerf: Change the resolution ? | 18:00.25 |
Snerf | well, the pdf is 800x600, how could I make the png file 100x75, for instance | 18:01.27 |
ray_laptop | Snerf: mupdf or gs ? | 18:01.36 |
Snerf | gs | 18:01.43 |
ray_laptop | -dPDFFitPage -g100x75 | 18:01.58 |
Snerf | thanks, will give it a try | 18:02.08 |
ray_laptop | note that there is a bug with PDFFitPage (that I'm waiting on to commit) that shows a problem if the target orientation (landscape / portrait) doesn't match the PDF. | 18:03.10 |
| Snerf: my pending patch will auto-rotate for 'best fit' | 18:03.28 |
| Snerf: but for your example, it should be fine | 18:03.43 |
Snerf | will try it | 18:04.40 |
| wow, terrible quality.. | 18:05.24 |
Robin_Watts | ray_laptop: I've found a problem in my clist stuff. | 18:05.31 |
| and you may be able to quickly point me to a solution. | 18:05.43 |
Snerf | think I will pass on that idea | 18:05.49 |
Robin_Watts | Snerf: Try antialiasing? | 18:06.03 |
Snerf | Robin_Watts, wouldnt know how.. this is my line.. | 18:06.44 |
ray_laptop | Snerf: you may get (slightly) better results with -dGraphicsAlphaBits=4 -dTextAlphaBits=4 (turns on anti-aliasing) | 18:06.58 |
Robin_Watts | Snerf: What ray said :) | 18:07.11 |
Snerf | gs -dPDFFitPage -g180x73 -dUseCIEColor -dSAFER -dBATCH -dNOPAUSE -r300 -dNOPLATFONTS -sDEVICE=png16m -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -sOutputFile=12193.thumb.png 12193.pdf | 18:07.14 |
Robin_Watts | oh, right. | 18:07.30 |
Snerf | I guess I am then :) | 18:07.31 |
Robin_Watts | Try using mupdf ? | 18:07.39 |
ray_laptop | Snerf: then that's about what you can get | 18:07.45 |
Snerf | no, I havent. I used imagemagick always, but it for some reason has broken, and not creating the png's properly | 18:08.16 |
ray_laptop | doubts that mupdf will do much different, but it's worth a try (manually setting the resolution down). | 18:08.24 |
| I don't think there is an equivalent to PDFFitPage is there ? | 18:08.45 |
Snerf | it always worked perfectly fine in imagick, until today | 18:09.01 |
ray_laptop | imagemagick uses GS for PS and PDF, I think | 18:10.00 |
Snerf | yes it does, its just weird how it broke | 18:10.21 |
ray_laptop | I laughed at the typo "imagick" when the image is icky :-) | 18:10.56 |
Robin_Watts | ray_laptop: I'm putting planar data for tiles into the clist. | 18:11.07 |
ray_laptop | Robin_Watts: right. | 18:11.19 |
Robin_Watts | and it's looking like my tiles data is being expanded out OK at the reading side. | 18:11.31 |
| but then it's being overwritten by the next tile that comes in. | 18:11.47 |
| (i.e. only the first plane remains uncorrupted). | 18:11.57 |
| This smells to me as if I've got a size calculation not taking account of the number of planes. | 18:12.19 |
ray_laptop | Robin_Watts: yeah, sounds like that to me, too | 18:12.35 |
Robin_Watts | Any hints as to where to look for such a calculation? | 18:12.50 |
ray_laptop | goes off to see if I can find it... | 18:13.30 |
Robin_Watts | Is that the 'offset' field that's stored in the clist ? | 18:13.47 |
| read_set_bits writes the data to 'data', which seems to be set from (slot + 1), and slot comes from cdev->chunk.data + offset | 18:15.02 |
| so 'offset' should be being calculated at writing time? | 18:15.21 |
ray_laptop | Robin_Watts: maybe 'clist_find_bits' ??? | 18:16.07 |
| the whole business of the writer keeping track of the tile cache that is used by the reader seems funky, and I haven't dug into it. | 18:18.17 |
Robin_Watts | can't see a calculation in there? | 18:18.18 |
| it surprised me :) | 18:18.31 |
ray_laptop | Robin_Watts: in clist_change_tile check the logic circa line 627 | 18:20.57 |
Robin_Watts | cmd_size_tile_params = How many bytes to send the tile params. | 18:21.33 |
| not the data. | 18:21.35 |
ray_laptop | that comment block is particularly concerning | 18:21.40 |
Robin_Watts | It's getting the offset from loc.tile. | 18:23.05 |
ray_laptop | Robin_Watts: but in the call to cmd_put_bits, it _looks_ like it takes the num_planes into account | 18:23.05 |
Robin_Watts | Yes, that writes the data. I've already updated that. | 18:23.22 |
| The problem is that the *next* tile goes in an an offset that's too small. | 18:23.43 |
| So I was hoping to find a line like: offset += size_of_this_tiles_data; | 18:24.02 |
| I don't see how it can know the number of bytes that the tile will take until after it calls cmd_put_bits. | 18:25.04 |
| ignore that, it's not that size that's required. | 18:26.32 |
ray_laptop | Robin_Watts: what's that FIXME in clist_change_bits ??? | 18:29.15 |
Robin_Watts | FIXME: Send more planes? | 18:29.26 |
| I've fixed that :) | 18:29.30 |
ray_laptop | gxclbits line 737 | 18:29.34 |
| Oh, guess I'm out of date. Just a sec.... | 18:29.44 |
Robin_Watts | Oh, it's clist_add_tile, probably. | 18:30.42 |
| Might have found it. | 18:31.33 |
| Thanks! | 18:31.34 |
ray_laptop | Robin_Watts: I didn't help ! | 18:31.49 |
Robin_Watts | (Test, then dinner, or dinner then test? The choice between the possibility of happiness, or the certainty of disappointment :) ) | 18:32.28 |
| Damn. Still failed. | 18:32.49 |
| Will keep looking later. | 18:32.53 |
| worked! Woo Hoo! | 18:35.27 |
| will commit later. | 18:35.32 |
ray_laptop | Robin_Watts: you should have had dinner first ;-) | 18:38.04 |
mvrhel2 | bbiab | 18:48.18 |
Robin_Watts | ray_laptop: (For the logs). I'm running a test job on peeves. Feel free to nice it if it's taking too much CPU etc. | 19:29.20 |
| reboot. | 19:29.23 |
Snerf | ok, downloaded the ghostscript-9.04.tar.gz file, and when I tar xzvf it, it errors with crc errors, and such, is the link bad? | 20:23.39 |
| thats from downloads.ghostscript.com | 20:36.57 |
ray_laptop | Robin_Watts: that's what peeves is there for. It seems to have finished because the load ave is only .14 and I don't see anything of yours running except ssh-agent. | 21:00.36 |
| Robin_Watts: today I don't need to heat up the office, however. It is currently 97 degrees (it feels hot anytime the temp is > a right angle) | 21:02.05 |
| Snerf: that's somewhat distressing (about the CRC errors). I usually use gzip -d ghostscript-9.04.tar.gz | tar -xf - | 21:03.45 |
| Snerf: I'll download it and check it right now | 21:04.00 |
| Snerf: you can also go to: http://www.ghostscript.com/download/ | 21:06.33 |
| and get a binary (linux or Windows) or source package | 21:07.26 |
BusError | what became of --with-system-libtiff, --without-jbig2dec, --without-jasper, --with-install-cups, --disable-compile-inits ? | 21:23.46 |
| trying to upgrade from 9.04 to git, to try to solve my cups problems | 21:29.38 |
| ah. figured it out | 21:41.00 |
| seems I can no longer really use the system lcms2 code | 21:41.11 |
| just to reply to my problem about gs taking 140MB ram on my 64MB board with cups, a couple of days ago... | 22:56.48 |
| turn out there is a RIPCache that default to "128m" thus the problem. I set it to 50m and it prints nicely now | 22:56.56 |
| a "complex" pdf page takes about 1:30m or so, which is not bad at all | 22:57.05 |
Robin_Watts | Has any decision been made about where we are having the December meeting? | 23:49.01 |
| If Miles doesn't announce before the end of this week, it'll be the end of the month (because of the race) at earliest, and we'll then be booking flights only about a month ahead... | 23:50.19 |
| Forward 1 day (to 2011/10/13)>>> | |