| <<<Back 1 day (to 2014/03/03) | 2014/03/04 |
Robin_Watts | paulgardiner: Morning | 10:03.51 |
paulgardiner | Hi Robin_Watts | 10:03.57 |
Robin_Watts | I got the go ahead from tor to commit the pdf_process stuff yesterday, but since then I've split the commit into separate parts. | 10:04.25 |
| Could you give it a once over before I push it please? | 10:04.34 |
paulgardiner | Yeah sure | 10:04.41 |
Robin_Watts | Thanks. from "Add pdf_process interface." to "Add pdf_process for filtering operator streams" | 10:05.14 |
paulgardiner | Small thing. Commit message of first commit covers all three commits. | 10:12.55 |
Robin_Watts | damn, I edited that at one point. Will fix, thanks. | 10:15.30 |
paulgardiner | I wonder about pdf_process_init instead of pdf_process_run, but I may be misunderstanding its purpose. | 10:17.48 |
| Oh i am | 10:18.06 |
| Got it | 10:18.08 |
| Besides that, LGTM | 10:21.42 |
chrisl | kens2: ping | 10:48.21 |
kens2 | pong | 10:48.27 |
chrisl | kens2: with Bug693711.pdf I get the warning about UseCIEColor | 10:49.00 |
kens2 | wwwIs that a problem ? | 10:49.09 |
chrisl | And I'm not setting it | 10:49.10 |
kens2 | Hmm, let me look | 10:49.16 |
chrisl | And I'm not using %d this time, either | 10:49.34 |
kens2 | OK just the usual pdfwrite command line ? | 10:50.36 |
chrisl | -sDEVICE=pdfwrite -o stuff.pdf ../../../svn-private/ghostpcl/trunk/tests_private/comparefiles/Bug693711.pdf | 10:51.09 |
kens2 | Hmm yes that does trigger for me | 10:51.40 |
| I wonder why | 10:51.46 |
chrisl | It might be a good idea to grep through the cluster logs, and see if other files are triggering the warning | 10:52.36 |
kens2 | One will be neough I guess | 10:52.46 |
| It must be coming form the PDF interpreter | 10:52.53 |
| I must admit I can't see it immediately, I'll get back to it later | 10:53.32 |
| Might keep it for the lfight | 10:53.53 |
chrisl | It's not urgent | 10:56.34 |
kens2 | chrisl the file uses a DefaultRGB colour space, so it does indeed set UseCIEColor. Its legitimate in this case, but kind of nasty. The occurence will be rare to extremely rare I think | 11:28.58 |
chrisl | kens2: it's not great given there's no action I can take based on the warning | 11:31.15 |
kens2 | chrisl well Marcos wanted the warning.... I can't tell the difference between setuserparams and .setuseciecolor | 11:33.24 |
| Like I said, it will be rare to extremely rare that this crops up | 11:33.39 |
| Its a consequence of the PDF spec,if there is a Defautl* color space in the PDF file then we have to systematically convert all colours into a device-independent colour space, ie set /UseCIEColor to true.Unlike PostScript, there's no real way for the user to say 'don't do that' | 11:35.36 |
chrisl | kens2: yeh, it's just the fact that the warning is effectively saying "don't do that", and I didn't do it..... | 11:37.31 |
kens2 | chrisl there really isn't much I can do, I'd need to know the input was a PDF file to prevent it. | 11:37.52 |
chrisl | Sure, if it's that rare, then let's not worry about it | 11:38.19 |
kens2 | It should be very rare indeed. | 11:38.28 |
| If it wasn't for the fact that the crappy code we had before required UseCIEColor, this wouldn't be such a problem :-( | 11:39.00 |
Robin_Watts | ok. vsnprintf, I can put you off no more. | 12:12.24 |
| Morning tor8. | 12:50.41 |
| I found the following commit lying around. | 12:50.49 |
| http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=40757a1ea0d81b73274856fe1d7aa9022bfe148d | 12:50.59 |
| As I recall you didn't like the use of a device hint. | 12:51.10 |
tor8 | Robin_Watts: yeah, but I'm starting to think I was wrong about that. | 13:08.24 |
Robin_Watts | ah, fab. :) | 13:08.38 |
tor8 | the device hints are inaptly named; they really are interpreter hints | 13:08.38 |
Robin_Watts | yeah. | 13:09.00 |
tor8 | but originally at least the idea was that they were related to the device/interpreter specifics | 13:09.14 |
Robin_Watts | And while we don't currently have the idea of a "no cache" hint for non PDF things, we could do in future. | 13:09.23 |
tor8 | and I believe my complaint was that the no-cache hint was unrelated to the device | 13:09.34 |
Robin_Watts | I think so. | 13:09.41 |
tor8 | my remaining question though is, why not make this behaviour default? | 13:10.23 |
| it looks a bit odd to create the device and set the hint there, when it really looks like it should be a flag to fz_run_xxx | 13:11.16 |
kens2 | Hmm for marcosw and maybe Robin_Watts. I just ran a bmpcmp and got these messages in the returned email: | 13:11.21 |
| Use of uninitialized value in concatenation (.) or string at run.pl line 906. | 13:11.22 |
| Use of uninitialized value $t2 in concatenation (.) or string at run.pl line 373. | 13:11.22 |
| Use of uninitialized value $t2 in scalar chomp at run.pl line 370, <F> line 1. | 13:11.22 |
Robin_Watts | kens2: Thanks. If I get a mo, I'll look later. | 13:12.00 |
tor8 | Robin_Watts: maybe we should have a general concept of interpreter flags, which is the bitwise or of a flags argument to fz_run_xxx and something retrieved from the device? | 13:12.15 |
kens2 | No rush Robin_Watts I'mnot even sure whose scripts those are | 13:12.15 |
Robin_Watts | run.pl is the script that the cluster master sends to the cluster node to run. | 13:12.43 |
tor8 | so that the device can always say "hey, I don't use images" and we can also from the app layer (without messing with the device hints) say, "don't send images" to the interpreter | 13:12.54 |
Robin_Watts | tor8: That would mean changing the existing API, right? | 13:13.56 |
tor8 | Robin_Watts: yes, it would. | 13:14.08 |
| or adding another entry point | 13:14.14 |
| I think fz_run_with_flags could be added without breaking the existing API should we want to | 13:14.46 |
Robin_Watts | What does this give us that we don't currently get by abusing the device hints? | 13:15.01 |
tor8 | and all the internal "run" functions we ought to be able to change with no concern for breaking, I reckon most people shouldn't be using those | 13:15.10 |
| a cleaner separation of concerns :) | 13:15.20 |
Robin_Watts | I'm wondering if there is a "renaming" we can do. | 13:15.31 |
| Stop calling them "device_hints" and call them... something else. | 13:15.51 |
tor8 | s/device_hints/interpreter_flags/? | 13:15.53 |
| interpreter_control | 13:15.58 |
| or something like that | 13:16.01 |
Robin_Watts | We could keep calling them 'hints' and define bits from 0 upwards to be device ones, and from 31 downwards to be interpreter ones. | 13:17.05 |
tor8 | Robin_Watts: but really, they're all interpreter ones. the only difference is the source. | 13:18.41 |
| whether they're set by default from current device, or manually by the caller | 13:19.00 |
Robin_Watts | OK, so interpreter_control or interpreter_flags is fine by me. | 13:19.16 |
tor8 | or interpreter_hints if you prefer that | 13:19.32 |
Robin_Watts | What are the functions that set/get them? | 13:19.45 |
tor8 | fz_enable/disable_device_hints set them | 13:20.10 |
Robin_Watts | Right, so we should rename those? And leave a #define there so old users get passed through OK ? | 13:20.33 |
tor8 | the ugly part is where we get them (dev->hints) | 13:20.34 |
| if I had my way I'd remove them and pass them to the fz_run functions instead | 13:21.25 |
Robin_Watts | Hmm. | 13:21.48 |
tor8 | the only code that uses them now is mudraw | 13:22.07 |
Robin_Watts | For the text extraction device, it runs by default without images. (i.e. it has FZ_HINT_NO_IMAGES or whatever set) | 13:22.23 |
tor8 | fz_run_page(..., flags) would do flags |= dev->hints at the top and then use that | 13:22.39 |
Robin_Watts | But we can tweak the device to turn that off. | 13:22.43 |
| So we get extracted images in the produced HTML. | 13:22.58 |
| so, we'd still need a way of setting/getting the device hints. | 13:23.10 |
| because that one is both an instruction for the device and for the interpreter. | 13:23.27 |
tor8 | Robin_Watts: we could pass a flag when creating the text extraction device there | 13:23.33 |
Robin_Watts | We could. | 13:23.41 |
tor8 | whether to create images or not, that seems like a cleaner solution interface wise | 13:23.53 |
Robin_Watts | ok. | 13:24.01 |
tor8 | if you want I can bash on it, I've had my fill of javascript date parsing/formatting crap for the week | 13:24.28 |
Robin_Watts | tor8: please feel free. | 13:24.37 |
tor8 | feel free to push the commit as is now in that case, and I'll revamp after that | 13:24.54 |
Robin_Watts | I'm knee deep in fz_vsnprintf. | 13:24.55 |
| tor8: Ta. | 13:25.01 |
tor8 | Robin_Watts: that's my next step... I got a complaint that Number.toExponential is outputting leading zeros on the exponent | 13:25.21 |
Robin_Watts | I'm taking the view that I don't need to support everything. Just enough for us. | 13:25.46 |
| so the only thing other than %x %c %d %f that I'm supporting is %02x. | 13:26.05 |
tor8 | Robin_Watts: that's a very sane choice | 13:26.08 |
| %C to utf-encode a unicode character? pleeeeeease! | 13:27.00 |
| with sugar and a cherry on top! | 13:27.05 |
| or just utf8-encode all %c? | 13:27.27 |
Robin_Watts | Do we need it? | 13:28.04 |
| I'll gladly add it if we do. | 13:28.09 |
tor8 | I'd love to have it for the text extraction XML generation | 13:28.23 |
| and I've found myself wishing for it on more occasions | 13:28.37 |
Robin_Watts | ok. | 13:29.07 |
| I am tempted by %m for a matrix (pointer to an fz_matrix) | 13:29.24 |
tor8 | Robin_Watts: and %r for rects? | 13:29.38 |
| %b for irects? | 13:29.46 |
Robin_Watts | That would make sense too. | 13:29.50 |
tor8 | definitely a %m though! | 13:29.56 |
Robin_Watts | Not sure we need a %b do we? | 13:30.12 |
tor8 | probably not | 13:30.17 |
| %m and %r should take us a long way | 13:30.28 |
Robin_Watts | %i for irects, keeping %b for bools if we need them. | 13:30.31 |
tor8 | the only problem I see here is that we lose gcc's handy printflike warnings | 13:30.47 |
Robin_Watts | yeah. | 13:30.56 |
tor8 | but I can live without that if we get %m: ) | 13:31.10 |
| it might make sense to use uppercase for our custom %-escapes | 13:31.40 |
| less risk of confusion for new users | 13:31.47 |
Robin_Watts | yeah. | 13:31.59 |
tor8 | %M, %R, etc | 13:32.01 |
| I'd have said %O for pdf_obj if it was still a fz_obj :) | 13:32.12 |
Robin_Watts | tor8: OK, pushed as is. | 13:47.08 |
sebras | tor8: oh, so no %N for 0 0 R either? :-/ | 14:02.50 |
kens | chrisl I thought about the CIEColor thing some more. On reflection I'm unconvinced that a high level device should honour the Default* colour spaces. The result of doing so will be (for example) a PDF file which has ICCBased spaces where originally it had device spaces. | 15:01.44 |
| I did a bit of digging and I've turned off the Default* colour spaces if the output device is a high level device, you cxan see the diffs in my latest bmpcmp run | 15:02.15 |
| COUld you take a look and give me an opinion ? | 15:02.28 |
| There are a very few files which are different | 15:02.47 |
chrisl | kens: I'll check through them in a mo - just on the phone.... | 15:05.21 |
kens | No rush | 15:05.27 |
chrisl | kens: okay, so I'm a little confused by the first diffs in your bmpcmp - the Ghent tests | 15:15.01 |
kens | They contain opverprint IIRC let me just look | 15:15.14 |
| Yes they do. So previously the CMYK is converted to ICC and the ovepritn is lost. Now we maintain the CMYHK and so its not. Note that Acrobat displays the overprint | 15:15.55 |
| Posibvly because the Default space is DefaultCMYK | 15:16.15 |
chrisl | Ah, okay so they are different to other tests - the usual ones I expect the "X" not to be visible | 15:16.43 |
kens | Ah, I believe the 'X' should be visible if overprint is allowed, but of course, it depends on how you interpret the Default* spaces | 15:17.15 |
| All that I can see fomr the spec is that we should convert to ICC, and therefore should not overprint. | 15:17.41 |
| But to be perfectly honest, I'm willing to bet that if someone turned on overpritn they are going to be annoyed if the result is not overprinted just because they specified a Default* space. | 15:18.20 |
| Note that in our implementation you can still get the same result by deliberately setting UseCIEColor | 15:18.41 |
| And you will get the non-overprinted version if you render. We only drop the 'UseCEIColor' PDF thingy for high-level devices | 15:19.11 |
| Its a bit confusing that one | 15:19.17 |
chrisl | The actual colour shifts are about what I would expect to see, though | 15:19.37 |
kens | Sort of. There are some places where we now have a device colour, whereas before it was ICC so we no longer see halftoning | 15:20.11 |
| I think overall its a small improvement, and I'll argue the case with anyone who actually *wants* pdfwrite and friends to maintain the ICC colours (they can just turn on /UseCIEColor) | 15:20.46 |
chrisl | kens: yeh, I meant the colour shifts are similar to those I've seen comparing UseCIEColor vs not, which is what I'd expect with your change | 15:21.31 |
kens | Pretty much yes. | 15:21.40 |
| What do you think ? It gets rid of the warning, and I think I can explain the differences to anyone who notices (or cares) and I have a solution for them if they really want that to happen | 15:22.23 |
chrisl | I think it makes more sense this way, personally | 15:23.06 |
kens | Yeah on balance that was what I though. I'll go ahead an commit it then, thanks | 15:23.22 |
Robin_Watts | https://www.google.co.uk/search?q=PC+world&oq=PC+world&aqs=chrome..69i57j0l5.1446j0j8&sourceid=chrome&espv=210&es_sm=122&ie=UTF-8 | 16:30.55 |
| Look carefully at the logo on the right hand side of the page | 16:31.09 |
kens | map data 2014 ? | 16:31.26 |
Robin_Watts | Under that... | 16:31.36 |
kens | Umm there's nothing under that for me | 16:32.07 |
Robin_Watts | Helen and I both get a PC World logo. | 16:32.41 |
| or what looks like a PC world logo. | 16:32.49 |
kens | Not me | 16:32.54 |
Robin_Watts | What browser? | 16:33.11 |
kens | Firefox | 16:33.15 |
| I'll try chrome | 16:33.28 |
chrisl | Would be funny, but too true..... | 16:33.36 |
Robin_Watts | I get it on firefox too. | 16:33.40 |
kens | Hmm not seeing it | 16:34.07 |
mvrhel_laptop | I get one | 16:34.33 |
Robin_Watts | http://www.bitterwallet.com/ | 16:34.41 |
kens | Guess my browser is too locked down | 16:34.45 |
| OK if I look at the ads in Chrome, tehn it has htat logo | 16:35.37 |
chrisl | TBH, the biggest shock for me the claim that there's a PC World in Andover - but there isn't, it's really Currys | 16:36.08 |
henrys | chrisl: wiki says it merged into currys | 16:37.59 |
chrisl | henrys: yes, but they mostly still have separate stores, and different target audiences | 16:38.42 |
mvrhel_laptop | off to dentist | 16:55.20 |
| bbiaw | 16:55.22 |
Robin_Watts | tor8: ping | 16:58.55 |
| Would you object to an fz_mins/fz_maxs ? For size_t rather than int ? | 16:59.16 |
kens | OK I have ot go, parent's evening to attend... Goodnight all | 17:02.28 |
paulgardiner | Robin_Watts: finally, the WinRT utf8 thing is fixed. It's on paul/master tagged onto the end of your commit. | 17:05.31 |
Robin_Watts | paulgardiner: Cool. Will look now. | 17:05.43 |
| 2 commits ? | 17:06.59 |
| oh, right, I see, yes. | 17:07.48 |
| paulgardiner: Pushed. Thanks. | 17:16.14 |
henrys | paulgardiner: you have your first SOT customer | 17:59.12 |
ray_laptop | henrys: an OEM ? | 18:01.05 |
henrys | ray_laptop: yes miles sent out email | 18:04.13 |
ray_laptop | henrys: I hadn't seen it | 18:10.31 |
| mvrhel_laptop: are you available for a call with cust 532 ? | 18:10.40 |
Robin_Watts | I still haven't seen it :( | 18:11.47 |
ray_laptop | oops just hung up on cust 532 | 18:12.43 |
henrys | Robin_Watts: you must get support email? | 18:15.20 |
Robin_Watts | henrys: Normally, yes. | 18:15.50 |
henrys | do we know how to release an SOT "library" for iOS and android, I've seen an android app built. | 18:21.35 |
Robin_Watts | henrys: Offhand, no. | 18:21.58 |
henrys | paulgardiner, Robin_Watts : unfortunately Bipartate will not be of help to us for at least 60 days. They've told us they're "reorganizing". | 18:22.46 |
Robin_Watts | henrys: Well, emobix are still a possibility. | 18:23.21 |
| henrys: I see marcosw's reply, but not the original. How odd. | 18:34.08 |
henrys | Robin_Watts: do we know if there is only 1 library or are there many? | 18:34.11 |
Robin_Watts | I don't know for sure. | 18:34.31 |
henrys | Miles responded before I could stop him sorry | 18:34.57 |
Robin_Watts | But if memory serves, there is a 'picsel library' build product that I saw while staring at the build files. | 18:35.12 |
zeniko | Robin_Watts: I'm unclear on ownership of the new pdf_process::state | 18:45.13 |
Robin_Watts | zeniko: You put a pdf_process structure on the stack. | 18:45.47 |
| You call pdf_process_run (or similar) and pass in a pointer to it to initialise it. | 18:46.07 |
| Then you pass that into pdf_process_page_contents (or similar) | 18:46.25 |
| pdf_process_run will set up some private state, and hide an opaque pointer to it in the structure. | 18:47.00 |
| It's owned by the pdf_process structure itself, effectively. | 18:47.16 |
zeniko | however it's freed by pdf_free_csi? | 18:47.29 |
Robin_Watts | pdf_process_page_contents calls pdf_new_csi | 18:47.50 |
| That takes the pdf_process structure and takes ownership of its contents. | 18:48.09 |
| hence it's freed by pdf_free_csi. | 18:48.18 |
zeniko | what's supposed to happen in the case where pdf_new_csi is never called? | 18:48.28 |
| (as it happens in pdf_process_annot for invisible and hidden annotations) | 18:48.40 |
Robin_Watts | Oh, ass. | 18:49.13 |
| That's a bug. | 18:49.17 |
| I will fix it, thanks. | 18:49.30 |
zeniko | and WRT the fix for bug 691691: have you been able to run any tests to verify that the patch works as intended? | 18:50.33 |
Robin_Watts | I assume I did when I wrote it. It's been sat there for a while. | 18:51.49 |
| and my memory is poor. | 18:51.56 |
zeniko | IIRC, when I did some tests back when you first posted that patch didn't show any improvements | 18:52.45 |
Robin_Watts | zeniko: OK. I will revisit that too. | 18:53.21 |
zeniko | Robin_Watts: I'll look into measuring again now that you've landed it | 18:54.01 |
Robin_Watts | Thanks. | 18:54.15 |
| If you can spot anything stupid that I've cocked up, please say :) | 18:54.26 |
zeniko | just to confirm: all I'd have to do is add an fz_enable_device_hints(dev, FZ_NO_CACHE) in mudraw for the text case so that I should be able to notice the improvements? | 18:55.39 |
Robin_Watts | zeniko: Yes. (IIRC) | 18:56.22 |
henrys | Robin_Watts: wow was talking to miles these oem's are responding. | 18:57.56 |
Robin_Watts | henrys: cool. I think. | 19:03.14 |
| Ah, and the backlog of support stuff just arrived. | 19:08.23 |
henrys | can you tell from the headers where it got delayed? | 19:08.59 |
zeniko | Robin_Watts: I've added a counter to pdf_clear_xref_to_mark for how many objects were cleared for each page - and the answer is always 0 | 19:10.33 |
| unfortunately, I don't see yet why it doesn't work | 19:11.49 |
Robin_Watts | henrys: An hour and 5 minutes delay between 10.60.44.232 and mail-oa0-f53.google.com it seems | 19:13.01 |
zeniko | Robin_Watts: could it be that pdf_load_page causes most objects to be loaded already so that doing an FZ_NO_CACHE text extraction run simply doesn't have to load any additional objects? | 19:15.08 |
Robin_Watts | zeniko: I'd have hoped not. | 19:15.39 |
| But clearly something is going wrong :( | 19:16.13 |
tor8 | Robin_Watts: okay to fz_min/max for size_t (or fz_minu for unsigned?) | 19:37.24 |
Robin_Watts | tor8: Actually, it now looks as though it might not be needed. | 19:44.06 |
| tor8: You still here? | 20:03.35 |
| git push -f origin master | 20:03.41 |
| bah. | 20:04.07 |
| tor8, sebras, paulgardiner: 2 leak fixes on robin/master | 20:04.33 |
zeniko | Robin_Watts: isn't it possible to free the pdf_process without creating a csi? | 20:06.50 |
Robin_Watts | zeniko: Not neatly, currently. | 20:07.09 |
zeniko | else that fix LGTM (except for a whitespace inconsistency in the flags check) | 20:07.13 |
Robin_Watts | Ta. Whitespace fixed. | 20:08.46 |
zeniko | Robin_Watts: your fz_vsnprintf change will change the behavior of fz_buffer_printf which we currently use as well | 20:14.31 |
Robin_Watts | zeniko: My intention is for it to change the behaviour of fz_buffer_printf. | 20:14.54 |
| Is there anything that you rely on that I am not supporting? | 20:15.22 |
zeniko | AFAICT, the only decorations we use are %.4f and %03o | 20:16.18 |
Robin_Watts | What does %02o do ? | 20:17.09 |
| What does %03o do ? | 20:17.12 |
| Is o octal? | 20:17.19 |
zeniko | o is used for printing octal escapes in PDF strings | 20:18.09 |
Robin_Watts | I can add support for o and %03o | 20:18.21 |
| assuming I can ever get the damn thing working. | 20:18.28 |
| What is the %.4f for ? | 20:18.37 |
zeniko | and %.4f is used for getting a reasonable amout of precision for floating point numbers | 20:18.57 |
Robin_Watts | The reason for this work is so I can reliably use %f and get no loss of data. | 20:19.02 |
| hence you might be able to drop the .4 | 20:19.15 |
zeniko | most likely, so the octal escape would be the only one missing for us | 20:19.55 |
Robin_Watts | I will add that, assuming I can ever get it working :( | 20:20.48 |
zeniko | another concern: will fz_output::printf still behave consistently independent of whether fz_output is for a buffer or a file? | 20:21.47 |
Robin_Watts | zeniko: Good question. | 20:22.36 |
| I will endeavour to ensure it does. | 20:23.09 |
zeniko | it might also be worth considering changing the name of fz_output::printf so that anybody implementing their own fz_output notices the change (even though I'd assume that there's nobody actually doing that) | 20:24.38 |
mvrhel_laptop | ray_laptop: back from dentist | 20:24.49 |
| sorry I missed the call | 20:24.54 |
zeniko | (then again, we already have an fz_stream for reading from a Win32 IStream, so might also eventually get an fz_output for writing to one) | 20:25.36 |
henrys | Robin_Watts: would processing all this doxygen stuff in epage be worthwhile. Was that actively maintained, sometimes things like that get stale and reading it is worse than not. | 20:36.38 |
Robin_Watts | henrys: When I was there it was a required part of commits that the doxygen should be updated. | 20:38.28 |
| But I'm not sure I ever saw the results of a doxygen run :) | 20:38.40 |
| At the very least, it serves as documentation for entry/exit conditions of all functions. | 20:39.16 |
henrys | docbuild.py I suppose holds the magic to run that | 20:41.07 |
Robin_Watts | It is possible that there was a nightly ATS task to run that? | 20:43.54 |
mvrhel_laptop | bbiaw | 21:18.58 |
henrys | doxygen works fine. I like the python, a hell of a lot better than make hell | 21:24.25 |
| Forward 1 day (to 2014/03/05)>>> | |