| <<<Back 1 day (to 2012/04/17) | 2012/04/18 |
kens | chrisl when I do a cluster test, do you have any idea where the source is stored on peeves for my user ? | 08:03.09 |
chrisl | kens: no, sorry | 08:08.29 |
kens | Well, I may have to wait for marcos :-) | 08:12.13 |
chrisl | kens: can't you use casper? | 08:12.17 |
kens | Hmm, maybe. I want to copy stuff to my own directory from the cluster and run over ssh | 08:12.49 |
chrisl | Ah, I don't think casper has stuff like ddd on it..... | 08:13.45 |
kens | Well, can't use it then :-( | 08:13.59 |
chrisl | Huh, I thought caper didn't have any X stuff on it, but I can run an xterm. I guess I could install ddd for you | 08:15.36 |
kens | If its not there that would help me a lot. | 08:15.51 |
| I can find the source I need there | 08:15.57 |
| THe problem only seems to occur on a 64-bit build with -r600, so its a memory problem of some kind. | 08:16.57 |
chrisl | /home/regression/cluster/users/ken/ghostpdl ? | 08:17.06 |
kens | Yes, thats it | 08:17.17 |
| But I've copied it to /home/ken | 08:17.32 |
chrisl | Okay, give me a couple of minutes | 08:17.33 |
| kens: ddd is installed - you probably want to disable the splash windows and go into options and enable "Suppress X warnings". | 08:22.56 |
kens | chrisl still building :-) | 08:23.09 |
| I'll try it as soon as its finished | 08:23.19 |
| How do you disable the splash screen ? | 08:23.34 |
chrisl | When you first start ddd you the spash screen and the "tip" dialogue - each of those has a check box to disable them. | 08:24.22 |
kens | AH right, I knew about the tip box | 08:24.34 |
| Can't see a check box on the splash screen though | 08:24.52 |
| (on my VM) | 08:24.58 |
| OK its in prefrences | 08:25.51 |
chrisl | just about to say that..... | 08:26.02 |
kens | :-) | 08:26.12 |
| Still waiting for build to finish.... | 08:26.31 |
chrisl | Given the bandwidth limitations, they are a pointless waste of time drawing onto the screen. | 08:26.44 |
kens | Yes, definitely | 08:26.54 |
Robin_Watts | kens: /home/marcos/cluster/users/ken | 08:30.49 |
kens | Robin_Watts : Nope | 08:30.59 |
| Its not the code the cluster is using | 08:31.12 |
Robin_Watts | Maybe /home/marcos/cluster/ghostpdl then? | 08:32.12 |
| but that will be "whatever the last code run was" | 08:32.19 |
kens | Hmm, I'll look | 08:32.22 |
| Last code run won't help me | 08:32.30 |
| Well, it will right now | 08:32.35 |
chrisl | I thought each node *only* had the last run code on them? | 08:33.22 |
kens | I thought it kept the 'user' versions | 08:33.54 |
chrisl | Hmm, I thought it was only "clustermaster" that kept each users code | 08:34.27 |
kens | doesn't know | 08:34.40 |
Robin_Watts | If the code in user isn't your latest, then there is only 1 other copy. | 08:35.10 |
kens | Robin_Watts : neither of those directories has the code I last ran in it | 08:35.24 |
| On Peeves | 08:35.35 |
Robin_Watts | (Do 'locate ghostscript.vcproj' and look at the locations shown in /home/marcos/cluster/) | 08:35.39 |
kens | The casper code is correct though | 08:35.41 |
Robin_Watts | How can I tell if it's your code or not ? | 08:36.28 |
kens | If you look at gdevpdf.c | 08:36.41 |
| date/time stamp should be enough | 08:36.59 |
chrisl | kens: are you planning to get an x64 Linux VM in the near future? | 08:37.16 |
kens | chrisl I wasn't planning to | 08:37.37 |
| I think I need to update Windows to 64-bit first | 08:37.51 |
Robin_Watts | kens: No. you can run a 64bit VM on a 32bit host. | 08:38.07 |
chrisl | I wonder if I should do a bit of digging and find out about user gdbserver, then | 08:38.08 |
| Robin_Watts: don't go there - kens doesn't believe in such voodoo! | 08:38.40 |
kens | Robin_Watts : You can also look in gdevpdf.c in the function pdf_close. | 08:38.51 |
| Robin_Watts : Have you tried that on VMWare ? | 08:39.04 |
chrisl | It also depends on your hardware | 08:39.32 |
Robin_Watts | I may have. Certainly it was VMwares capability I was talking about. | 08:39.38 |
| chrisl: Yes, your hardware needs to be 64bit capable. | 08:39.48 |
kens | hardware is capable, Windows isn't | 08:40.00 |
Robin_Watts | I updated to Windows 7 64bit a while ago, and it's great. | 08:40.23 |
| tor8: Hey! Welcome back! | 08:40.28 |
chrisl | Robin_Watts: there's more to it than that - you need some of the virtualisation extensions, which not all 64 bit capable CPUs include. | 08:40.30 |
tor8 | Robin_Watts: *yawn* morning | 08:40.39 |
kens | would prefer to avoid WIndows 7. But may have to in order to avoid WIndows 8 | 08:40.48 |
Robin_Watts | When did you get back? | 08:40.53 |
tor8 | just past midnight yesterday | 08:41.06 |
Robin_Watts | Windows 7 is WAY better than vista, and is pretty much as good as XP. | 08:41.25 |
| How are your feet ? | 08:41.51 |
tor8 | still a bit swollen from all the walking :) | 08:42.11 |
| I'm planning to stay indoors for a week now! | 08:42.59 |
chrisl | tor8: that's normal for you, isn't it? Indoors for weeks at a time? | 08:43.31 |
tor8 | chrisl: cave dweller I am, yes. | 08:44.02 |
chrisl | ;-) | 08:44.06 |
tor8 | The Sun! It burns! | 08:44.17 |
chrisl | You do, however, take holidays, which is an almost entirely alien concept for me...... | 08:45.28 |
Robin_Watts | tor8: Time to investigate online shopping deliveries. | 08:45.33 |
tor8 | Robin_Watts: eh? | 08:46.13 |
| ah! | 08:46.34 |
Robin_Watts | I find that the advent of Tescos home delivery corresponded nicely with me needing to leave the house less. | 08:46.56 |
| tor8: We had some issues reported with the rc1. | 08:47.49 |
| Firstly, the windows binary release missed out mupdfclean, mupdfextract etc. | 08:48.32 |
tor8 | Robin_Watts: I saw you have some fixes on the master. One thing I want to fix before 1.0 proper is not linking in the font and cmap data with pdfclean. | 08:48.35 |
Robin_Watts | tor8: I looked into that. | 08:48.44 |
| And it's not easy. | 08:48.52 |
kens | Ah crap. | 08:49.03 |
| Casper doesn't haev the same problem as peeves. | 08:49.13 |
tor8 | I bet fz_run_page -> pdf_run_page is the culprit | 08:49.22 |
kens | Same source, same file, same configuration works properly | 08:49.30 |
Robin_Watts | No, it's the fz/pdf_document thing that's the problem. | 08:49.47 |
tor8 | pdf_init_document() adds the dependency... | 08:50.14 |
Robin_Watts | Because all the pdf specifics hang off pdf_document, it means as soon as you use any part of that interface, it pulls the whole interface in. | 08:50.25 |
tor8 | yeah. | 08:50.36 |
| ugh. | 08:50.38 |
Robin_Watts | Frankly, I don't consider it a problem. | 08:50.48 |
| I added a metadata interface and put back "Document Properties..." | 08:52.16 |
| And there is some crufty trig stuff in there now that eliminates (I hope) the indeterminisms between cluster nodes. | 08:53.02 |
| Someone also suggested we could run upx on the binaries we ship. | 08:53.38 |
tor8 | Robin_Watts: the comment on FZ_META_INFO ... UCS string? | 08:53.58 |
| do you mean UTF8 or UTF16/UCS2? | 08:54.06 |
| personally I would have gone with strings as keys rather than enums | 08:55.08 |
Robin_Watts | UTF8, sorry, nice spot. | 08:55.16 |
| They are strings, not enums ? | 08:55.27 |
| Oh, you mean for the meta interface itself? | 08:55.37 |
tor8 | yeah, for the key enums | 08:55.46 |
Robin_Watts | I disagree. string comparisons are slow. | 08:55.56 |
tor8 | meta data querying is rarely a performance bottleneck, and the PDF objects are all about string comparisons with the dictionaries :) | 08:56.27 |
Robin_Watts | You can always use enums like: 0x43563423 /* 'INFO' */ | 08:56.37 |
| But the meta data interface is NOT just for PDF operations. | 08:56.53 |
tor8 | string keys are more easily extensible without changing the enum definitions | 08:57.15 |
| but I'm not particularly opinionated about it | 08:57.26 |
Robin_Watts | for enums you just add new ones to the list, and they go on the switch. | 08:57.38 |
tor8 | hm, is there a metadata querying interface for the javascript bindings? | 08:57.45 |
Robin_Watts | No idea. | 08:58.01 |
| but I see where you're going. If there is, we should look at it. | 08:58.13 |
tor8 | that may be worth looking into (or asking paulgardiner about) | 08:58.17 |
Robin_Watts | I'll change the UCS -> UTF8 and add a note that this interface is still very experimental and is subject to change. | 08:59.35 |
| And we have an issue with font matching. | 09:01.00 |
| It seems that MuPDF treats matching fonts and matching cid fonts in the same way. | 09:01.22 |
| and that's bad, cos fonts and cid fonts have different encodings. Or something like that. Chrisl ? | 09:01.45 |
tor8 | Robin_Watts: we have separate lookup_substitute_font for cjk and simple fonts | 09:02.20 |
| so not sure what you mean there | 09:02.36 |
chrisl | This isn't a CJK font - it's Arial, but Arial as a CID font | 09:02.47 |
| But MuPDF substitutes this Arial CIDFont with the *font* Arial, and bad things happen | 09:03.38 |
Robin_Watts | tor8:http://bugs.ghostscript.com/show_bug.cgi?id=692943 | 09:11.48 |
| I was vaguely planning to have a prod at that today, but you may be better placed than me. | 09:12.06 |
| But if you want to ease back into it, I can have a look. | 09:12.49 |
tor8 | Robin_Watts: ugh. non-embededded Identity-H truetypes. that's NEVER going to work well. | 09:14.54 |
Robin_Watts | Actually, I should look at: http://bugs.ghostscript.com/show_bug.cgi?id=692990 | 09:15.12 |
tor8 | Robin_Watts: I have a suspicion about that one | 09:15.57 |
Robin_Watts | go on... | 09:16.18 |
tor8 | it depends on how stroking is implemented | 09:16.26 |
| the "naive" way that 90% of software uses (including mupdf) gives the result as shown | 09:16.48 |
| adobe reader seems to use a different algorithm for stroking, which deals with these special cases | 09:17.06 |
Robin_Watts | I'm going to give into the shower. will take the file to bits afterwards. | 09:17.38 |
tor8 | we've had this bug before. apple preview gives the same (broken) rendering as mupdf here. | 09:17.56 |
| the way we join line segments makes little loops, that in most cases are fine due to the winding, but when the line width is too big and the segment too short and the angle too steep, the loop overlaps itself and causes the white areas | 09:19.12 |
| IIRC, the loop from one join partially overlaps the loop on a neighboring join | 09:19.49 |
miha | sorry for being noob, but how to build mupdf without graphic interface.. no X there | 09:20.55 |
| i just need command line | 09:21.04 |
kens | 'make' ? | 09:23.43 |
tor8 | miha: make NOX11=yes | 09:24.52 |
kens | Time for me to go, see you all later | 09:25.39 |
Robin_Watts | tor8: So... why are we drawing strokes with even odd windings? | 09:26.48 |
| actually, ignore that question. I need to look at the code. | 09:27.28 |
| tor8: So, when we stroke we're working at the gel level rather than the path level (i.e. the output of the stroking phase is an updated gel, not a new path) | 09:45.39 |
tor8 | Robin_Watts: the stroking goes from path to edge list directly, not via another path | 09:47.59 |
Robin_Watts | So for each line segment of the path we add 2 'edges', plus various edges for the joins/caps | 09:48.17 |
tor8 | yup | 09:48.27 |
Robin_Watts | Still can't visualise the problem. | 09:52.41 |
| Do you have a simple example ? | 09:52.48 |
| (If not ,don't worry, I'll take the example file to bits. | 09:53.09 |
tor8 | I'll see if I can cook up an example, gimme a few mins | 09:54.33 |
Robin_Watts | Morning paulgardiner | 10:04.58 |
paulgardiner | Hi Robin | 10:05.08 |
Robin_Watts | You know we added the fz_document interface to mupdf to abstract all the file formats into a single interface? | 10:05.44 |
| Well, in doing so, we lost the ability to do PDF specific operations, like poking around in the xref->info stuff to find out who the creator was etc. | 10:06.26 |
| so I've added a metadata interface to the fz_document stuff, so we can get that info back so we can once again have a "Document Properties..." dialog on the viewer. | 10:07.18 |
| tor8 raised the idea that there might be a metadata interface in the javascript stuff? If so we should probably use a similar formulation. | 10:07.50 |
| Have you seen any such thing in your trips through the spec ? | 10:08.02 |
paulgardiner | No, but then I'm not sure I'm yet completely understanding what sort of things one might interogate through the metadata. | 10:09.23 |
| I hadn't noticed the Document Properties dialog on the viewer. | 10:10.10 |
Robin_Watts | Well, I see it as an arbitrarily extensible interface, but to start with, we have things like: | 10:10.10 |
| Me either. I only noticed it when people complained it had gone :) | 10:10.27 |
paulgardiner | :-) | 10:10.34 |
Robin_Watts | 1) "What format is the document?" (answer: "PDF 1.7" or "XPS" etc) | 10:10.52 |
| 2) "What encryption is used?" (answer "None" or "AES 128bit" etc) | 10:11.23 |
| 3) "What is the Creator of this document?" (answer "Acrobat Distiller") | 10:11.48 |
paulgardiner | So far I've been adding the forms API at the fz level, but I guess it would apply only to PDF. Maybe that's bad. | 10:11.59 |
Robin_Watts | 4) "What is the Modified date?" etc. | 10:12.07 |
| paulgardiner: It would be bad to add stuff that is PDF specific (i.e. to do with the actual structure of a PDF file) at the fz level. | 10:12.41 |
paulgardiner | Another approach is to be able to ask an fz_document for other interfaces, which may be present or not. | 10:13.13 |
Robin_Watts | but, suppose we added HTML or an XML forms format reader to MuPDF. | 10:13.17 |
| paulgardiner: COM, yeah. | 10:13.23 |
| I'm not a HUGE fan of COM, but I can see it may have it's uses. | 10:13.43 |
paulgardiner | I had a very simple version of COM in Unidisp and it worked really well. On the other hand the UE2 stuff was a nightmare | 10:14.30 |
Robin_Watts | Yes, and yes :) | 10:14.39 |
tor8 | Robin_Watts: %! | 10:14.49 |
| 50 setlinewidth 100 100 moveto 110 100 lineto 105 110 lineto stroke | 10:14.49 |
| showpage | 10:14.49 |
paulgardiner | Oh Hi Tor. Welcome back. | 10:15.17 |
Robin_Watts | My first cut is a simple metadata interface as used in EStreams (or in ghostscript in gxdso). | 10:15.28 |
tor8 | Hi Paul! | 10:15.29 |
Robin_Watts | int fz_meta(fz_document, int operation, void *ptr, int size); | 10:16.08 |
| So you can do COM with that. | 10:16.39 |
paulgardiner | How do you mean? | 10:17.03 |
| Do you mean a fz_meta call could reaturn an interface if we wished? | 10:17.43 |
| I'd imagine that some operations would work best with metadata and other best with COM. If you just want to quickly grab some attribute of the document, then metadata is best. If you want to make an extended sequence of document-type-specific calls, then COM would be better. | 10:22.09 |
| I was tempted to suggest putting in a COM like thing for forms. The idea was to first ask for an "interaction" interface. That would come back non-null only if the underlying doc was interactive (e.g., had forms). | 10:23.57 |
Robin_Watts | paulgardiner: Yes, I was imagining that we could have a fz_meta call to get a particular interface. | 10:24.20 |
paulgardiner | Ok. Makes sense. | 10:24.46 |
Robin_Watts | but then, if we like COM, it may make more sense to go for COM from the get go. | 10:25.08 |
| The only risk I see with COM, I think is that we can end up with a proliferation of different interfaces. | 10:25.37 |
| Take for example, the question of things like "modified date", "creator", "producer" etc. | 10:25.56 |
paulgardiner | Lots of interfaces can be nice if they really are orthoganal - if each one can be understood in its own right. | 10:26.49 |
Robin_Watts | With PDF the temptation would be to have an interface to get the xref->info dictionary, which we'd then do PDF specific calls on to get the results. | 10:26.53 |
paulgardiner | Yes. That might be nice. | 10:27.20 |
Robin_Watts | But if we then came to add that functionality to XPS, we'd end up with a different beast. | 10:27.42 |
| I guess it's a matter of picking the nicest interface. | 10:28.06 |
| I'd be up for using COM (or a COM-like interface style) I think. It's not a style that we've embraced throughout MuPDF to date though, but that probably doesn't matter. | 10:29.29 |
paulgardiner | It's difficult to predict up front whether it would be OTT or not. For Unidisp it was an easy choice because there were already loads of slightly different clients all carrying around each others baggage. | 10:31.02 |
Robin_Watts | So... to try this out, I'd need to change fz_meta to be fz_get_interface and to define a first interface that PDF would implement to do the stuff currently done by fz_meta calls. | 10:35.06 |
| Forgive the newbie COM question, but... In COM (and COM-like) things I've seen before, *every* interface has a get_interface call. | 10:36.13 |
| Do we need that? | 10:36.26 |
paulgardiner | Hang on. I've forgotten how it worked. | 10:36.52 |
Robin_Watts | Ha! I'm glad I'm not the only one :) | 10:37.30 |
paulgardiner | I don't think we need that. Probably just the fz_document needs the get interface call. | 10:37.54 |
| Maybe every interface needs a "drop" or "destroy" call. | 10:38.17 |
Robin_Watts | Why? | 10:44.13 |
paulgardiner | ... or maybe it doesn't. :-) | 10:44.40 |
Robin_Watts | Many interfaces won't need it (if you're returning a static set of pointers etc) | 10:44.47 |
| Some might, but they can provide it. | 10:44.59 |
paulgardiner | Yes I think you're right. | 10:45.17 |
Robin_Watts | OK. | 10:45.25 |
| I could imagine that there having everything offer 'getInterface', 'dropInterface' would be useful if you wanted to be able to do an automated crawl of the available interfaces ,but.... | 10:46.00 |
| You'd probably want 'enumerateInterface' in that case too though. | 10:46.22 |
tor8 | my KISS alarm just went off... | 10:46.26 |
Robin_Watts | yeah. | 10:46.32 |
paulgardiner | Best keep it as simple as possible. I'm trying to remember some of the tricks used with Unidisp. There were some very useful things that could be done by overriding interfaces of existing objects, but I think it's best to ignore any of that until a use for it comes up | 10:46.37 |
Robin_Watts | tor8: I think "simple COM" can be very simple. Do you have a feeling for this? | 10:47.17 |
paulgardiner | In it's simplest form it's just returning a struct full of function pointers. | 10:47.50 |
| more or less | 10:48.00 |
tor8 | not very familiar with COM, but from what I recall of it, it's basically opaque objects (as in OO) with methods that you can enumerate at runtime? | 10:48.28 |
paulgardiner | I think in typical COM, a enumeration interface is just one more thing that an object might or might not provide. | 10:48.51 |
tor8 | I guess it's main use is as a layer to maintain binary compatibility between DLLs and drivers and client/server RPC/IPC stuff | 10:49.22 |
| but I haven't drunk from the OO koolaid well in over 10 years, so my memory of these things is a bit hazy | 10:49.56 |
Robin_Watts | tor8: I think what we're proposing here, is that instead of fz_meta, we have fz_get_interface(); where we can say "Give me the 'Forms' interface on this document" or "Give me the 'Information' interface on this document". The interfaces are identified from an enumerated type. | 10:50.36 |
paulgardiner | tor8: Oh yeah, binary compatibility. I forgot about that. I think Robin and I have been talking about the interfaces aspect of it rather than that part. | 10:50.57 |
Robin_Watts | and different document types will only implement an interface if they need it. | 10:51.18 |
tor8 | if you didn't care about binary (and dynamic linking with different implementations) you'd just slap on an 'interface' declaration on your java class and be done with it :) | 10:51.40 |
Robin_Watts | (and if the document in question supports it - so a forms interface would only be available for documents with forms etc). | 10:52.05 |
paulgardiner | tor8: maybe that's what we mean really: the equivalent in C of Java's interfaces. | 10:52.44 |
tor8 | fz_metadata *meta = fz_metadata(doc); title = fz_title(meta); ... is that the sort of scheme you're considering? | 10:52.49 |
| where fz_metadata could be a struct of function pointers or similar | 10:53.35 |
Robin_Watts | fz_doc_info *doc_info = fz_get_interface(doc, FZ_INTERFACE_DOC_INFO); title = fz_title(doc_info); | 10:53.53 |
| yes | 10:53.55 |
tor8 | ugh, why the get_interface abstraction though? | 10:54.12 |
Robin_Watts | The key thing is that we can extend the interfaces available without changing the fz_document class. | 10:54.57 |
| i.e. if we want to add a forms interface, that's a new FZ_INTERFACE_FORMS enumeration key, but fz_document doesn't change at all. | 10:55.25 |
paulgardiner | So fz_title would probably be fz_doc_info_title and could be defined as part of the separate fz_doc_info interface | 10:55.43 |
tor8 | so basically, taking binary compatibility to an extreme :) | 10:55.49 |
| not sure if it's worth it | 10:55.55 |
Robin_Watts | tor8: The alternative would be to have fz_document_get_forms_interface(), fz_document_get_doc_info_interface(), fz_document_get_some_other_interface() etc. | 10:57.05 |
paulgardiner | Ah, I've just remembered why I shied away from this for forms when considering it before: at the implementation level all you again is avoidance of indirecting all these calls through the xref struct of virtual functions. | 10:57.21 |
| So you avoid making xref big. That matters if you invisage having 1000s of xref objects around but less so if only a few. | 10:58.18 |
Robin_Watts | paulgardiner: Right, but with the fz_document abstraction, you don't have an xref object that the viewer can call. | 10:58.34 |
| And we really don't want to bloat the fz_document more than we have to. | 10:58.48 |
| COM keeps interfaces small and well defined. | 10:58.59 |
paulgardiner | Sorry I meant fz_document. | 10:59.04 |
tor8 | Robin_Watts: alternatively adding a lot more functions to fz_document, one for each interface. but having separate interface function pointer structs (interface objects) does seem like a nice idea. | 10:59.09 |
paulgardiner | That's the thing. At the implementation level all we avoid is bloating fz_document. | 10:59.38 |
Robin_Watts | tor8: I was unconvinced by COM initially, but having used it in unidisp, it worked really nicely. | 10:59.50 |
| The 'cost' of COM is the fz_get_interface(doc, INTERFACE_ENUMERATION) call. | 11:00.40 |
tor8 | paulgardiner: a second thing is lessening the amount of work to write new object types; not having to implement all interface functions. but that can be lessened (as we already do) by providing no-op default implementations. | 11:00.45 |
paulgardiner | Yes true. | 11:00.58 |
Robin_Watts | but that's a small cost compared to its benefits I think. | 11:01.16 |
tor8 | having each 'interface' as a separate thing, you could easily just say a certain type doesn't implement the interface and all is easy | 11:01.26 |
paulgardiner | And maybe conceptually it's cleaner not to have lots of fz_document functions that in many cases don't apply | 11:01.30 |
Robin_Watts | I have to pop back to vets to pick up Ruskin. back in a bit. | 11:01.36 |
tor8 | but that's hardly relevant. I think I'd prefer having separate interfaces for code cleanliness reasons. | 11:01.56 |
Robin_Watts | tor8: That's what you get with COM. You ask for an interface on a document, and if it's not implemented... you get NULL. | 11:01.57 |
tor8 | but not convinced about get_interface with an enum ... too much faffing about with architecture and patterns and crap and not enough about just getting things done in the simplest way | 11:03.11 |
| if we didn't have MuXPS we probably wouldn't even be having this discussion though | 11:03.45 |
paulgardiner | I was putting negative arguments only as devil's advocate really. I mostly like the idea of seperate interfaces. My first cut of the forms API had a call to get an fz_interactive interface from the fz_document, and then the calls to access widgets took fz_interactive in place of fz_document. | 11:03.48 |
tor8 | I think separate interface types for things like the forms support are the way to go | 11:04.37 |
| if only to keep them out of fz_document | 11:04.48 |
paulgardiner | Oh ok good. | 11:04.58 |
Robin_Watts | So, the question is how we get those interfaces, right? | 11:05.39 |
tor8 | so a fz_form interface with all the interactive form functions is good, and a fz_document function to retrieve it | 11:05.41 |
paulgardiner | The enumerated type may have advantages: it means we need only one get-interface call added to fz_document, rather than one per interface | 11:05.53 |
Robin_Watts | what paul said. | 11:06.05 |
tor8 | at the moment, I'd just fz_document_<interface-name>() because it's the most straight forward | 11:06.27 |
Robin_Watts | So you prefer multiple get_interface calls, and I prefer just 1 with an enumeration. | 11:07.29 |
paulgardiner | Could make that a macro that calls get_interface with the correct enum maybe | 11:07.33 |
Robin_Watts | so it sounds like we're close at least. | 11:07.46 |
paulgardiner | The other thing is whether to declare each interface in a seperate file, or stick with just fitz.h | 11:09.42 |
tor8 | Robin_Watts: we may want to make the fz_document an interface in the same style, to avoid the "fz_document super" voodoo. but one thing at a time. | 11:09.50 |
| paulgardiner: for now, just fitz.h | 11:10.06 |
| we have plans to split it later | 11:10.19 |
paulgardiner | Ah right. ok | 11:10.29 |
tor8 | personally I don't like libraries that come with a dozen header files | 11:10.52 |
| but then, mupdf is a larger library than things like zlib and libjpeg | 11:11.17 |
Robin_Watts | back from the second vets visit of the day. 1 more to go. | 11:50.39 |
| tor8: The stroking example you gave earlier... what sort of join/cap gives problems with that? | 11:51.21 |
tor8 | Robin_Watts: convert to pdf and compare gs and mupdf rendering | 11:54.20 |
| Robin_Watts: open sample.ps on macosx will convert to a pdf you can save back out as pdf | 11:55.03 |
Robin_Watts | perfect, thanks. | 11:55.16 |
tor8 | Robin_Watts: I think that sample is related to the bug report, but it may be a similar but different problem. | 11:56.02 |
Robin_Watts | Ah, right, I understand the problem | 11:59.21 |
| partly cos I fiddled in this area in gs. | 11:59.37 |
| There is a simple fix. | 12:03.24 |
| but it involves adding more lines to the gel (roughly doubling the number) | 12:03.43 |
| Which may have a disastrous effect on rendering speed. | 12:04.53 |
tor8 | worth attempting and measuring :) | 12:12.39 |
| depends on how complicated the non-simple fix is | 12:13.13 |
naveen_ | Hi Robin,I'm using the following code to get md5 in muPdf | 12:22.29 |
| unsigned char digest[16]; int blockNum = 0; fz_md5 md5; fz_md5_init(&md5); fz_md5_update(&md5,(const unsigned char*)blockNum,32); fz_md5_final(&md5, digest); | 12:22.39 |
| is this correct? | 12:22.55 |
Robin_Watts | Looks reasonable to me. | 12:23.04 |
| That's basically what I do in fz_md5_pixmap | 12:24.11 |
naveen_ | I'm getting segmentation fault with the above code... | 12:25.03 |
| in fz_md5_update | 12:25.20 |
Robin_Watts | is blockNum a pointer ? | 12:26.07 |
naveen_ | No it is int | 12:26.27 |
Robin_Watts | Ah, well, in the above code, you'd need blockNum to be a pointer to 32bytes of memory. | 12:26.52 |
| Maybe you mean fz_md5_update(&md5, (const unsigned char *)&blockNum, sizeof(int)); ? | 12:27.22 |
| (presumably you're going to have more going into the md5 than that, right?) | 12:27.49 |
naveen_ | yes.. | 12:27.49 |
| I'm a java guy with no experience in C.....how can I convert int to unsigned char *? | 12:29.27 |
Robin_Watts | If you do &blockNum, that will give you an (int *) (i.e. a pointer to the int) | 12:30.38 |
| char's are smaller things than ints, so it's safe to do (unsigned char *)&blockNum, which will give you a byte pointer to the storage used for the int. | 12:31.39 |
| In C "&foo" means "Address of the storage used for foo". | 12:32.10 |
| So: fz_md5_update(&md5, (const unsigned char *)&blockNum, sizeof(int)); | 12:33.13 |
| will feed the block of bytes used to store blockNum into the md5 update function. (i.e. it will include the value of blockNum in the calculated md5 sum). | 12:33.53 |
naveen_ | I tried this and got the values in md5 digest for blockNum =0 as 241 211 255 132 67 41 119 50 134 45 242 29 196 229 114 98 | 12:36.04 |
Robin_Watts | ok... | 12:36.30 |
naveen_ | but in java I get 74 -25 19 54 -28 75 -7 -65 121 -46 117 46 35 72 24 -91 | 12:36.40 |
Robin_Watts | Do you need the C to match the java? | 12:37.32 |
| The representations for ints may be different in C and java. | 12:37.50 |
naveen_ | ohh...I was assuming both the md5 will be same.. | 12:38.08 |
Robin_Watts | For C you'll be relying on whether it's big or little endian storage. | 12:38.20 |
| Java uses big endian always. For C it will vary between machines. On an x85/amd64 machine it will be little. | 12:39.06 |
| oops, x86, sorry. | 12:39.18 |
| Do you need the two to match? | 12:39.25 |
naveen_ | ohh..I didn't know this... | 12:40.09 |
| I'm using the above calculated digest to decrypt a block encrypted in AES using the following code | 12:40.36 |
| fz_aes aes; aes_setkey_dec(&aes,(const unsigned char*) m_key, sizeof(m_key)); unsigned char encryptedBuffer[64*1024]; unsigned char decryptedBuffer[64*1024]; aes_crypt_cbc(&aes,AES_DECRYPT,bytes_read,digest,encryptedBuffer,decryptedBuffer); | 12:40.39 |
| is this correct? | 12:41.16 |
Robin_Watts | I am not familiar with the aes code really, but it looks plausible. | 12:42.07 |
| So walk me through this... | 12:42.26 |
| You get a blockNum given to you. | 12:42.32 |
| You make a digest from the blockNum. | 12:42.39 |
| then you feed the digest into the AES decryption code as a key. | 12:42.58 |
naveen_ | digest is not fed as key..it is fed as IV | 12:43.45 |
Robin_Watts | IV? | 12:44.12 |
naveen_ | InitializationVector | 12:44.26 |
Robin_Watts | oh, ok. Sorry, my bad terminology. | 12:44.41 |
| I meant it's passed into the AES decryption thing as part of its initial state. | 12:44.57 |
| So, is the blockNum the SOLE contributor to the digest ? | 12:45.25 |
naveen_ | yes.. | 12:45.33 |
Shyren | chrisl: someone mentioned yesterday that hpdj340 supports 300 ppi but when I give -r300 it says it doesn't support that. | 12:45.34 |
Robin_Watts | naveen_: OK. Give me a mo and I'll get you some code. | 12:45.54 |
naveen_ | ok thanks.. | 12:46.07 |
Robin_Watts | Shyren: I said that the HP DJ 340 printer supports 300dpi color. Not that the device does. | 12:46.20 |
kens | Shyren we are not HP support | 12:46.23 |
Shyren | Robin_Watts: sorry I misunderstood | 12:47.02 |
kens | Also as I think has been mentioned, that device is a contributed device and we cannot support it. | 12:47.15 |
Robin_Watts | naveen_: http://pastebin.com/KKP0cbb5 | 12:48.57 |
| That's a crappy routine that should help. You can do: fz_md5_int(&md5, blockNum); | 12:49.30 |
| Oh, wait... | 12:49.32 |
| naveen_: Fixed version: http://pastebin.com/6kXyLg1d | 12:50.13 |
| tor8: http://bugs.ghostscript.com/show_bug.cgi?id=688655 <- That's the ghostscript bug for the same thing. | 12:51.12 |
chrisl | Shyren: I really think your best bet is to either try to contact one of the authors of those hpdj devices, or take on the task of working out what needs added to them to support your device. | 12:51.56 |
Robin_Watts | I made some simple examples and looked at how acrobat rendered them, and it seems that acrobat doesn't do the 'underjoin'. | 12:52.31 |
| Most rendering apps assume that they can do a simple bevel join on the underside of the join. | 12:52.52 |
| And 99% of the time it works fine. | 12:53.02 |
tor8 | right | 12:54.07 |
Robin_Watts | But in actual fact, we shouldn't do the underjoin. | 12:54.44 |
| So I think the fix is quite simple and fairly cheap - we just draw back to the centre point. | 12:55.03 |
| I'll try that after lunch. | 12:55.08 |
| naveen_: I'll be back in 30 mins or so. Let me know how you get on. | 12:55.45 |
naveen_ | ok Robin.. | 12:56.44 |
Shyren | chrisl: Thanks. I was able to print one of the document with decent output using hpdj1120c with settings you mentioned. On other documents it gives "pdf14_compressed_encode_color devn_params not from pdf14 device" error. Same for other devices. Any ideas how to resolve this? Also how would I go about writing my own devices or change existing ones...any pointers? | 12:59.39 |
chrisl | Shyren: as I said before, I would try converting to PS and then using the PS to generate the PCL3e to send to the printer. | 13:01.17 |
naveen_ | Robin,fz_md5_int function is not compiling.It says argument of type "char *" is incompatible with parameter of type "const unsigned char *" | 13:04.07 |
chrisl | Shyren: as far as developing your own device is concerned - I wouldn't! I would look at the gdevdjtc.c and gdevdjet.c and work out what needs changed in there to work with your device. They are *much* simpler devices than the pcl3 one, and take a much simpler route to getting results. | 13:05.44 |
naveen_ | sorry Robin,imade char c to unsigned char c..and it works...md5 produced by this is same as the previously used fz_md5_update(&md5,(const unsigned char*)&blockNum,sizeof(int)); | 13:11.14 |
| But still i'm not able to decrypt the content...using the following code | 13:13.03 |
| fz_aes aes; aes_setkey_dec(&aes,(const unsigned char*) m_key, sizeof(m_key)); unsigned char encryptedBuffer[64*1024]; unsigned char decryptedBuffer[64*1024]; aes_crypt_cbc(&aes,AES_DECRYPT,bytes_read,digest,encryptedBuffer,decryptedBuffer); | 13:13.05 |
| I'm getting segmentation fault erros in aes_crypt_cbc function.. | 13:13.28 |
Robin_Watts | naveen_: back. | 13:32.06 |
| So, you have the md5 digest working, and coming out the same as from the java? | 13:32.24 |
naveen_ | No md5 is not same as java | 13:36.06 |
| it is different | 13:36.12 |
| in java I'm using the little_endian format | 13:37.24 |
Robin_Watts | oh. | 13:38.13 |
| Well, I'd imagine that not having it matching is a complete non-starter for this, right? | 13:38.56 |
| So... can you paste your java code? | 13:39.04 |
naveen_ | yaa | 13:40.14 |
| long[] md5_key = { blockNum, blockNum }; MessageDigest md = MessageDigest.getInstance("MD5"); byte[] md5KeyBytes = AgentUtils.convertLongToByteArray(md5_key); md.update(md5KeyBytes, 0, md5KeyBytes.length); return md.digest(); | 13:40.45 |
| public static byte[] convertLongToByteArray(long inputLong) { ByteBuffer byteBuffer = ByteBuffer.allocate(8); byteBuffer.order(ByteOrder.LITTLE_ENDIAN); LongBuffer longBuffer = byteBuffer.asLongBuffer(); longBuffer.put(inputLong); return byteBuffer.array(); } | 13:41.00 |
| this is how I do it in java | 13:41.09 |
Robin_Watts | OK, so in java, you're using 8 bytes. | 13:41.30 |
naveen_ | yaa | 13:41.51 |
Robin_Watts | So you want to call fz_md5_update(&md5, (const unsigned char *)&blockNum, sizeof(int)); twice then. | 13:42.59 |
| Oh, wait, no... | 13:43.43 |
| In java, what type is blockNum ? | 13:43.56 |
| long is 64bit signed in java. | 13:44.34 |
| So long [] md5_key = {blockNum, blockNum}; actually produces a 16 byte long object. | 13:45.03 |
Shyren | chrisl: Converting to PS and then using hpdj1120c gives decent output. I am only able to use 300ppi but anyway to use 600ppi as the device mentions? In pcl3 the device mentions 600ppi resolution. Again I know pcl3 is also contributed so any help is appreciated. | 13:45.29 |
Robin_Watts | If blockNum is a long in java, and an int in C, then you run the risk of not being able to represent all the required values in C. | 13:45.53 |
naveen_ | yaa.. thats right..so lets make blockNum as long in C too | 13:46.33 |
Robin_Watts | What platform are you on? | 13:46.53 |
naveen_ | windows | 13:47.02 |
Robin_Watts | long is not 64bits in C in many cases. | 13:47.02 |
| It's not 64bits on windows. | 13:47.10 |
chrisl | Shyren: I don't know, I'm afraid. Did you say you'd tried the "bare" pcl3 driver? | 13:47.22 |
naveen_ | I'm running the program in cygwin | 13:47.22 |
Robin_Watts | naveen_: But ultimately, where is this going to run ? | 13:47.46 |
naveen_ | android | 13:47.53 |
Robin_Watts | OK, so you want to be using int64_t really. | 13:48.09 |
| So you want int64_t blockNum; and then: | 13:48.36 |
| fz_md5_update(&md5, (const unsigned char *)&blockNum, sizeof(int64_t)); | 13:48.52 |
naveen_ | do I need to include any header for this? | 13:49.18 |
Robin_Watts | stdint.h probably. | 13:49.27 |
| #include <stdint.h> | 13:49.43 |
Shyren | chrisl: with bare pcl3 the printer gets stuck so I used sub device hpdj1120c | 13:52.30 |
chrisl | Shyren: Okay, so look at the sources and find out what settings are implied by hpdj1120c and try to work out a combination that let you still print, *and* use 600dpi. | 13:53.41 |
naveen_ | I'm getting the md5 same as in java except for negative numbers | 13:54.29 |
Robin_Watts | naveen_: That's fine. | 13:55.23 |
| In java bytes are signed. | 13:55.42 |
| so while in C we see 0...255, in java we see 0..127 then -128 to -1 | 13:56.02 |
naveen_ | yup..I forgot.. | 13:56.05 |
Robin_Watts | np. | 13:56.10 |
| So the md5 matches... now you said you were getting SEGVs ? | 13:56.28 |
| "GPL Ghostscript Lite 8.70" ??? | 13:58.25 |
kens | Yeah, dumb question | 13:59.03 |
naveen_ | yaa...I'm getting segmentation fault errors.. | 13:59.11 |
kens | I was goign to say 'you aren't a commercial customer so you have not and have never had any support;' | 13:59.19 |
| But I thought I'd leave it to Marcos today | 13:59.55 |
| I guess it ought to go to Scott | 14:00.04 |
Robin_Watts | naveen_: OK, so paste the code ? | 14:00.32 |
chrisl | Oh, yeah, we're going to get a *lot* of cash out of Greece....... ;-) | 14:00.34 |
kens | :-) | 14:00.42 |
Robin_Watts | chrisl: We might, but it'd be in euros just before it all collapses. | 14:00.59 |
chrisl | I doubt they've got any euros left, either | 14:01.21 |
Robin_Watts | chrisl: Technically, it's Euro, not Euros. Specifically because "Euros" means "Urine" in greek, and they didn't think that would be good as a name for a currency. | 14:02.14 |
kens | Marcos is on it :-) | 14:02.52 |
chrisl | Robin_Watts: Hmm, "do you take the p*ss?", maybe "euros" *is* the right name for the currency..... | 14:04.08 |
naveen_ | fz_aes aes; aes_setkey_dec(&aes,(const unsigned char*) m_key, sizeof(m_key)); unsigned char encryptedBuffer[block_size]; unsigned char decryptedBuffer[block_size]; int bytes_read = 0; FILE *rp = fopen("C://WorkSpace//tmp//pdf.s2g", "r"); bytes_read = fread(encryptedBuffer,1,block_size,rp) aes_crypt_cbc(&aes,AES_DECRYPT,bytes_read,digest,encryptedBuffer,decryptedBuffer); | 14:04.32 |
| I'm getting error in aes_crypt_cbc function.. | 14:04.53 |
Robin_Watts | What type is m_key ? | 14:06.50 |
naveen_ | char m_key[16]; | 14:07.04 |
Robin_Watts | That looks plausible. | 14:08.59 |
| What is block_size ? | 14:09.08 |
naveen_ | is thr any limitation on block size? | 14:09.59 |
Robin_Watts | Should be a multiple of 16. | 14:10.08 |
naveen_ | yaa it is a multiple of 16.. | 14:10.20 |
Robin_Watts | I can't immediately see why that would fail. | 14:10.57 |
| but I don't believe your paste. | 14:11.09 |
naveen_ | that is the exact code I'm trying here.. | 14:11.49 |
Robin_Watts | Then you're missing a ; after the fread. | 14:12.05 |
naveen_ | ok.. that was in a loop..I just copied it from thr.. | 14:13.09 |
Robin_Watts | Right. | 14:13.16 |
| So, can you paste the ACTUAL code into a pastebin please? | 14:13.31 |
| not just the edited highlights :) | 14:13.45 |
naveen_ | how to paste in pastebin? | 14:13.48 |
Robin_Watts | open pastebin.com in a browser. | 14:14.05 |
naveen_ | I mean ..right here | 14:14.13 |
Robin_Watts | select the code in your editor, copy, move to pastebin.com, paste. Click the button and then copy/paste the URL into here. | 14:14.34 |
naveen_ | http://pastebin.com/8XvABqYr | 14:15.05 |
Robin_Watts | Add a printf("bytes_read=%d\n", bytes_read); before the aes_crypt_cbc call ? | 14:17.02 |
| block_num1 is never used. Nor is j, right ? | 14:17.34 |
naveen_ | I was getting the value in bytes_read.. | 14:17.35 |
Robin_Watts | what value ? | 14:17.41 |
| I want to know when it's crashing; is it dying on the first block? or on subsequent ones ? | 14:18.07 |
| I don't see an aes_set_key_dec call. | 14:18.37 |
naveen_ | 65536 | 14:18.38 |
| I only have one block in the file.. | 14:18.51 |
| so it is failing the very fisrt time.. | 14:18.58 |
Robin_Watts | ok. | 14:19.02 |
naveen_ | I'm calling aes_setkey_dec before the loop..that was there in the code I previously pasted.. | 14:19.44 |
Robin_Watts | Can you paste the WHOLE code please. Just drop in the whole file. | 14:20.14 |
naveen_ | I'll paste the whole code except the key logic.. | 14:21.41 |
Robin_Watts | ok, miss out any commercially sensitive stuff, sure :) | 14:22.00 |
| tor8: I have a fix locally that makes that file work. Just cluster testing it now. | 14:22.34 |
naveen_ | http://pastebin.com/M6fXzzK4 | 14:23.08 |
Robin_Watts | naveen_: Humour me, and change the aes_setkey_dec line to: | 14:24.58 |
| aes_setkey_dec(&aes,(const unsigned char*) &m_key[0], 16); | 14:25.24 |
naveen_ | still I'm getting Segmentation fault (core dumped) error | 14:27.17 |
Robin_Watts | OK, I can't immediately explain why. | 14:27.39 |
| I can suggest a few tests? | 14:28.02 |
naveen_ | yes please.. | 14:28.09 |
Robin_Watts | before the aes_crypt_cbc call do: | 14:28.26 |
| printf("1\n"); | 14:29.27 |
| memset(encryptedBuffer, 0, bytes_read); | 14:29.29 |
| printf("2\n"); | 14:29.31 |
| memset(decryptedBuffer, 0, bytes_read); | 14:29.33 |
| printf("3\n"); | 14:29.34 |
naveen_ | In the case of error I'm not getting any printf statements displayed... :( | 14:30.31 |
Robin_Watts | ok, so put an fflush(stdout); after every printf. | 14:31.30 |
naveen_ | ok.. | 14:31.40 |
Robin_Watts | You are allocating both encryptedBuffer and decryptedBuffer off the stack - that's 128K. | 14:32.22 |
| That's bad form. | 14:32.32 |
| It probably doesn't explain the crash, but... | 14:32.55 |
naveen_ | I got all 1,2,3 printed..with the Segmentation fault error | 14:34.07 |
Robin_Watts | ok. | 14:34.18 |
| Ah! | 14:36.01 |
| Our aes code expects keys to be of size 128,192 or 256. | 14:36.35 |
naveen_ | :( .. | 14:36.49 |
Robin_Watts | Those are in bits. | 14:37.28 |
| You're passing in a keysize of 16, meaning 16 bytes. | 14:37.42 |
naveen_ | yes | 14:37.49 |
Robin_Watts | 16*8 = 128 | 14:37.50 |
naveen_ | yaa | 14:37.54 |
Robin_Watts | sizeof(m_key)*8 ? | 14:38.07 |
| or just 128. | 14:38.12 |
| then try it again. | 14:38.18 |
| This AES code comes from somewhere else, otherwise I like to think we'd cope a bit more gracefully. | 14:38.52 |
| tor8: 72 pages of bitmap diffs :( | 14:39.17 |
henrys | welcome back tor8 | 14:39.46 |
naveen_ | Error is gone Robin...I need to check if the decrypted file is good.. | 14:41.02 |
Robin_Watts | you've removed my test code, right? :) | 14:41.19 |
naveen_ | No.. | 14:41.37 |
Robin_Watts | Well, it'll be bad with the memsets in :) | 14:41.51 |
naveen_ | yup :) I'll remove it.. | 14:42.14 |
| Thanks for all the help Robin...I'm sorry for bugging you so frequently...I'll check the decrypted file and let you know... | 14:46.12 |
Robin_Watts | naveen_: No worries. | 14:46.28 |
naveen_ | Robin,I encrypt using "AES/CBC/NoPadding" algo...Now after decrypting I'm getting 47 kb of extra data in the file....Any ideas why this is happening? | 14:59.33 |
Robin_Watts | How big is the original file, and how big is the final file ? (in bytes) | 15:00.11 |
| aes encryption spits out the same number of bytes you give it each time, AIUI. So you're feeding in (up to) 64K of data per block and you should get the same amount out | 15:03.25 |
naveen_ | encrypted file & decrypted file are 3,27,680 bytes and original file is 2,79,575 bytes | 15:03.38 |
Robin_Watts | OK, so that's 5 64K blocks going in. | 15:04.03 |
| and hence you get 5 64K blocks out. | 15:04.12 |
| The original file fits into 4 and a bit blocks, so that makes sense. | 15:04.54 |
| Presumably you should have access to the original file size somewhere so you know how to truncate the last block of data. | 15:05.28 |
| That's not down to the AES code itself. | 15:05.43 |
naveen_ | hmm...ok..I'll debug that....I'm not able to open the decrypted pdf...adobe reader says it is corrupt.. | 15:06.16 |
Robin_Watts | Load the decrypted file into an editor, and remove the extra off the end, then save it out. Will it open then ? | 15:06.48 |
| The PDF spec says that readers should tolerate up to 1K of rubbish at the start/finish, if memory serves. | 15:07.17 |
| Does the file start with: %!PDF ? | 15:07.31 |
naveen_ | No the decrypted file does not start with %PDF-1.5 but the actual file does.. | 15:08.59 |
Robin_Watts | ok, so the decrypted file isn't right. | 15:09.35 |
| Does %PDF-1.5 appear within the decrypted file? | 15:09.52 |
| (i.e. is there a prefix of rubbish?) | 15:10.00 |
naveen_ | No | 15:10.31 |
Robin_Watts | OK, so something is wrong. | 15:11.30 |
| I'd add some printfs to the java/c versions and compare that the key etc is being formed right. | 15:11.56 |
naveen_ | yes..I'll do that.....and bug you tomorrow with the update :) | 15:13.31 |
Robin_Watts | ok. | 15:14.39 |
| what timezone are you in, btw ? | 15:14.46 |
naveen_ | IST | 15:14.54 |
| I work from 12 - 9 IST | 15:15.05 |
Robin_Watts | Ah, so this is late! | 15:15.10 |
naveen_ | how abt you? | 15:15.19 |
Robin_Watts | UK. | 15:15.25 |
| so about 5-6 hours behind you. | 15:15.49 |
naveen_ | hmm..Ok..You really have helped me step by step Robin..Thanks for all the help... | 15:16.07 |
Robin_Watts | no worries. | 15:16.18 |
kens | Well, I've found the source of my mysterious crashes, but I don't know how to fix it, need to cogitate a bit.... | 15:18.36 |
Robin_Watts | read that as "coagulate a bit" :) | 15:18.52 |
kens | COuld do that too I guess | 15:21.19 |
Robin_Watts | kens: I'm sure there is a joke to be made about "leaking" somewhere :) | 15:23.40 |
| tor8: OK, the fix seems to work. Some genuine progressions too, which is nice. | 15:30.10 |
| Just retesting a final version now, where I reduce the flatness on beziers by the linewidth. | 15:30.42 |
| ok, back to the vets again. bbs. | 15:42.40 |
kens | OK time to go, goodnight everyone | 16:17.47 |
br2zz | hello! i have a question regarding mupdf (git version): in earlier versions i was able to access the trailer object from pdf_document directly since the structure was exposed in the header file (now you need mupdf-internal.h) - is there any other way to get this object or any other way to read out the pdf document information (author, title...)? | 16:27.03 |
Robin_Watts | br2zz: There is another way, yes. | 16:27.47 |
| but it's in flux at the moment. | 16:27.59 |
| See fz_meta | 16:28.25 |
br2zz | Robin_Watts: Okay, thanks. I will take a look | 16:29.19 |
Robin_Watts | br2zz: But it's likely to change before the end of the week. | 16:30.37 |
br2zz | Robin_Watts: Good to know; then I will wait before changing my code | 16:31.48 |
Robin_Watts | paulgardiner, tor8: Did we reach a decision in the end? | 16:32.58 |
| Rip out fz_meta and replace it with a document_info interface? | 16:33.47 |
tor8 | Robin_Watts: there is also the metadata xml stream in the PDF document we may want to support | 16:34.16 |
| IMO we should keep the metadata interface out of 1.0 | 16:35.20 |
Robin_Watts | tor8: It would be a shame to ship without the "Document Properties..." thing in the viewer. | 16:35.45 |
| given we've had it before. | 16:35.54 |
tor8 | considering only 0.1% of people found it... I dunno. and it doesn't work for XPS documents as it should anyway. | 16:36.24 |
Robin_Watts | I might be tempted to ship with fz_meta left in, but with a big note that it will be reworked entirely for the next version, so don't rely on it. | 16:36.34 |
| tor8: Did you start looking at the font/cidfont thing ? | 16:37.23 |
| (if not, I will have a quick look) | 16:38.19 |
tor8 | I don't see how it can possibly work without adding a bunch of workarounds for missing fonts that should be embedded if you are to follow the spec | 16:38.24 |
| it's an Identity-H encoded CID font. those are ALWAYS supposed to be embedded. | 16:38.46 |
Robin_Watts | tor8: I was (in my ignorance) considering passing a flag through the substitution stuff saying whether it was a cid font or not. | 16:39.07 |
tor8 | since the Identity-H encoding refers to raw glyph indices | 16:39.09 |
Robin_Watts | and if it's a cid font, not match arial etc. | 16:39.22 |
| (i.e. cid fonts would always fall back to just droidsans) | 16:39.52 |
| but I have no idea whether that's possible or not. | 16:40.08 |
tor8 | it wouldn't change anything | 16:40.21 |
| to support it we need to either have the real font at hand, or get lucky and find a font with the same glyph ordering | 16:40.49 |
| or add a table with glyph -> glyph substitute font remappings for Arial (and the other windows fonts that people never seem to embed) | 16:41.24 |
Robin_Watts | In chrisl's test of the bug in question, he found that avoiding the normal fonts as sustitutions was enough to make it work. | 16:41.42 |
tor8 | or do really ugly workarounds with the ToUnicode cmap | 16:41.44 |
| did chrisl do the workaround with gs or mupdf? | 16:43.17 |
Robin_Watts | mupdf | 16:43.27 |
| gs gets it right, | 16:43.30 |
tor8 | right. I guess I need to take a deeper look then. been a bit jetlagged and zombiefied today. | 16:43.52 |
| chrisl: bug 692943, what did you nobble with the CIDFont loading? | 16:46.38 |
Robin_Watts | Comment 4 paragraph 3 and 4 ? | 16:47.17 |
tor8 | yes. I'd like to reproduce it. | 16:48.17 |
br2zz | i have another small question: earlier i could access the page obj with xref->page_objs[index]; can someone tell me what i can do now to achieve the same? i want to access the resources object, so that i can read out the images that the page contains. | 16:58.21 |
Robin_Watts | br2zz: OK. At that point you currently need to #include "mupdf-internal.h" | 16:59.11 |
| We haven't made the internals of the xref available through the document interface yet. | 16:59.50 |
br2zz | Robin_Watts: okay; are there any plans for the future to circumvent that? | 17:00.04 |
Robin_Watts | Yes, but not concrete ones as yet. | 17:00.23 |
br2zz | okay | 17:00.31 |
Robin_Watts | cid collection: Adobe-Identity - does that help us at all? | 17:18.30 |
| I have a hack that works for that file. | 17:20.39 |
| Let me tidy it up. | 17:20.43 |
| tor8: http://ghostscript.com/~robin/0001-ForTor.patch | 17:27.08 |
| marcosw: If a cluster bmpcmp completes with no differences found, does the users bmpcmp directory not get updated? | 17:28.07 |
| oh, wait... ignore that. | 17:29.16 |
tor8 | Robin_Watts: it's just pure luck for that patch to work. droid sans fallback must be using the exact same glyph order as arial. | 17:35.52 |
| Robin_Watts: and actually, that patch still causes garbled text for all characters outside the basic ascii set | 17:37.56 |
Robin_Watts | OK... | 17:42.45 |
| The collection appears to be "Adobe-Identity". That's not one of the known ones in MuPDF. | 17:43.30 |
| So you've modified the file with some "out of ascii" chars and it renders wrongly? | 17:47.25 |
| Does gs get it right ? | 17:47.28 |
tor8 | Robin_Watts: no, gs also gets it wrong | 18:13.56 |
Robin_Watts | ok, fair enough. | 18:14.24 |
| That kicks my understanding of the problem then. | 18:14.52 |
tor8 | Robin_Watts: pdfref17.pdf page 445 last paragraph | 18:15.43 |
marcosw | Robin_Watts: the bmpcmp directory should be empty if there are no differences found. | 18:15.48 |
tor8 | "Additionally, when used in conjunction with a Type 2 CIDFont whose CIDToGIDMap entry is Identity, the 2-byte CID values represent glyph indices for the glyph descriptions in the True- Type font program. This works only if the TrueType font program is embedded in the PDF file." | 18:15.54 |
marcosw | marcosw: I guess you can ignore that too :-) | 18:16.08 |
mvrhel | finally my new version of strip tile rect makes it through the clist | 18:18.11 |
tor8 | Robin_Watts: the short summary is: file broken. it assumes the system has access to arial.ttf. | 18:18.38 |
Robin_Watts | tor8: Fair enough. | 18:18.54 |
| So, I'll leave it to you to close that bug. | 18:19.05 |
tor8 | I can think of a workaround, but that won't be pretty (it means adding an original font glyph -> substitute font glyph mapping table for microsofts fonts) | 18:19.49 |
| I don't consider that worth pursuing; better to use arial.ttf if available like sumatrapdf does | 18:20.27 |
| though that will likely not work on android and ios devices. | 18:20.50 |
Robin_Watts | Any thoughts on 692924 ? | 18:21.02 |
tor8 | nothing good, file magic detecting can't distinguish cbz from xps :( | 18:23.19 |
| though we could make it fall back to pdf if extension is not recognized easily enough | 18:23.46 |
mvrhel | hi ray_laptop: did you seem my reply to your question about the soft mask | 18:25.29 |
| oh you already replied | 18:25.52 |
| oh the use of the mask_is_image makes sense | 18:26.34 |
| makes sense to do the checking about the actual device also | 18:27.50 |
| bbiaw | 18:28.48 |
Robin_Watts | For mupdfinfo, I think it's more a question of the logic in main. It assumes that if an arg ends in .pdf or .PDF it's a file spec, otherwise it's a page range specifier. | 18:37.00 |
ray_laptop | for the logs: mvrhel: this code is only used for clist (rendered raster type) devices. I don't understand what "checking about the actual device also" means. | 18:38.00 |
ray_laptop | is curious if there will be any differences as a result of my experiment to the cropping | 18:39.27 |
| I'm proceeding to collecting per-band info if any marking is done with messy alpha or SMask. | 18:40.59 |
| I'm basically assuming that any SMask will contain non-trivial alpha somewhere (not just a PS ImageType 3 or 4 masked image) | 18:42.27 |
| mvrhel: just saw your cluster submission -- are you back ? | 18:44.39 |
ccotton_ | Howdy, I'm new here. but I had a question about MuPDF. I couldn't find any specifics in the code that I looked at. I'm interested in whether or not it supports the user being able to select a region of text and to copy out that text. | 20:43.58 |
| (specifically we are looking at licensing MuPDF for Android and possibly iOS) | 20:45.14 |
oy | mvrhel: hello | 20:48.10 |
mvrhel | oy: hello | 20:48.22 |
oy | mvrhel: this spring I had lots of events and will not come to the summit | 20:49.12 |
mvrhel | oy: ok | 20:49.27 |
oy | mvrhel: preparing some slides about the proposed concepts would be fine | 20:50.25 |
mvrhel | ray_laptop: for the logs. not sure what you mean by "checking about the actual device also" | 20:50.31 |
| oy: ok. I dont know if you can coordinate something with tkamppeter so that you can present something from afar | 20:51.05 |
oy | mvrhel: would be good, but so far have onotime seen such thing working - kind of ;-) | 20:52.19 |
| *have no time | 20:52.47 |
mvrhel | oy: yes, it can be difficult. | 20:52.59 |
oy | we can try, the slides will be useful anyway | 20:53.29 |
mvrhel | oy: at minimum an overview of your work and how it fits into the whole printing solution would be useful | 20:55.50 |
oy | mvrhel: will do a comparison of the PDF/CUPS CM paths | 21:01.14 |
mvrhel | souncs good | 21:01.28 |
| sounds good | 21:01.33 |
tkamppeter | oy, we had often remote speakers. You will send me the slides as PDF and Mike will put them into tyhe webcast. So you will see the slide which is on our conference room screen and also on the webcast screens of the other phone participants. | 21:11.51 |
| oy, you are talking through the phone and hear us through the phone. | 21:12.29 |
| The session is 9-12am in SF so it will be in the afternoon/evening for you, not in the middle of the night and not while you are doing your usual day job. | 21:13.27 |
mvrhel | finally my change compiles on the clusters. off to run an errand. bbiab | 21:20.10 |
tkamppeter | oy, Are you in Germany next week? Because of the phone costs. In Germany it is cheapest. | 21:22.55 |
oy | tkamppeter: yes, I will call in from Germany, the time appears fortunate for me | 21:40.36 |
tkamppeter | oy, so send me (and Mike Sweet) the slides in PDF format so that we have them available. | 21:49.45 |
| Oy, I have added you to the list of call-in participants on the Summit' s web page. | 21:51.12 |
Robin_Watts | ccotton_: Hi | 23:13.32 |
| MuPDF has a text extraction device. It's still experimental, but it works in most common cases - we use it to do searching etc. | 23:14.11 |
| You can certainly use that together with the list device to get the text for a specific region of the screen. | 23:14.36 |
| And indeed, the windows/linux viewers have that functionality built in. | 23:14.54 |
henrys | Robin_Watts:hope you know what you signed up for - sabrina is not my idea of a light packer. Mine should be easy, I'll definitely owe you a dinner or something. | 23:41.48 |
Robin_Watts | henrys: Well, as long as it fits in the car. | 23:42.04 |
| And it can't be worse than Helens :) | 23:42.12 |
| Sabrina told Helen she wasn't packing Wellies. | 23:42.40 |
| While wellies are overkill, if you're planning to visit stonehenge you'll want to have some shoes that can cope with walking on (possibly wet) grass. | 23:43.24 |
henrys | sabrina is concerned my jeans and sneakers might be a bit too hippie like for some of the places we are going. | 23:44.13 |
Robin_Watts | Urm... The step up from Jeans for me is a suit :) | 23:45.04 |
| But honestly, I can't imagine jeans and sneakers being a problem anywhere. | 23:48.30 |
mvrhel | marcosw: I like the bmpcmp output report | 23:54.55 |
| or Robin_Watts. not sure who did this | 23:55.09 |
henrys | that's a relief... I'll probably dress nice a few days though - going out to dinner and such. | 23:55.14 |
Robin_Watts | mvrhel: The 'no differences found in the following files?' - that was Marcosw. | 23:55.48 |
mvrhel | ok | 23:55.56 |
| it also reports when it can't compare the psdcmyk files | 23:56.15 |
| due to bit depth diffs | 23:56.21 |
Robin_Watts | henrys: Dressing nice involves upgrading the sneakers to shoes and swapping t-shirt for shirt. | 23:56.29 |
| Ah, right, yes. Also marcosw. | 23:56.41 |
| I'm supposed to pull that into the bmpcmp webpage. | 23:56.49 |
| but don't currently. | 23:56.57 |
mvrhel | I am hoping I am through the worst part on this project. the strip_tile stuff seems to be working. I had missed adding the proc to a few of the clip devices. added that and now testing once more | 23:58.19 |
| looks like something odd going on with the transfer function through the clist... | 23:59.11 |
Robin_Watts | nice. | 23:59.17 |
| and ouch. | 23:59.22 |
| The clist still scares me. | 23:59.28 |
mvrhel | it is a rube goldberg mess | 23:59.41 |
| Forward 1 day (to 2012/04/19)>>> | |