| <<<Back 1 day (to 2018/03/15) | 20180316 |
malc_ | tor8: arxiv pdfs (for example: https://arxiv.org/pdf/1803.02796) do not contain .pdf suffix, so the routine becomes download the paper(X), rename it to(X.pdf), run something that uses fz_open on the X.pdf. Question is there a way to have a fallback(to PDF) document handler (assuming that unrecognized document is a PDF was the default in fitz a while ago) | 10:41.20 |
tor8 | malc_: you can always call pdf_open_document as a fallback yourself. I have on my todo list to look at the first couple of bytes of a file to detect the file type in the absence of a mime-type/suffix | 10:58.53 |
| you can freely cast from pdf_document* to fz_document* | 10:59.12 |
malc_ | tor8: yeah i know, but was more thinking of registering special(fallback) document handler | 11:01.51 |
tor8 | malc_: see the example commit on tor/for-malc, maybe something like that should work for you | 11:09.47 |
| it breaks some other things so I'm not completely happy about it | 11:09.57 |
malc_ | it's beautifully tiny | 11:11.10 |
| tack | 11:11.18 |
tor8 | malc_: there's a better fix on tor/master | 11:24.48 |
malc_ | trying | 11:30.58 |
| works | 11:32.48 |
paulgardiner | tor8: that fix for the invisible form text looks to have worked, thanks. I'm seeing other problems, like I'm no longer able to detect I need to update a form button when its appearance changes, but I think you may have warned me about that. | 11:43.19 |
sebras | Robin_Watts: are you happied about the encryption commits on sebras/master now? I tried to explain in more detail. | 11:51.12 |
Robin_Watts | I'm happy with the first one. | 11:52.43 |
| The second one, I am still disquieted. | 11:52.51 |
sebras | looks up disquieted. | 11:53.03 |
| oh!, an 1830s word. :) | 11:53.37 |
Robin_Watts | When we do the 50 loop, we are (pointlessly, it seems to me, but what do I know, I didn't write the spec) munging the key output 50 times. | 11:54.14 |
| The input key to the first iteration may be n bytes long. | 11:54.35 |
| but the output from the fz_md5_final will be 16 bytes long. | 11:54.49 |
| so the input to the second iteration will be 16 bytes long. | 11:55.04 |
| Do you have any example files that use this code with n != 16 ? | 11:55.32 |
sebras | Robin_Watts: a fuzzed file where we overstep the initialized part of the key buffer, yes. | 11:56.06 |
| Robin_Watts: A real file? no. | 11:56.19 |
tor8 | paulgardiner: I've got some fixes for that too on tor/master | 11:56.35 |
sebras | the fuzzed file is part of bug 699085 | 11:56.37 |
paulgardiner | tor8: oh okay great | 11:57.47 |
sebras | Robin_Watts: pdfref17 page 125, algorithm 3.2 step 8 states: "Do the following 50 times: Take the output from the previous MD5 hash and pass the first n bytes of the output as input into a new MD5 hash, where n is the number of bytes of the encryption key as defined by the value of the encryption dictionary's Length entry." | 11:58.28 |
| Robin_Watts: previously we never cared about the Length entry, we always passed 16 | 11:58.56 |
| Robin_Watts: in other parts of the code we did care about the Length entry, but we never limited it to the size of the MD5 hash, so we stepped outside of the initialized part of the buffer. | 11:59.36 |
tor8 | paulgardiner: Robin_Watts: stuff on tor/master for review (including a possible fix for paul's form button appearance change update thingies) | 12:02.55 |
sebras | tor8: "Don't throw when fz_is_directory is called on a non-existent path." does what is advertized, but why do we want it? | 12:04.25 |
tor8 | sebras: so I can test if a path is a directory, without it throwing an exception if the path doesn't exist | 12:04.45 |
sebras | tor8: and that path is a manually entered path in the open file dialog? | 12:05.30 |
| tor8: if so I see the point. :) | 12:05.38 |
Robin_Watts | sebras: OK, that seems clearly to be what your new code does. | 12:05.47 |
| lgtm then. | 12:05.49 |
tor8 | sebras: yes. | 12:05.49 |
sebras | tor8: using pdf_set_annot_quadding() wouldn't you be able to create Q entries for annotations where these are not defined? | 12:13.22 |
| tor8: pdf_set_annot_opacity() doesn't have this problem because it ise defined for all markup annotations. | 12:15.11 |
tor8 | sebras: I don't see that as a significant issue | 12:15.49 |
sebras | tor8: in that case I propose check_allowed_subtypes() to be removed. | 12:16.16 |
tor8 | sebras: it would simplify (and improve performance) :) | 12:16.34 |
| sebras: the annot_has_property functions are also a bit misleading; there are types where it is defined to have the property, but it makes no sense | 12:17.04 |
sebras | tor8: that is true of all code I write. | 12:17.07 |
tor8 | like borders on Text annotations, which are just icons | 12:17.19 |
| and line endings on polygon annotations (which are closed polygons -- no line endings) | 12:17.48 |
| so I ended up needing to switch on the annotation type in the gui anyway, and not rely on the pdf_annot_has_xxx functions :( | 12:18.35 |
malc_ | tor8: 'Use KOI8-U for Cyrillic, and ISO 8859-7 for Greek.' any particular reason for -U vs -R? | 12:18.37 |
tor8 | malc_: -U is a strict superset of -R if you don't care about the block drawing characters (which I don't) | 12:19.01 |
sebras | tor8: you did use it for setting the author field though. | 12:19.03 |
tor8 | malc_: but if you have a better suggestion for a good encoding to use, I'm all ears | 12:19.19 |
| this is a bit beyond my area of expertise | 12:19.26 |
paulgardiner | tor8: that doesn't look to be working. | 12:19.40 |
tor8 | paulgardiner: hmm. okay. what problems are you seeing, and can I reproduce them with mupdf-x11? | 12:20.03 |
malc_ | tor8: my expertise is limited to... Ukrainians has letters we don't _AND_ vice versa | 12:20.31 |
| Ñ is и in Ukrainian etc | 12:20.46 |
| s;has;have | 12:21.24 |
tor8 | malc_: is that U+44b and U+418 respectively? | 12:22.17 |
malc_ | 44b and 438 | 12:22.53 |
paulgardiner | tor8: have you separated annotations and widgets into two different enumerations? | 12:23.05 |
tor8 | malc_: both characters are in koi8-r and koi8-u at the same places | 12:23.26 |
| paulgardiner: not yet, no. | 12:23.32 |
paulgardiner | Okay. Just checking. That could have been the reason muso wasn't picking up the changes | 12:24.00 |
malc_ | tor8: https://en.wikipedia.org/wiki/Koi8-u - KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian and Bulgarian, but replaces eight graphic characters with four Ukrainian letters Ò, Ð, Ð, and Ð in both upper case and lower case. | 12:24.27 |
| "replaces" is what bothers me | 12:24.33 |
tor8 | paulgardiner: do you check has_new_ap to trigger a re-render? | 12:24.39 |
paulgardiner | tor8: ah no, I don't | 12:24.52 |
malc_ | but if those are boxes which you care nothing about.. | 12:25.15 |
| well.. | 12:25.16 |
tor8 | malc_: yeah, these are the characters that old DOS programs like Turbo Pascal use to draw boxes | 12:25.43 |
| our base14 fonts don't have those characters anyway | 12:26.26 |
malc_ | tor8: links (the browser) still has an option to use KOI8-R for frames :) | 12:27.01 |
tor8 | and I doubt I've ever seen them used in a PDF file | 12:27.30 |
| malc_: they were very useful characters :) | 12:28.10 |
paulgardiner | tor8: sorry, I am testing has_new_ap. I'd forgotten making that change. | 12:29.56 |
tor8 | paulgardiner: don't forget to reset it to '0' once you've used it (nothing in the core resets the flag, it just sets it for the UI's convenience) | 12:30.35 |
paulgardiner | I seem to be doing that too | 12:30.47 |
sebras | tor8: pdf_add_cjk_font() has a few poential leaks of dictionaries. | 12:31.02 |
| tor8: but it requires pdf_dict_put_name() and friends to throw of course. | 12:31.16 |
paulgardiner | tor8: I seem to be detecting the change, but still rendering the old copy | 12:31.40 |
tor8 | sebras: hm, I thought I'd checked that it wouldn't leak. which objects? | 12:31.50 |
sebras | font = pdf_new_dict(ctx, doc, 5); | 12:32.03 |
tor8 | is taken and dropped by pdf_add_object_drop | 12:32.21 |
sebras | tor8: yes, if you get that far. | 12:32.36 |
tor8 | sebras: ah, you should look at the next commit too. | 12:32.36 |
| it might leak in the first commit, sorry. | 12:32.44 |
sebras | no worries. | 12:32.52 |
tor8 | the 'Simplify PDF font creation code' reworks and plugs a lot of possible leaks and simplifies the structure to be similar everywhere | 12:33.06 |
malc_ | tor8: :) | 12:38.03 |
sebras | tor8: in the cyrillic commit message "argumnet" | 12:38.35 |
| tor8: also I'd like for these two commits to patch docs/manual-mutool-create.html explaining how these directives can be sued. | 12:40.10 |
| used. | 12:40.13 |
malc_ | darn... now i recognize bad kerning arising from "argumnet" in my face of choice.. sigh | 12:40.27 |
tor8 | malc_: what is your font face of choice? | 12:40.57 |
malc_ | tor8: inconsolata with fallback to anonymous pro for Cyrillic. it seems inconsolata is VEI VEI close to misc-fixed.. (spent a considerable amount of time recently finding replacement for Fira Mono after S9 59 debacle) | 12:42.38 |
paulgardiner | tor8: I think possibly it is working okay. For some reason muso is not updating the correct area of the page. Used to be fine, and I have no recollection of making a change in that area, but that looks to be the problem | 12:42.42 |
tor8 | malc_: heh, I use bitmap fonts in my xterms... | 12:43.36 |
malc_ | tor8: i irc and edit in x11 emacs :) and my font of choice in terms in pt mono (least amount of kerning and look alike glyph problems) | 12:44.26 |
| s;in;is | 12:44.33 |
| tor8: i would have still used bitmap misc-fixed.. but it's too small for this largeish monitor :( | 12:45.23 |
tor8 | malc_: gnu unifont isn't too bad, and it's very useful when working on unicode stuff | 12:45.43 |
malc_ | tor8: gnu unifont is one of the few typefaces i have installed :) | 12:46.12 |
tor8 | and it looks a little bit similar to the old 6x13 | 12:46.13 |
malc_ | tor8: it's _based_ on the old 6x13 :) | 12:46.26 |
tor8 | https://ghostscript.com/~tor/stuff/fonts/codec/ I use this most of the time though, but it only does latin-1 | 12:46.32 |
malc_ | deal breaker here | 12:46.46 |
tor8 | yeah, I assumed it would be :Q) | 12:47.02 |
| ehm, that Q shouldn't be there | 12:47.09 |
| I blame fat fingers and dvorak | 12:47.22 |
sebras | tor8: int r = nelem(koi8u_from_unicode) / 2 - 1; why are you dividing by 2 here and in pdf_greek_from_unicode()? all other implementations of binary search start with r = nelem() - 1; | 12:47.29 |
malc_ | tor8: http://repo.or.cz/llpp.git/blob/2eb3314b6c5946ac3fe49ae336e5fdd02ed224c9:/misc/llppac (i guess can be massaged to use mupdf instead of llpp) | 12:47.54 |
| allows one quickly "see" the faces | 12:48.04 |
tor8 | sebras: ooops. bad copy & paste :) | 12:48.12 |
sebras | tor8: in pdf_add_simple_font_encoding_imp() why reserve 129 spaces in PDF_NAME_Differences? you will only populate 128 at most..? | 12:49.42 |
tor8 | sebras: one for the initial length | 12:50.05 |
| plus 128 entries | 12:50.09 |
| but in reality it's fewer than that | 12:50.17 |
paulgardiner | tor8: I'd found a place not far back in muso development where button updates were working, so I should be able to find out the cause of the problem soon | 12:55.10 |
sebras | tor8: "Add simple fonts with 8-bit greek and cyrillic encodings." | 12:58.20 |
| does not update mupdf_native.c | 12:58.29 |
| neede because pdf_add_simple_font() adds an argument. | 12:58.52 |
tor8 | sebras: right. will fix. | 12:59.07 |
sebras | tor8: apart from what I have mentioned everything up to (but not including) "Simplify PDF font creation code." LGTM. | 13:00.24 |
| tor8: I'm going to go on a hunch in "Simplify PDF font creation code." and assume that the contents of pdf_standard, pdf_mac_roman, pdf_mac_expert, pdf_win_ansi, pdf_glyph_name_from_koi8u and pdf_glyph_name_from_iso8859_7 never really change... | 13:02.50 |
tor8 | sebras: yeah, I just reformatted them so they'd be easier to compare | 13:03.10 |
| and added the ascii bits to koi8u and iso8859-7 | 13:03.21 |
paulgardiner | tor8: something between mudpf commits dd76e14 and 4839427 broke form button updates, but I guess that might not have been brokenness in the same way as now. | 13:06.26 |
| It has to be changes in mupdf because I've done nothing much on muso other than resync mupdf since then. | 13:07.59 |
tor8 | paulgardiner: is it commit 9f8e07247? | 13:09.14 |
paulgardiner | Possibly. I think you said you expected it to cause problems. The strange thing is that now - with your latest changes - I can detect widgets have changed, but I'm updating the wrong rectangles of the page. | 13:12.00 |
| I assumed that was some slip in one of the muso commits, but I've done nothing to muso other than resync mupdf since it was working. | 13:12.53 |
| I'm getting various rendering glitches, although possibly all form related. | 13:13.30 |
| Oh hang on. I should be able to try just the mupdf update at the point at which it was working | 13:17.03 |
| How did we ever survive before git | 13:20.31 |
tor8 | paulgardiner: uphill, in the snow, both ways ;) | 13:21.34 |
sebras | tor8: this simplification will take me while to review. | 13:23.55 |
tor8 | sebras: might be better to just review the result than the diffs | 13:26.07 |
| I would like to pull that code out of pdf-font.c into its own file | 13:26.35 |
| it has nothing to do with font loading, which is what the rest of that file does | 13:26.43 |
paulgardiner | Okay I do now have a working and non-working version which differ only in mupdf, the non-working one using your current master branch | 13:33.12 |
| Something is provoking muso to update the wrong page area and I'm seeing other rendering glitches I wasn't before. | 13:34.02 |
| I realise that doesn't entirely confirm it isn't a muso problem. | 13:36.35 |
tor8 | paulgardiner: try working at the commit I mentioned before, and the commit before it | 13:37.21 |
| 9f8e07247 and the one preceding it | 13:37.38 |
paulgardiner | But don't we know that that breaks it? | 13:37.50 |
tor8 | are you sure it's that commit? | 13:38.01 |
| I suspect it may be, but it would be good to know for certain before I start chasing shadows | 13:38.15 |
paulgardiner | I may have misunderstood. I thought you made a change that you expected would stop form updates being detected and then later you fixed it. | 13:39.06 |
| I'll try what you suggest in any case | 13:42.28 |
tor8 | nothing on tor/master should stop form dates being updated | 13:48.35 |
paulgardiner | dates? | 13:49.06 |
tor8 | update being detected | 13:49.25 |
| brain fart :) | 13:49.27 |
paulgardiner | I'm just building at 8ebca3d | 13:51.08 |
| That worked fine | 13:53.40 |
tor8 | and 9f8e07247 doesn't work? | 13:54.39 |
| sebras: commits on tor/master updated to fix your comments | 13:55.55 |
paulgardiner | tor8: testing now | 13:56.05 |
| Yes that breaks it. | 14:01.55 |
tor8 | what is the nature of the breakage? I assume you can detect that things have changed via has_new_ap correctly? | 14:24.17 |
| are the transforms wrong? | 14:24.42 |
titanous | sebras: hey, just realized I forgot to give you the CVEs for the upcoming release: CVE-2018-1000036, CVE-2018-1000037, CVE-2018-1000038, CVE-2018-1000039, CVE-2018-1000040 | 14:25.49 |
| any idea what date it will be announced on? | 14:26.13 |
tor8 | paulgardiner: that commit's main effect is removing the pdf_xobject sturct and its 'iteration' and replacing it with the has_new_ap field in the pdf_annot | 14:26.38 |
| paulgardiner: does it break calc.pdf or do I need to look at another form? | 14:28.15 |
paulgardiner | It's difficult to be sure, but that's how it looks. I'm certainly detecting many annots with has_new_ap set. More than I'd expect for a single click | 14:28.24 |
tor8 | paulgardiner: yeah. I think it was set too often before the commit on tor/master "4b3a206d1 Load most annotations, even if they are missing appearances." | 14:29.21 |
| which did fix some bugs in that regard | 14:29.26 |
paulgardiner | Strangely calc.pdf seems okay | 14:29.31 |
tor8 | well, calc.pdf is my main test here so... | 14:29.47 |
paulgardiner | I'll send you a file | 14:31.08 |
tor8 | paulgardiner: hm, it seems to work with mupdf-x11 on both tor/master and origin/master | 14:36.34 |
paulgardiner | Mysterious. Just taking on that commit, I get all sorts of glitches | 14:46.53 |
tor8 | that is the commit that changed how to detect when an annotation has a new appearance | 14:48.20 |
| but just "glitch" is a bit vague for me to go on... | 14:48.34 |
paulgardiner | The check boxes don't update and then changing zoom level causes momentary incorrect renders of parts of the page. Perhaps I have a timing related bug in muso redraw. | 14:50.13 |
tor8 | paulgardiner: maybe you're calling pdf_update_page too rarely or something else not triggering the detection? | 14:53.18 |
paulgardiner | The detection is fine... overly fine. Tapping just one check box cause a load of has_new_ap flags to be set | 14:54.03 |
tor8 | paulgardiner: yes. that should be less so in tor/master. | 14:57.50 |
| how do you check which regions to update? | 14:58.06 |
paulgardiner | The annotation's bounds. | 15:01.39 |
| I think I have a bug in muso and this change is just tickling it. | 15:01.56 |
| After tapping a check box, the rendering during zooming is completely broken for a while and then it recovers until the next check box. | 15:03.01 |
| I can't see how the mupdf change alone can do that without there also being a bug in muso | 15:05.15 |
| ... unless it's memory corruption | 15:08.50 |
tor8 | valgrind says nothing with mupdf-x11, so I think I'll leave you to hunt this one yourself for a while. | 15:10.49 |
jogux | paulgardiner: There's at least one muso bug that must be memory corruption (I presume either inside mupdf or related to the way we drive mupdf) | 15:28.29 |
paulgardiner | In bugzilla? | 15:29.38 |
| I think the most likely cause of this is that I'm forgetting to redispatch onto the UI thread somewhere. | 15:31.18 |
| The algorithms that are falling apart are common to the epage and muso branches | 15:32.14 |
| Not just the algorithms. It's common code | 15:32.30 |
jogux | paul: yep - https://bugs.ghostscript.com/show_bug.cgi?id=698931 | 15:46.01 |
| (actually I don't know for sure it's memory corruption) | 15:46.33 |
fredross-perry | tor8, sebras - I saw in the logs you did some work that relates to our conversation about annotations. I'm digging into that now, but I see rhat the current master doesn't build, because it needs a fix that I think is only on tor's master (number of arguments for to_Annotation_safe). I'll make that one fix and proceed, but you might want to move that fix to the main master. | 16:12.17 |
paulgardiner | I think I can see what it is. It's the number of updates along with the fact that I haven't implemented render abort | 16:12.19 |
fredross-perry | tor8, sebras - if I use tor's latest (which includes that fix), and I call updateAppearance() when creating annotations, then it works without my having to reload the page from the doc. Nice. Would we not always update the appearance each time you make an annotation? In which case, push that inside the createAnnotation() api ? | 19:50.45 |
tor8 | fredross-perry: it would be wasteful to create the appearance automatically. if the use case is create it and then fill in some values, then we'd create the appearance twice. | 21:26.43 |
fredross-perry | OK, I understand that. So let me know when everything is golden and I'll update muso. Thanks. | 21:28.44 |
tor8 | fredross-perry: sure. I'll get the stuff on tor/master reviewed and hopefully pushed on monday. | 21:34.01 |
fredross-perry | sweet. | 21:34.14 |
tor8 | or if you approve the "Fix java build." commit I can push that now so you don't need to wait | 21:34.26 |
| there are a few other issues that later commits fix, so if you're not in a rush you could just as well wait until monday | 21:34.46 |
velix | tor5: Thanks for analytics. I'll reply soon. Sorry, got involved in a huge project. | 23:43.13 |
| tor8: this was for you ;) | 23:43.22 |
| Who's tor5? :) | 23:43.27 |
| Forward 1 day (to 2018/03/17)>>> | |