| <<<Back 1 day (to 2018/01/13) | 20180114 |
tomasz | hi | 11:33.41 |
mubot | Welcome to #mupdf, the channel for MuPDF. If you have a question, please ask it, don't ask to ask it. Do be prepared to wait for a reply as devs will check the logs and reply when they come on line. | 11:33.41 |
tomasz | wanted to ask if MuPDF is able to print PDF docs to printer? | 11:34.01 |
titanous | sebras: hey, I'm the author of the fuzzing stuff, happy to discuss | 17:07.44 |
sebras | titanous: ok. so when you wrote that the email addressess must have a google account, would G Suite addresses that are hosted by Google but use our company domain work? | 17:19.10 |
titanous | yeah, that's fine | 17:19.25 |
sebras | titanous: non Google hosted addresses are out of the question? | 17:19.35 |
| titanous: it is not obvious to me why that would be, though. :) | 17:19.48 |
titanous | you can sign up for a Google account with any address (if it isn't already hosted by G Suite) | 17:20.07 |
| (the access control system for the oss-fuzz issue tracker uses Google sign-in) | 17:20.35 |
sebras | titanous: right. | 17:20.42 |
| titanous: let me get back to you with a proper address next week. most of my colleagues are away at the moment. | 17:21.13 |
titanous | great | 17:21.23 |
sebras | titanous: I'm happy that some one set up the oss-fuzz thingy though. because I was thinking about it when it was launched. | 17:21.50 |
titanous | awesome! | 17:22.09 |
sebras | titanous: I also saw a bug that oss-fuzz appears to already have reported. | 17:22.30 |
titanous | yeah, that's just from running the fuzzer locally | 17:22.41 |
| it hangs pretty quickly | 17:22.49 |
sebras | titanous: right. I'm tied up in a few other bugs at the moment so it will not get immediate attention, but it is on my radar at least. :) | 17:23.05 |
titanous | cool | 17:23.12 |
sebras | titanous: btw, when you are testing are you testing using git HEAD or the latest release? | 17:23.27 |
titanous | git HEAD | 17:23.46 |
| oss-fuzz will track HEAD and automatically update the fuzzers | 17:24.02 |
| and then close any issues that have been fixed | 17:24.10 |
| do you have a CI system for mupdf? if so we can integrate regression testing there | 17:24.52 |
sebras | excellent! we've had fuzzing people run on the latest release, only for us to find out that we had already fixed the bug (but not done a new release). | 17:25.04 |
| titanous: if you want to find a number of new pdfs to use to seed the fuzzer one option would be to harves the PDFs from e.g. popplers bugzilla or so. | 17:25.44 |
| really, whatever PDF parsers bugzilla. | 17:26.00 |
| bug the one from pdf.js is indeed a good start. | 17:26.13 |
titanous | hmm, good idea, I might put together a crawler | 17:26.25 |
sebras | titanous: though quite often the PDFs are hosted at the bugzillas of the UI:s, like okular and evince. and for mupdf we previously got lots of reports via sumatrapdf. | 17:27.18 |
titanous | mupdf supports some other formats too, right? any idea where to get corpuses for them? | 17:27.30 |
sebras | we do. | 17:27.39 |
| for epub there is a public test suite. | 17:27.51 |
| http://epubtest.org/testsuite/ | 17:28.06 |
| that's something to start with. | 17:28.11 |
titanous | great | 17:28.22 |
| how do I send patches? I started with the fuzzer in oss-fuzz because it's easier, but they much prefer that all of the code goes upstream | 17:29.26 |
sebras | but we do support parsing xps/oxps, svg, cbz/cbt, bmp/gif/j2k/jpeg/jpx/jxr/pam/pbm/pgm/ppm/pnm/png/tiff, fb2, html/xhtml/xml as well. | 17:30.25 |
| we usually take patches via bugs.ghostscript.com | 17:30.57 |
titanous | k | 17:32.03 |
sebras | perhaps ccxvii (tor8 here) has enabled github issues and pull requests. but that's not the normal path for contributions. :) | 17:32.41 |
titanous | yeah, I'm much more familiar with that flow, but if you prefer Bugzilla I can make it work | 17:33.12 |
| I will work on some more fuzzers and start sending patches to integrate them | 17:33.35 |
sebras | we will certainly look at them, but depending on how invasive the patches are we may need to reconsider taking them on. I don't know what to expect at this point. :) | 17:34.28 |
titanous | I don't expect anything particularly invasive, mostly just new files and a script to compile everything, and there may occasionally be an internal hook that allows bypassing problematic work, for example a way to parse an epub that's already unzipped (if that doesn't already exist) | 17:36.12 |
| each of the fuzzers will just be one file and only a few lines | 17:37.28 |
sebras | sticking them into one of the subdirectory would probably help acceptance I suspect. | 17:37.59 |
titanous | yeah, I was planning to put them all in src/fuzz or similar | 17:38.19 |
sebras | personally I prefer if these tool specific things are not residing in the code repository, but in another repo. hm. maybe it would be possible to setup a repo that has mupdf as a submodule? if the patches are not accepted that might be one way forward. | 17:39.34 |
| I don't know what tor8's (nor others') position is on these things yet. | 17:39.56 |
| I think the bugs that may be found or useful to us, hence my interest in it happening. :) | 17:40.45 |
titanous | yeah, it doesn't really matter, there are two goals: 1) make sure the fuzzers don't bitrot if APIs or dependencies get changed slightly and 2) have a way of running regression tests against all past finds | 17:40.52 |
| so that bugs don't get reintroduced | 17:41.02 |
| the way we accomplish those goals doesn't matter | 17:41.18 |
sebras | may I ask what prompted you to do this? | 17:41.53 |
titanous | so if a separate repo is better that's fine (and easier for me) as long as you can commit to just compiling the fuzzers and running the regression tests as part of your process | 17:41.56 |
| the new version of Ruby on Rails is going to use mupdf for generating PDF previews | 17:42.30 |
| and I find memory-unsafe code extremely scary especially when it processes untrusted input | 17:42.47 |
| so this is just my attempt to stop the bleeding a bit | 17:42.58 |
sebras | oh, I see. just to be clear, everyone knows that MuPDF is licensed under AGPLv3, and what that implies, right? | 17:46.46 |
titanous | I don't work with Rails at all, I just have indirect dependencies on it | 17:48.13 |
| they are calling mutool | 17:48.22 |
| so I don't think AGPL matters | 17:48.27 |
| as there is no direct linkage | 17:48.44 |
sebras | ok, I see. I wasn't aware about how it interfaced with mupdf. :) | 17:49.26 |
| it is refreshing to hear that people actually take licensing into consideration. :) | 17:49.50 |
titanous | well, I do :) but as I said, I don't know anything about what's going on with Rails other than that they are using this in their new beta | 17:50.22 |
| and it's not sandboxed by default | 17:50.40 |
sebras | titanous: right, but depending on your level of paranoia there might be ways to escape the sandbox due to bugs in the sandbox. oh well, I appreciate your starting to setup mupdf for oss-fuzz anyway. | 17:52.50 |
titanous | oh for sure, though seccomp-bpf would mitigate most breakout concerns pretty well at this point as the number of syscalls required is very small and well-defined | 17:54.12 |
| Forward 1 day (to 2018/01/15)>>> | |