| <<<Back 1 day (to 2018/08/05) | 20180806 |
kens | sebras without seeing your PDF file its hard to comment, but why are you even bothering with parsing this ? It sounds to me like its a fundamentally broken PDF file. | 06:53.29 |
sebras | kens: it's a fuzzed file that we should handle somehow without crashing. so the question is more along the lines of: which one is the most fault tolerant way of parsing hex strings. | 10:36.22 |
| I'm not yet convinced the current approach is the best one. | 10:36.41 |
kens | Not crashing is fair enough, but an error seems like the right thing to me | 10:36.47 |
sebras | kens: we crashed at a later stage because these strange numbers were out of bounds and we didn't handle that. | 10:37.43 |
kens | Yeah I'd be inclined to error out when you encounter the bad heex in the CMap | 10:38.26 |
sebras | kens: there's a begincodespace/endcodespace and so I decided to have the ranges inside beginbfrange/endbfrange be limited to the codespace ranges in beginbfrange. | 10:39.31 |
kens | What does GS do with the file ? | 10:39.32 |
sebras | I believe this is the right thing to do. | 10:39.39 |
sebras | is still looking for the file. | 10:39.48 |
kens | Yes that seems reasonable | 10:39.48 |
| But I'd throw an error if they aren't | 10:39.57 |
| Unless you can show that Acrobat does 'something else' | 10:40.10 |
sebras | kens: I decided to just ignore that particular range. | 10:40.42 |
kens | What then happens if a text string uses teh CID in that range ? | 10:41.18 |
sebras | _if_ it does I think it should fail, but it might not. | 10:42.17 |
kens | Well obviously I know nothing about MuPDF, I'd like to think you;d get an error, but its probably worth checking | 10:42.45 |
sebras | kens: the files is 9374*.pdf in my home directory. it comes from oss-fuzz. | 10:43.00 |
kens | OK let me look | 10:43.12 |
| My that's a long name..... | 10:44.21 |
sebras | kens gs prints a warning but doesn't fail, the page is blank. I think that's fine. | 10:44.24 |
kens | Seems reasonable | 10:44.35 |
sebras | kens: the first part is the oss-fuzz bug number, the second part is the mupdf SHA1 where it was first reproduced according to oss-fuzz. | 10:44.54 |
kens | Hmm current GS code renders some text | 10:45.40 |
| "Issue " then a load of text all on top of itself | 10:45.56 |
| along with a warning about the font Widths array being smaller than the character range | 10:46.19 |
sebras | I just run gs 9374*.pdf and get a blank page. | 10:46.34 |
kens | as well as complaining about the xref | 10:46.35 |
| What version of GS ? | 10:46.53 |
| Because that's all I'm doign | 10:47.02 |
sebras | updates. | 10:47.17 |
kens | I get the same with the 9.23 release code | 10:47.29 |
sebras | I was on 1c12d01a2 | 10:47.36 |
kens | I've no found a good way to figure out whta the SHA relates to in the logs, other than checking it out.... | 10:48.14 |
| But I'm getting txt out, though GS does complain about the file | 10:48.30 |
sebras | there's a 9374.png with the rendering that I get out from mupdf after my fixes. | 10:48.32 |
| I still get nothing from gs... | 10:48.52 |
| kens: nothing rendered, just a blank page. the warnings you've mentioned I alos get. | 10:49.14 |
kens | The size of the MuPDF png matches GS but not the text I see | 10:49.23 |
| Acrobat refuses even to open it :-) | 10:49.48 |
| So its not important anyway | 10:49.53 |
sebras | kens: ok. I haven't even tried that. at first I just wanted mupdf not to crash. | 10:51.07 |
kens | Yep, not crashing is important :) | 10:51.34 |
sebras | kens then I took a look at object 1 and saw where the issue stemmed from. | 10:52.25 |
Hufokus | Hey! How to run_page with specific dpi? | 21:06.46 |
sebras | Hufokus: see fz_transform_page() as being called in transform_page() in platform/gl/gl-main.c | 21:12.19 |
| the second argument is expressed in dpi, and you can then use the resulting matrix to transform your page (and even get rotation if you like). | 21:13.10 |
sebras | sleeps. | 21:13.28 |
| Forward 1 day (to 2018/08/07)>>> | |