| <<<Back 1 day (to 2016/09/04) | 20160905 |
sebras | tor8 (for the logs): I have a commit on top of sebras/wip that may cure the issues that I have been seeing with handling luminosity softmasks with transparency groups whose colorspace has gone missing. basically we skip them. I put a few test pdfs on casper. | 02:01.55 |
| I'm quite unsure about if this is a good approach though. is it better to assume DeviceGray or similar? I guess this is more fault tolerant. | 02:02.25 |
| also I'm in deep water because I only have a very rough understanding of what softmasks are. :) | 02:02.47 |
| i.e.... help? :) | 02:02.54 |
Robin_Watts | sebras: morning. | 09:41.43 |
tor8 | sebras: grayscale would be a perfect colorspace to assume when a Luminosity softmask doesn't list one | 10:57.06 |
| because after rendering a luminosity softmask, we convert it to grayscale and then use the grayscale values as alpha | 10:57.42 |
Robin_Watts | tor8: It would be good to understand how we get into that state though. | 11:17.42 |
| We should never get into the state of not having a colorspace, unless maybe we're in the process of clearing up from an earlier failed error. | 11:18.16 |
| And if we're in the process of clearing up from such an error, then skipping rendering may be exactly the right thing to do (to ensure we fail as fast as possible). | 11:18.42 |
chrisl | Robin_Watts: Images don't inherit the color space | 11:23.30 |
Robin_Watts | chrisl: This is not about PDF images. | 11:29.17 |
| This is about the internals of the mupdf draw device state. | 11:29.33 |
chrisl | Ah, I'd assumed it was about handling broken files... sorry | 11:30.24 |
tor8 | Robin_Watts: actually, I think sebras was talking about the pdf interpreter state when the PDF does not have a colorspace entry | 11:36.50 |
Robin_Watts | I think broken files are the issue here. | 11:36.57 |
tor8 | but I could be mistaken. it's a monday... | 11:37.05 |
Robin_Watts | tor8: Yeah, me too, I'm still trying to get up to speed after 2 weeks with no Internet... | 11:37.37 |
| I stood by a satellite dish to try and absorb some IP packets about half way through the holiday to stave off the DTs. | 11:38.15 |
tor8 | Robin_Watts: hehe. welcome back! | 11:41.34 |
HenryStiles | US holiday today, but I'll check email a couple times today for support stuff... | 13:28.05 |
Robin_Watts | HenryStiles: Anything vital come in for me while I was away? | 13:32.06 |
| I didn't see anything when I ran through the email, but I could easily have missed stuff. | 13:32.24 |
HenryStiles | Robin_Watts: no I don't think so. | 13:43.28 |
Robin_Watts | Fab. | 13:43.53 |
| tor8: I've just found a bug in mutool clean. | 14:25.25 |
| I've got a 1bpc mono image that's 2x1 in size. | 14:25.48 |
| The object has the data flate compressed, so it's 6 bytes long. | 14:26.09 |
| /Length is 1 (the uncompressed length of the stream) | 14:26.29 |
tor8 | hm, shouldn't /Length be the compressed length of the stream? | 14:27.30 |
Robin_Watts | pdf_load_raw_stream reads /Length as 1, and calls pdf_open_raw_stream to ignore the filtering, and then calls fz_read_all | 14:27.32 |
| Yeah, you might be right. | 14:28.51 |
| which means the problem is elsewhere in mutool clean. | 14:29.02 |
tor8 | is the input file correct (i.e. have /Length 6)? | 14:29.26 |
Robin_Watts | I've run the file through twice. | 14:29.42 |
| The ORIGINAL file has /Type/XObject/Subtype/Image/Width 2/Height 1/ColorSpace/DeviceGray/BitsPerComponent 1/Length 1 | 14:29.59 |
| and just 1 uncompressed byte of data. | 14:30.07 |
tor8 | oh, so when we compress it we're saving out the wrong Length | 14:30.12 |
| rats | 14:30.16 |
Robin_Watts | After it's been through pdfclean -difggg, we compress it using flate, but write a length of 1. | 14:30.29 |
tor8 | agh, I think I see where the problem could be | 14:31.57 |
| in copystream, we have the code that updates the Length inside if (do_ascii), unlike in expandstream where we always do it | 14:32.26 |
| that looks like a copy/paste or merge error | 14:32.34 |
| last commit on tor/master | 14:33.42 |
Robin_Watts | tor8: Would it be possible to only compress if it actually gets smaller ? | 14:34.41 |
| yeah, just a bit of fiddling after the deflatebuf call, right? | 14:35.24 |
tor8 | should be a trivial fix in copystream | 14:35.25 |
| yeah, if the resulting buffer is bigger than the original, don't add the FlateDecode filter | 14:36.15 |
| I am a bit unsure how robust this copystream stuff is though | 14:37.11 |
| oh wait, no, if it already has a filter we assume it is already compressed so don't do anything to it. | 14:37.43 |
| so my worry is unfounded | 14:37.49 |
| Robin_Watts: don't forget the same test in expandstream if you do add it | 14:38.54 |
| expandstream decompresses before recompressing | 14:39.14 |
| whereas copystream only compresses | 14:39.23 |
Robin_Watts | expandstream doesn't recompress? | 14:39.26 |
| oh, it does. | 14:39.52 |
| ok. | 14:39.54 |
tor8 | expandstream recompresses if asked to, so there we should also test for whether compression benefits if we add the test to copystream | 14:39.54 |
Robin_Watts | tor8: agreed. Have done here. | 14:40.52 |
| I'll roll your fix into mine here, and put a commit up for review, OK? | 14:41.22 |
tor8 | okay. | 14:42.56 |
Robin_Watts | http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=796be86b078c59bed0d67c4322072317a9fafb06 | 14:44.39 |
| Urgh. Now I'm getting "Not a dict (<NULL>)". I thought I saw a fix for that whizz past ? | 14:48.20 |
tor8 | I fixed at least one, fix should be on master | 14:49.01 |
| if there are others, that's code that's been behaving badly all along. | 14:49.21 |
| fix LGTM | 14:50.16 |
Robin_Watts | http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=c648955c79231d750dc43a0514f3612b41e35294 | 14:52.45 |
| There are 2 more commits on robin/master to do with type3 fonts that date from before I went away. | 14:53.11 |
| I thought they had been committed, but evidently not... maybe I was waiting for you to have a chance to comment? | 14:53.46 |
tor8 | in the new fz_bound_ft_glyph, you set bounds = &font->bbox_table[gid] | 14:54.46 |
| not all fonts have a bbox_table | 14:54.50 |
Robin_Watts | tor8: Which is very odd, cos I never use bounds. | 14:56.36 |
| I shall remove that line :) | 14:57.09 |
tor8 | I wouldn't bother keeping SVG_SEND_REPEATED_IMAGES | 14:57.14 |
| Robin_Watts: um, bounds is used on line 942 | 14:57.40 |
Robin_Watts | yeah, I can't read diffs. | 14:58.17 |
tor8 | the commit removes the fz_rect *bounds input argument, presumably thinking it can always use the bbox_table instead | 14:58.26 |
Robin_Watts | yeah, I'll rework that. | 14:58.34 |
tor8 | but for CJK fonts (or fonts with >LIMIT number of glyphs) we don't create a bbox_table | 14:58.43 |
Robin_Watts | yeah, I see the problem now. | 14:59.01 |
| tor8: So, the SVG_SEND_REPEATED_IMAGES thing... | 14:59.14 |
tor8 | you mention browsers crashing? | 14:59.44 |
| do <symbol> tags need to live inside a <def>? | 14:59.58 |
| my SVG is rusty | 15:00.00 |
Robin_Watts | The problem is that some consumers of SVG files (firefox and edge for example) die horribly when given the new style files (with a significant amount of reuse in them). | 15:00.35 |
| Mine was never shiny. | 15:00.50 |
tor8 | IIRC it's strongly recommended that <symbol> lives inside of a <defs>, but it's not required by the spec | 15:03.03 |
Robin_Watts | Did you nod at the outlines fix? | 15:03.53 |
tor8 | the outlines fix LGTM | 15:06.07 |
Robin_Watts | Thanks. | 15:07.53 |
| oh, and I have some comment fixes from holiday... let me do those now before I forget them. | 15:08.39 |
sebras | Hi ghostbost! | 15:13.42 |
| Hi Robin! | 15:13.47 |
| Hi tor8! | 15:13.49 |
Robin_Watts | wonders if sebras needs to cut down on the caffiene :) | 15:14.31 |
sebras | Robin_Watts: the file I was looking at has a content stream that sets an extgstate that specifies a softmask. | 15:14.47 |
| this softmask is luminosity and uses a proper transparency group xobject. | 15:15.04 |
| next the content stream attempts to draw an xobject. | 15:15.16 |
| err... I was wrong the transparency group xobject is broken in that is doesn't have a colorspace set. the softmask xobject. | 15:17.16 |
| the normal form xobject that is drawn _does_ however have a colorspace. | 15:17.38 |
| so what happens is that when processing the form xobject everything is going well up until we decide to call begin_softmask and set the softmask pushed in the gstate. | 15:19.43 |
| because then we attempt to query the softmask xobject what colorspace to use (at least we do if it is luminosity, for alpha I don't know). | 15:20.12 |
| and this is what causes us to end up in fz_draw_being_mask() with cs == NULL. | 15:20.27 |
| (because there is no cs specified in the softmask) | 15:20.38 |
| Robin_Watts: does that make sense? | 15:20.41 |
Robin_Watts | That does, I think. | 15:30.40 |
| I need to look at the spec to see if transparency group xobjects are required to have a colorspace. | 15:31.02 |
sebras | Robin_Watts: so the problem is that softmasks _must_ have a colorspace, but this particular one does not. | 15:31.04 |
| Robin_Watts: pdfref17.pdf page 557 | 15:31.17 |
| Robin_Watts: for luminosity they are required. | 15:31.30 |
| for alpha I don't know (probalby not?) | 15:31.38 |
Robin_Watts | So the case you're seeing is a transparency group that is that /G entry in the softmask dictionary of subtype Luminosity ? | 15:33.27 |
sebras | Robin_Watts: yes, and the transparency group is missing a colorspace. | 15:34.39 |
Robin_Watts | That would indeed appear to be required. | 15:34.50 |
| Assuming grey there in the luminosity case would seem reasonable then. | 15:35.09 |
sebras | right, because if I assume gray then I don't have to throw and attempt to handle it. | 15:35.36 |
Robin_Watts | it might be nicer to try and 'correct' the transparency group at load time, rather than to try and catch it at every point during rendering where it isn't defined. | 15:35.58 |
sebras | Robin_Watts: we don't normally do that sort of thing though..? | 15:37.12 |
| Robin_Watts: are you worried that this softmask would be used so often that the workaround if-statement would be expensive? :) | 15:37.57 |
Robin_Watts | sebras: No, it's not a speed thing. | 15:39.02 |
| sebras: I'd really like for the draw device entry points to be able to assert that they have the required colorspace. | 15:39.29 |
| and for us to fix it in one place. | 15:39.39 |
| That way if the asserts start tripping because of another issue, we haven't accidently masked it in our attempts to cope with this case. | 15:40.13 |
sebras | Robin_Watts: so... that means that you would advice against detecting a null colorspace in being_softmask() and replacing it with devicegray? (that function is i pdf-op-run.c) | 15:42.00 |
Robin_Watts | sebras: No, I think doing it in begin_softmask is probably the right place. | 15:45.52 |
sebras | Robin_Watts: but the idea is to permanantly aalter the xobject? | 15:46.17 |
Robin_Watts | Cos then all the following draw device calls will get called with a proper colorspace. | 15:46.19 |
sebras | should that xobject be used again in the future? | 15:46.31 |
Robin_Watts | sebras: Urm... does the xobject ever get used again? I thought we only looked at the pixmap colorspace after that call? | 15:47.17 |
| The point is to ensure we create the pixmap with a correct colorspace rather than with a NULL one, and then all the other calls should work OK. | 15:47.45 |
sebras | Robin_Watts: well, the reason we trip up the code right now is because we are calling Do on _another_ xobject. | 15:49.23 |
| what if I had two such operators in the stream? in that case we'd need to call begin_softmask() twice, no..? | 15:49.50 |
Robin_Watts | and the code for that is looking at pixmap->cs right? | 15:49.52 |
sebras | Robin_Watts: yes, if I have 5 x Do in the content stream then begin_softmask is called 5 times. | 15:51.47 |
| Robin_Watts: so if I just do an if there, then we have to execute that code 5 times once per call to being_softmask(). and I thought that this is what you wanted to avoid? | 15:52.16 |
| Robin_Watts: I'm not sure I understand what pixmap you are refering to. | 15:53.05 |
Robin_Watts | sebras: no. I'm clearly not being clear. Let me look at the code. | 15:54.42 |
sebras | Robin_Watts: also I'm not sure I understand this area very well at all. source. shape. mask... I remember having understood this once! :) | 15:55.20 |
Robin_Watts | sebras: Right, so in begin_softmask the problem is the mask_softmask is NULL? | 15:56.26 |
sebras | Robin_Watts: correct. | 15:56.35 |
Robin_Watts | and that results in fz_begin_mask being called with NULL, which results in the device begin_mask entrypoint being called with NULL. | 15:57.14 |
| I would prefer us to fix mask_softwask to be non-NULL in begin_softmask. | 15:57.35 |
sebras | Robin_Watts: right. so now I added if (gstate->luminosity && mask_colorspace == NULL) fz_throw()... | 15:57.48 |
Robin_Watts | I think throwing is harsh. | 15:58.20 |
sebras | Robin_Watts: would it be sufficient to just set mask_colorspace = fz_device_gray()? | 15:58.33 |
Robin_Watts | I'd just do mask_colorspace = fz_device_gray(); | 15:58.33 |
| Yeah. | 15:58.39 |
sebras | ok, I understood that you wanted me to manipulate the xobject itself and _add_ /Colorspace /DeviceGray to it. | 15:59.11 |
Robin_Watts | When we get down into fz_draw_begin_mask, we create a pixmap to draw onto, and that goes onto the draw stack. | 15:59.18 |
| What I didn't want to happen was for that pixmap to be created without a colorspace, and for us to have to have every subsequent draw device method have to cope with that. | 15:59.53 |
| This way we've kept the fix purely to one place. | 16:00.13 |
sebras | oh, the pixmap is in fz_draw_begin_mask() I see. | 16:00.47 |
Robin_Watts | sebras: Yeah, sorry, should have been clearer. | 16:01.02 |
sebras | Robin_Watts: hm... but then this connects to the BC field in the softmask too. | 16:02.24 |
| because the number of components in the colorspace relates to the number of entries in the BC array. | 16:02.50 |
Robin_Watts | pdf_process_extgstate assumes colorspace_n = 1 if no colorspace, so that's fine. | 16:04.52 |
| so does pdf_run_gs_SMask | 16:05.16 |
sebras | Robin_Watts: yes, and I'm tracing the calls to run_gs_SMask right now. | 16:05.47 |
Robin_Watts | possibly it would be neater to pull the softmask_bc handling into begin_softmask too. Dunno if that's possible or not offhand. | 16:07.33 |
sebras | Robin_Watts: I can toy with that in a separate patch. :) | 16:08.10 |
| Robin_Watts: ok... so I added an abort() in fz_draw_begin_mask() if colorspace == NULL. | 16:26.29 |
| Robin_Watts: after assumeing fz_device_gray() the cluster no longer segfaults. | 16:26.48 |
Robin_Watts | Fab. | 16:26.59 |
sebras | Robin_Watts: and the files look rasonable. | 16:27.24 |
| tor8: two patches for review at sebras/master. I don't think they'll pass the literary test though. | 16:29.59 |
sebras | sleeps. | 16:44.42 |
| Forward 1 day (to 2016/09/06)>>> | |