Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2016/12/22)20161223 
sebras tor8: there are two commits on sebras/master, do they make sense to you?12:12.23 
tor8 sebras: I'd rather fix the segfault to allow for NULL resource dicts13:05.48 
  resource dictionaries in pages are optional, after all13:06.00 
sebras tor8: notice that this NULL, not null.13:07.43 
tor8 should not matter. pdf_dict_get(page_obj, "Resources") will return NULL not null.13:08.38 
  ...if the resource dictionary is missing13:09.04 
  resource dictionaries are not required to be indirect objects either, as you state in the commit comment.13:09.37 
  they can be inline dictionaries13:09.43 
  and that is what you create in your patch as well...13:10.00 
sebras in the file that was originally created, we create an indirect reference to an object that does not exist.13:11.42 
  this causes pdf_load_page to call pdf_resource_use_blending which eventually tries to resolve the object.13:12.16 
tor8 sebras: I'm confused. We don't create a file in make_fake_doc.13:12.25 
sebras this cannot be done because there is no file backing it so fz_seek() segfaults in the end.13:12.29 
tor8 we pass NULL to pdf_add_page13:12.42 
sebras tor8: exactly, so NULL is stored in the xref.13:13.09 
tor8 huh, why are we trying to seek in a newly created document?!13:13.19 
sebras xref entry->obj13:13.20 
tor8 ah, pdf_add_page creates an indirect object for the resources dictionary13:13.48 
sebras let me switch irc computer...13:13.49 
tor8 if (resources) pdf_dict_put_drop(...Resources) then?13:13.58 
  same with Contents, they are also optional13:14.03 
sebras that might work.13:15.16 
  since this an in-memory only object we cannot afford to seek in the non-existing stream.13:15.43 
tor8 hm. reading the spec more closely, it says that if there are no resources we should create an empty dictionary (or else it will go looking for inherited resource dictionaries in the page tree)13:16.34 
sebras tor8: right, so those if-statements in pdf_add_page() also fixes it.13:16.35 
tor8 sebras: I think the core issue here is calling pdf_add_object with NULL13:16.51 
  but crashing on such errors is also bad13:17.24 
sebras tor8: maybe pdf_add_page() should be able to handle a NULL resource argument and create the empty dict in that case?13:17.43 
tor8 yeah. I think that would be best.13:18.35 
  and it should possible avoid creating a new indirect object as well13:18.55 
sebras OTOH, if we do that you cannot add pages which you want to inherit the resource dict fro mthe page tree..13:19.16 
tor8 no. we don't want to get involved in that insanity :)13:19.44 
  that's just a brain dead bit of spec which we shall avoid at all costs when creating our own files.13:20.17 
sebras this is actually what the file I'm using for testing happens to use. :)13:20.37 
tor8 we don't allow access to the page tree so there's no way for anyone to even do that13:20.41 
  using the pdf_add_page and pdf_insert_page apis13:20.51 
sebras right, then the path to resolving this is clera.13:21.02 
tor8 yes. the only question is whether to put the resource dictionary inline or create a new numbered object for it.13:21.42 
  most files tend to use numbered objects for it13:22.19 
sebras inline if it is empty and indirect if it is provided?13:22.23 
tor8 and that will let our garbage collection dupe detection coalesce them13:22.34 
  yeah, inline if it is empty and indirect if it is provided and not already indirect is probably best.13:22.51 
sebras so the issue causing all of this is that there is no stream backing the fake document.13:25.05 
  this means that having the resource dict be an indirect reference to a non-existing object in a normal file ought to work fine.13:25.35 
tor8 yeah. there is normally no way to create a new document and have a bogus indirect reference that will try to load the underlying file.13:25.49 
sebras no, but you could have a file we parse that says /Resource 99 0 R but not have 99 0 obj in there, and in thta case well error out on "object out of range" which results e.g. in "cannot find Font dictionary".13:27.17 
tor8 if it's a non-existent object, we'll throw exceptions based on it being out of range of xref->len and not try to load13:27.27 
  if it's created with pdf_create_object it'll be a 'f' free entry and thus not loaded13:27.42 
  and if it's updated with pdf_update_object then it'll have a proper object13:27.54 
  *but* if we call pdf_update_object with NULL all hell breaks loose13:28.07 
  so we should catch that right there13:28.12 
  and add a 'null' object13:28.20 
  or just no-op it13:28.24 
  sebras: try the two top commits on tor/master13:31.16 
sebras that works for the tests I have been doing. mention bug 697183 in the commit message though.13:36.47 
  though I'm worried what will happen if csi->rdb is a null object and we look for a resource other than a font though.13:37.33 
tor8 sebras: in the page interpretation?13:38.00 
sebras yes.13:38.04 
  do we handle not having a resource dict gracefully everywhere.13:38.37 
  in pdf-interpret.c13:38.43 
tor8 sebras: we'll just get NULL13:39.50 
  see pdf_process_Do where we look for image resources13:40.04 
  xres = dict_get(csi->rdb, "XObject") will return NULL if rdb in NULL13:40.29 
sebras yes, we check xres for NULL.13:40.35 
tor8 and even if it were not checked, we'd eventually end up pdf_load_image with xobj==NULL where we'd fail on width==013:41.44 
  sebras: was there somewhere specific you were looking at?13:42.00 
sebras no, I'm looking at all uses of csi->rdb13:42.18 
  to make sure they're all robust.13:42.43 
tor8 resolve_properties is the only one that doesn't have an explicit check13:43.22 
sebras true but then again the caller checks for !cooked.13:43.49 
tor8 which is for the marked content stuff used for optional content groups13:43.53 
  but that can cope with blank and missing entries well enough13:44.05 
  yeah.13:44.28 
sebras load_font_or_hail_mary() is the only one I haven't looked at yet.13:44.45 
tor8 sebras: the other fixes in your make_fake_doc look reasonable13:44.54 
  missing the 's' stroke command on the content stream makes me wonder if this has ever been properly tested :)13:45.17 
sebras yeah, I'll bring your commits over and create separate commits for that.13:45.24 
  I think not.13:45.35 
  also the bogus error handling seems to suggest that no... ;)13:45.50 
tor8 sebras: there is one other place where we create a 'fake' document13:45.51 
  in pdfportfolio.c when creating a new portfolio pdf13:45.59 
sebras right, I'm actually attempting to add portfolio support to the x11 viewer just as an experiment.13:46.22 
tor8 but there we add some actual text and not just a cross13:46.33 
  sebras: okay. as an extra command line argument which portfolio entry to open?13:47.04 
sebras oh, pdf_load_type3_font() checks for rdb being NULL. so I think we handle NULL everywhere in that case.13:47.09 
tor8 sebras: yes, we should. the loading code is pretty robust these days.13:47.31 
  it's the creation code that's had less field testing...13:47.40 
sebras tor8: no, I wanted to generate a fake page with GoToE links. this is why i was looking into the make_fake_doc() code.13:48.11 
tor8 take a look at pdfportfolio.c13:48.31 
sebras I did.13:48.35 
tor8 okay.13:48.39 
sebras as of now, when you press F, you'll get a fake page with the names of the embedded files13:49.03 
  but as of yet no GoToE links.13:49.12 
  we don't support those, but they would be handy in this case.13:49.33 
tor8 sebras: right. we don't support those...13:49.41 
sebras in order to support them fully we'd need to keep a hierarchy of pdf's in memory. :-P13:50.17 
tor8 I think if we allow mupdf to be called with a file:// url argument and crack that to go to the correct file, embedded doc, and page13:50.24 
  as an external URL13:50.34 
sebras yes, I have already implemented an ugly hack to support mupdf-x11 file:///absolute/path/to/document.pdf#page=4213:51.03 
tor8 sebras: the other alternative is to have a pdf portfolio container fz_document which just puts all the portfolio entries in sequence13:51.10 
sebras along with an initial fake page used as index?13:52.15 
  if we want.13:52.19 
tor8 yeah.13:52.49 
sebras tor8: so pdfportfolio leaks a lot where the fake document is created. in case of error.13:53.07 
tor8 BUT ... pdf portfolios are such a bad idea I don't fancy the idea of wasting effort supporting them.13:53.15 
  sebras: yeah, but it's a command line tool so we don't really care.13:53.28 
  but if you want to clean it up, go ahead! :)13:53.51 
sebras the resource dict is being fully populated though so there are no issues there.13:54.23 
tor8 if we get an error, we'll end up calling abort() with uncaught exception13:54.27 
sebras yeah, I see that.13:54.39 
  tor8: I have another thing pending where I think the gif decoder needs to enable old_tiff LZW-decoding.14:05.59 
  tor8: but I don't have access to my entire gif suite now so I'm not sure if there will be regressions.14:06.21 
tor8 sebras: https://helpx.adobe.com/acrobat/kb/link-html-pdf-page-acrobat.html for specific pdf page numbers and named destinations14:11.26 
  but I can't find an equivalent syntax to go to an embedded portfolio file14:11.36 
sebras yeah I found that link in git log and read it.14:12.16 
tor8 https://forums.adobe.com/thread/2111745 looks like the answer is "no can do"14:14.07 
  but that doesn't prevent us inventing our own syntax, like #subfile=N,page=M14:14.51 
  or #portfolio=N to be clearer14:15.21 
sebras is N the name or a number though?14:23.00 
  tor8: the commits on sebras/master clustered fine, final LGTM?14:23.40 
serge111 Thanks for the guidance Robin_Watts, I'll fiddle with MuPDF and see what I can do. 14:34.40 
karl_ any mupdf developer here ?22:02.41 
 Forward 1 day (to 2016/12/24)>>> 
ghostscript.com
Search: