Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2020/11/04)Fwd 1 day (to 2020/11/06)>>>20201105 
ator sebras: how far did you get in reviewing tor/master?13:05.30 
sam_ Robin_Watts: looks like you got there? :)13:05.55 
  Sorry, I conked out last night.13:06.01 
sebras ator: Re "Remove support for Luratech JBIG2 and JPEG2000 decoders." I asked, should we go first before gs?13:06.19 
ator sebras: ah, I missed that q13:06.34 
Robin_Watts sam_: No problem. Touch wood, it appears to be working. Thanks for the suggestion.13:06.36 
sebras ator: we could, but what's the plan for gs?13:06.48 
  ator: LGTM "mutool run: Support image mask when creating images from file names."13:06.54 
sam_ Robin_Watts: no problem, let me know if I can help further13:07.04 
Robin_Watts sebras: I believe we didn't supply luratech this time in the commercial release, but the support code is there.13:07.31 
ator maybe we should get HenryStiles to weigh in on that?13:07.37 
Robin_Watts The plan is to remove the support code either next release, or the one after.13:07.43 
sebras ator: Re "Add 'embedded' option to JBIG2 compressed buffers." I got stuck because I got confused about what this change is actually affecting since I constantly confuse the entire image api. :)P13:07.55 
  Robin_Watts: I know that, but what is the plan for removeing the support code?13:08.10 
  ator: so we're adding support for the embedded option in the jbig2 stream API, i.e. source/fitz/filter-jbig2.c but not in the file API in source/fitz/load-jbig2.c13:10.23 
  and then in the next commit "Only use jbig2 subimage loading for multi-page images." you change muimg to use source/fitz/load-jbig2.c for multipage files and source/fitz/filter-jbig2.c for single page files..?13:11.17 
Robin_Watts sebras: Those two messages just crossed, right?13:11.21 
sebras Robin_Watts: yes, I didn't see yours. sorry.13:11.46 
Robin_Watts no worries. :)13:12.00 
ator sebras: yes. the first change is so that we can use fz_new_image_from_buffer for non-embedded non-PDF jbig2 images too13:13.21 
  sebras: the second side-steps the "load everything as a pixmap" behavior of multi-page images that mucbz has special cased code for13:13.44 
  so that when we open single-page jbig2 images, we use the compressed_buffer code for them, instead of always loading via a fz_pixmap13:14.34 
sebras ator: I need more hand holding. so when the PDF-stuff normally decodes a jpeg stream it ends up in pdf_load_image_imp()13:28.48 
  this then calls pdf_load_compressed_stream() that decodes the stream into a buffer. this decoded buffer is then passed to fz_new_image_from_compressed()?13:29.32 
  let's skip inline images right now.13:29.45 
  hm... maybe that's not possible because what I just said is true if cstm == NULL.13:31.04 
ator pdf_load_image -> ... -> fz_new_image_from_compressed_buffer13:31.31 
sebras if cstm != NULL then we frist call fz_new_image_from_compressed_buffer() and _then_ call pdf_load_compressed_inline_image().13:31.42 
ator and pdf_load_compressed_stream, which does the actual work of creating the compressed buffer13:32.07 
  pdf_load_image_imp calls pdf_load_compressed_stream, and then fz_new_image_from_compressed_buffer wraps the compressed_buffer into a fz_image13:32.50 
sebras ah, shit. there's fz_compressed_buffer too. I expected fz_buffer.13:33.17 
  and was confused as to why we decoded it.13:33.32 
ator pdf_load_compressed_stream -> pdf_load_image_stream -> pdf_open_image_stream -> pdf_open_filter -> build_filter -> fz_open_image_decomp_stream -> fz_open_jbig2d13:34.19 
sebras right, and what is the corresponding path for muimg?13:34.43 
ator a fz_image comes in several flavors. one is backed with compressed_buffer, another is backed with fz_pixmap.13:34.53 
sebras after your patch for single page and multi page jbig2 files.13:34.58 
ator fz_new_image_from_buffer -> fz_load_jbig2_info + malloc + fz_new_image_from_compressed_buffer13:36.14 
  however, in the PDF case, since we have a compressed_buffer, build_filter short-stops when it gets to the jbig2 filter (since it knows that the compressed_buffer will take over from there)13:38.24 
  sebras: so there are 3 paths for us to read jbig2 data13:39.09 
sebras mm, but then compressed_image_get_pixmap -> fz_open_image_decomp_stream_from_buffer + fz_decomp_image_from_stream ->? argghghghghghg.13:39.19 
ator 1) by just opening the PDF stream13:39.20 
  2) by wrapping the PDF stream in a compressed_buffer stuffed in a fz_image13:39.33 
  3) by using the multi-page fz_load_jbig2_subimage13:40.06 
sebras when are 1) and 2) used?13:40.20 
  in what cases.13:40.32 
  this is _utterly_ confusing to me.13:40.40 
ator mutool show uses (1)13:40.43 
  to just decode the stream, without knowing or caring that it's an image13:40.54 
  case (2) is when we know we're decoding an image, and want to pass the compressed image data to the device interface without decompressing it in the interpreter13:41.22 
  case (3) you added.13:41.28 
sebras yes, case 3 I know I added. :)13:41.39 
  doesn't mean I understand what I did. :)13:41.57 
ator both case (1) and case (2) end up with fz_open_jbig2d13:41.57 
  the compressed_buffer is just a way to "pause" the chain of filters at a known level13:42.15 
sebras right, and the latter ends up in fz_load_jbig2().13:42.16 
ator and then we can resume the chain when we want to use the image data, or we can use the compressed data as is13:42.38 
  this is so we can go from an encrypted + ASCII85 + DCT encoded image to actual JPEG data stored in a fz_image + compressed_buffer13:43.35 
  rather than having to decode the full image into a fz_pixmap when we encounter it13:43.45 
sebras yes.13:43.55 
ator then the device can decide whether it wants a pixmap, in which case it'd resume decoding the DCT/jbig2/whatever13:44.19 
  or if it wants to say write a new PDF file on the output, it can copy the compressed image data13:44.31 
sebras so in build_compression_params() you set + params->u.jbig2.embedded = 1; /* jbig2 streams are always embedded without file headers */13:44.53 
  ARGH113:44.56 
ator why is that argh?13:45.11 
sebras (pasted the wrong thing) you set params->u.jbig2.embedded = 1 becuase that is the location where all PDF stuff will go throught.13:45.24 
ator jbig2 images in PDF files are always embedded, so we want that to be 113:45.36 
sebras but the file case will not pass throught build_filter()13:45.39 
ator fz_new_image_from_buffer takes a standalone JBIG2 image, and there you don't want that flag13:45.57 
  this is so we can pass the correct flag to jbig2dec13:46.17 
sebras right, but at the right spot.13:46.35 
  and that's what I was trying to understand.13:46.46 
  i.e. would we require embedded jbig2 bitstreams in the file case too?13:47.03 
  but since build_filter() will not be called in that case, then we won't.13:47.13 
  because muimg will end up in fz_new_image_from_buffer().13:47.23 
  and not in pdf_load_compressed_stream().13:47.35 
  did I get that right?13:47.44 
ator fz_new_image_from_buffer leaves the flag zero from fz_malloc_struct(fz_compressed_buffer)13:47.48 
  yes.13:47.58 
  we could get rid a bit of spaghetti, by hooking the multi-page image loaders into the main ball of pasta :)13:50.01 
  then there would be no case (3)13:50.11 
  fz_load_pnm could go via either a compressed_buffer, or a fz_image.get_pixmap13:50.52 
sebras and in the muimg case we end up in fz_open_image_decomp_stream_from_buffer() only for two formats: jpeg and jbig2...?13:51.32 
  yes, adding the multipage thing there might be the better option.13:52.03 
ator sebras: no, we end up there for all PDF images, I think13:52.24 
sebras well compressed_image_get_pixmap() (used by muimg) loads png/gif/bmp/tiff/pnm/jxr/jpx but jpeg and jbig2 are handled by calling fz_open_image_decomp_stream_from_buffer().13:54.02 
ator raw, flate, ccittfax, etc.13:54.03 
sebras yes, in the pdf case, I meant in the muimg case.13:54.16 
ator I think it would be possible to add a fz_load_jbig2_subimage case to compressed_image_get_pixmap13:55.30 
sebras we don't open raw,flat,cittfax, etc in the muimg case..?13:55.31 
ator but we don't for historic reasons (since we need to be able to decode as a non-image anyway)13:55.42 
  fz_new_image_from_buffer cannot create those13:56.24 
sebras ok, so LGTM "Add 'embedded' option to JBIG2 compressed buffers." and LGTM "Only use jbig2 subimage loading for multi-page images."13:57.01 
ator since those are not standalone image formats, there's no image width/height/colorspace etc headers13:57.02 
sebras because w/h/cs is specified in PDF dicts.13:57.43 
ator yup.13:57.53 
  all PDF images end up in the default: case of compressed_image_get_pixmap13:58.47 
sebras and xps/svg/fb2/html/etc call use fz_new_image_from_buffer() since they don't use embedded streams like PDF, but rather actual _files_.13:59.19 
ator y.13:59.26 
  sebras: did you nod at "mutool run: Support image mask when creating images from file names." already?14:01.58 
sebras y14:05.07 
  "Add JBIG2 support to pdf_add_image." is next.14:05.20 
ator sebras: I think I missed a drop so we leak the globals->data buffer14:09.44 
  fixup for that on tor/master14:10.16 
Wizzup I pulled in the 5 JBIG2 patches, rebuild mupdf locally, and I've been able to generate the PDFs I wanted to generate with JBIG2 images instead of ccitt - thank you!14:38.39 
malc_ sobras, ator: is it (or will ever be) possible to load img-pdf.jbig2?14:45.04 
Robin_Watts malc_: IIRC, the big problem is that jbig2 doesn't have a header on it.15:00.39 
  so you can't immediately spot that something is a jbig2 from looking at the file.15:01.00 
malc_ Robin_Watts: perhaps, it's still a bit puzzling (for an outsider - me) that img.jbig2 is okay but img-pdf.jbig2 is problematic15:02.44 
Robin_Watts ok, there is clearly stuff I don't follow here, so ignore my comments.15:03.23 
malc_ can do15:05.26 
ator malc_: there's no way for us to know that it is a jbig2 file -- no magic signature15:12.53 
malc_ ator: gotcha15:14.18 
  ator: img.jbig2 has this JB2 near the start of the file, "non mandatory" filler?15:16.40 
ator malc_: "JB2\r\n\x1a\n" to be accurate15:18.28 
malc_ ator: now pray tell how my description WASN'T accurate?15:20.12 
ator img-15:21.27 
  malc_: there are 3 formats of jbig2 data. 2 have a file header, one is embedded data that is not a valid standalone file.15:22.21 
  img-pdf.jbig2 is the latter.15:22.38 
malc_ ator: tack15:23.09 
sebras so if one wants to add a jbig2 _file_ to a PDF, then this file header must be stripped out, and the naked jbig2 _bitstream_ is put inside the PDF.15:40.38 
  this is what the next patch to review is all about I think.15:40.51 
ator sebras: yes.16:06.05 
  the next patch is all about stripping the file header and maybe reorganizing the segment headers to form an embedded jbig2 stream16:06.25 
sebras yeah, I'm looking.16:08.15 
  ator: in pdf_parse_jbig2_segment_header, where does 11 in the first length check come from? 4 byte segment number + 1 byte flags + 4 bytes referred to segment count ought to be the worst case..?16:30.01 
  the message for the last throw in pdf_copy_jbig2_random_segments() should probably refer to "segment data" not "segment header".16:39.28 
  ator: I believe that the end of file segment is optional for files that are sequential.16:42.52 
  nvm the last message, I mixed up the random and sequential cases in your reordering.16:56.07 
  ator: right, so in the future we'd want add_image_res() to supply a image index number to fz_new_image_from_file() so we can ask for a particular page.17:18.08 
  but that means the entire fz_image must be multipage.17:18.17 
ator sebras: or that the image/compressed buffer has a page number associated with it17:18.55 
sebras ator: Re "Add JBIG2 support to pdf_add_image." I wonder about the 11 and the error messages (2 of them) mentioned above. otherwise LGTM17:19.16 
  also: JBIG2 is a convenient format. :-/17:19.38 
  ator: anyhow multipage is next patch. :)17:20.40 
ator sebras: 11 is the minimum file header size17:20.57 
  the second segment header error is an error from parsing the header, even if we're at the data17:21.29 
  the header said there would be more data than there is17:21.37 
  they could of course be reworded better, error messages can always be improved :)17:22.00 
sebras yes please.17:22.06 
  why does the minimum file header size matter when you are parsing a segment header?17:22.47 
  and isn't the file header size 9 bytes (8 byte id + 1 byte flags (indicating unknown number of pages)17:24.08 
  the minimum page header appears to be 10 bytes? (4 byte segment number + 1 byte segment flags + 1 byte referred to segment count + 4 byte segment data length)...17:26.08 
  I must be missing something.17:26.17 
  ator: LGTM "Bug 703077: Confirm reload if the document has unsaved changes."17:30.30 
Robin_Watts test18:13.12 
ator sebras: you missed the return to segment data18:22.25 
  sebras: in any case, I took the number from jbig2dec jbig2_segment.c line 5518:23.28 
sebras ator: return to segment data?18:24.17 
  ator: referred to segment data? but "count of referred-to segments" might be < 4 in which case the field in 7.2.4 only occupies a single byte which includes both the count _and_ the retain bits.18:31.46 
  I think giles counted them wrong.18:32.16 
  too18:32.22 
ator I thought so too, and recounted and got 11 but I don't remember how anymore18:43.25 
sebras ator: at least it is consistent with jbig2dec.18:43.54 
  ator: do you still have your WIP Makelists commit somewhere?18:44.39 
  ator: I reviewed that one, but want to see what you have changed since if anything.18:44.58 
  I've lost it and neither git fsck --dangling nor git reflog finds it.18:45.25 
ator sebras: there's an older version of it on tor/wip418:46.15 
sebras ator: thanks.18:56.45 
  ator: why does HAVE_LEPTONICA, HAVE_TESSERACT and HAVE_EXTRACT all end up in CFLAGS when other libraries like jpegxr get their flags added to THIRD_CFLAGS?18:59.08 
  ator: CURL's variables are not set similar to any other other libraries.19:02.15 
  I know it is the odd one out since we never build it ourselves.19:02.32 
  but still.19:02.43 
  ator: the part that you really wanted my opinion on was the android parts, and to me they look good. I have not testcompiled. should I?19:07.16 
Robin_Watts HAVE_LEPTONICA is not a flag for leptonica.19:17.26 
  It's a flag that the main code needs to get to determine whether or not it uses the appropriate bits of thirdparty code.19:18.05 
sebras I know, but then again HAVE_JPEGXR wouldn't be a flag for jpegxr either.19:18.09 
Robin_Watts OK. I can't argue that.19:18.19 
sebras HAVE_JPEEGXR is supplied to THIRD_CFLAGS, HAVE_LEPTONICA to CFLAGS. my observation is that it is inconsistent and my question is "why?".19:18.55 
  not sure if it is an oversight. :)19:19.21 
 <<<Back 1 day (to 2020/11/04)Forward 1 day (to 2020/11/06)>>> 
ghostscript.com #ghostscript
Search: