Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2017/12/19)20171220 
sebras Robin_Watts: if you are around, maybe you can take a look at sebras/master?13:17.50 
  Robin_Watts: I tried to fix a couple of issues I've noticed.13:17.58 
Robin_Watts sebras: Sure.13:17.59 
sebras Robin_Watts: I found a couple of pdfs over athttp://www.pdfill.com and decided to run mudraw over those. that tripped ASAN/valgrind a bit.13:19.07 
Robin_Watts All 3 look good to me.13:20.42 
  A repository for sick PDFs? :)13:21.06 
sebras perhaps, but I don't think so.13:21.41 
  the company develops some kind of windows app for signing pdfs (and editing?)13:21.57 
Robin_Watts pdf "ill". Never mind :)13:22.45 
sebras Robin_Watts: I'm not sure I appreciate the GUI they show off on their landing page (you really ought to take a look...)13:23.53 
  Robin_Watts: on a more serious note: in source/pdf/pdf-op-filter.c:filter_push:102 we keep the text fonts in pending and sent.13:25.03 
Robin_Watts Yeah, ribbon overload.13:25.06 
sebras Robin_Watts: shouldn't we equally keep the cs, pat and shd parts in cs, CS, sc and SC?13:25.25 
  Robin_Watts: seems to me like the new_gstate will otherwise just borrow those references from the next gstate.13:25.48 
  perhaps that is ok, but if it is, then why keep the fonts?13:26.00 
  I looked at all *variable = *variable2 assingments after having found the reference counting mistake in fz_new_pixmap_from_pixmap().13:26.41 
  so I'd like to add something like fz_keep_colorspace(ctx, new_gstate->pending.cs.cs);13:27.17 
  but perhaps I'm missing something.13:27.22 
Robin_Watts sebras: You may be right. I am a bit buried in other stuff at the mo.13:27.26 
sebras Robin_Watts: ok, I'll look further, you continue to stay buried. I wouldn't want to dig you up... ;)13:28.00 
Robin_Watts I'm reluctant to take references more than we need to.13:28.03 
  cos colorspaces might only have signed 8 bit reference counts.13:28.20 
sebras ah.13:28.42 
  Robin_Watts: are you still buried?17:54.54 
KnoteAI Hello guys!18:28.39 
Robin_Watts sebras: I can unbury myself for a bit, sure.18:29.31 
  KnoteAI: Hi18:29.42 
sebras Robin_Watts: fz_get_pixmap_from_image() can be used to decode subareas of an image.18:31.17 
Robin_Watts sebras: Sorry, I've just been called away for the next 15 mins or so. she who must be obeyed.18:31.45 
sebras Robin_Watts: first we look for the requested subarea recomputed into a rect.18:31.59 
  no worries.18:32.04 
  we look for this subarea in the store.18:32.16 
  if we don't find it we decide to decode the image.18:32.24 
  when we doecode the image we change the rect that represents the subarea and then uses this as part of the key when storing the decoded image into the store.18:33.15 
  so the next time we request another part of the image we will try the same time, i.e. we will fail to find the fully decoded image in the store since we initally look only for the subarea we wanted.18:33.55 
  of course we don't find it, then we *again* decide to decode the entire image and then we *again* try to store the fully decoded image into the store, thereby overwriting the existing entry.18:34.40 
  that seems a bit silly.18:34.44 
  I'm thinking we could first look for the subarea we want and if we find it, use it.18:35.02 
  if we don't find it, try to look for not a subarea but the entire image fully decoded in the store.18:35.23 
  if we find it, use the full image and adapt ctm correspondingly18:35.45 
  and if we can't find that either, only *then* will we attempt do decode the full image and try to put it into the store.18:36.11 
Robin_Watts sebras: back.18:44.51 
  That does indeed seem suboptimal.18:45.02 
  What we ought to be doing is looking for an entry in the store that includes the given subarea.18:45.44 
  That was the original intent.18:45.53 
  To do that we'd need to form the hash key from everything *without* the rectangle, and allow multiple entries in the hash store with the same key.18:47.00 
  Then we'd linearly probe through those entries to look for one with an appropriate rectangle18:47.32 
sebras Robin_Watts: ok, because we are guaranteed that values with the same key are stored linearly in the hash.18:51.01 
  i see.18:51.04 
  wouldn't we still need to have the rect though to know what part was actually decoded?18:51.43 
Robin_Watts sebras: The mechanism isn't important really. It's the idea of looking for a set of possibilities.18:51.45 
sebras yes, I realize that.18:51.55 
Robin_Watts The rect would still be stored, it just wouldn't form part of the actual key.18:52.11 
sebras right so fz_make_hash_image_key would still store the rect in the key, but fz_cmp_image_key() wouldn't take it into account.18:53.23 
  if the rect is not stored there, I'm lost as to where it would be stored.18:53.35 
Robin_Watts sebras: Is it not stored in the fz_pixmap ?18:53.48 
  x, y, x+width, y+height should be the rect, right?18:54.07 
sebras I don't think we get the correct x,y from the pixmap though.18:55.11 
  but the width/height is certainly stored there.18:55.19 
  well, or larger.18:55.33 
  also I have noticed that in compressed_image_get_pixmap() we may invert the CMYK jpegs for XPS.18:56.01 
Robin_Watts sebras: Why wouldn't we get the correct x/y ?18:56.18 
sebras but this is not present in the key, so we could end up wanting a non-inverted variant and the store would give us the inverted one.18:56.24 
Robin_Watts one problem at a time, eh? :)18:56.41 
sebras Robin_Watts: sure, but I need to blurt it out before I forget! :)18:56.54 
Robin_Watts I can understand that :)18:57.02 
sebras Robin_Watts: the pixmap wouldn't know what x,y it should be at since it might be used in several locations.18:57.20 
Robin_Watts sebras: The pixmap x/y should be the position of the rectangle within the source that is decoded.18:57.48 
sebras when we finsihed decoding the image I imagine that x == y == 018:57.49 
Robin_Watts i.e. if we have a 300x300 image, and we decode just the middle 100x100 of it, I'd expect x,y,w,h to all be 100.18:58.25 
  Where, and at what scale that is displayed shouldn't affect the rectangle.18:58.53 
sebras ok, not the x,y on the page, got it.18:59.56 
  Robin_Watts: I think you try to do the corresponding thing with the subsampling factor, you try to look in the store for the most appropriate factor first.19:01.22 
  if that is not found, you decrease it and try to look again.19:01.36 
  until we subsampling is 0.19:01.42 
Robin_Watts yeah, with subsample factors, there are a limited number of possibilities, so I can check 'em all.19:01.58 
  I can't check all the possible rectangles that would satisfy me though.19:02.11 
sebras depends on how we implement the compare function, right?19:02.28 
  what if the comparison would say FOUND! as long as the desired rectangle is covered..?19:02.44 
  i.e. even if the pixmap is larger than you originally wanted.19:02.59 
  but that would only solve the linear probing.19:03.32 
  ok.19:03.33 
KnoteAI any plan to change the xmin, ymin, xmax, ymax to x,y,w,h ???19:03.41 
Robin_Watts KnoteAI: What, where?19:04.10 
sebras KnoteAI: you need much more context to get a decent answer. what xmin/ymin/etc..?19:04.23 
Robin_Watts sebras: The intent is that the comparison should say "Found!" as long as the desired rectangle is covered.19:04.47 
sebras right.19:04.55 
Robin_Watts But that can't work with the current way we're driving the hash table.19:05.15 
  We need the "check multiple entries" stuff (which we can do by linear probing, I hope)19:05.37 
sebras 019:05.44 
  if the hash is the same the entires will be stored linearly without gaps as far as I know.19:06.06 
Robin_Watts sebras: That is my understanding too.19:06.17 
  We could have a hash_nextMatch or something.19:06.28 
sebras we don't have a hash table interface to say, ok I have this entry, give me the next one.19:06.34 
  yes.19:06.38 
Robin_Watts but in fz_underscore_stylee19:06.53 
sebras fz_hash_find_next() I assume.19:07.06 
Robin_Watts sebras: That'd work for me I think.19:07.17 
  I'm being distracted by my SSE4.1 intrinsics almost, but not quite working.19:07.39 
KnoteAI Sorry I was talking about the span bounding box, I am wondering if you were on the point to change the bbox xmin, ymin, xmax, ymax to x,y,w,h? I'm just curious.19:08.15 
sebras Robin_Watts: I discovered this while trying to help malc_ debug a problem in llpp, he's doing tiling when he's rendering, even if he's rendering a PNG, hence he triggered this. mupdf doesn't seem to be doing that so we haven't seen it.19:08.20 
  KnoteAI: I haven't heard of any such plans.19:08.47 
KnoteAI ok ok thank you for your answer. I just made a bad conclusion with your message (<@Robin_Watts> i.e. if we have a 300x300 image, and we decode just the middle 100x100 of it, I'd expect x,y,w,h to all be 100.)19:10.01 
sebras KnoteAI: that is about an entirely different data structure.19:10.41 
KnoteAI got it thanks ;)19:11.18 
sebras Robin_Watts: SSE4.1 intrinsics is for SO?19:12.12 
  Robin_Watts: thanks for setting me on the right track, I'll try to do something smart with this.19:12.52 
Robin_Watts sebras: gs, image scaling.19:22.29 
Diemex Hi!19:27.12 
  Couple of quick questions:19:27.21 
  Is there a way to get the source/javadoc from the maven repository?19:27.45 
Robin_Watts Diemex: What source? What javadoc? What maven repository? :)19:28.50 
Diemex https://mupdf.com/docs/android-sdk.html19:29.03 
  I can't see the javadoc in Android Studio for MuPdf methods19:29.39 
Robin_Watts Everything is in the git repos listed on that page.19:29.40 
  It's entirely possible that there IS no javadoc as yet.19:30.02 
Diemex Would be cool if that could be included in the repo if possible19:30.45 
  I'm using MuPdf from Android. If I hold on to the DisplayLists can I call MuPdf from multiple Java Threads to render multiple pages at once?19:31.44 
  If yes is there anything I need to watch out for?19:32.42 
sebras Diemex: one you have a displaylist you should be able to have multiple threads read that list and render it.19:34.40 
  Diemex: e.g. if you two threads and one renders the upper half while the other renders the lower half.19:35.01 
  be sure to only one thread to parse the document though.19:35.17 
  Diemex: also, no there is no javadoc at all.19:35.31 
Robin_Watts Diemex: If you want to write it for us, we'll add it, sure.19:35.34 
sebras Robin_Watts: nice timing. :)19:35.46 
Robin_Watts Diemex: You may find: http://ghostscript.com/~robin/mupdf_explored.pdf of interest.19:36.15 
  That describes the C API, upon which the Java API is based.19:36.37 
  I haven't written the chapters on the java bindings yet.19:36.48 
Diemex Robin_Watts: Nice! Thanks for that19:36.57 
  I just returned to Android Dev from an absence of over a year. It's a breeze. Tools have improved a lot. And Kotlin is just WOW19:38.41 
  A walk in the park with a yummy icecream compared to FPGA dev XD19:39.13 
Robin_Watts :did FPGA dev 20 years ago.19:40.02 
Robin_Watts did FPGA dev 20 years ago.19:40.07 
Diemex Niiiice. Which FPGA/vendor?19:40.42 
Robin_Watts Diemex: We used Xilinx mostly, but then we were writing the tools to take software descriptions of programs, and compile them to netlists that then got laid out for the chips.19:41.37 
  so I was insulated from the hardware itself most of the time.19:41.47 
  Gotta go, sorry!19:41.53 
Diemex Robin_Watts: cu :-)19:42.09 
  Does loading a page and holding on to the page object require a lot of memory? I have just "benchmarked" the loadPage function and it only requires about 0.3 ms per call for the PDF I tried. I expected it to take longer. Is there a possibility that I would run into memory troubles if I would f.e load every page of a large pdf and not free the objects?20:13.30 
sebras Diemex: that depends on how much memory you have and the number of pages of course.20:16.54 
  Diemex: and probably other things too.20:17.07 
malc_ sebras: it depends on the pages too :)20:17.13 
sebras malc_: certainly.20:17.53 
  malc_: the more complicated pages are the more memory they use.20:18.07 
malc_ sebras: not only that, they can contain humongous embedded images for instance20:18.34 
  there goes your memory20:18.37 
Diemex Ah I see. So embedded images are loaded into memory. Can I get the memory usage from the Java API?20:19.35 
  To decide if I should release the page?20:19.50 
  Or I just keep it simple and only hold on to the last 5 pages or so20:20.11 
sebras Diemex: I don't think we have any API that gives you the memory usage even at the C level.20:27.38 
malc_ sebras: custom allocator in fz_new_context might help some, then again no idea if libjpeg etc is tracked by that20:30.54 
sebras malc_: the intent is that they should be, but I quite recently discovered allocations in the libraries that are not.20:34.04 
  malc_: I seem to recall that e.g. harfbuzz uses malloc directly.20:34.15 
malc_ sebras: aye.. simple grep reveals naked calloc there in harfbuzz20:36.58 
sebras malc_: might be replaced by #define calloc() though.21:22.53 
  malc_: I think some libraries do it that way.21:23.00 
malc_ sebras: thus moving deeper into the realm of upstream deviation21:25.25 
janzo i see mupdf can output xml. im trying to get the boxes/lines/words of a pdf. is this a possible way?22:24.12 
  i stumbled on the stext output22:37.34 
 Forward 1 day (to 2017/12/21)>>> 
ghostscript.com #ghostscript
Search: