Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2017/08/14)20170815 
avih tor8: btw, i tested the threaded code with heap allocation instead of stack (with size of 1000), it's almost 10x slower :/ also, so you plan to leave the two implementations with that #if ?11:28.12 
  (and half off topic, why don't you leave the code you're working on at the top commits, and just add commits until it's stabilized, instead of updating the commit and force push it?)11:31.24 
  i get the need for clean history, but while you're working on something, at least for me, i find it useful to just keep adding commits as ugly as the history might look, and only squash once i'm happy with it all11:31.24 
tor8 avih: I'm working on a completely different implementation as well11:31.52 
avih i thought you are :) the "lockstep" one?11:32.12 
tor8 avih: on tor/wip are two commits you could give a try11:36.28 
  I expect the lockstep one to be much slower for simple regexes (more book-keeping per step) but scale better with 'nasty' regexes11:37.03 
avih tor8: will do. is the lockstep one still partial due to not handling I_REF, I_PLA, or I_NLA ?11:39.44 
tor8 I fall back to the recursive one if the program has either of those instructions11:40.19 
avih also, so the recursive one is unlimited and will just blow off the stack if it overflows it?11:40.33 
tor8 so it should be complete (modulo the use of system malloc and no error handling for malloc failures)11:40.36 
  it will blow up the C stack if it overflows, yes11:40.49 
avih up. right :)11:41.02 
tor8 but at least it doesn't always use 96k of stack :)11:41.25 
avih 280k !11:41.54 
  also, i think on windows the default stack is 1M, so it's actually not that far off it, in scale terms11:42.49 
  so the new recursive is slightly (~5-8%) slower than the previous recursive, but possibly because previously the flags were a mess. but it works. now trying the lockstep11:49.32 
  tor8: the lockstep one segfaults for me on the same test where the recursive doesn't - repeat("hello world ", 500 * 1000).match(/hello/g).length11:53.03 
  (i'm actually using "hello world ".repeat(500000) using my polyfill https://pastebin.mozilla.org/9029766 )11:55.13 
tor8 ugh, yes, there must still be some bugs left :(11:55.24 
  avih: top commit on tor/wip11:57.10 
avih tor8: huh.. 50x slower than recursive for that ^ match11:59.24 
  (specifically, with 500k repeats the recursive is ~900ms, the heap-allocated MAXTHREADS ~8000ms, and the lockstep with the fix - i thought it hangs and aborted, but then i reduced the repeats to 5000 and it was ~450ms12:00.52 
tor8 avih: yes, that's about the slowdown I expected.12:01.37 
  I shall not be putting the lockstep one into master (not before optimizing the hell out of the Resub state managment, which is what's causing the slowdown)12:01.59 
avih tor8: is there a real reason to not use the recursive by default? i don't see how the array size helps with the "threaded" implementation if you don't know what the stack size is to begin with. it can blow the stack either way, but the recursive is not limited to 1000, and only blows the stack once it actually needs to. i think overall it's better.12:03.41 
tor8 not really, it makes it somewhat harder to debug if you smash the stack, but that's pretty much the only reason12:04.10 
avih what about incremental heap allocations? start with stack array of pointers to Rethread, and allocate more items as you need them? this way you can actually catch exhaustion12:05.32 
tor8 I'm happy enough with the simple recursive backtracking implementation.12:06.02 
avih except blowing the stack ungracefully12:06.20 
  (which was the main reason against it, if i understand correctly)12:06.38 
tor8 that, and I was halfway towards the lockstep implementation when I figured it was good enough and moved on12:07.27 
avih k12:07.48 
tor8 but back-references and positive/negative lookahead mess up a lockstep implementation and pretty much mandate a backtracking implementation12:08.00 
  it's not a simple DFA/NFA machine anymore with back-references12:08.22 
avih in general though, this stuff is, IMO, classic for heap allocation. you can't know how much space you'll need in advance, and that's what the heap is for.. after all..12:09.23 
  not to mention it's order[s] of magnitude bugger than stack space, even with glibc12:09.57 
  i*12:10.31 
sebras tor8: ok, I see.12:36.26 
  tor8: did you see my weekend ramblings? https://ghostscript.com/mupdfirclogs/2017/08/12.html12:36.56 
tor8 sebras: I'm not too fond of the fz_indexed_lookup function name12:37.01 
  and we have a rather too many awkward 'is this colorspace X' type of constructs in the code12:37.21 
  I think it wuold be good to add some more fz_colorspace_is_rgb and _is_cmyk functions12:37.36 
  sebras: yes, I saw. with that said, the printf commits LGTM.12:37.56 
  sebras: like the if-else chains in 'Add support for writing pdfs usin Lab colorspaces.'12:38.51 
  we have a bunch of other similar if tests elsewhere too12:39.00 
  where we test n==3 and assume RGB12:39.04 
  I think it'd be good if we could change all those to colorspace_is_rgb calls12:39.26 
  and add such a function12:39.33 
sebras tor8: ok, so you'd rather we have explicit functions and test for each colorspace and not rely on interpreting the number of components?12:39.48 
tor8 and likewise for colorspace_is_gray12:39.49 
sebras ok.12:39.59 
tor8 a DeviceN and Separation colorspace could mistakenly be considered as rgb or gray12:40.26 
sebras that's a good point.12:40.40 
tor8 sebras: yeah. now that we support more than devicegray/rgb/cmyk/lab it's a bit trickier I think12:40.47 
  even with lab before, but that was such a rare colorspace12:41.01 
sebras ok, let me rearrange my commits a bit to have those that are acceptable be merged immediately.12:41.13 
tor8 sebras: so apart from those two I mentioned, the rest LGTM12:41.28 
  though 'Bug 698168: Add support for writing indexed images to pdfs.' depends on the fz_indexed_lookup commit12:41.49 
sebras tor8: I know.12:42.00 
tor8 fz_(get_)indexed_colorspace_palette?12:42.51 
sebras tor8: so if you look at sebras/master now, all commits are LGTM, right?12:42.55 
  tor8: that's indeed a better name!12:43.05 
tor8 I'd have said fz_lookup_indexed_colorspace(ctx, cs, index, char color[]) but that won't match with how you use it12:43.28 
sebras tor8: that way seems slower since we'd need to make one call per index..?12:44.14 
  tor8: do we care?12:44.20 
tor8 sebras: I don't like moving S_ISDIR to a public header file, but we've got too many messes like that already (which I intended to clean up before being side tracked)12:44.37 
sebras tor8: we're potentially writing a GIGANTIC pdf so 10ns more might not register.12:44.45 
tor8 so I'll just move that to a private header file when I get around to it12:44.52 
  we don't care :)12:45.02 
sebras tor8: ok... so nopt to a public header, but there are many things in that header that are already system-specific, why not this once too?12:45.30 
tor8 but change the name to something less confusing (lookup is a verb, hence the confusion)12:45.37 
  and it's fine by me12:45.43 
sebras tor8: right, ok.12:45.45 
tor8 sebras: all of the cruft we do to paste over missing posix stuff on windows should live in a private header12:46.10 
  so we don't pollute system headers and namespaces for users of our library12:46.32 
sebras tor8: oh, we never expose FILE * or something like that in external API:s?12:46.32 
  tor8: right, that argument resonates with me. :)12:46.48 
tor8 we unfortunately expose FILE* which makes a big mess of how to deal with LARGEFILE :(12:46.54 
  if we were to avoid FILE* we can get away with only exposing two system header files at all: stddef.h and stdint.h12:47.20 
  and we only need those for size_t and int64_t etc12:47.45 
sebras tor8: I see, well. I'll let you deal with that in the cleanup in that case and remove the S_ISDIR commit.12:49.00 
tor8 no, leave the S_ISDIR commit. there's more stuff in that header file that needs to move along with it. we're not making things any worse with that commit!12:49.24 
  and this moves them to one place where I'll remember to find them12:49.41 
  I dislike having #ifdef's in the C files (if we move to using private headers)12:49.53 
sebras tor8: then I misunderstood you. I thought you saw this one as unncessary churn. :)12:50.56 
tor8 nah, some churn is good12:51.15 
  especially as I don't know when I'll get around to the next bit of churning :)12:51.27 
sebras tor8: ok, so sebras/master LGTMed? it clusters well.12:52.16 
  tor8: oh and refetch because I might have updated.12:52.29 
sebras is worried he might push stuff which is not yet LGTMed.12:52.44 
tor8 we should make sure to have the same 'javah' tool for mupdf_native.h12:53.14 
  your javah and mine seem to output the symbols in a completely different order12:53.30 
sebras tor8: oh, is that why we have an issue? I was wondering what was happning.12:53.44 
tor8 which leads to a LOT of needless churn in mupdf_native.h12:53.46 
sebras tor8: do we need to check mupdf_native.h in?12:54.02 
  tor8: if we depend on javah regenerating it anyway...?12:54.13 
tor8 yes, we need to check it in. you know how android losers are...12:55.25 
  sebras: try changing platform/java/Makfile to use $(sort $(LIBRARY_JAVA_CLASSES)) for the javah command12:55.52 
sebras tor8: in that case we should just check in libmupdf.so into the java repos...12:55.58 
tor8 and see if you still get so many changes12:56.01 
sebras tor8: readlink -f $(which javah) give what output?12:56.44 
tor8 for me it's /opt/jdk1.8.0_121/bin/javah12:57.01 
  javah -version12:57.20 
  javah version "1.8.0_121"12:57.21 
sebras tor8: oracle? openjdk?12:57.41 
  javah version "1.8.0_141" here.12:58.07 
tor8 Oracle, I think13:00.02 
  sebras: have you tried feeding it a sorted list?13:00.23 
sebras tor8: nope.13:03.05 
  tor8: I'm testing now however.13:04.08 
  tor8: still a gigantic diff.13:05.45 
  tor8: the same if I switch to oracles: javah version "1.8.0_111"13:06.42 
  tor8: javah version "1.8.0_144" also generates a large diff.13:20.31 
  tor8: and the same one as is in df4f2031b13:21.25 
  tor8: this is with $(sort $(...)) so.. I'm not sure how to fix this.13:21.45 
tor8 I guess the only way is to do as you say: git rm platform/java/mupdf_native.h13:31.05 
  *but* that will cause problems with Android.mk13:31.37 
sebras tor8: wow.. I haven't rebuilt -mini for a while and now gradle decided to download a crapload of packages of different versions. I'm still stunned.13:33.22 
  tor8: I was doing make clean in -mini btw.. :-P13:33.38 
tor8 sebras: alternatively we make sure to run the same version of jdk13:42.31 
sebras tor8: I run 1.8.0u144 right now.13:46.25 
  tor8: downloaded it just now.13:46.49 
  tor8: http://kn-gloryo.github.io/Build_NDK_AndroidStudio_detail/13:48.07 
  tor8: https://medium.com/@siantlords/opencv-and-android-ndk-integration-in-android-studio-883a810189e213:48.09 
  seems like there is no real consensus on how to configure android studio to autogenerate the headers. :-/13:48.27 
Guest74051 Hi sebras, good morning!13:48.55 
sebras Guest74051: hi bruno.13:49.34 
Guest74051 I still don't know how to draw an image from png in the right dimensions and aspect ratio. Can you help me?13:49.36 
sebras Guest74051: did you read the logs over at https://ghostscript.com/mupdfirclogs/2017/08/14.html ?13:50.17 
Guest74051 Yes, I did. Now I have two methods: the first method draws the image in a page and the second method adds the image directly to resources branch. But, in both cases, the image is distorced and cropped.13:54.10 
  The problem is probably in the matrix.13:55.38 
tor8 Guest74051: the problem is likely as you said in the matrix14:10.49 
  how do you calculate the matrix?14:10.56 
Guest74051 fz_document_writer *out = fz_new_document_writer(ctx, "page0.pdf", "pdf", "compress");14:11.40 
  fz_matrix ctm = *fz_copy_matrix(&ctm, &fz_identity);14:11.45 
tor8 it should be [ image_width_in_points_on_the_page 0 0 image_height_in_points_on_the_page left_coordinate_in_points bottom_coordinate_in_points ]14:11.47 
Guest74051 float xs = image->w * 96.0 / image->xres;14:11.51 
  float ys = image->h * 96.0 / image->yres;14:11.57 
  fz_scale(&ctm, xs, ys);14:12.01 
  fz_rect mediabox = {0, 0, xs, ys};14:12.08 
  fz_device *dev = fz_begin_page(ctx, out, &mediabox);14:12.14 
  fz_fill_image(ctx, dev, image, &ctm, 1);14:12.19 
  fz_end_page(ctx, out);14:12.24 
tor8 Guest74051: are you sure the image->xres and image->yres values are good (some images have bogus values there)14:12.30 
  Guest74051: I see nothing obviously wrong with that code14:12.58 
Guest74051 It shows 502 for image->xres and image->yres14:12.59 
tor8 except the mediabox14:13.06 
  which may or may not be what you want14:13.15 
  502 dpi? that's pretty high res14:13.32 
Guest74051 fz_rect mediabox = {0, 0, image->w, image->h};14:14.32 
tor8 Guest74051: if you change the mediabox to be { 0, 0, 1000, 1000 } and ctm to be { 1000, 0, 0, 1000, 0, 0 } you should get a full page image which is square (so wrong aspect ratio, but would exclude any other problems)14:14.40 
Guest74051 I still get cropped image and wrong aspect ratio14:15.22 
tor8 Guest74051: can you post somewhere the input image and output pdf you get?14:15.55 
Guest74051 The image has width=4299 and height=608614:16.35 
  I just tried with other images and the code actually works. But the initial image that I tried, that is giving problems, is proprietary.14:31.02 
  Is it possible, that giving the dimensions of the image, it is split in tiles?14:31.44 
tor8 Guest74051: it is possibly a bug in the image loading code then14:31.46 
Guest74051 is it split in tiles?14:31.52 
tor8 what does it look like if you just open the image file with mupdf?14:32.00 
  you could also try "mutool convert -o out.pdf input.png"14:32.53 
  and see if that has the same errors14:32.59 
sebras tor8: I think I have hunted down the javah issue: 7f0a9b9f commit introduced a few changes to the graft api which required 8 lines of mupdf_native.h to change14:34.16 
  tor8: later dd58ea5 introduced lcms2 and updated mupdf_native.h but not in the same was as my javah.14:34.50 
Guest74051 tor8: Yes, same problem. The image is cropped exactly in half and the aspect ratio is wrong.14:37.16 
sebras tor8: those changes originate from ddf38c5 and 8174ab2 on robin/lcms. I'm not sure what version of javah he used to fix mupdf_native.h or if robin did the changes manually (presumably not?)14:38.11 
Guest74051 I did the following experiment: reduced the image to 1000x1416 pixels and the code worked. Then, yes I believe that fz_new_image_from_file is not handling big images properly.14:40.46 
tor8 Guest74051: if you could open a bug and attach the file, we can mark it as private14:47.23 
  or if you could find a non-private file that exhibits the same problems14:47.44 
sebras Guest74051: tor8: does https://upload.wikimedia.org/wikipedia/commons/3/39/Periodic_table_large.png exhibit the problem?14:48.16 
  it is supposedly 6000x3300 pixels.14:48.29 
  tor8: so I guess the javah changes stem from the lcms merge and no one noticed because it was in platform/java... ;)14:49.23 
tor8 sebras: some changes but it looks like the whole file is being rearranged with your commit14:50.00 
  it's moving the "Header for class ...RenderingIntent" section down a few lines, etc14:50.50 
  and changing constant suffixes:14:51.18 
  -#define com_artifex_mupdf_fitz_TryLaterException_serialVersionUID -7034897190745766939i6414:51.22 
  +#define com_artifex_mupdf_fitz_TryLaterException_serialVersionUID -7034897190745766939LL14:51.22 
sebras tor8: not my fault: git show ddf38c5 platform/java/mupdf_native.h | tail -2014:51.55 
  tor8: is this a javah/win vs javah/other issue?14:52.15 
tor8 quite likely14:52.33 
  I would have said if I didn't generate the same exact file locally14:52.50 
sebras tor8: https://bugs.openjdk.java.net/browse/JDK-5085132 aw crap, seems like they did this on purpose.14:53.12 
tor8 ah, hang on.14:53.33 
  I may be confused because on tor/master there is *another* fix for mupdf_native.h14:53.46 
sebras tor8: that one seems quite similar to mine!14:54.25 
  tor8: but perhaps not identical14:54.32 
tor8 so yeah, I think we're down to windows vs linux javah14:54.41 
sebras tor8: apparently there was time when the c compiler on windows didn't support the LL suffix, hence they generate i64 on that platform.14:55.20 
tor8 commit your change, and we'll poke robin and remind him not to do the mupdf_native.h regen from windows14:55.21 
sebras tor8: ok.14:55.34 
Guest74051 The command "mutool convert -o page1.pdf Periodic_table_large.png" converts it properly. My code almost does: it has the correct aspect ratio and it is not cropped, but the image dimensions are smaller than the dimensions of the paper.14:55.37 
tor8 sebras: it may be that the version of MSVC we use (2005) doesn't...14:55.57 
  but my windows VM is still broken...14:56.08 
sebras tor8: see the discussion about MSVC on the other channel btw.14:56.34 
  tor8: so, then we are back to thinking that sebras/master is LGTM..?14:57.37 
tor8 sebras: we are!14:58.24 
sebras tor8: sweet! I'll push it.14:58.32 
  tor8: btw, we should consider disabling -Wmisleading-indentation somehow.15:00.12 
  tor8: it triggers in lcms.15:00.18 
  tor8: or we could fix lcms perhaps.15:00.27 
tor8 sebras: we could disable it for lcms in Makethird15:05.01 
sebras tor8: in 3cbe36b63 I disabled freetypes stream support. this means that freetype no longer will be able to load files from the file system, but you have to supply it with a memory buffer.15:07.10 
  tor8: llpp apparently used this feature in freetype. mupdf doesn't platform/java/mupdf_native.c:load_not() e.g. uses fz_new_font_from_file() which loads to a fz_buffer before creating the font.15:08.04 
tor8 sebras: we do anyway, fz_new_font_from_file loads the font into a fz_buffer before handing to freetype15:08.38 
sebras tor8: I wasn't aware that I would be restricting how system font loading would work (and neither does the cluster exercise this bit)15:08.39 
tor8 it's so we can always get the backing file out for example for pdfwrite15:08.50 
sebras tor8: ok, now llpp did this: https://github.com/moosotc/llpp/commit/af256f5b4eefcb9c1aacd113906c54f90cc826a315:09.43 
tor8 it shouldn't affect system font loading...15:09.43 
  the fz_load_system_font_fn callback is supposed to return a fz_font... ideally loaded with fz_new_font_from_file15:10.06 
sebras tor8: ok, let me rephrase that: I wasn't aware that apps were accessing the file system directly to load fonts not related to mupdf using the same library.15:11.11 
tor8 right. using our linked freetype.15:11.37 
Guest74051 tor8, sebras: just found the steps to reproduce it15:11.42 
sebras tor8: yes.15:11.42 
tor8 for non-mupdf purposes.15:11.44 
  yeah, I see the problem now.15:11.49 
sebras tor8: quite.15:11.49 
tor8 I guess we should back that change out then15:11.58 
Guest74051 How do I upload the image?15:12.14 
sebras Guest74051: you can attach it to the bug you create at bugs.ghostscript.com15:12.37 
  Guest74051: thanks for taking the time to reproduce with an accessible file btw. most people wouldn't go through the trouble to do so, but it valuable to us. :)15:13.14 
Guest74051 No problem :)15:13.55 
sebras tor8: the llpp people also asked me if the fonts embedded in mupdf are accessible through some reasonable interface.15:21.10 
tor8 sebras: yes, the fz_lookup_xxx_font functions15:21.46 
  the ones in noto.c15:21.58 
  fz_lookup_base14_font, fz_lookup_builtin_font, fz_lookup_cjk_font, fz_lookup_noto_font, etc.15:22.45 
sebras tor8: alright, I'll tell them.15:23.35 
  tor8: sebras/master LGTMed?15:26.09 
Guest74051 tor8, sebras: I created the bug https://bugs.ghostscript.com/show_bug.cgi?id=698357#c015:28.12 
  I tested on mupdf 1.11, but bugzilla only showed 1.10.15:29.09 
sebras Guest74051: ok, I just confirmed that I see the bug. thanks.15:35.57 
mvrhel_laptop tor8: thanks for fixing the oc issue. rebased the branch and it looks good. I see one blend issue I have to fix though...16:15.27 
  but first I am going to fool with seeing how the colors are looking on the ipad16:16.06 
  oh the iOS app was moved to its own repo16:40.53 
  I was wondering what was going on16:41.00 
 Forward 1 day (to 2017/08/16)>>> 
ghostscript.com #ghostscript
Search: