Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2018/09/04)20180905 
persmule Hi. Currently there is a workaround inside pdf-font.c for PDFs ill-produced by S22PDF, allowing mupdf to display those ill-produced pdf, but is there any way to write the usable font descriptor back to PDF files, in order to fix them?08:27.42 
tor8 persmule: you mean so you can pass the file around to other PDF tools that don't share the same workaround?08:31.00 
persmule Yes.08:31.21 
tor8 it's easy enough to write a 'mutool run' script if it's just a handful of files that you need to fix08:31.29 
  well, 'easy' is stretching it a bit08:32.09 
  in PDF there are two main flavors of fonts -- simple fonts with 256 characters, and CID fonts with more characters08:33.01 
  the S22PDF generator writes the structures for a simple font with 256 characters and says that it uses ASCII encoding but then writes multi-byte character codes using windows codepage 93608:34.19 
persmule I once tried "mutool clean -s", but it is unable to write the fix to file.08:43.57 
  tor8: You mean I can do the fix by simply adding a CID font to the PDFDocument object created from an S22PDF-produced file?09:00.25 
tor8 persmule: you need to change the font descriptor object structure from being a simple font to being a cid font09:06.21 
  with an appropriate encoding09:06.29 
persmule tor8: By reproducing what is done in pdf-font.c via JS code?09:07.40 
sebras tor8: I have a few commits on sebras/master as well as an implementation of the StructuredTextWalker thingy at the top of sebras/wip09:50.13 
  tor8: I did notice that the colorspace name might be clipped to 24 characters. so that one might not be perfect, but at least it illustratets what I want to do.09:56.24 
tor8 persmule: something like this script: http://ix.io/1m0S10:10.40 
persmule tor8: Thanks.10:14.20 
avih tor8: hmm... the getopt thingy also fails on osx. (in addition to two mingw setups and alpine linux). it basically only works on ubuntu (debian?)10:15.51 
persmule tor8: The js api of mutool seems not well documented, e.g. the property Font#Encoding cannot be found on https://mupdf.com/docs/manual-mutool-run.html .10:16.57 
tor8 persmule: it assumes an intimate knowledge of the PDF reference10:17.32 
  addCJKFont returns a PDF object representing the font10:17.42 
  font.Encoding is poking at the internal PDF object properties10:18.06 
persmule function addCJKFont is not documented as well.10:18.31 
tor8 persmule: that much is true!10:18.46 
  sebras: the colorspace thing will need to be rebased on top of "Use colorspace type enum instead of magic profile names."10:37.55 
sebras tor8: can do.10:39.17 
  tor8: apart from that, are you happy with it?10:39.26 
tor8 wow, color management in GIF ... that's rather strange10:41.34 
  sebras: but yes, apart from that I'm happy with the commits on sebras/master10:42.03 
sebras tor8: yes, it was in one of the application extensions that we used to ignore.10:42.09 
tor8 sebras: why not use the to_Rect and to_Matrix utility functions?10:45.50 
  in the text walker stuff10:45.56 
  and I'd remember the last font object so you only create a new font wrapper when the font changes10:47.00 
  sebras: if you're happy with the commits on tor/master (up to the missing fz_var declarations) I can push that10:48.42 
sebras tor8: I should probably be using to_Rect_safe() since I'm not handling fitz exceptions.10:48.52 
tor8 I added a few minor changes and documentation additions to murun10:48.57 
  sebras: one of them anyway :)10:49.30 
sebras tor8: should you be mentioning scriptPath in docs/manual-mutool-run.html?10:50.30 
  you do add scriptArgs there so add both to not get persmule chasing you in the future. :)10:50.54 
tor8 true, will add.10:51.18 
persmule sebras: most existing js scripts for mutool use argv.10:52.26 
tor8 persmule: sebras: I learned just the other day that Mozilla's SpiderMonkey js shell puts the script path and arguments in scriptPath and scriptArgs10:56.00 
  so I'm matching that behaviour10:56.05 
  as that's what the plain 'mujs' shell also does10:56.21 
sebras tor8: yes, I know. I can't see a problem with that.10:56.38 
tor8 other than breaking existing scripts (which I expect there to be not too many of in the wild)10:56.56 
sebras tor8: I was just worried that the docs and the software were being inconsistent. :)10:56.57 
persmule At least on the version I use, scriptArgs has not existed yet.10:58.16 
sebras persmule: tor8 has not pushed the change to the main repository yet. :)10:58.39 
tor8 persmule: in the plain 'mujs' shell or 'mutool run'?10:58.40 
  persmule: use argv[1] instead of scriptArgs[0] if you want to run the example script I pasted earlier on non-bleeding-edge mupdf10:59.19 
persmule tor8: I have done in that way.10:59.55 
  My version is 1.13.0.11:00.39 
  tor8: This fix is quite useful, since mupdf cannot print. If some S22PDF-produced files are going to be print, they should be fixed first before feeding into programs capable to print PDFs, e.g. poppler frontends.11:04.17 
sebras tor8: once you add scriptPath to the docs go ahead and merge. then I'll go next with the ICC-stuff given that you are happy with my rebase..?11:04.50 
tor8 sebras: done.11:07.33 
sebras tor8: is it only me or do you also get updates in platform/java/mupdf_native.h due to PDFWidget?12:11.44 
tor8 sebras: hm? the PDFWidget stuff hasn't gone in yet, has it?12:20.58 
sebras I can't find it in the git log. what on earth is going on?!12:22.26 
tor8 sebras: remnants from fred's forms2?12:24.40 
  IIRC the javah tool is finicky.12:24.50 
  maybe it tries to keep old stuff in the header and not regenerate everything?12:25.11 
sebras there were some uncommitted files in a directory, yes.12:25.31 
  tor8: why did you remove public from the interface members? https://docs.oracle.com/javase/tutorial/java/IandI/interfaceDef.html seems to indicate that without them the access rights are incorrect..?12:28.18 
  or rather, unreachable.12:28.31 
persmule tor8: Could you make the script you just provided for me a part of the future release of mupdf?12:29.18 
  tor8: Since S22PDF seems disastrously popular in China, a Free software solution to fix S22PDF-produced PDFs is eagerly needed.12:31.22 
  tor8: as stated in https://bugs.ghostscript.com/show_bug.cgi?id=691457 , in which S22PDF's problem was detected first time.12:35.08 
  tor8: Earlier solutions all depend on proprietary software.12:36.42 
  tor8: Since it is your work, it had better be published by you, not me, and I believe to publish it as a part of future release of mupdf is the best way.12:38.28 
  tor8: Besides, I have fixed a minor bug of the script in http://ix.io/1m1i , since not all page has fonts referenced.12:41.24 
  tor8: http://ix.io/1m1j is better, which fixes the pdf in place by default unless the second parameter is given.12:43.49 
tor8 persmule: I can put the script in the docs/examples directory12:49.20 
persmule tor8: That is just what I want. Thanks.12:49.48 
sebras tor8: StructuredTextWalker.beginLine() does not supply fz_stext_line_s->dir12:51.25 
tor8 persmule: you may know better which of the sun/hei/kai/fang/li fonts should be serif/sans-serif style12:51.41 
  sebras: true. do you think it should?12:51.52 
sebras tor8: I'm not sure. is it meant as an internal field?12:52.25 
persmule tor8: There is no correspondance, only similarity.12:52.40 
tor8 persmule: I know ... the PDF format has a flag for 'serif' style only though so we have to shoehorn it in12:53.17 
persmule tor8: I do know that hei and fang are similar to sans-serif.12:57.33 
sebras tor8: if you use StructuredText you are presumable mostly interested in the logical order of characters (for searching/marking). so perhaps the dir is not as useful there as it is when you need to figure out in what order to insert the characters into the stext in the first palce. perhaps it is best left out as an internal field.12:58.27 
tor8 sebras: you can infer the direction for each character from the quad12:59.22 
sebras sure, the question is: would a consumer need it. the more I think about this, the more I believe they wouldn't.13:00.19 
persmule tor8: and song is similar to serif fonts in west.13:00.20 
  tor8: Kai and Li are ambiguous. Many system count them as serif fonts, since strokes in them has various width, different with sans-serif-like Hei and Fang, which has constant stroke width.13:13.43 
tor8 sebras: I suspect not13:14.40 
  persmule: thanks. that's roughly to what I thought as well.13:15.23 
sebras tor8: in that case I feel "jni: Add StructurexTextWalker interface." seems usable. I added the BlockWalker to be able to retain the getBlocks() interface so as to avoid complaints about changing the API.13:16.01 
  I'd rather remove it, but hey...13:16.13 
tor8 persmule: http://git.ghostscript.com/?p=user/tor/mupdf.git;a=blob;f=docs/examples/fix-s22pdf.js;h=4a2789ec58bb496ff38066e20b9db50dd3945f6b;hb=2d512ffd1a9d19faabe38b3bbb419b5950a8105b13:17.13 
persmule tor8: Thank you very much.13:19.15 
tor8 sebras: yeah, but I suspect fred may be using it13:19.18 
  possibly ask him first?13:19.27 
  sebras: we should probably add StructuredText.snapSelection(Point a, Point b, int mode)13:20.39 
sebras tor8: that's why I retained the interface.13:20.42 
tor8 which calls fz_snap_selection13:20.44 
sebras can do.13:20.56 
tor8 and then maybe he can use that instead of cooking his own13:21.13 
sebras tor8: seems like StructuredText.snapSelection() would need to return a struct of a quad and the a and b points seeing as fz_snap_selection() actually modifies the a and b points to snap to the correct positions.13:32.23 
tor8 sebras: you would modify the a and b input points13:34.09 
  i.e. the java Point objects would be in-out as well13:37.06 
sebras tor8: I'm daft. I forgot that functions may change java objects. :)14:04.24 
  tor8: do we want to add a link to where to S22PDF in the script?14:44.59 
  tor8: I tried searching for it but came up empty14:45.11 
  persmule: where do I get S22PDF?14:46.28 
tor8 sebras: I don't know that it's still used... it was extremely popular a long time ago and there are a lot of broken files out there already.14:58.23 
sebras I see.15:02.21 
tor8 sebras: "StructurexTextWalker" typo otherwise sebras/master LGTM15:05.43 
sebras nice!15:06.05 
moolc ghoscrscript package for my distro was updated _3_ times in the last 24 hours... amazing15:15.43 
sebras moolc: there has been a number of security bugs fixed and a new release just the other day.15:16.34 
moolc sebras: yes, i know, but _three_ bloody times15:17.04 
persmule sebras: It may have been popular before pdf printers were introduced to M$WIN, just as a disastrous influnza.15:32.45 
sebras tor8 (for the logs): I tracked down a memory leak in the ICC loading code and noticed what I believe to be a typo taking sizeof(ptr). in any even it clusters fine.20:37.12 
mojca I'm looking for some hints about how to avoid "platform/gl/gl-main.c:1677:16: error: use of undeclared identifier 'GLUT_ACTION_ON_WINDOW_CLOSE'"20:52.49 
  I'm trying to compile 1.13.0 on macOS 10.1320:53.04 
 Forward 1 day (to 2018/09/06)>>> 
ghostscript.com #ghostscript
Search: