Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2018/04/15)20180416 
tor8 morning Robin_Watts.08:51.35 
  you said you had a patch for premake?08:51.40 
Robin_Watts morning08:51.42 
  Yeah. Let me get you a URL.08:51.49 
tor8 there's another problem I think we'll want to patch as well08:52.07 
  it doesn't handle quotes in 'defines' properly08:52.19 
Robin_Watts https://github.com/robinwatts/premake-core/tree/artifex08:52.42 
tor8 so I needed to add \ escapes manually, but that breaks when generating 'gmake' files08:52.44 
Robin_Watts as in -DFOO="BAR" ?08:53.25 
tor8 yeah08:55.25 
Robin_Watts I'm going to be distracted this morning.08:56.12 
  My hosting company just changed, and they broke "catch all" email addresses, which kinda screws me.08:56.40 
tor8 Robin_Watts: okay.08:57.46 
paulgardiner Robin_Watts: gah, you too madly typing a list of forwards! I would have warned you, but that happened to me because of swapping to Gandi rather than waiting to see how Site5 worked out.09:05.59 
__monty__ Is there a way to specify the width of plaintext output from mutool draw? Neith -w nor -W seem to affect it.09:06.33 
Robin_Watts I can match on a regex, so I can do "^(foo|bar)@wss.co.uk"09:08.10 
  but what I really want to do is the NEGATION of that, and that's harder.09:08.22 
tor8 __monty__: the lines are broken where the input breaks the lines. if you want to reformat, pipe the output through the "fmt" tool09:08.22 
__monty__ tor8: Thanks.09:10.12 
  I have a couple documents where pdftotext (from poppler) gives a much better result than mutool draw, is this user error or a limitation of the tool? Example 1: http://ix.io/17UF, mutool http://ix.io/17UG Example 2: http://ix.io/17UH, mutool http://ix.io/17UI10:09.08 
kens Its not really possible to tell without seeing the original PDF file10:10.15 
__monty__ Maybe better is too strong a statement because in the second example mutool's output is more complete I guess, even though I don't really care about the vertical text that pdftotext doesn't show.10:10.38 
kens It 'looks like' the pdftotext code is trying to reassemble the document, whereas mutool draw isn't, but that's just a gues10:10.39 
__monty__ 1st pdf: http://repository.readscheme.org/ftp/papers/pe98-school/D-364.pdf10:12.31 
  2nd pdf: https://www.cl.cam.ac.uk/~sd601/papers/mlsub-preprint.pdf10:12.38 
kens The mutool output is dumping every text section as a separate line, even when its on the same vertical position10:13.54 
  You'd have to get Tor8 to tell you if there are other options I think10:14.07 
Robin_Watts __monty__: Ogg Monty, or a different Monty ?10:16.59 
kens There's no ToUnicode information in the PDF file, which is why you get the binary output10:17.08 
__monty__ Robin_Watts: Different monty.10:17.47 
Robin_Watts mudraw can give you different formats of text output.10:18.05 
  txt is raw text.10:19.04 
  html = an attempt at an html page to replicate10:19.15 
  stext = XML format with positions etc.10:19.24 
  You may find that you get better information by working with one of the latter forms.10:19.39 
__monty__ I need plaintext though.10:19.41 
  I'm using it for ranger previews.10:19.52 
  That's also the reason the vertical "banner" on the second one's a little annoying.10:20.12 
  But I figure mutool'd be stricter than pdftotext so it's not really a problem. Was just wondering whether there was something simple I could do.10:20.44 
Robin_Watts The problem is that PDF doesn't contain a "flow" of text.10:21.24 
moolc Robin_Watts: Vorbis Monty... Ogg is just a container (and i have no idea whether Monty was involved with that in any significant way)10:21.28 
  __monty__: ranger as in file manager?10:21.48 
__monty__ Yes, moolc.10:21.54 
Robin_Watts It literally contains, "put this glyph at position x,y" "put this glyph at position x,y" etc.10:22.00 
moolc __monty__: why don't you just render to image then?10:22.04 
Robin_Watts so we decide when to split lines based on the gaps between chars. It's all very grotty.10:22.25 
  moolc: Yes, but he'd have known what I meant. :)10:22.41 
__monty__ moolc: Because I don't want image previews : )10:22.42 
Robin_Watts I was peripherally involved in ogg vorbis/theora back in the day, and knew Monty from that vaguely.10:23.17 
moolc Robin_Watts: ;)10:23.58 
  Robin_Watts: he had an excellent videos dedicated to the a/v coding basics10:24.25 
  -an10:24.33 
Robin_Watts Yes, he's a smart cookie.10:24.47 
moolc __monty__: shrug :)10:24.47 
  tectronics oscilator to explain nyquist was a nice touch10:25.24 
__monty__ I only remember nyquist diagrams vaguely. No clue what a tectronics oscilator is.10:26.24 
moolc __monty__: i messed spelling in that one up (in more than one way) https://www.tek.com/oscilloscope10:32.40 
tor8 __monty__: the D-364.pdf is going wonky because the file has some very weird ideas about font sizes and spacings10:32.41 
kens Yes I was going to mention that, I just tried running it through Ghostscript's txtwrite, and got some very peculiar answers10:33.23 
tor8 __monty__: it gives a font size of 0.12 points, and then the glyphs are thousands of units wide10:33.26 
  so it has a font that violates the principle of the 'em' size being the size of the font... it has an infinitesimal em-size and then characters that are a thousand times bigger10:34.20 
__monty__ tor8: I figured it was just a bad pdf. It's not a big deal. I just noticed that pdftotext handles it a little better.10:34.48 
  Subjectively of course.10:34.56 
kens Looks like hte glyphs are all btimapped type 3 fonts10:35.15 
  values for the actual glyphs are sane10:35.41 
tor8 __monty__: it's not entirely uncommon for TeX-produced PDF files to have "strange" fonts10:35.42 
kens Widths arrays look reasonable too10:35.57 
  Oh, and at least one *does* have a ToUnicode10:36.08 
tor8 kens: the 0.12 0 0 -0.12 Tm and text objects like [(tro)-4000(duction)]TJ10:37.05 
kens Yes I saw that10:37.14 
  But the actual fotn data looks reasonable (for at least some fonts)10:37.25 
  But it looks like the ToUnicode values are only partial10:38.12 
tor8 kens: yeah, I wonder how all these numbers add up...10:38.56 
kens It puzzles me I have to say, teh Widths array looks OK10:39.13 
  THe d1 values look sensible10:39.27 
tor8 the FontMatrix is unity, the FontBBox seems to be basically 100x100 and so do the type3 bitmap images and font widths10:39.29 
  so we've got a font that when em-size/font scaling is at 1 point, the bitmaps are 100 times bigger10:40.04 
  which adds up and explains why mupdf is confused10:40.18 
kens Probably explains why txtwrite is also confused10:40.29 
  Since its trying to represent the page10:40.41 
tor8 mupdf uses the font scaling as the base value for its line breaking and adding space character heuristics10:40.46 
  and the font size is 0.12 points (but the actual rendered text is 100x bigger, so 12 points)10:41.20 
  we can't trust the FontBBox and use that instead (it's even *more* unreliable for a vast number of fonts and pdf files)10:41.57 
  it migh be possible to look at the average advance width value and use that instead of the font size10:42.37 
  but really, just create a proper PDF file. I expect LaTeX's PDF output has improved since 2004 when this file was created.10:43.07 
  __monty__: the "vertical" text in mlsub-preprint.pdf is using the Identity-H encoding (so a horizontal writing mode) but spaced vertically10:45.58 
  we do exactly what we're supposed to there10:46.05 
__monty__ Ok, no problem. Thanks for looking into it.10:49.49 
Robin_Watts tor8: So, did that branch work for you?11:56.39 
  I need to sort out a pull request to get those changes (or something like it) pulled back into the original source.11:57.02 
tor8 I'm having trouble building premake5 natively on my windows vbox12:02.18 
  missing stdint.h ...12:02.28 
moolc tor8: does vbox still have the credits section in about menu?12:03.36 
Robin_Watts tor8: I had to build it using vs201712:04.31 
  I started to fix it to work with 2005, but I hit the stdint.h and other stuff.12:04.59 
tor8 Robin_Watts: right. the world really has moved on since vs2005... stdint.h was introduced to the C standard in C9912:06.20 
moolc no version of microsoft c ever claimed to support C99 though12:06.57 
Robin_Watts I tried vs2010 too, and it still failed, I think.12:06.57 
tor8 moolc: I know. microsoft has been amazingly annoying in that respect.12:07.47 
  where C99 intersects C++ they bother to support it12:07.59 
  moolc: but hey, what do you expect from a company that implements strlen() using C++ and templates?12:08.22 
moolc tor8: :) they, i believe, even have an entry in some FAQ regarding C99 support12:12.50 
  it basically says - migrate to IOS C++12:13.01 
  i wonder if Herb Sutter authored it12:13.11 
  IOS=ISO12:13.19 
  then again, no compiler ever supported full C99.. (well maybe Comeau did at some point)12:14.24 
tor8 Robin_Watts: I'm not managing to get your modifications to work... the stuff it outputs are still using /Z7 and ProgramDataBaseFileName12:15.36 
  Robin_Watts: when building on linux and using that to generate vcproj files I want to copy over to windows12:16.53 
  okay, if I did a clean checkout/bootstrap rebuild it worked. weird.12:18.38 
  Robin_Watts: and that makes the vcproj files much more usable on vs200512:20.56 
  the debug/release configs have the proper options set, and it doesn't want to rebuild all the time12:21.22 
  I'm getting a LNK4098 warning still (defaultlib 'libcmt.lib' conflicts with use of other libs)12:21.42 
Robin_Watts tor8: If you change the lua, you need to "premake5 embed"12:46.33 
  then build premake12:46.42 
  tor8: For what target/configuration?12:47.09 
  (are you getting the conflicting lib?)12:47.24 
  tor8: I can't see that warning here.13:03.03 
tor8 Robin_Watts: ah, that step would be it13:03.34 
  Robin_Watts: mupdf-gl config=debug13:03.44 
Robin_Watts A rebuild of that just worked with no warnings for me.13:04.46 
  Was it possible that you weren't building from clean?13:05.08 
tor8 let me retry13:05.33 
  Robin_Watts: clean bill of health here too.13:08.58 
  sorry for the false alarm13:09.03 
Robin_Watts no worries.13:09.10 
  I'm just opening pull requests for these. I expect them to get kicked back, but it means we can start a discussion with them at least.13:09.39 
tor8 now all that's needed is fixing the quoting stuff in 'defines'13:09.42 
  and then upstream to take on the fixes in one form or another13:10.03 
  even if they don't use your code, they'll know what to fix13:10.09 
Robin_Watts tor8: Ok, so I can look at that now.13:21.03 
  Has your branch been updated with something that needs it?13:21.14 
tor8 Robin_Watts: there's a commit on tor/wip that removes the explicit extra quotes13:23.52 
  without that commit, 'premake gmake' creates bad makefiles13:24.08 
  and with it, 'premake gmake' creates correct makefiles but incorrect vs2005 project files13:24.20 
Robin_Watts ok, so I'll pull that in, and look at it.13:26.59 
  tor8: I am confused.13:39.26 
  I've pulled in your latest version, deleted all the build files, regenerated them, and they seem fine.13:39.52 
  " is being translated to &quot; which is right, aiui.13:40.04 
  certainly it looks right in the VS properties.13:40.16 
  tor8: OK, so I see the problem in the build.13:41.50 
  tor8: I have a fix. Will prepare a commit/pull request now.14:03.21 
HenryStiles tor8: did you tell this person there was a bounty? I assume not, https://bugs.ghostscript.com/show_bug.cgi?id=69921814:04.47 
tor8 HenryStiles: I have not. I stay well out of the whole bountiable issue.14:05.21 
Robin_Watts tor8: commit pushed.14:09.44 
tor8 Robin_Watts: I wonder if the same stuff needs to be done for the other vs20xx output modules too14:10.55 
Robin_Watts tor8: possibly. I was hoping to get a comment back from the authors about what we've done so far, but I can dig into that too if required.14:11.50 
tor8 let's wait a bit and see what they say14:12.03 
  gmake/gmake.lua has a 'make.esc' which adds a \ to " in its escaping function14:12.25 
  modules/vstudio/vs2010.lua's escape function doesn't even touch double quotes14:13.32 
  in the vstudio modules, the preprocessorDefinitions function takes an 'escapeQuotes' argument14:15.38 
  upstream is more likely to know what to do to fix this properly, hopefully14:17.16 
Robin_Watts pops to collect new glasses. bbs.14:18.24 
  ok, varifocals. I am officially old.15:13.21 
kens Don't say that, you're younger than me15:13.53 
Robin_Watts I'm old. Not ancient :)15:14.11 
kens :-D15:14.18 
moolc Robin_Watts: varifocals the glasses that solve myopia and (whatever is the name of the opposite) at the same time?15:15.40 
Robin_Watts different prescriptions for looking straight ahead, and looking down, yes.15:16.13 
moolc Robin_Watts: condolences15:16.29 
Robin_Watts moolc: On balance, getting old is is better than the alternative.15:16.55 
kens Depends ont eh alternative15:17.12 
moolc Robin_Watts: you watch too much Brad Pitt movies15:17.14 
  at least you toddlers don't have to wear an eye patch15:17.47 
kens Give it time....15:18.02 
HenryStiles Robin_Watts: we call them progressive lenses, had to google what you meant15:18.51 
sebras tor8 (for the logs): sebras/master has a commit or two.19:35.41 
  oh and I liked the semantic compression article. :)19:36.00 
 Forward 1 day (to 2018/04/17)>>> 
ghostscript.com #ghostscript
Search: