IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2013/02/05)2013/02/06 
henrys marcosw:ping02:56.55 
marcosw henrys: hey02:57.03 
henrys can you be careful to be on top of this customer using the old Aladdin version.02:58.20 
  I read he is using 6.5 and you said 5.5 maybe I didn't read everything carefully02:58.43 
marcosw I think he said 5.5.x originally. But in any case, 6.5 is pretty old :-)02:59.53 
henrys indeed anyway they are shaping up to be an important customer - the upper crust of the top ten so to speak so we do want to keep on top of it.03:02.46 
marcosw henrys: I found a patch for the customer using 6.50. We fixed the problem 10 years and 10 days ago, which must be a record. The customer should probably upgrade to a more recent version, e.g. something that was released in this century.05:11.24 
henrys great work the power of bisect!05:12.12 
mvrhel_laptop wow05:13.36 
  hmm. have the pages rendering and sliding across with the touch gestures as well as the slider control in the windoze app. need to do a little work for when the device is rotated about and also need to add scaling case and searching06:44.08 
  enough for tonight06:44.47 
ManDay Can gs be used to remove a page from a pdf?09:02.10 
kens Yes, and no.....09:02.29 
ManDay Can gs be used to remove a page from a pdf?09:02.45 
kens You can creatge a new PDF which doesn't contain the page, you can't remove the page from the original09:02.52 
ManDay Isn't that equivalent?09:03.06 
kens Well, it might be.09:03.23 
ManDay That sounds good enouhg, thanks09:03.33 
kens But the output PDF file will not be constructed in the same way as the original09:03.44 
ManDay I guess I'll see what that means09:04.14 
kens What you need to do is run the file through GS using the pdfwrite device and specifying FirstPage and LastPage, then do it again with a different FirstPage and LastPage (obviously leaving out the page you don't want)09:04.31 
  Then run GS a third time, giving the two input files and create a new one.09:04.48 
  But really, I think pdftk can do this, and it'll be faster nad more reliable if it can09:05.06 
ManDay pdftk also wants me to recompile my GCC with "gcj" (which I've never heard of).09:05.55 
kens Possibly something Java related, new to me too though09:06.13 
ManDay kens: Worked fine at a first look09:09.39 
kens pdftk you mean ?09:09.50 
ManDay no, gs09:09.56 
kens Oh, well if it works that's good, but bear in mind its not the same file you started with, fonts might be subset and differently encoded, metadata will be different, some non-printing stuff might go missing etc.09:10.49 
ManDay I guess GS will only make it better :P09:11.20 
  More stable and compatible, that is09:11.34 
kens ROFL09:11.35 
  I hope so09:11.41 
ManDay I'm serious.09:11.44 
  I mean to read the PDF on my Ebook, which doesn't really cope well with strange stuff09:11.56 
kens In general it will be more compatible, and smaller, but just sometimes it might not be, its a risk you take on every conversion, as I'm sure you are aaware.09:12.16 
ManDay Sure, it was just for this time09:13.16 
  I general I can't stand PDF09:13.37 
kens PDF used ot be OK, back around version 1.309:13.52 
ManDay I have a problem with it's nature itsself, not with a particular design. The mangling of content and markup is bad habit, imho.09:15.27 
  s/it's/its09:15.45 
kens Depends where you are sitting, its not intended to be editable.09:15.52 
ManDay It's not intended to be machinable parseable, either. Which kind of implies its an unsuitable format for digital media09:16.23 
kens Damn, my mail client just folded up and died09:16.28 
  Don't know what you mean by machine parseable, its easy enough09:16.57 
  insert an appropriate 'not' in there09:17.22 
  IMO its no worse than say an XPS file09:17.39 
ManDay kens: You have no way to extract semantics and content from a PDF. It's practically a vector-image which a minimal amount of semantics layered on top (namely flow).09:17.41 
  Compare that to HTML, which is what I use09:17.49 
kens Flow is most definitely not embedded in the document, that's one fo the problems.09:18.05 
  BUt no, its not meant to be 'understood' by a machine.09:18.15 
ManDay kens: I mean Flow as in "which word comes after which"09:18.24 
kens Its intended to looks pretty much the same on any device.09:18.28 
  ManDay so did I09:18.36 
ManDay kens: Well, you 09:18.47 
  kens: Well, you *can* select text in a reasonable fashion09:18.58 
kens In PDF I can scatter words all over the page, but the reading flow bear not relation to the order laid down09:19.07 
ManDay kens: Exactly. As I keep telling people: PDF is nothing but a fancy name for a picture09:19.11 
kens THe text slection relies on 'proximity'09:19.17 
ManDay It's not a format for *Documents*09:19.19 
kens No its not a document format09:19.26 
ManDay kens: Ok, learned another thing. THat makes it even worse09:19.34 
kens Its a viewing format which has been horribly abused09:19.36 
ManDay So PDF doesn't even have THAT kind of info embedded09:19.43 
kens Nope.09:19.48 
ManDay kens: I'm happy we agree09:19.57 
kens If you are using Latin languages, then it often works out the way you expect, but yoiu can't rely on it09:20.09 
ManDay Interesting09:20.29 
  my network really sucks09:22.48 
kens Mine varies, depending on the weather09:23.00 
Robin_Watts paulgardiner, tor8: 2 commits on robin/reflow that follow on from pauls.10:49.21 
  Greatly improve the paragraph finding and line breaking.10:49.48 
  The former is an n^2 process though at worst case, so I'm open to the idea of recoding. It seems to work well though.10:50.25 
tor8 Robin_Watts: how about our release? maybe we should cut that before we start making all these text extraction modifications.10:50.29 
  bump the version numbers to 1.2 and tag what we have on origin/master then make the binaries when we're less stressed out10:51.31 
Robin_Watts tor8: Possibly, yes.10:51.38 
  I've not yet pushed all Pauls latest changes.10:52.04 
  I am planning to go for a run, then to run through them all one last time before pushing.10:52.30 
  I think it'd be nice to have reflow in the release though, personally.10:52.43 
tor8 Robin_Watts: yeah, but we probably want it in good shape11:05.18 
  Robin_Watts: perhaps a double-linked list for spans/lines/blocks to make splitting easier?11:06.27 
Robin_Watts tor8: Yes. Reflow on android seems stable though. And we don't take a hit on text extraction unless we call fz_text_analysis.11:06.51 
  tor8: yes.11:06.58 
  I wonder if we should collect lines, and then collect into blocks later?11:07.08 
  But that kind of thing is a much bigger change, and shouldn't be done pre-release.11:07.22 
tor8 Robin_Watts: so just collect into a random jumble of lines, tack on the span to the closest existing line or open a new line11:07.55 
Robin_Watts yeah.11:08.03 
tor8 then later sort into columns and paragraph blocks in an analysis stage11:08.07 
Robin_Watts yup.11:08.15 
tor8 sounds good11:08.18 
  and linked list structs for ease of manipulation11:08.28 
  I've found myself having to look at splitting and reordering spans for the RTL pass11:08.44 
  zenikos patch fails to reorder RTL across spans11:08.54 
Robin_Watts I might be tempted to push all of Pauls reviews into the release, plus the second of my two (that tweaks char -> span collection).11:09.06 
  And leave my paragraph stuff out.11:09.16 
tor8 Robin_Watts: another simplification could be to let the lines hold the chars, and the spans just point out ranges in the line11:09.28 
  or bloat the char struct with one pointer to style11:09.48 
  both would simplify a lot of things11:10.00 
  and make text search etc faster11:10.06 
  if you just had blocks and lines to iterate through11:10.13 
Robin_Watts All sound reasonable ideas.11:10.15 
  Certainly the splitting of blocks is the hairiest part of my patch.11:11.14 
tor8 I'm okay with that (take the reflow branches up until we start making big changes) for the release11:12.45 
  Robin_Watts: want me to simplify the text extraction structures into linked list for blocks and no span struct (just bloat the char with a pointer?)11:13.33 
  or we could set a maximum number of styles and use a short for the style11:13.46 
Robin_Watts tor8: We should release before we start messing with that stuff.11:17.00 
  but then, yes please!11:17.10 
tor8 Robin_Watts: yes, certainly.11:17.16 
Robin_Watts I personally prefer pointers to styles I think.11:17.33 
tor8 will you prep the release branch with what should go in, and I'll bump the version numbers?11:17.49 
  pointers are easier, for sure11:17.55 
  and it's already got 4 floats and an int in the textchar11:18.04 
  so adding another pointer there won't hurt much11:18.15 
Robin_Watts tor8: We need to agree where we want to draw the line. I'd like to get reflow in, cos I don't believe it will adversely affect stability.11:18.45 
tor8 reflow is fine by me, it only affects the android app11:19.57 
  the pass by reference matrix/rect and simplified text structs should go after the release11:20.33 
  or maybe we should force in the matrix/rect stuff since we've already changed those apis a little bit11:20.59 
  what's your thoughts on that?11:21.06 
  I'll need to do a final pass of reviewing that patch, haven't studied it in its latest incarnation11:21.39 
Robin_Watts tor8: good point.11:21.55 
  If we are changing the API, it would be nice to just change it once, rather than changing it 2 releases in a row.11:22.13 
  i'd be tempted to put the 'by reference' stuff in, along with the bbox -> irect you talked about yesterday.11:22.39 
tor8 Robin_Watts: yeah. my only outstanding gripe is the pre_xxx naming.11:24.27 
  I would much prefer mul_xxx11:24.42 
  (I associate pre with temporal before, not which side of the expression)11:25.12 
Robin_Watts I could live with mul as long as we are clear in the comments that it's premul.11:25.18 
tor8 fab.11:25.28 
Robin_Watts OK, I will make that change and endeavour to get a new version up today.11:25.50 
tor8 Robin_Watts: so s/fz_pre_/fz_mul_/ and s/_bbox/_irect/ and we're good to go on that commit11:26.17 
  Robin_Watts: just to double check here, fz_pre_rotate(m, angle) is equivalent to m = m * fz_rotate(angle)?11:28.55 
Robin_Watts no.11:33.06 
  m = fz_rotate(angle) * m11:33.13 
  hence pre.11:33.18 
tor8 right. that's why I don't like the pre name... one: it's not clear which of the two arguments is pre, and two: pre implies rotating "before" the other transforms, not after...11:44.32 
Robin_Watts It's pre multiply, pure and simple.12:03.42 
tor8 well, since I'm so easily confused anyway, I'll leave the choice of name up to you. :)12:05.13 
Robin_Watts The worry I have is that 'mul' says nothing at all about the order.12:06.06 
  Wheras 'pre' at least says something.12:06.24 
  but honestly, if we're going to have to look it up on every use, the name probably doesn't matter :)12:06.51 
tor8 having misunderstood the order (too much 3d graphics hacking, where you multiply stuff the other way...) I'm just going to silently go with whichever you prefer...12:07.25 
Robin_Watts tor8: OK. Everything rebased and ready to go on robin/master14:01.13 
  except for the bbox -> irect change.14:01.20 
tor8 Robin_Watts: want me to tackle that change?14:03.44 
Robin_Watts either you can, or I will in about an hour.14:04.32 
tor8 I'll do it now14:04.39 
  having done it in reverse before I know where to go looking :)14:04.55 
  fix on tor/master now14:15.50 
Robin_Watts tor8: So, I'll check that the android and windows build still work.14:52.51 
  Can you check linux and ios?14:52.56 
tor8 I'll check and fix ios. linux is fine.15:05.19 
Robin_Watts android isn't. fixing now.15:06.05 
henrys Robin_Watts:since you are talking with these tester can you ask them what program they are using to fuzz PDF and PostScript?15:32.35 
Robin_Watts henrys: who? The guys from customer 395 ?15:40.48 
  Apparently they use "Address Sanitizer" which is akin to valgrind, but faster.15:41.35 
  but I don't know if that's fuzzing.15:41.52 
henrys thanks that's right I remember now15:42.10 
  they are doing postscript - lucky us ;-^15:42.37 
kens It will be interesting to see what they come up with15:43.25 
Robin_Watts I have replied.15:52.06 
henrys I do think fuzzing would be easy and enlightening also - have you guys read https://docs.google.com/a/artifex.com/viewer?url= href="http://fuzzinginfo.files.wordpress.com/2012/05/cmiller-csw-2010.pdf">http://fuzzinginfo.files.wordpress.com/2012/05/cmiller-csw-2010.pdf16:03.34 
kens I htink I've seen it somewhere yes16:06.48 
Robin_Watts henrys: Page 12. That's what we're missing from our cluster set.16:09.34 
henrys the idea of reducing the set by coverage was discussed it scared me though. Line coverage is not program coverage.16:12.32 
  but maybe I'm wrong about that16:13.00 
Robin_Watts We have the data so that we can reduce our set so that historically we would still have found all the differences.16:13.48 
henrys something for marcosw to mull over16:17.26 
  but that is just crashes you have to figure defects is order of magnitude(s) higher - wow software sucks.16:23.30 
Robin_Watts tor8, paulgardiner: Looks like I broke fz_buffer_printf somehow in the fz_output commit.16:36.05 
  Can't see how at the moment. If either of you can spare a mo to sanity check it, I'd be grateful.16:36.24 
paulgardiner ok16:36.39 
Robin_Watts It's causing crashes in the cluster.16:37.39 
paulgardiner Could there be a call with fmt == "", where it would loop growing the buffer indefinitely?16:48.44 
  I'm thinking maybe len > 0 should be len >= 016:49.20 
  Or more likely fz_buffer_printf(buf, "%s", "");16:54.09 
  That would do it16:54.16 
Robin_Watts would it?16:54.35 
  We'd get infinite loops growing the buffer.16:55.01 
  I'm testing a fix now for those cases.16:55.14 
paulgardiner And you could change len+1 <= slack to len < slack, maybe.16:58.25 
kens OK off to eat, goonight all17:02.31 
Robin_Watts marcosw: cluster knackered?17:25.20 
henrys yes we have a long queue now17:31.01 
Robin_Watts henrys: The length of the queue is not an issue; normally the mupdf jobs run in 10 mins or so.17:53.07 
  The worry is the obscene length of time they now spend sitting in the 100% state. It's probably a stupid mistake I've made.17:53.48 
ray_laptop These Advanced TIFF Editor guys seem to be real pirates. They say they use Ghostscript and actually gs 9.06 is installed during their installation and it doesn't ever mention GPL18:38.57 
henrys Gar!18:40.24 
mvrhel_laptop They are in Vancouver BC so not to hard to reach18:42.00 
ray_laptop They also (apparently) include ImageMagick -- by looking at the directory where they install there is an imagemagick-license.txt18:42.06 
  and an imagemagick.dll18:42.17 
  have to run an errand. bbiab18:43.00 
Robin_Watts ray_laptop: So, they install gs 'silently' as part of their installation.18:46.49 
  but it's not hardwired in, right? I mean, you can uninstall it and install a new version?18:47.03 
marcosw Robin_Watts: The 100% state just means that the jobs have all been sent to the cluster nodes. Unfortunately some of the mupdf hubs take a long time to run. This is true of Ghostscript cluster runs as well, but we send those jobs at the beginning of the cluster run so the finish by the time all the jobs have been sent.18:52.44 
Robin_Watts marcosw: Right, but after the jobs have all finished, it was sitting there for about 10 minutes.18:53.07 
marcosw Robin_Watts: possibly it was waiting for one of the nodes to send the 'done' signal? There isn't anything the cluster should be doing after the nodes are done (other then reading and archiving the logs and sending an email, but that only takes 15 seconds).18:54.42 
Robin_Watts marcosw: If it had happened just once I'd have written it off as a temporary glitch like that.18:55.14 
  but it was (is?) taking a long time after EVERY job.18:55.26 
marcosw it didn't this time. I'll look at the logs...18:55.46 
Robin_Watts Aha!18:59.04 
  I hate var_args.18:59.21 
  tor8: hi19:07.51 
  I found the cock up in the "fz_output" commit that has caused cluster failures.19:08.14 
tor8 ah.19:08.29 
Robin_Watts Do I push a fix, or do I rewrite history so that the problem was never there.19:08.31 
marcosw Robin_Watts: All of the mupdf jobs took 6 or 7 minutes to run, except for 49b4eedf24212cac7ae0b94b855b219e4f1ae86f and the one right after it, which was your mupdf clusterpush.pl19:09.01 
  In the case of your clusterpush the problem was the node 'macpro'19:09.15 
Robin_Watts I guess we don't really care that the cluster history shows failures, given that those were all pretty much android only changes that shouldn't have made any differences.19:09.41 
  marcosw: OK, so it was just that the cluster went wrong at the exact time I was watching it.19:09.56 
tor8 considering how many commits have gone in and how long they've been up I'd be a bit reluctant to rewrite19:10.01 
Robin_Watts I will endeavour not to look again.19:10.07 
marcosw it took between 3 and 8 minutes between connections. Just often enough that timeouts didn't happen.19:10.20 
tor8 Robin_Watts: but if you feel strongly I shall not oppose19:10.49 
Robin_Watts tor8: testing a fix now. commits on robin/master for your perusal.19:10.59 
tor8 having bisectable history trumps most things19:11.02 
Robin_Watts I'm happy to leave it unrewritten I think.19:11.17 
marcosw also your job had to restart because of a problem with the 'henrysx6' node. Presumably the was an internet issue with henrys machines.19:12.07 
Robin_Watts tor8: Better version online.19:25.50 
ray_work Robin_Watts: AIUI the GPL restriction is w.r.t. "distribution" (which they do with NON-GPL software that invokes it silently). Whether or not one can upgrade the GPL Ghostscript isn't entirely the issue, but it is the goal of GNU that open source software in apps be able to be replaced by users.20:09.28 
marcosw Robin_Watts, et.al.: I've reduced the cluster timeouts for mudpf jobs from 20 minutes to 10 minutes (which is what it is for Ghostscript). There were cases were one node would timeout, after 20 minutes it's jobs would be distributed to the remaining nodes, one of which would then timeout, rinse, repeat.20:11.16 
ray_work Robin_Watts: and also, AIUI, Artifex interprets the GPL w.r.t. linking to INCLUDE invisibly invoking a GPL program even if it is in a process and not just a DLL, so the viral nature of GPL would apply20:11.28 
henrys ray_work:miles tries that hoping the customer doesn't understand the GPL but it's very hard to read the GPL to include external processes - I don't think that would ever hold if things got legal20:14.55 
Robin_Watts ray_work: I agree with henrys.20:15.59 
  There is nothing in the GNU GPL that says that GPL software can't be silently installed.20:16.21 
  As long as they supply a copy of the LICENSE file, and people can install newer versions, they are fine.20:17.28 
  As long as they call into gs by invoking it from the command line (i.e. kicking off a new process), I believe they are safe. Miles is welcome to try all the FUD he wants, of course :)20:18.46 
ray_work Robin_Watts: they don't provide ANYTHING w.r.t. Ghostscript's GPL status or LICENSE file, or how to install/upgrade Ghostscript (OR imagemagick) Note that imagemagick _is_ a DLL, not an EXE20:21.34 
  Robin_Watts: they don't invoke gs from a command line. They _may_ be using the gsdll32 directly, or they may be invoking the gswin32c using a "system()" call (or exec or fork)20:22.58 
Robin_Watts ray_work: Is gs being installed as normal into a separate dir? Or is it just a bare binary in with their stuff?20:23.43 
ray_work and if gs encounters any problem during opening a file, no useful error messages are output -- it just shows the page as a big "?"20:23.47 
Robin_Watts ray_work: Crap integration doesn't preclude being OK with the GPL :)20:24.08 
ray_work Robin_Watts: gs is being installed in a non-standard place20:24.13 
Robin_Watts without a license file?20:24.22 
  Then it's against the GPL.20:24.26 
ray_work Robin_Watts: correct20:24.28 
  oh, and when you uninstall their software it doesn't invoke the gs "uninstall" so it leaves it laying around (not any kind of violation, just poor Windows practice)20:27.08 
Robin_Watts ray_work: So their sole GPL infraction that I can see is that they don't provide the LICENSE file.20:28.42 
  You can argue that it's polite to properly install/uninstall etc, and to have a clear way of invoking it at arms length, but the only REAL infraction is that they don't provide a license.20:29.29 
  So Miles/Scott can genuinely tell them they are in breach, but it wouldn't take much for them to comply.20:30.08 
ray_work Robin_Watts: and identify Ghostscript as a GPL component. But that's if you don't accept that invocation as a process that makes it appear to be an integral part of the app is not equivalent to linking.20:34.54 
  They appear to violate Imagemagick's (not GPL) requirements as well in that it does not include a copy of the license (just a file with a link to it, which can break) and they don't provide the "clear attribution to ImageMagick Studio LLC"20:38.03 
Robin_Watts ray_work: Right.20:40.40 
ray_work One thing about ImageMagick's license file is that it claims "the license is compatible with the GPL V3", but GPL requires that source code (including any modifications) be available, and the ImageMagick license specifically EXCLUDES that20:46.06 
  The Graphic Region folks don't even mention Ghostscript or ImageMagick on their "About" screen (where some people bury the info)20:47.47 
  FWIW, I ran the x_-_renders_slowly.pdf using the Adv TIFF editor and it _is_ invoking gswin32c as a process. Still, unless you know where they hide the version of gs they use, you can't update it, and it looks like a "seamless" part of their app (same as if just the DLL was used).20:53.17 
sk8rjess hey guys i have a form inside pdf1. pdf1 is being combined to pdf2 but when this happens the signature form is erased. how do i keep the form?22:00.58 
  im starting to think it's not even possible. i tried using -dDOPDFMARKS as well but didnt change anything22:06.08 
alexcher marcosw: cups device appeard to be broken on the 'angstroms' node. Please check.23:58.16 
marcosw alexcher: thanks for the heads up. I'll take a look23:58.37 
alexcher marcosw: see my last commit.23:59.02 
 Forward 1 day (to 2013/02/07)>>> 
ghostscript.com
Search: