IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2016/04/06)20160407 
tor8 Robin_Watts: so... looking at adding in the svg parser11:40.00 
Robin_Watts tor8: cool.11:40.17 
tor8 I've got it running as a fz_document thing11:40.24 
  but it would be nice to be able to hook it into a fz_image11:40.33 
  so that epub can just use it as any other image11:40.46 
Robin_Watts tor8: Ah. Interesting.11:40.47 
  It would be nice to do that genericly.11:41.01 
  fz_image are decoded syncronously.11:41.58 
tor8 question is ... do it for svg specifically or handle it via fz_document11:42.01 
Robin_Watts Just pondering what locks are taken while they are decoded...11:42.16 
  do it via fz_document if at all possible.11:42.34 
  Feel free to leave that for me, if you want.11:42.59 
tor8 I vaguely recall seeing some epubs where the xhtml has svg embedded, not as a separate file11:43.12 
  but it should be easy enough to instantiate an svg document from a fz_xml thing11:43.38 
Robin_Watts fz_document can be instantiated from an fz_stream, right?11:44.26 
tor8 yes.11:44.36 
Robin_Watts So that should work out OK.11:44.50 
tor8 but it should be trivial to add a special 'init an svg fz_document from this fz_xml tree' for use in epub11:44.53 
  when we run into that problem11:45.04 
Robin_Watts Oh, you mean, you want to run it post parsing rather than from the underlying stream.11:45.38 
  I was thinking that we'd open a stream on the subsection of the file in question.11:45.57 
tor8 it might be an 'inline' svg in the html document11:46.10 
Robin_Watts tor8: yeah, so we find the byte range of the html document and make an fz_stream from that.11:46.34 
tor8 Robin_Watts: wasteful re-parsing of xml; and we have long since lost the byte range by that time.11:47.03 
Robin_Watts tor8: yeah.11:47.13 
tor8 there's another gotcha we might need to look out for11:47.27 
Robin_Watts I dislike special cases though.11:47.32 
tor8 if an svg document refers to external resources11:47.35 
  like png images, etc11:47.43 
  those need to be looked for in the parent context (i.e. the epub zip file)11:47.57 
  so I fear there are going to be quite a few special cases here :(11:48.08 
  One drawback of handling svg images as fz_image ... the pdfwrite document will write them as rasters not as vectors.11:49.13 
Robin_Watts That one doesn't bother me so much, cos having an open_document_in_this_context (or something, where the 'context' gives access to resources) doesn't feel like a massive upheaval.11:49.16 
  pdfwrite is capable of asking "what format is the underlying image" and reacting appropriately.11:49.53 
  The fallback position is to write them as rasters.11:50.01 
tor8 Right. so an svg image could then have a case to write an an XObject form thing.11:50.20 
Robin_Watts (actually to write them as flate decompressed lossless things).11:50.21 
  We have code that spots JPEGs and writes them unchanged.11:50.40 
  So we could extend that to spot other types too.11:50.57 
tor8 one alternative is to do FZ_IMAGE_VECTOR with a fz_display_list11:51.00 
Robin_Watts tor8: Yes. That sounds nice in fact.11:51.15 
tor8 rather than go via fz_document11:51.22 
Robin_Watts Well, going via document and having an fz_image_vector are not necessarily mutually exclusive.11:52.15 
sh4rm4^bnc i wanted to update jbig2dec to 0.12... and using my google-fu i was finally able to find the tarball13:24.00 
  however its incomplete13:24.11 
  lacks install-sh, depcomp, config.sub config.guess and probably others13:25.00 
  is there a chance a fixed tarball gets uploaded ?13:26.51 
kens Can't you just use the Git repository ? (I believe its in a Git repository these days)13:27.30 
  http://git.ghostscript.com/?p=jbig2dec.git;a=summary13:29.05 
  The 0.12 release is tagged there 15 months back13:29.22 
sh4rm4^bnc nope, my distro works with release tarballs only13:29.40 
Robin_Watts Also missing those files though.13:29.43 
kens Hmm well Git and jbig2dec are not my areas of competence :)13:30.07 
Robin_Watts I think you're expected to use ./autogen.sh13:30.08 
kens That would seem likely13:30.16 
sh4rm4^bnc no, because that pulls in unwanted build deps13:30.29 
Robin_Watts sh4rm4^bnc: Such as ?13:30.42 
sh4rm4^bnc autoconf, automake, libtool, m4...13:30.43 
Robin_Watts Right.13:30.47 
kens Well yes....13:30.52 
Robin_Watts the build process for the code as *we* supply it is to use autogen.sh13:31.07 
kens If you don't want to use autogen I suspect you are on your own13:31.13 
sh4rm4^bnc i wrote about this here https://github.com/sabotage-linux/sabotage/wiki/Why-github-downloads-suck13:31.33 
  see "autoconf-dilemma"13:31.46 
Robin_Watts If you want to avoid those dependencies (which you may be able to do because you know about your distro), then it's up to you.13:31.50 
kens The fact that you don't like autogen doesn't really signify. That's the way we build it and the way we expect you to build it. Of course, if you don't like it you don't have to use it, but we aren't going to build it the way you want us to, its not how we work.13:33.06 
Robin_Watts sh4rm4^bnc: I believe that as part of the release process, chrisl generates configure scripts from autogen.sh (for ghostscript at least).13:33.26 
  He may not do that for jbig2dec.13:33.37 
kens He might, but I sort of doubt it13:34.03 
  It might also fall out as part of doing Ghostscript, I don't know wneouhg and he's not here13:34.27 
sh4rm4^bnc tl;dr: using autogen.sh is a PITA13:41.12 
  it depends that you have the right versions of everything13:41.30 
  including m4 macros for deps13:41.37 
kens You won't find any arguments here, our build maintainer (chrisl) doesn't like it either, but we agreed some time back to use autogen so at present complaining about it won't make any difference.13:42.21 
  I strongly suggest you come back when chrisl is online13:42.41 
sh4rm4^bnc ok13:42.48 
Robin_Watts kens: cluster seems unhappy.13:56.57 
kens I was htinking that13:57.05 
Robin_Watts I'm going to kill my job to give yours a chance to run.13:57.11 
kens Hmm, OK13:57.18 
  The last one I ran was OK (I screwed up my code and got thousands of errors, but that's my fault)13:57.51 
  I hope I haven't filled up a scratch volume or something13:58.06 
Robin_Watts The non bmpcmp version that I just ran was fine.13:58.08 
kens Oh, well I guess that's encouraging, but I'll need a bmpcmp too if thsi works OK :-(13:58.50 
sh4rm4^bnc is there any important security-related bugfix in 0.12 ?13:58.57 
kens I guess I can try it and see what happens13:58.59 
sh4rm4^bnc or is it safe to continue using 0.11 until the configure script is fixed ?13:59.17 
kens sh4rm4^bnc : It was 15 months ago, I can't recall what the changes were, but if you look at our Gitweb interface you can see them all13:59.50 
  However a quick scan would suggest the answer is yes14:00.41 
sh4rm4^bnc well CVEs are usually easily remembered14:00.47 
kens I don't recall ever seeing a CVE for jbig2dec, which doesn't mean there aren't important security fixes14:01.07 
sh4rm4^bnc i see14:01.21 
kens Eg http://git.ghostscript.com/?p=jbig2dec.git;a=commit;h=6e1f2259115efac14cd6c7ad9d119b43013a32a114:01.45 
  http://git.ghostscript.com/?p=jbig2dec.git;a=commit;h=4e682afbfcb79ea61b096af38f4fa703274c192d14:02.04 
  I alos see numerous segv fixes, prevention of heap overflow (3 of) etc14:02.41 
  And in fact back in 2013 there were 7 fixes for CERT reported issues14:03.25 
sh4rm4^bnc ouch14:03.26 
kens Err 2012, I can't read dates now it seems14:03.47 
  Robin_Watts : going to try a bmpcmp now.....14:10.03 
  Yeah that looks totally broken. Weird, it was working OK earlier14:15.34 
Robin_Watts I see 106 rsyncs running.14:17.39 
kens O.O14:17.52 
  Oh, it looks like it completed14:18.04 
  But I sent an abort, so I don;t know what really happened14:18.15 
et^ Hi! Anyone got a few mins to help me a bit? Trying to print a pdf as a A6 pagesize, but it comes as A4. :) (Ghostscript.Net)14:19.34 
kens Ghostscript.NET is not, I'm afraid, anything to do with us. Although it does use Ghostscript, its developed, maintained and supported by someone else (j habjan)14:20.24 
et^ ah, ok :)14:20.40 
kens So its likely we won't really be able to help you, but I'm willing to listen14:20.43 
chrisl sh4rm4^bnc: Hmm, I should probably have done a jbig2dec release last month - it slipped my mind.....14:23.20 
sh4rm4^bnc chrisl, would you be so kind as to add all the autogen-generated files ?14:25.15 
  it really makes life much nicer14:25.28 
chrisl sh4rm4^bnc: that's one of the fixes I did last year (hence should have done a release this time around)14:26.02 
  sh4rm4^bnc: for some reason I cannot fathom, automake defaults to creating symbolic links to several of it's files (rather than copying them) hence the results of the default automake are specific to my system - which, AFAICT, is totally the opposite of the intent of the autotools14:29.28 
kens Interesting article from the MS Build:14:36.46 
  http://www.theregister.co.uk/2016/04/07/microsoft_rethinks_the_windows_application_platform_one_more_time/14:36.46 
  Robin_Watts : my bmpcmp looks like its completing now14:39.52 
  You might like to retry yours14:40.11 
sh4rm4^bnc chrisl, weird. maybe its a good idea then to untar the tarball to /tmp/foo or something and check if configure and make work as intended before publishing it14:48.35 
  (not wanting to sound like a smart-ass, but eh)14:48.53 
chrisl sh4rm4^bnc: That wouldn't be enough - I'd have to uninstall the autotools, or have a "fresh" machine - which I will do. I just didn't realise automake was being to stupid.....14:50.20 
sh4rm4^bnc cool thanks14:55.32 
chrisl sh4rm4^bnc: if you check tomorrow about this time, the release should be ready, all being well14:56.06 
sh4rm4^bnc great <314:56.31 
Robin_Watts 132 rsyncs. (Well, actually twice that number, cos rsync calls rsync it seems, but...) that can't be right.15:06.28 
kens It seems excessive15:08.11 
  it does seem to be running though15:09.43 
Robin_Watts marcosw loops around each rsync call 5 times to allow for retrying.15:10.05 
  I wonder if that's going wrong, and it's actually running all of them at once.15:10.23 
marcosw Robin_Watts: problem with the cluster?15:10.33 
Robin_Watts marcosw: When a bmpcmp is run, casper has a massive load of rsyncs run before the jobs get started properly.15:11.07 
marcosw before the cluster jobs are run on the nodes?15:11.43 
Robin_Watts It sits there at 30/1000 jobs, with ~145 ├─sshd─┬─145*[sshd───sshd───tcsh───authprogs───rsync───rsync]15:11.47 
  with those being: rsync --server -logDtpre.iLs . /home/regression/cluster/bmpcmp/.15:12.05 
  marcosw: Yes.15:12.11 
marcosw off hand I don’t know why that would be but i’ll look into it.15:12.39 
jogux robin: I /think/ rsync forks rather than calling itself (bicbw).15:12.48 
Robin_Watts jogux: The line above is from pstree. It shows (or seems to show) 145 instances of sshd calling sshd calling tcsh calling rsync calling rsync15:13.53 
  but I could be reading it wrong.15:13.58 
jogux I think sshd is forking too, I think ps tends to show forks as separate child processes because 'unix'.15:14.28 
chrisl forks *are* separate processes15:15.07 
Robin_Watts jogux: OK, but 145 of them seems excessive.15:15.27 
jogux that part I definitely agree with :-)15:15.37 
Robin_Watts I wonder that 33 nodes * 5 retries each is in the right kind of ballpark.15:15.52 
jogux chrisl: well, true, yes :-)15:17.31 
marcosw but the retries happen sequentially15:18.31 
  and there isn’t anything in the logs on the nodes suggesting that the retries are necessary15:20.42 
jogux could one node be running multiple jobs at the same time, all of which are calling the rsync?15:21.38 
marcosw jogux: that’s true15:23.54 
  but that shouldn’t just occur at the beginning of the run15:24.49 
Robin_Watts jogux: At the point at which I'm seeing the rsyncs, the dashboard reports 30 jobs have been sent.15:24.54 
  hence if that was the case, I'd (naively) expect 30 rsyncs max.15:25.11 
jogux marcosw: probably at the begginning of the run would be the only time they'd happen all that the same time, later ones would be staggered I would guess as jobs process at different rates.15:25.30 
  Robin_Watts: Your argument seems sound. Don't know enough about the cluster to counter :)15:26.03 
Robin_Watts jogux: Different nodes build at different speeds.15:26.09 
jogux nods.15:26.17 
Robin_Watts And (AIUI) nodes are triggered by polling the clustermaster, rather than vice versa, so there is an additional "at any time within a 30 second period" factor there to.15:26.55 
  too15:26.57 
jogux Robin_Watts: Hm. Makes it harder, but 33 nodes, that's still going to be an average of over one a second starting an rsync and I would bet the rsync rarely completes in under a second.15:29.10 
kens No idea if its relevant, but my last bmpcmp came bicj with "rsync retry 1" 5 times15:29.42 
Robin_Watts yes, but it lessens the difference between the start of the run and the middle of the run, I expect.15:30.10 
jogux Robin_Watts: true.15:30.55 
  it's happening again now15:32.03 
marcosw i’m seeing ~100 rsync jobs running, but that’s after 870 jobs have been sent and all the cluster nodes are running15:32.04 
Robin_Watts dashboard says "40/1000" sent.15:32.28 
marcosw Regression marcos bmpcmp started at 04/07/16 15:27:38 UTC - 1000/1000 sent - 100%15:32.44 
  that’s using the console dashboard15:32.52 
jogux makes it about 150 rsyncs now15:32.58 
marcosw presumably the http dashboard is delayed?15:33.12 
Robin_Watts marcosw: I guess it must be.15:33.22 
marcosw I don’t see any easy way of preventing this. The cluster nodes have to upload the completed bmpcmp output and if they don’t all do it at once it’s going to slow down cluster jobs. 15:34.03 
  I suppose they could gather output nto a .tar.gz file and upload them in chunk.s15:34.23 
Robin_Watts marcosw: Can't we upload all the files from a node at once at the end of the run?15:37.59 
  That would keep the number of rsyncs going on casper to the number of nodes.15:38.29 
  Would be slower of course, as we wouldn't start transferring files until they were all done.15:39.02 
  Best of all worlds might be to queue rsyncs from a node, so that no node ever has more than one rsync going at a time.15:39.29 
  So we'd get maximum use of bandwidth still, and not kill casper each time.15:39.56 
marcosw the problem with queueing jobs is that it make each job no longer indpendent.15:40.10 
Robin_Watts Possibly that might be as simple as taking/dropping a mutex around the rsync call ?15:40.16 
  In what way not independent ?15:40.29 
marcosw the rsync command is built into the job that is sent to the cluster node.15:41.14 
  the node software just runs a bunch of these jobs in parallel. it doesn’t know that the job contains an rsync.15:41.40 
Robin_Watts marcosw: Can we change it from rsync to my_rsync? And then have my_rsync be a script on each node that takes a lock, calls rsync, then drops a lock ?15:41.59 
marcosw that should work...15:42.36 
jogux marcosw: possibly /home/regression/bin/authprogs could be tweaked to only allow <x> rsyncs at once.15:43.01 
  though Robin's idea works just as well15:43.10 
Robin_Watts jogux: We don't want rsyncs to fail. We want them to block though.15:43.22 
jogux Robin_Watts: Yeah, I mean, sleep if there are more than <x>15:43.36 
  'allow' was the wrong word15:43.49 
Robin_Watts We could have a lockfile/rm pair in the script, and still use vanilla rsync?15:44.02 
  the target of lockfile could be set in an environment variable that could be set locally on each cluster node ?15:44.34 
marcosw Robin_Watts: I like the lockfile idea, but why does it need to be different on each cluste rnode? can’t we just use ./rsync.lock?15:46.03 
Robin_Watts marcosw: We could, yes.15:46.16 
  I was worried that we'd want to use /tmp/blah or something, and /tmp might be different on windows nodes or something.15:46.44 
  or MacOSX nodes.15:46.50 
  but ./rsync.lock sounds fine.15:47.03 
marcosw luckily we don’ thave any windows nodes :-) and I’m pretty sure that /tmp works on mac os x15:47.17 
Robin_Watts as long as we don't go changing directory.15:47.21 
  marcosw: We *could* have windows nodes though. The cluster stuff runs under cygwin. Or did a couple of years ago at least.15:47.42 
marcosw (yeah, it’s just a symlink to /private/tmp)15:47.48 
  cygwin has /tmp15:47.55 
Robin_Watts And with the new windows 10 bash stuff, it might run better when that's out of beta.15:48.18 
marcosw pretty sure every unix program would break if /tmp didn’t exist, but in any case we can use the directory that the bmpcmp are being generated in (./temp).15:48.48 
chrisl You could make it consult the environment for which temp directory to use15:49.53 
marcosw you mean TMPDIR?15:50.36 
chrisl Or we could have our own "CLUSTER_TEMPDIR"15:50.56 
Robin_Watts An environment variable? Crazy!15:50.56 
chrisl Better than the registry <sigh>15:51.29 
marcosw the cluster node already has a variable $temp that it uses for a temp directory ( currently set to “./temp”). I will just use that. 15:52.15 
chrisl So, that'll work, then.....15:52.33 
marcosw at one point I was experimenting with using ramdisks, so $temp was /dev/shm/temp)15:52.54 
  lockfile doesn’t exist on mac os x, so that’s a problem...15:53.36 
Robin_Watts https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/lockfile.1.html15:54.11 
jogux marcosw: flock is a bit more portable15:54.40 
marcosw marcos@macbookpro:[37]% lockfile15:55.02 
jogux lockfile is technically part of procmail iirc. and doesn't exist on my Mac.15:55.03 
marcosw lockfile: Command not found.15:55.03 
Robin_Watts http://stackoverflow.com/questions/10526651/mac-os-x-equivalent-of-linux-flock1-command15:55.42 
marcosw jogux: flock doesn’t seem to exist on mac os x either15:55.50 
jogux marcosw: uh. so it doesn't. ignore me :(15:56.04 
  perl it is then :-S15:56.17 
marcosw perl -MFcntl=:flock -e '$|=1; $f=shift; print("starting\n"); open(FH,$f) || die($!); flock(FH,LOCK_EX); print("got lock\n"); system(join(" ",@ARGV)); print("unlocking\n"); flock(FH,LOCK_UN); ' /tmp/longrunning.sh /tmp/longrunning.sh15:56.21 
Robin_Watts https://github.com/discoteq/flock15:56.26 
marcosw flock seems to be working. I’ve disabled the macpro node temporarily.16:42.13 
  |-sshd-+-24*[sshd---sshd---tcsh---authprogs---rsync---rsync]16:42.26 
Robin_Watts marcosw: Nice one.16:44.27 
marcosw Robin_Watts: thx, but you found the problem and the solution.16:44.58 
jogux as marcosw suspected, the performance of aws general purpose SSD is pretty poor - | make the guaranteed speed for casper's root approx 8MBps, theoretically burstable upto 11.7MBps (contrast with a modern SSD that should achieve around 500MBps.)16:55.32 
  so that would explain why casper feels like it has a spinny disc :-)16:55.44 
Robin_Watts jogux: We don't *generally* do much compute intensive on casper.17:30.41 
  Running the git server is probably the most intensive thing.17:31.00 
  The cluster shouldn't really be compute intensive. The recent problems were due to join going crazy cos I'd given it filenames with spaces in.17:31.31 
  s/cluster/cluster master/17:31.53 
  obviously the cluster nodes themselves are compute intensive :)17:32.09 
  But even then, probably not disc intensive.17:32.21 
sebras Robin_Watts: jogux: are you seeing performance issues with casper?17:34.25 
jogux we saw one where bmpcmp rsyncs were making out casper's I/O bandwidth several times over, but hopefully Marcos has fixed that though.17:35.14 
  s/making/maxing/17:35.22 
sebras jogux: right.17:35.42 
  jogux: a simple read of a file gave me 37Mbyte/s which equates to almost 300Mbps. but that was without any processing at all.17:36.42 
jogux Hm. I wonder if I got my IOPS -> MBps calc wrong. afaict, amazon guarantees us 2,000ish IOPS.17:37.28 
  (and should let us burst to 3,0000 IOPS temporarily)17:40.31 
Robin_Watts jogux: Are those "Random 4k IOPS" ?18:34.52 
jogux Robin_Watts: Urm, pass. https://aws.amazon.com/ebs/details/ is the entire extent on my knowledge on the subject :)18:35.38 
Robin_Watts In which case 2000 IOPS is ~8MBPS.18:35.47 
  Do you mean 30,000 or 3,000 burst? :)18:36.04 
jogux Robin_Watts: that page says 'Max IOPS Burst Performance3,000 for volumes <= 1 TB'18:36.23 
  I don't know how sebras got 37Mbyte/s. That would probably imply casper had at least some of the file cached in ram then I guess.18:37.23 
Robin_Watts That page says Max throughput of 160MB/sec, which presumably matches the 10,000 IOPS/volume.18:38.42 
  If we take 30% of that (because of the <= 1TB thing) that's 16*3 = 48Mbyte/sec max.18:39.14 
  which is ~sebras figures.18:39.24 
jogux Robin_Watts: I think when it says "max" it means "if you had a huge volume such that you had a burstable 10,000"18:43.39 
  I think our max is 3,00018:43.52 
  admittedly this may depend on our I/O size.18:45.09 
  http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-io-characteristics.html says a bit more18:45.17 
  interestingly running hdparm -t -T I get:18:46.49 
  Timing buffered disk reads: 248 MB in 3.00 seconds = 82.53 MB/sec18:46.50 
  but I never understand exactly what that means :-)18:47.06 
sebras Robin_Watts: jogux: the first times I ran dd on my file I got 37Mbyte/s, and the fastest iteration was 97.8Mbyte/s but by then it was definitely cached.21:56.09 
 Forward 1 day (to 2016/04/08)>>> 
ghostscript.com
Search: