IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2015/01/28)20150129 
tor8 Robin_Watts: got a minute?11:43.09 
Robin_Watts sure.11:45.38 
  <FX: Runs to fetch caffiene>11:45.48 
tor8 I nuked the embedded contexts in fz_stream and fz_output. the results are nicer than I thought, but I'll need to run some benchmarks to see if it's made an performance impact.11:46.19 
Robin_Watts tor8: So everywhere you pass a stream or an output you now need to pass context too?11:47.00 
tor8 my instinct tells me it should be a wash, passing an extra argument vs doing a pointer dereference to get the context back out11:47.02 
  Robin_Watts: yeah. making the API symmetrical, always take a fz_context no brain activity required11:47.20 
Robin_Watts symmetry is a good argument.11:47.40 
tor8 and hopefully removing the need for the rebind magic that can go wrong so easily11:47.52 
Robin_Watts I still *really* want to replace fz_stream with estream.11:48.03 
tor8 what is estream?11:48.14 
Robin_Watts estream is the stream abstraction we use in sot.11:48.26 
tor8 I'm not opposed to rewriting the fz_stream api and merging fz_stream and fz_output somehow11:48.30 
  the entry points in most places should be similar with read/write11:48.51 
Robin_Watts I had not considered merging stream and output.11:49.09 
tor8 should let us filter stuff both on input and output, so we can compress using the same filters11:49.32 
  though our current filters only decompress11:49.38 
  not something I consider hugely important, though11:50.03 
Robin_Watts no.11:50.09 
tor8 but we should be using fz_stream and fz_output for all of our i/o needs11:50.26 
  and also more consistently use fz_buffer for memory buffers11:50.47 
Robin_Watts I think estreams are easier to think about/code, but we have stuff working at the moment, so it's purely a refinement at the moment.11:50.59 
tor8 anyway, what I wanted to discuss is the fz_device interface11:51.12 
Robin_Watts got a patch on line?11:51.13 
tor8 yeah, tor/drop branch11:51.19 
  lots of search-and-replace style commits11:51.43 
  I was hoping to make the device callback functions of the form foo_fill_path(fz_context, void *user, ...other arguments...) rather than fz_device11:52.31 
Robin_Watts That might tie in with something else.11:53.40 
  You have a stylistic thing of doing structs of a fixed size (like fz_device) and then having a void * user pointer.11:54.17 
  My instinct is to have a struct like fz_device, and then have other structs based off that, like: fz_device_foo being { fz_device base; extra fields... }11:55.24 
tor8 yea mean to embed the fz_device at the beginning of the user data instead?11:55.31 
  yeah. I've considered that11:55.35 
Robin_Watts It means less mallocing generally.11:55.51 
tor8 needs nasty type casting though, but less mallocing and means passing the same pointer semi-transparently11:56.04 
Robin_Watts I think it's important than in device callbacks we always pass the device.11:56.16 
tor8 if only C had the plan9 C extension where an anonymous struct at the beginning of a struct would type punt11:56.23 
Robin_Watts s/callbacks/functions/11:56.28 
  tor8: yeah.11:56.35 
  You don't need type casting.11:56.57 
  You can just pass &dev->base rather than dev11:57.21 
tor8 somewhere you need to cast back or to fz_device_foo11:57.58 
Robin_Watts and if you're consistent about using 'base', you can #define BASE(x) (&x->base)11:58.08 
tor8 or you mean to let the user see fz_device_foo structs?11:58.21 
Robin_Watts tor8: True.11:58.24 
tor8 Robin_Watts: this is what we do for fz_document/pdf_document11:59.09 
Robin_Watts Right. Possibly cos I was the last one to touch that? :)11:59.33 
tor8 but annoyingly they have duplicated fields... both fz_document and pdf_document have a ctx, so there's both doc->ctx and doc->super.ctx :/11:59.33 
  it's newish code and I might have been influenced by you :)11:59.46 
Robin_Watts tor8: yeah that's not great.11:59.52 
tor8 no, but I think we might be able to get rid of the fz_document ctx12:00.09 
Robin_Watts But to return to the device functions...12:00.10 
tor8 but I'll deal with *_document later, devices are my current focus12:00.24 
Robin_Watts I think we *do* need to pass dev into every device function, not just dev->user12:00.47 
tor8 yeah. we need the 'hints' flag in the subdevices12:02.00 
  the error depth thing is done at the wrapper layer that calls the function pointers IIRC12:02.12 
Robin_Watts And if you want to do a device that (say) maps stroked text to text, you want the dev so you can pass on.12:02.42 
tor8 Robin_Watts: true enough12:03.45 
  okay, so we really do need to pass the fz_device along rather than just the user pointer12:04.07 
Robin_Watts I feel vaguely uneasy about passing context everywhere in speed critical places.12:04.14 
tor8 which makes the choice of user pointer or embedded struct moot, the subdevices can do whichever12:04.33 
  Robin_Watts: yeah, but I'm going to measure some benchmarks now and see if it actually has an impact12:04.48 
Robin_Watts tor8: perfect.12:04.56 
tor8 I wouldn't be surprised if it's actually faster for those functions that actually use the context12:05.00 
Robin_Watts the user pointer/embedded struct is an issue.12:05.05 
  cos we currently have fz_new_device (or something like that) that allocates a basic fz_device struct.12:05.36 
tor8 Robin_Watts: right. pass a size_of_extra argument as well?12:06.18 
Robin_Watts We would need an fz_init_device (or an fz_new_device that took sizeof(required struct)) to neatly allow the oo way of working.12:06.21 
  tor8: I think we should standardise on one or the other.12:06.37 
tor8 Robin_Watts: the construction of fz_document is a nasty mess12:06.43 
Robin_Watts but yes, that would be a good start.12:06.47 
tor8 so we could consider cleaning both of these up at the same time12:06.59 
Robin_Watts fz_document is slightly weird cos of pdf_no_run maybe.12:07.15 
tor8 also, the pdf_process struct uses another pattern, init/fin on a stack allocated struct12:07.26 
  Robin_Watts: possibly, but I think it just grew organically and mutated in odd ways12:07.41 
Robin_Watts tor8: init/fin may not be a bad thing to follow.12:07.57 
tor8 which is a pattern which we could standardise on, but I'm not sure the gain (one malloc/free pair) is worth the extra API cognitive load12:08.23 
Robin_Watts init/fin are construct and destruct without malloc/free12:08.26 
tor8 Robin_Watts: yeah. I'm not convinced of its benefit in non-performance related code, since we're heavy malloc users everywhere else12:08.55 
Robin_Watts yeah, but where it's needed for performance it's a win.12:10.15 
  init/fin can be thought of as an internal part of new/free.12:10.36 
  and we only expose it in circumstances where it's required.12:10.54 
  it's not a huge change to the way we work.12:11.14 
tor8 no, it's fine for our internal interfaces12:11.29 
  but not something I'd want to expose to the public (since it requires non-opaque data types)12:11.48 
Robin_Watts indeed.12:12.41 
tor8 agh, don't quit the editor when focused on xchat!12:14.32 
paulgardiner init and fin are a pain in some places within sot. Forces to have object plus flag saying whether the object is allocated or not, although I suppose it would be possible to find special values within the struct to mark something as not allocated12:21.33 
Robin_Watts paulgardiner: Yes, they can be a pain, but they can also be powerful.12:22.39 
  SOT has them for a reason.12:22.54 
paulgardiner Not in the case I'm refering to.12:23.09 
  I think the argument was "well pthreads uses it and that's Lunux so it must be right"12:23.35 
  If the malloc is the only possible failure then that can be a good reason for init/fin12:24.03 
kens heads for some lunch12:27.41 
tor8 Robin_Watts: inconclusive first benchmark results indicate that passing the ctx everywhere is actually faster13:09.35 
  than fetching it out of a struct13:09.51 
Robin_Watts ok...13:10.01 
tor8 and on other test files it's marginally slower13:11.29 
  pdfref17 is faster with context everywhere (by ~50 ms over the whole document)13:12.54 
  it's in the hard to measure category...13:13.53 
  and about the same diff slower on a more graphically intensive document13:15.54 
  so I'd say not to worry about it for now, the API benefits outweigh any performance differences13:16.13 
Robin_Watts I guess that if it's hard to measure, it's in the 'we don't care' area.13:16.22 
tor8 especially since it seems to be a wash13:16.24 
  some things marginally faster, others marginally slower13:16.38 
  I expect we might be slower now than in the previous case and only for functions that don't use the ctx themselves13:17.15 
  otherwise the time spent passing the extra argument is just spent fetching it back out of the struct13:17.28 
  and those functions are pretty rare13:17.36 
  and any inlined code should be faster, since there's no pointers that have to be fetched13:18.05 
Robin_Watts We use ctx a lot. Either to pass it to other functions, or in try/catch.13:18.31 
tor8 Robin_Watts: have you got an arm platform to test on?13:18.33 
Robin_Watts tor8: I have a beagleboard.13:18.43 
  and a pi.13:18.50 
tor8 could you do a "time mudraw -5 pdfref17.pdf" on those comparing master to tor/drop?13:19.26 
  if you're not busy?13:19.36 
Robin_Watts not immediately.13:19.43 
tor8 it would be good to know if arm makes more difference, intel cpus are so finicky and hard to predic13:20.05 
  t13:20.06 
Robin_Watts it would.13:20.15 
kens Hmm Office on Android now available:13:23.23 
  http://www.theregister.co.uk/2015/01/29/apple_office_android_tablets/13:23.23 
pedro_mac seems to be cloud only though, can’t open docs already on your device13:37.01 
  (or at least not directly from their app)13:37.31 
  and can only save edits to a cloud share13:39.03 
kens No ideas, its clearly limited since you need Office 365 subscription to create/edit docs13:42.10 
  Makes it a viewer then I guess13:42.24 
pedro_mac it has dropbox & sharepoint support, so you don’t need a subscription but it doesn’t let you save/load to device 13:46.58 
  strange choice13:47.06 
kens The article says you need a subscription to Office 265 to edit or create documents, is that not correct ?13:47.53 
  Office 365*13:48.01 
pedro_mac I have it on my phone and just use dropbox13:48.24 
Robin_Watts Office 265 is the european public sector version where they take more holidays.13:48.32 
kens :-D13:48.39 
  Some of the comments on the web site seem to indicate that indeed you don;t need a subscription13:50.25 
pedro_mac its a massively cut-down editing experience too13:52.43 
  I get the choice of making my text red, yellow or green13:52.57 
kens Well, that's good news for SOT right ?13:53.11 
Robin_Watts The news is not as bad as it might have been.13:53.38 
pedro_mac no font selection either - just size, style and a choice of 3 colours13:54.18 
  probably enough to encourage people to buy the pro version if/when it arrives13:54.45 
  they have 1 million downloads so far though, and a 4 star rating13:55.48 
  for a 1 star feature set - hey, what do yu want for nothing?13:56.17 
neves Hi!I'm developing android app,which should download pdf file from url.My problem is,I must download not all pages for one time,but separately,every page,so user shouldn't wait until all document will be downloaded.Is it possible,using MuPDF?13:56.25 
Robin_Watts neves: MuPDF has code in its core to allow documents to be displayed as they are downloaded.13:57.32 
  With a suitable linearized file, MuPDF will therefore show pages as they appear.13:58.07 
  If you have an http fetcher that can do byte range requests, you can even jump ahead in the file, and pages will be preferentially loaded as you look at them.13:58.46 
  BUT... that code is not hooked up for the android version.13:59.02 
  You can probably hook it up yourself if you want if you are a competent C programmer.13:59.41 
  neves: Are you aware of the licensing situation with MuPDF?14:00.08 
neves No,I'm not.Yet14:00.27 
Robin_Watts MuPDF is developed by Artifex (us).14:01.39 
  We release it in 2 ways.14:01.47 
  Firstly, if you are happy to abide by the terms of the GNU GPL, then you can use MuPDF under that license.14:02.07 
  This means (among other things) that you must give away all the source to your application.14:02.28 
  (sorry, that should read GNU AGPL, but the difference is probably moot in this case).14:02.49 
  If the terms of the GNU AGPL are impossible for you to live with, then we can sell you a commercial license that lets you do what you want.14:03.47 
chrisl To clarify, with (A)GPL you don't have to "give away all the source....", you retain the copyright, but the source must be openly available/modifiable and re-distributable under the same license terms14:04.56 
neves Ok,thank but.But since I'm android developer it would be very hard to implement changes in mypdf core,I think..14:09.47 
Robin_Watts neves: The changes required are not in the mupdf core.14:11.52 
  They are in the android specific wrappers around the core.14:12.02 
  but that will require some C/JNI/Java14:12.49 
  Or with a suitable commercial contract, we could possibly do the work for you.14:13.13 
Robin_Watts foods. bbs.14:15.04 
neves Ok,thanks for your help!sorry,I can't offer you a commercial contract since I'm just a developer14:37.22 
Robin_Watts neves: No worries. If you decide to tackle it, let us know.15:01.01 
henrys kens: the problem I punted to to you? Did you see it?15:41.26 
kens Yes, there are several parts to it15:44.24 
  I was going to send round an email for comment, to tech15:44.43 
  But I was hoping to fix an actual limitation exposed by the code first.15:44.58 
  Which I'm getting nowhere with at the moment15:45.06 
  I'll finish up the email and send it for comment15:45.23 
henrys kens: they contacted me again yesterday for a schedule so if it's a big "todo" let's make a bug with your analysis and point them to it.15:51.07 
kens It cna be partially solved 'reasonably' quickly, partly on their end, partly on ours. A major portion is not triuvial and would be weeks to months of work. I'll finish this email and you cna read it.15:51.51 
henrys chrisl: we missed your gs font expertise yesterday.16:00.38 
chrisl henrys: I saw some of the discussion - but didn't read it all in details16:01.19 
henrys chrisl: probably don't need to.16:04.32 
chrisl henrys: what I can I say is that we can't use the same trickery for TTFs that we do for UFST/Microtype - we could do something sort of similar, but it would be potentially *much* more complicated16:05.42 
henrys chrisl: don't worry about that... but otf cff is the direction.16:06.37 
chrisl henrys: Okay, I've been doing a little experimenting, although I've been using just CFF, not OTF......16:07.25 
henrys why does adobe acrobat ship with otf instead of cff?16:07.59 
chrisl henrys: I assume for greater compatibility - the Windows font engine can (sort of) handle OTF/CFF, but not bare CFF16:08.41 
henrys chrisl: well we can think about converting what urw is going to deliver but it doesn't look like a big savings over otf.16:10.03 
chrisl henrys: the base URW gs font set (the latest ones we just got) got from 2.4Mb in Type 1 pfb format, to 1.1Mb in "bare" CFF16:10.27 
henrys yes there is about 40 to 50 percent from type 1 to cff but the savings. I was talking about the difference between cff and otf with cff outlines16:11.29 
chrisl Yeh, I'm just not sure right now how to poke fontforge's scripting interface to produce OTF/CFF - hence I tried CFF first16:12.19 
henrys tor posted a script to pastebin I've been using.16:12.46 
  well I just took out the cff extension and used otf in his script and it seems to work. But how good is fontforge? I feel like I'm depending on this thing for these numbers and have no clue if it's producing something reasonable. Have you done a cluster push with a cff substitute for the a type 1?16:14.34 
chrisl I haven't clusterpushed yet, the CFF fonts don't *quite* work with Ghostscript because of some of the crazy sh*t we do when loading fonts16:15.38 
henrys chrisl: I think I can do a cluster push with the type 1 courier converted to otf with pcl. Be interesting to see the bmpcmp16:16.20 
chrisl henrys: I'm not sure that will work.16:16.56 
kens henrys mail sent to tech, its a bit lengthy I'm afraid but its hard to explain this problem quickly. I really would like you at least to read it and consider what (if anything) we should do about this particular issue. Other opinions welcomed by the way (hint hint; chrisl, ray, Robin etc)16:18.05 
Robin_Watts kens: 'form cache' sounds like something that would be done in MuPDF using a display list.16:20.28 
  The idea of having to use clist to do it in gs makes me go cold(er).16:20.46 
kens Robin_Watts : there's lots of ways to do it, GS doesn't do it at all16:20.52 
  It doesn't have to be done for low level devices at all. Its possible that we could store an /Implementation in the form dictioanry and have the form code check it, if it finds that, it doesn't execute the form, just sends the Implementation to the high level device (for an example)16:21.42 
Robin_Watts "MD65"16:21.56 
kens Ooops16:22.01 
Robin_Watts 13 times as good as MD5.16:22.09 
kens 13 times slower too ?16:22.17 
Robin_Watts "WHich"16:22.52 
  "/R19as"16:23.21 
kens I was in a hurry writing a lot of this.....16:23.33 
Robin_Watts "WHen"16:23.38 
kens Really I;'m more interested in comments about the facts and implications than spelling mistakes16:23.56 
henrys and Shapr is the one I noticed ;-)16:24.04 
Robin_Watts "R18 Do"16:24.19 
  kens: yes, just mentioning stuff as I go.16:24.35 
henrys how is this different than pdf/vt that I'm constantly badgered about at tradeshows is it completely separate?16:24.37 
kens This is PostScript input16:24.48 
  PDF/VT is a way of doign the same task with PDF input (sort of)16:25.06 
henrys right but presumably if we had PDF/VT machinery in the code.... then it would be useful to this problem.16:25.42 
Robin_Watts kens: Reads well to me.16:26.07 
kens He could rewrite his contents as PDF/VT, yes16:26.07 
chrisl kens: I'm assuming that the Implementation key could simply be an integer index, and that would be sufficient for high level devices?16:26.47 
kens He would have to convert the fixed portion to PDF separtely (3 pages) then add the variable portion (which is all 'aaa' and similar in his test file to the 'VT' definition, which I don't recall offhand16:26.47 
  chrisl yeah I was thinking the object number already in the PDF file would be easy16:27.05 
  It woudl work for ps2write and pdfwrite, teh PS front-end doesn't need to know what it is, its very presence implies 'send the associated value direct to the device'16:27.38 
chrisl kens: doesn't that complicate things by pdfwrite having to communicate that back up to the interpreter?16:27.49 
kens chrisl, indeed it does, yes16:27.58 
  I didn't say it would be easy :-(16:28.04 
chrisl I was thinking of just adding it in at the interpreter end....16:28.24 
kens I'm also wondering if we should have a Forms cache for rendering, though its much harder to justify that16:28.25 
  We could certinaly add an ID at the interpreter, pdfwrite could use that instead. It would be as complex though, possibly16:29.15 
chrisl In theory, it's practically the same as a Type 1 pattern cache, but.....16:29.17 
kens Much easier if the object number relates directly to the existing stored object16:29.28 
  Forms can be much bigger than (sensible) pattern tiles though16:29.55 
henrys from a marketing perspective if we can call whatever we do for this customer pdf/vt it makes a lot more sense to undertake it, if we can't I"d want to push back.16:30.24 
chrisl That doesn't matter, if the tile is too big, it uses a clist16:30.25 
Robin_Watts clist pattern tiles and a form cache would.... what chrisl said.16:30.34 
kens henrys we absolutely cannot call it PDF/VT since it doesn't involve that at all16:30.44 
henrys push back if we can't find a simple solution.16:30.50 
kens I cna solve 'part' of the problem16:31.08 
chrisl kens: a spec_op for the interpreter to say to the device "I have a form, give me a 'something' for the implementation key"?16:31.44 
kens The [/Pattern] colour space should be fixed anyway, its wrong16:31.49 
Robin_Watts kens: So... pdfmarks cause a problem cos they write Illustrator metadata into the file.16:31.57 
  Does the illustrator metadata differ for each instance?16:32.12 
  What format is illustrator metadata in?16:33.16 
kens2 D'oh bad time for the net to die. I was just saying that we don't know the ID for the form until after its stored, so it would be best if the interpreter sent a spec_op after the endform saying 'can I put an implementation in here'16:34.02 
Robin_Watts kens: So... pdfmarks cause a problem cos they write Illustrator metadata into the file.16:34.16 
  Does the illustrator metadata differ for each instance?16:34.17 
  What format is illustrator metadata in?16:34.19 
henrys kens: I'm fine with the prose except the spelling stuff. lgtm16:34.38 
kens2 Robin_Watts : the illustrator metadata is the same for each instance of the form, therefore using a form cache would resolve that problem, as well as all the others, including performance16:34.51 
  henrys, spelling corrected already16:34.59 
  Robin_Watts : the Illustrator metadata is XML, but actually it *could* be anything, and there are other kinds of pdfmarks16:35.27 
Robin_Watts ok, let me rephrase the question a bit...16:35.27 
henrys kens: if we did pdf/vt would we be able to use that machinery to solve his problem was my question.16:35.39 
Robin_Watts but pdfmarks write what? arbitrary streams? or an arbitrary pdf object? or multiple objects?16:36.26 
kens2 henrys, yeds, but the customer would have to alter their workflow away from PostScript to manufacture the files as PDF/VT. I don't know why they want these files as PDF, but I'm assuming they want them as *real* PDF file, they aren't intending to use the PDF for printing, otherwise they'd be better staying with PostScript16:36.35 
Robin_Watts (I am, as you can probably tell ignorant of what pdfmarks are, other than being "some magic that lets you set some pdf stuff from postscript")16:37.23 
kens2 Robin_Watts : pdfmarks can write pretty much anything. This particular one writes a Properties dicitonary which references a stream. The Properties dictionary can contain anything which is valid for a dictionary. SO this data could be abnything which is valid as 'general' PDF16:37.29 
  It can't, for example, write an xref, or a Pages tree or anything like that16:37.58 
Robin_Watts kens2: pdfmarks are postscript code?16:38.18 
kens2 Yes they are16:38.23 
  But they create PDF objects16:38.29 
chrisl Hmm, disabling pdfmarks during an execform wouldn't be a general solution to the problem :-(16:38.45 
kens2 THey are, as you said, a magic way to construct 'stuff' in a PDF file16:38.47 
Robin_Watts So there are specific operators that can be called by pdfmarks that generate pdf objects ?16:38.59 
kens2 pdfmark is the operator, the arguments define waht type of object is written (and where)16:39.26 
  chrisl I agree, a form cache is a much better solution16:39.37 
Robin_Watts If I read your email correctly, a form cache would not solve the problem with a change to avoid pdfmarks too ?16:40.15 
kens2 But trying to identify if a random pdfmark matches some random object which we've already written to the file is too much for me to take on16:40.17 
Robin_Watts s/with/without/16:40.24 
chrisl But then implementing a full form cache would be quite a lot of work for really very little real world benefit........16:40.45 
kens2 Robin_Watts : yes it would, because we would not execute the form again, so we wouldn't execute the pdfmarks in the form, and so wouln't end upw tihdifferent form content streams16:40.52 
Robin_Watts Ah, I see.16:41.04 
kens2 chrisl a quick one for pdfwrite/ps2write would work well though16:41.10 
Robin_Watts a form cache does sound like a nice solution.16:41.15 
  And if we can leverage the pattern clist code to do it...16:41.28 
kens2 From my POV its the best solution, but I have no real clue how long it would take to write.16:41.40 
chrisl kens2: yes, I was thinking of a full blown cache16:41.44 
Robin_Watts (could we even reuse the pattern cache maybe?)16:41.53 
chrisl Robin_Watts: this is Ghostscript we're talking about......16:42.08 
  ;-)16:42.13 
kens2 : A full-blown cache would take longer than a quick and dirty pdfwrite solution. I guess we could do aomething with the pattern cache code16:42.15 
  I doubt we could reuse it, maybe take some hints16:42.30 
Robin_Watts What's the lifespan of the pattern cache? per page?16:42.47 
chrisl The lifetime of the color space object]16:43.08 
kens2 Hmm, I assumed it was the lifetime of the job16:43.08 
  That makes more sense chrisl16:43.26 
  No point in keeping the pattern bitmap after the colour space goes away16:43.43 
henrys kens2: have you had technical conversations with these folks before, are they going to have any idea what you are saying?16:44.03 
Robin_Watts kens2: Ordinarily I'd be really scared to do anything that involved the clist, but given that michael/ray have already done the pattern clist stuff, I'm guessing that the really nasty decoupling of page/clist has been done already.16:44.09 
kens2 henrys, nope as far as I know I've never spoken to them16:44.19 
chrisl I'm wary about devoting a lot of time to a form cache because forms are rarely used, and almost never used "properly"16:44.34 
kens2 I've no clue if they will understand any of this, one reason I wanted to run it past you16:44.35 
  I just don't have a good idea how long a 'full' form cache would take to implement. I suspect I could do a quick and dirty implementation for the high level devices quite quickly16:45.24 
  Just add a spec_op after the endform to get a number to store in the Implementation. Check tghe form dict before beginform and if we have an Implementation, send a different spec_op to the device to say 'draw this form'. If it returns an error, go through the full execform for safety's sake16:46.39 
chrisl That would also improve the speed a lot, 'cause you could skip the checksumming16:47.47 
kens2 And also the execution of the form, which I htink is where allthe time is going16:48.08 
  I'm sure that's how Distiller is getting such performance on this file, if it was running the forms 5000 times it couldn't possibly (and yes, the customer example file is nearly 5,000 pages with 3 different form definitions.....)16:48.54 
henrys kens2: what does you PaintProc get them? Is it an improvement?16:49.07 
kens2 Its smaller henrys16:49.19 
  About 63Mb instead of 81 Mb16:49.32 
  The problem form is the biggest one and that still ends up in the file 1200 times16:49.48 
rayjj kens2: that's not much of an improvement compared to Distiller16:49.53 
henrys kens2: but not anywhere near adobe16:49.57 
kens2 rayjj ^^16:49.58 
  Like I said, if I fix the [/Pattern] so that the shadings don't mess up the form stream, that will almost certainly improve dramatically.16:50.35 
  I obvously can't say for certain without getting the problem fixed, and its turning out to be surprisingly difficult16:51.03 
rayjj kens2: and that's the problem with the Shading (Pattern) colorspace, right16:51.06 
kens2 I thought it would be quick to fix, half an hour or so, but its been all afternoon and I'm nowhere with it at the moment16:51.43 
  The problem is that the way the code works when it finds an uncolored pattern it doesn't write the [/Pattern] as a colour space at all16:52.22 
rayjj if we can pass a PDF_obj_id into some of the dicts (images, Patterns, etc.) it becomes a lot more straightforward to recognize that we already have it, right ?16:52.30 
henrys kens2: I'd be inclined to put everything you know in a bug, tell them we are still in a "research mode". If you want to just create the bug I'll talk to the customer.16:52.33 
kens2 SO our code for finding duplicate colour spaces doesn't work16:52.34 
  henrys OK I can crib the bug content from the email16:52.48 
henrys okay when I see but I'll write the customer and contact support.16:53.18 
kens2 rayjj not really. We have to check at definition whether a defined object is the same as an existing one16:53.29 
  henrys no problem, I'll go do it now.16:53.46 
henrys kens2: when I talked to them privately I did say it didn't look like something you were going to fix quickly so I think they are "braced"16:54.01 
rayjj every PDF object is unique -- if we knew the PDF obj#, then we know it's the same isn't it?16:54.32 
kens2 henrys if you and Miles think its worth it a 'quick' solution would be as I outlined above, to have pdfwrite say 'this form has this ID' and have the form code tell pdfwrite each time it is about to rerun a form. That would get gthem everything they want, performance and small size (I believe). What I don't know is how long it will take to implement16:55.16 
rayjj kens2: you just need to know what object you've created for which source PDF object16:55.28 
kens2 rayjj the input is PostScript, so no object numbers16:55.28 
rayjj kens2: oh. That part I missed.16:55.49 
kens2 That is, unfortunately part of the problem :-(16:56.08 
  BTW we don't even attempt to spot duplicate forms in PDF files, so if they were to take the Distiller output of this file and run it back through pdfwrite they would still get a monster file.16:57.00 
henrys chrisl: put this in batch.ff:17:06.45 
  Open($argv[2])17:06.50 
  Generate($fontname + "." + $argv[1])17:06.51 
  the then arg 1 is otf and arg 2 on is a font file.17:07.38 
chrisl henrys: I got it - I just wasn't sure if "otf" would result in CFF outlines, so I tried it17:08.05 
henrys chrisl: I looked at that and it didn't seem it converted to TT17:09.03 
chrisl It's rather poor use of the TLA since OTF doesn't mean it's definitely CFF outlines17:09.13 
henrys chrisl: I imagine if you started with a TT it wouldn't go to cff... but as I was saying I don't know how good fontforge is, if you start from a pfb and generate pfb you get something larger which is alarming but not completely unexpected.17:11.27 
kens2 henrys one (I hope) comprehenesive description in bug #69580517:12.36 
  Also has the customer number and such17:12.45 
chrisl henrys: of course, there will be loads of cluster diffs, even just changing Type 1 to CFF.....17:12.50 
henrys kens2: I think we should "sit on it" and discuss it next meeting after I notify the customer17:13.18 
kens2 OK not a problem for me17:13.27 
  I will try and fix the definite bug though17:13.37 
  as a low priority17:13.41 
henrys kens2: right.17:13.45 
kens2 I'm feeling cr*p again, this bug seems to hit me as the day wears on, so I'm off for the night, see you all tomorrow.17:14.50 
henrys has anyone not been sick in January?17:15.30 
chrisl henrys: with those latest fonts from URW, if I have fontforge regenerate pfb's from the ones we got, I get smaller files out: 2.0Mb vs 2.4Mb17:15.45 
henrys what version of ff?17:16.30 
chrisl fontforge 2012073117:16.57 
henrys chrisl: same thing, likely I had it backwards.17:17.48 
  details ;-)17:18.17 
chrisl I could decrypt the fonts and work out why, but I don't think it's worth it17:18.37 
henrys chrisl: yeah, I'm sort of annoyed not have the fonts from the vendor. He's created the fonts in a tool like fontforge where any format is a button push... geez.17:25.54 
  the otf with cff outlines that is.17:26.32 
chrisl henrys: I'd have thought/hoped they'd be amenable to the request17:28.44 
henrys I somehow missed this talk when it came out, I wish there was a short written summary of it somewhere, anyway worth a listen if you're into tech and civics: https://www.usenix.org/conference/usenixsecurity13/dr-felten-goes-washington-lessons-18-months-government17:31.10 
chrisl henrys: there's my bmpcmp (-t 16 -w 3) on the regression dashboard which is gs with the base fonts in CFF (all except symbol and dingbats)17:33.12 
Robin_Watts chrisl: bmpcmp -filter=.ppmraw :)17:34.26 
chrisl Robin_Watts: I just forgot.... and it didn't seem worth rerunning when the fuzzy got it down to such a manageable number17:35.13 
henrys chrisl: oh that's why my office is warm...17:35.27 
rayjj chrisl: is there a simple way to just get a few devices built into gs (other than autogen.sh and just edit Makefile) ? I want just bit, bitrgb, bitcmyk, bitrgbtags17:35.27 
Robin_Watts Are there any non halftoned ones there?17:36.10 
chrisl rayjj: with configure, do: --with-drivers=bit,bitrgb,bitcmyk,bitrgbtags17:36.35 
  rayjj: But you'll need pdfwrite, too, or gs won't work....17:36.54 
rayjj Robin_Watts: all of the bi devices can be any depth: 1, 2, 4 or 8 bits per component with -dGrayValues=2, 4, 16, 25617:36.57 
henrys chrisl: I've gone through a page and a half and don't see anything that wouldn't pass "fuzzy"17:37.04 
Robin_Watts rayjj: Different conversation :)17:37.13 
henrys chrisl: are these the new fonts converted?17:37.21 
chrisl henrys: yes17:37.26 
Robin_Watts henrys: They don't pass fuzzy cos they are halftoned :)17:37.35 
chrisl Wot Robin_Watts just said.....17:38.00 
rayjj Robin_Watts: sorry -- that makes sense that fuzzy doesn't work with halftoned images17:38.14 
chrisl There's a few on page 9 that are more noticeable, but not "wrong"17:38.55 
henrys chrisl: can you do the filter so we can all look at them quickly?17:39.16 
chrisl henrys: running now17:41.43 
henrys thanks17:41.57 
chrisl Hmm, except it's not appeared in the queue......17:42.34 
henrys rayjj: did you send out the email to the potential customer?17:42.49 
cryptopsy how can i move with arrows around a large picture opened in mupdf?17:42.57 
henrys I didn't see it.17:42.57 
rayjj henrys: still collecting numbers on linux x86. It's easy enough to also provide the ARM ROM sizes for the builds so I have those, but collecting the clist RAM size is harder. And I am doing mono as well as color based on the printers you had in that link (all at 600 and 1200)17:45.10 
henrys okay great17:45.53 
rayjj I will send it to tech for comment BEFORE it goes to the customer, just in case anyone has comments or questions17:46.15 
  and I have the Font size broken out so if we have the 136 CFF we can plug those in (presumably compressed)17:47.08 
chrisl So, for the 136 fonts from URW, converting from Type 1 to CFF goes from 7Mb to 4Mb and OTF/CFF comes in at 4.5Mb (and TTFs from URW comes in at 12Mb).17:54.36 
rayjj chrisl: what about zipping each font -- what's the total then ? (that's what romfs would do if we enabled compression)17:55.24 
chrisl rayjj: which ones, the T1 or the CFF?17:55.53 
rayjj chrisl: the CFF or the OTF's17:56.13 
henrys chrisl: right but we want to know the numbers with the new glyphs and we don't have those. I hope to extrapolate from the 3 fonts they sent us but that looks precarious17:56.21 
  I had hoped to extrapolate ^^^17:56.54 
rayjj I am just curious how compressible the CFF's will be17:57.22 
chrisl rayjj: ah, give me a sec, I made a mistake there.....17:58.11 
rayjj based on what we did at CalComp (with Peter's wrfont stuff) zip gave us about 80% of the original bzip2 got it to 70%17:58.35 
chrisl rayjj: ~2.9Mb gzipping the cffs individually17:59.33 
rayjj chrisl: great! so about 75% of the original size17:59.59 
chrisl rayjj: yeh, but I'd worry about the impact on performance......18:00.28 
rayjj and since current romfs doesn't compress, that's a reduction down from 7Mb to 2.918:00.52 
Robin_Watts Hey marcosw. Feeling better?18:00.55 
rayjj chrisl: fonts get loaded rarely18:01.23 
henrys chrisl: do you have current numbers for the ufst?18:01.40 
chrisl henrys: I don't think you want to know them.......18:01.55 
rayjj and gzip is pretty fast at decompression (unlike bzip2)18:01.56 
  The UFST 80 is about 800Kb iirc18:02.19 
  but it's been a while since I checked18:02.45 
chrisl The 135 PS3 fonts FCO is 1.2Mb18:03.19 
rayjj chrisl: that seems reasonable. Of course, we don't know what glyph set it has18:03.47 
chrisl rayjj: The glyph set is rather bonkers, frankly18:04.13 
henrys rayjj: I hope we do we just did a big analysis of urw vs. ufst, didn't we?18:04.31 
rayjj chrisl: as is the UFST quality, IMHO18:04.33 
chrisl You also have to add another ~150Kb for the plugin and the other fco which I forget what it's for....18:05.14 
rayjj chrisl: I think that's symbols or dingbats or something18:05.38 
chrisl Yeh, something like that.....18:05.51 
henrys I do wonder how many duplicate glyphs we could find in the 136 or at least visually the same or don't care. 18:06.25 
chrisl henrys: We're still not getting anywhere near the glyph set of the MT fonts *if* you allow the multitudes of "unstyled" "non-standard" glyphs they include18:06.43 
Robin_Watts chrisl: You and Ken argued the other day that postscript can do 'things' with the fonts which means that we have to have CFF rather than TTF. I don't want to open that particular argument again, but I was wondering what things they could do? Other than 'get the outlines for a given glyph' ?18:06.50 
  (sorry, feel free to ignore that until after the existing conversation dies down)18:07.42 
rayjj it might be interesting to pick a fairly common font like "Arial/Helvetica" and compare the glyph quality between URW and UFST and find some particularly ugly UFST glyphs18:07.43 
henrys chrisl: i.e. cjk?18:07.44 
  Robin_Watts: release the kracken18:08.18 
chrisl henrys: no, those crazy geometric shapes and "symbols"18:08.47 
rayjj Robin_Watts: PS can (and often does) add glyphs to the CharStr dict and plug them in -- they add in Type 118:08.51 
henrys anyway I'm going to do a run be back in an hour or so I'll write kens customer when I return.18:09.24 
rayjj Robin_Watts: and PS sometimes tries to diddle with the matrices to do artificial slant typeface18:09.48 
henrys chrisl: I thought we put those in the order for urw - the box things?18:09.49 
Robin_Watts gs can handle truetype fonts - presumably if someone adds glyphs to those fonts it "works"?18:09.53 
chrisl henrys: some of them, we left out the crazier ones18:10.11 
Robin_Watts (i.e. I bet we don't actually ever add to the real font)18:10.16 
chrisl Robin_Watts: no that doesn't work.18:10.31 
henrys chrisl: okay I think that's reasonable.18:10.35 
Robin_Watts chrisl: Ah, so we really do manipulate the cff internals for that ?18:10.55 
chrisl Robin_Watts: yes, or Type 1 internals - the point is, it needs to be a Postscript font, not a Postscript layer on top of another font format18:11.36 
Robin_Watts chrisl: OK. Curiosity dowsed for now. Thanks :)18:12.00 
chrisl Robin_Watts: the problem is, if we try to make a glyph from a "real" charstring, and the dictionary isn't from a charstring based font, bad things could happen - like running calling a subr, expecting another charstring, and getting an integer back18:13.13 
rayjj Robin_Watts: plus, the TTF's (at least before stripping out tables) are 12Mb compared to 4Mb for CFF. I'm not sure you'd get that back by stripping tables18:13.27 
henrys Robin_Watts: I've seen many postscript programs that do a condition if it is type 1 and assume it is type 2 if the condition fails on an internal font - I recall the position of the euro when adobe first release cff and moved the euro around.18:13.28 
Robin_Watts rayjj: I am not advocating the use of TTF at all.18:13.49 
rayjj Robin_Watts: good. otherwise you might get a midnight visit from some angry Scots ;-)18:14.35 
chrisl And the reason we can "hack" around all that for the UFST/MT fonts is because to render a glyph from those, we only use one standard, and two non-standard keys from the font dictionary, so the rest of the dictionary can be made to look just like a "real" type 1 font.18:15.48 
rayjj chrisl: I am curious about your statement that gs won't run without pdfwrite. I built it and it runs fine (at least tiger)18:17.13 
chrisl rayjj: really? Our startup code specifically loads pdfwrite initially or, at least, did not that long ago....18:18.06 
rayjj chrisl: Note that in order to get it to build I do need a patch I haven't uploaded yet.18:18.19 
  chrisl: hmm... it might be when doing PDF's annots.pdf fails with: Error: /undefined in --run-- Operand stack: --nostringval-- OutputIntent --nostringval--18:21.32 
  I guess I'll fix that as well since we really don't want printers to require pdfwrite18:22.08 
chrisl No, during startup we (did?) load pdfwrite and do a getdeviceparams - presumably as pretty much every other device uses a subset of the params pdfwrite uses18:22.54 
rayjj chrisl: that may have been fixed in gs_pdfwr.ps that now uses "IsDistiller" spec_op18:24.55 
chrisl Ah, possibly. I did discuss it with kens a while ago18:25.21 
  I'm going to have to finish now, I'm starting to get a headache (late night, last night!).......18:28.46 
cryptopsy bye for now18:47.11 
henrys marcosw: are you back to work?19:27.00 
Robin_Watts mvrhel_laptop: For the logs... 1 of the top 5 commits on robin/master is a fix for SOT builds in your code. Trivial thing. Let me know if you're not happy with it.20:07.29 
henrys is the cluster broken I get back all segv's that I can't reproduce locally? 20:58.21 
  ah nevermind it's perfectly correct all pcl -> pdf jobs are failing with otf which makes sense.21:05.40 
 Forward 1 day (to 2015/01/30)>>> 
ghostscript.com
Search: