IRC Logs

Log of #ghostscript at

 <<<Back 1 day (to 2013/07/17)2013/07/18 
marcosw is there anybody about who can answer a 'make so' question? specifically with is gs.o included in soobj/ this results in a duplicate .main symbol under AIX. removing gs.o from soobj/ fixes the problem and creates a .so that can be called from gsc.01:27.47 
henrys marcosw:but this works okay on other platforms right?02:45.05 
  so there are two symbols with the same name in the app and the lib which works okay if the linker deals with it okay. With dynamic linking shouldn't be a duplicate right? static would be an issue.02:47.31 
marcosw henrys: thanks for response. under AIX the duplicate main symbol in the shared library gives a warning, but it does work.06:36.41 
vtorri chrisl: ping06:38.50 
chrisl vtorri: pong06:39.23 
vtorri hey06:39.28 
ghostbot niihau06:39.28 
vtorri question about std* redirection06:39.41 
chrisl vtorri: I'm probably the wrong person to answer that.....06:39.59 
vtorri if i don't use buf, the callback must return 0, right ?06:40.03 
  i mean the callbacks set with gsapi_set_stdio()06:40.44 
chrisl You mean if you just discard the contents of the buffer?06:42.11 
vtorri yes06:42.34 
  just "return 0;"06:42.57 
  is it correct ?06:43.06 
chrisl No, I think you want to return the number of bytes in the buffer - otherwise the calling code will think you haven't done anything, and may continue calling until the buffer is written out06:44.37 
vtorri so "return len;"06:44.57 
chrisl Yes (thanks, was just looking for the name of the parameter!)06:45.17 
vtorri on the contrary, if I do06:45.35 
  i return 0, right ?06:45.47 
  actually, i do something equivalent to printf (it's our login stuff, but in the end, it calls printf(buf))06:46.49 
chrisl Sorry, I don't follow - printf returns the number of bytes written out06:48.25 
  So just returning zero could potentially be seen as an error condition06:48.48 
vtorri the problem is that our login system does not return the number of bytes written06:49.24 
chrisl Okay, with printf() you're probably okay06:50.14 
vtorri i call06:50.47 
  which is our log macro06:51.02 
  there is no returned value06:51.18 
  internally, it calls printf(buf)06:51.39 
chrisl vtorri: okay, so this is purely in the stdout/stderr callback?06:51.46 
vtorri but i do not have an access to the value returned by printf06:51.59 
  i will set stdin callback to NULL06:52.33 
  btw, i do set it to NULL :)06:52.50 
chrisl Okay, you will *never* get printf style formatted strings through that callback, we expand those ourselves, so the length of the buffer will be the number of bytes written06:53.08 
vtorri in my case, i can assume that all the buffer is written06:57.34 
  so :06:57.36 
  right ?06:57.48 
chrisl vtorri: well, sort of. buf is not guaranteed to be null terminated07:00.49 
vtorri ouch07:00.56 
chrisl although in practice, I think it usually is.07:01.24 
vtorri ok, thanks07:01.25 
  so malloc + memcpy + [len]=0 is preferable ?07:01.46 
chrisl yes, I would say so07:02.18 
vtorri thanks07:02.26 
marcosw morning chrisl07:02.28 
chrisl morning marcosw07:02.50 
vtorri chrisl: in the gsapi_set_stdio() doc, the returned value of the function is not described. Is it normal ?07:03.31 
  not the returned value of the callbacks07:03.47 
  the returned value of hte function itself07:03.58 
chrisl vtorri: it's a normal Ghostscript integer error return value - < 0 is an error, >=0 is fine07:05.13 
vtorri ok07:05.25 
chrisl vtorri: I thought the integer return values were explained *somewhere*, not sure where right now, though07:06.08 
  marcosw: So, the linking thing - it sounds like you're okay on that?07:06.20 
vtorri i don't see it in iapi.h07:06.26 
  ha, it's in ierrors.h07:06.49 
chrisl vtorri: I'll some words to API.htm just to say "unless otherwise stated integer returns are those in ierrors.h" or something07:07.59 
  s/some/add some07:08.12 
vtorri ok07:08.35 
Robin_Watts tor7: Morning08:32.34 
tor7 morning08:32.55 
Robin_Watts I just pushed an updated robin/progressive. Now appears not to leak, and still passes all the cluster tests.08:32.55 
tor7 okay, will look08:33.06 
Robin_Watts I hope it's getting towards commitability. Thanks.08:33.09 
  chrisl: While tor7 looks over the progressive stuff, I'm going to look at pulling lcms2.5 into gs and extracting our changes into a separate plugin for it.08:46.48 
chrisl Robin_Watts: okay, that's fine. Is there anything new of note in 2.5?09:07.39 
Robin_Watts chrisl: All our fixes :)10:30.54 
  but not our speedups.10:30.59 
chrisl_r61 Robin_Watts: hmm, I wonder why Marti doesn't want the speedups.....10:31.24 
Robin_Watts chrisl_r61: He thinks that they would be better done as a plugin.10:32.00 
  which is what I am going to try to recast them as this afternoon.10:32.14 
  He doesn't like my chaemeleonic header file to generate optimised transform code.10:32.30 
chrisl_r61 Ah, I can understand that, but still.....10:32.54 
chrisl_r61 Seems like a good guide on how *not* to write C!10:39.47 
paulgardiner tor7, robin_watts: There's still something that feels awkward with the font stuff we discussed yesterday. Each font descriptor is created twice. Firstly a font descriptor has to be created so that we can call pdf_load_font to create (indirectly) the fz_font object to use in the fz_text objects. Secondly the pdf-write device creates a font descriptor. It'll certainly do for now, but I can't...10:52.13 feeling that either the pdf device should be able to find the initially created descriptor, or I should be seeding the resource (passed to the device) with the descriptors and the device should be able to do matching.10:52.15 
Robin_Watts Yes, that does sound sub-optimal.10:55.25 
  When I've come up against stuff like this before, I've often had to refactor bits of mupdf.10:56.10 
  like functions used to be a pdf level thing, and I promoted them to a fitz level thing.10:56.34 
  similarly with shadings.10:56.45 
  I wonder if we could promote font descriptors from being a pdf level thing to being a fitz level thing.10:57.10 
  Then we could have a fz_font->get_font_descriptor function.10:57.48 
  which would either get the original font descriptor, or would make us one.10:58.19 
  or maybe we need both an fz_font_descriptor and a pdf_font_descriptor or something.10:58.45 
tor7 paulgardiner: would a function to directly create a fz_font (for use with fz_text objects) for the base14 fonts be useful?11:04.05 
  calling pdf_load_font to get the fz_font of an existing font you want to reuse seems logical to me, though11:04.39 
Robin_Watts tor7: The problem, AIUI, is for the device to get a font descriptor from a font.11:04.50 
  though such a function might be useful for app writers.11:05.03 
tor7 the pdfwrite device should, from that fz_font, either rediscover the existing font descriptor if it matches our narrow requirements, or make a new one11:05.13 
  in the general case, we can't know from the pdfwrite device that we're both creating a stream in an existing file and expect to reuse an existing font descriptor11:06.05 
paulgardiner I'm not sure whether we necessarily need the descriptor. I know I need the internal name that refers to the descriptor11:06.18 
tor7 the interpreter and output device are separated by the device interface, and should not cross-contaminate IMO11:06.49 
Robin_Watts tor7: The pdfwrite device can be expected to know details of the document it is working within.11:08.08 
tor7 robin_watts: I disagree about pdf_font_descriptor moving to the fz layer. they're an interpreter only thing.11:08.15 
Robin_Watts it just can't expect to know those details via the device interface.11:08.25 
paulgardiner tor7: I did wonder about a function to create an fz_font for base fonts. I dismissed it because I thought I needed the internal name to be in the fz_font obj, but actually it should be the name of the base font, so I think it would work.11:09.02 
tor7 robin_watts: the resource machinery in the pdfwrite device has the responsibility to create (or rediscover) font descriptors11:09.07 
Robin_Watts tor7: right.11:09.15 
tor7 all it sees are fz_font and glyphs, without the icky encoding and metric crap carried over from the pdf interpreter11:09.53 
  given a fz_font it should be able to create a font descriptor for use; and I was thinking it could also look for an existing one that would look just like one it would create itself and use that if found11:10.44 
Robin_Watts Where would it look for such an existing one?11:11.41 
paulgardiner In the resource dict?11:11.58 
tor7 in order to get resource reuse, it'd have to look through existing resource dictionaries (and maybe some pdfdevice internal list)11:12.00 
Robin_Watts tor7: right. And those existing resource dictionaries come from the existing document.11:12.46 
tor7 basically iterate through the fonts in the resource dict, if the fz_font matches and it follows our criteria about encodings reuse it, else create a new font descriptor and insert11:12.48 
Robin_Watts or are at least seeded from it.11:13.03 
tor7 I don't think we should scan the entire document for font descriptors, just the current page's resource dict11:13.54 
paulgardiner tor7: we get to pass a resource dict to the device11:14.21 
  So we could restrict the search to that11:14.32 
tor7 paulgardiner: when creating the pdfwrite device you mean?11:14.43 
paulgardiner tor7: yes11:14.49 
tor7 paulgardiner: right. that'd be the page resource dict usually, I believe.11:15.05 
  or an Xobject resource dict11:15.15 
Robin_Watts The resource dict we pass in when creating the device is (If my crap memory serves) the current resource dict for the thing we are writing.11:15.20 
tor7 the one you'd want to use anyway :)11:15.35 
paulgardiner tor7: certainly the latter when I've used the device11:15.36 
tor7 and the one we want to insert newly created font descriptors in11:15.50 
Robin_Watts But arguably, when it comes to fonts, we want to pass in other resources from the same file, and say "feel free to pull stuff in from here too".11:15.57 
tor7 robin_watts: yes, we could make use of a list of previously spotted font descriptors and insert those in the current resource dict as well11:16.41 
Robin_Watts otherwise we could end up creating a new font descriptor for every annotation on a page, say.11:16.52 
paulgardiner That's what I've been refering to as "seeding"11:17.20 
Robin_Watts tor7: At the moment, I think we probably create a pdfwrite device, use it, then destroy it.11:17.31 
  hence the ability to "remember" font descriptors internally (and indeed images, shadings etc) is restricted to within a single stream.11:18.01 
tor7 I guess we should extend the pdfwrite device to have some out-of-band non-device calls for this use11:18.02 
Robin_Watts quite possibly.11:18.09 
tor7 or have a separate pdfwritestate object that the devices share, and is hooked up to the pdf_document that is being edited or created11:18.54 
paulgardiner I'm slightly lost now. Isn't the existing ability to pass a seeded resource sufficient?11:19.06 
tor7 paulgardiner: actually, I think it might. if you share the same resource dictionary for all annotations you create.11:19.29 
paulgardiner It seems nice to me to have the flexibility to share or not depending on use, and I'd have thought the passed-in resource allows for that.11:21.06 
Robin_Watts paulgardiner: AIUI, the resource we send in is just "the resource dictionary for this stream".11:21.33 
paulgardiner Hmmm. Yes it is.11:21.56 
Robin_Watts If you 'preseed' that, you end up with potentially having lots of stuff in resource dictionaries that isn't used.11:21.57 
paulgardiner I may be going around in circles, but:11:22.35 
Robin_Watts I think I prefer the idea of passing in both "the current resource dictionary" for the thing to add to as required, plus "some other resources you might want to make use of".11:22.47 
paulgardiner It would be handy if either I could put in just the font-descs I wanted into the resource dict, or if the device could add them as needed (using existing one somehow known from the fz_font obj)11:23.48 
  robin_watts: oh okay11:24.16 
Robin_Watts Ah fz_fonts tied to a doc ?11:24.30 
  s/Ah/Are/ ahem.11:24.39 
tor7 robin_watts: so, a global resource dictionary that has it all, and then create a stripped minimal resource dict that has just the stuff used in a given stream?11:24.57 
paulgardiner I think type 3s are tied11:25.00 
tor7 type3 should be independent once they end up in the fz_device interface11:25.34 
  some of the bits are dependent, but they're only used by the interpreter and converted to other device calls11:26.06 
  any type3 font you get in a fz_text object has a display list per glyph you can use11:26.30 
Robin_Watts tor7: I wouldn't like to mandate that all callers MUST create a global resource dictionary, but I'd like the device to be able to make use of it if it was there.11:26.45 
paulgardiner Just to make sure I understand: so we choose this approach because we think it would be messy to make descriptors derivable from fz_font objects, but we think it should be possible to match an fz_font object to a known descriptor11:27.02 
tor7 robin_watts: yeah, that's what I meant when I was babbling about a pdfwrite internal list of font descriptors11:27.22 
Robin_Watts paulgardiner: That agrees with my fuzzy understanding.11:27.34 
  tor7: right. My only qualm with that is that currently we create a device, use it, throw it away. hence any internal list that gets built up wouldn't be long lived.11:28.10 
  perhaps we can think of a way to persist such an internal list across pdf device creation/destruction though.11:28.38 
paulgardiner We should be sure that's true, because if matching turned out to be just as hard, we are losing for no gain, the simpler approach of not needing to pass a global resource11:28.50 
tor7 robin_watts: which is why I'm thinking I'd like a pdfwrite-state object that the devices use11:28.51 
  so you'd create a new pdfwrite device per stream you want to create11:29.12 
Robin_Watts tor7: Right. Was about to suggest a pdf_write_resource list or something.11:29.15 
tor7 and then use some calls on the pdfwrite state object to insert the stream11:29.32 
Robin_Watts and you'd pass the pdf_write_resource into each pdf device creation.11:29.40 
tor7 yeah, except I'd make it more abstract since we're likely to want to use it for more things later on11:30.02 
  like outlines and links, etc11:30.16 
Robin_Watts tor7: ok.11:30.31 
  I was thinking that images/fonts/shadings would be enough (hence resource), but you may be right about outlines/links etc.11:31.14 
tor7 paulgardiner: we choose this approach because it's going to be messy to reuse a generic font descriptor (all the encoding crap in pdf gets in the way, and substitute fonts make for another mess).11:32.19 
  if by deriving you mean locating a font descriptor object number from a fz_font, that bit is easy enough (just load all the fonts in the resource dict, and do pointer comparisons of the fz_font)11:33.10 
paulgardiner tor7: I'm not completely understanding why yet, other than the fact that it's difficult fit with the interfaces.11:33.28 
tor7 if by deriving you mean creating a new font descriptor object from a fz_font, then yes, that can get messy but is something we must do anyway11:33.37 
  so basically, we only want to reuse a small subset of existing font descriptors.11:34.08 
paulgardiner If I have an fz_text object, I can assume that the fz_font that it refers to and the descriptor that came from is going to be appropriate for display of the text11:34.16 
tor7 and that subset is base14 + winansi encoding and no funny business11:34.19 
  paulgardiner: where did you get said fz_text object? if from the interpreter, there is no such guarantee.11:34.55 
paulgardiner By "deriving" I mean we get the fz_font to internally remember the descriptor it came from11:35.04 
tor7 paulgardiner: that could also be an xps_font...11:35.20 
  IIRC we're talking about this because you want to know what /Name to put in the fill_text call in pdfwrite, and that name must come from an entry in the resource dictionary, and correspond to the fz_font you have. right?11:36.38 
paulgardiner Yeah, in the case I'm dealing with.11:37.23 
tor7 so the simplest (and broken, not going to work for more than lucky cases) you can just look through the resource dict, load all the fonts, and compare pointers until you find one that matches11:38.08 
  but that will fail both when the fz_font doesn't exist in the resource dict yet, and the font descriptor does funny business (which is true in >90% of cases when dealing with non-annotation text at least)11:38.52 
  my suggestion is to do the same looking, but also check the resulting font descriptor so it doesn't do funny business.11:39.46 
paulgardiner I'm not looking for the simplest solution. I think we have that with the special function to create fz_fonts for base fonts.11:39.49 
tor7 if funny business, or no matching font, create and insert a new font descriptor for the fz_font11:40.05 
  creating a new font descriptor for base14 fonts is trivial, so I'd start there11:40.23 
paulgardiner The thing I'm still banging on about is (I think) Robin's suggestion of having (in some cases) fz_fonts remember details of how they were created.11:40.55 
tor7 in the fz_text call you have absolute positions of all the glyphs, you'll need to both look at the /Widths array and the implicit widths from the freetype font data to figure out how much you need to compensate with text metric adjustment commands in the output stream11:41.33 
paulgardiner And then there is no matching because you can simply ask for the descriptor11:41.35 
Robin_Watts paulgardiner: The problem there, is what if an fz_font has come from an xps file?11:42.04 
tor7 paulgardiner: if the font is one you got from a font descriptor in the file, all the descriptors etc will be cached in the pdf_store11:42.04 
Robin_Watts It will have no pdf_font_descriptor.11:42.16 
tor7 so calling pdf_load_font in the pdfwrite device should be snappy11:42.20 
paulgardiner robin_watts: yes so fz_fonts would need to be virtual and some would allow request for the descriptor some not11:42.44 
Robin_Watts paulgardiner: And partly that was why I asked if fz_font's were tied to a doc.11:43.12 
tor7 pdf_font_desc *pdf_find_font_descriptor(doc, dict, fz_font *font)11:43.20 
  that can do the dict enumoration and pdf_load_font and pointer tests11:43.39 
  is that what you want?11:43.50 
Robin_Watts tor7: What type is doc ?11:44.06 
tor7 pdf_document*11:44.11 
Robin_Watts And that's the doc that the device is writing into ?11:44.24 
tor7 it'd have to return both the font descriptor, and dict name, I believe11:44.25 
Robin_Watts makes sense.11:44.30 
tor7 yes.11:44.32 
Robin_Watts (or at least it makes sense modulo my dodgy understanding of fonts)11:44.53 
tor7 rename the function, and it could also synthesize the font descriptor objects and insert a new entry into the resource dict.11:45.15 
  the pdf_font_desc has all the icky stuff you need to do with metrics as well11:45.42 
paulgardiner I think with robin_watts's suggestion you could have pdf_font_desc *pdf_find_font_descriptor(fz_font *font)11:45.54 
tor7 pdf_font_desc_from_font()11:46.01 
  paulgardiner: yeah, but then we'd still be in the situation of needing to detect whether the pdf_font_desc we get is actually suitable for our purposes11:46.32 
  and belongs to the output document11:46.40 
  consider someone doing pdf2pdf using the device11:46.51 
paulgardiner tor7: couldn't the pdf device just stuff it in the resources11:47.16 
tor7 then it'd have to copy files from the input document, bypassing the device interface. no thanks.11:47.46 
paulgardiner I was thinking that really we want the pdf_obj back not the pdf_font_desc11:47.52 
tor7 you'll need the pdf_font_desc to deal with the metrics (/Widths arrays) when writing text11:48.15 
  and the encoding tables11:48.22 
  to match the glyph layout and kerning stuff from the fz_text object11:48.54 
paulgardiner But isn't the pdf_font_desc derivable fromt the pdf_obj form of the descriptor. Isn't that where it came from?11:49.09 
tor7 by pdf_load_font yes11:49.22 
paulgardiner I think I may misunderstand more of this than I thought.11:49.32 
Robin_Watts paulgardiner: Perhaps we could put a pdf_obj * pointer into the pdf_font_descriptor ?11:50.08 
tor7 brb, getting dizzy. need lunch.11:51.22 
paulgardiner Possibly, though perhaps this whole approach doesn't lead anywhere because however neatly it deals with my case it we still need to handle the XPS font case11:51.49 
tor7 robin_watts: I won't object to that (pdf_obj of the 10 0 R form in the pdf_font_desc)11:52.01 
  I'll write up some text explaining my thoughts this afternoon11:52.19 
Robin_Watts lunches (I think my "contribution" to this conversation is pretty much exhausted anyway!)11:52.31 
tor7 or ask miles to schedule a workshop of a few days where we can discuss this in person11:53.03 
paulgardiner When you're both back can we briefly discuss the base 14 only solution becaue that's what I'd like to do for now.11:54.05 
  I'm hoping it still makes sense to go with yesterdays plan plus the new fz_font creating function for base fonts.11:54.58 
  If that still make sense, I can do that fairly quickly I think11:55.15 
Robin_Watts paulgardiner: That sounds fine to me, though I only understand it at the highest level.12:36.20 
tor7 robin_watts: skype voice chat maybe?12:37.45 
  paulgardiner: ^12:37.48 
Robin_Watts tor7: If you like, though I don't necessarily feel I have much to contribute.12:38.13 
tor7 I can rant about the stupidity of fonts in PDF :)12:38.37 
paulgardiner We could give it a go. I've just realised I may need to externally create a descriptor in any case because I have to add a DA (default appearance) string to the annotation and that can't sensibly be generated by the pdf device12:40.06 
tor7 paulgardiner: some incoherent ramblings12:44.46 
  paulgardiner: so to start, you want to just create base14 font descriptors and not reuse existing font descriptors?12:47.24 
paulgardiner Yes. That'll do fine for now12:47.47 
  So does yesterdays plan still fit with today's If I carry on in that style, will I be going against what we've decided today?12:49.00 
Robin_Watts I think getting something (anything!) that works without too much trouble has to be a step forward.12:49.30 
tor7 remind me of yesterday's plan12:49.35 
paulgardiner tor7: I was hoping you wouldn't ask that. :-)12:49.56 
tor7 paulgardiner: okay, let me paste today's requirements as I understand them12:50.26 
  interpreter load: from pdf_obj to pdf_font_desc (and by extension fz_font)12:51.04 
  interpreter layout: from pdf_font_desc + content stream to fz_font + fz_text12:51.12 
paulgardiner I think it was "use the pdf-write device as is, but where it currently always creates Helvetica, have it sensitive to the fz_font pass in using the data pointer to decide which base font12:51.18 
tor7 (this is the data flow of our structs)12:51.21 
  in pdfwrite we need: from fz_font + fz_text to pdf_obj + content stream12:51.39 
  at the highest level12:51.42 
  which need to be decomposed to these transforms:12:51.58 
paulgardiner That's the trouble with IRC. I was joking when I said I didn't know what yesterday's plan was12:51.58 
tor7 from fz_font to pdf_font_desc12:52.17 
  here we can either reuse an existing font descriptor (if it matches our needs)12:52.39 
  or create a new one from scratch based on the fz_font12:52.50 
  from pdf_font_desc to pdf_obj12:52.57 
  so we have something to stick in the resource dictionary12:53.04 
  from fz_text and pdf_font_desc to content stream12:53.10 
  we need the pdf_font_desc to deal with encodings and metric layout12:53.25 
  we also need a way for you to create a fz_text object given a fz_font + unicode string IIRC12:53.54 
  (and I honestly forgot what yesterday's plan was, we've been back and forth so many times)12:54.54 
paulgardiner I've just paraphrased it above12:55.16 
tor7 right. so currently it always creates a new font descriptor for Helvetica?12:55.58 
paulgardiner The pdf device already creates descriptors but always Helvetica12:56.06 
tor7 right. so we could start by extending that to the full set of base 1412:56.21 
paulgardiner Yep. I think that was yesterday's plan12:56.37 
tor7 and error out if the fz_font doesn't match12:56.38 
paulgardiner Or create Helvetica :-)12:56.48 
  Probably best to error out12:57.01 
tor7 given a fz_font based on a freetype font it should be fairly easy to create a font descriptor embedding the data12:57.28 
  but it means reading up and understanding pdf font descriptors...12:57.47 
paulgardiner As an extension?12:57.49 
tor7 but one of the cases will be base14 so best to start there12:58.02 
paulgardiner Right12:58.08 
tor7 from a fz_font we have four kinds of font descriptors we need to worry about:12:58.39 
paulgardiner The one addition from today I was considering was to add the quick method to get an fz_font for a base font12:58.43 
tor7 1) base 14 non-embedded12:58.47 
  2) embedded12:58.51 
  3) substitute fonts (non-embedded non-base14)12:59.03 
  4) type312:59.05 
paulgardiner 5) not PDF?12:59.23 
tor7 I don't understand that question.12:59.40 
paulgardiner I thought we also needed to deal with fz_font objects from XPS files.... or do we not create fz_fonts for XPS13:00.36 
tor7 the 4 cases listed are what we may need to *create*13:00.44 
  regardless of the source13:00.52 
paulgardiner Oh ok sorry13:01.00 
tor7 xps fonts will always end up in case (2) though13:01.12 
  they're always embedded :)13:01.16 
paulgardiner Right13:01.17 
tor7 there are many more complications with font descriptors though, simple vs cid font for example.13:01.53 
  so preserving existing font descriptors is a much too difficult problem13:02.15 
paulgardiner I was reading the whole thing as the incoming fz_font. The one that fz_fill_text finds in the fz_font attached to the fz_text13:02.19 
tor7 the fz_font exists in two flavors -- freetype backed, or type 313:02.42 
  the freetype backed one has a flag whether it's a real font or whether it's an acting substitute font13:02.57 
paulgardiner That's one of the things I'm still not completely understanding "preserving exisiting too difficult"13:03.19 
tor7 the font descriptor in pdf lets you go from funny-pdf-encoding-that-you-have-no-idea-what-the-bytes-actually-mean to a glyph id13:04.15 
  and in the lucky cases, also to unicode13:04.19 
  it also has a table mapping from character code (in this funny unknown encoding) to glyph advance width13:04.41 
paulgardiner But we are passing an fz_text object that presumably has a suitable string for the encoding13:04.56 
tor7 there is no way we can reliably go from a unicode string to a fz_text object from this information13:05.09 
  the fz_text object has raw glyph ids (encoding less) and maybe, just maybe, some unicode text that matches13:05.50 
  depending on how good the input data is13:05.56 
  a font in a pdf file does not have a way to go from glyph id to unicode. it *may* be possible by reverse lookups in the freetype backed font data. but not guaranteed.13:06.33 
paulgardiner Still confused. We call fz_fill_text. We give it an fz_text object. However confusing the encoding is, the string in the fz_text object must match the encoding in the fz_font.13:06.46 
tor7 lots of embedded subset fonts have stripped out the necessary tables.13:06.48 
  there is no encoding in fz_text13:07.01 
  there is no encoding in fz_font13:07.06 
  they work on raw glyph ids13:07.10 
paulgardiner Oh13:07.11 
  Of couse13:07.19 
  There couldn't be. Silly me13:07.24 
tor7 the fz_text may have supplementary unicode text string, if we're lucky. but that's only for the text extraction device and searching.13:07.59 
paulgardiner No no no.13:08.22 
tor7 now, in a subset of fz_fonts we can go from unicode <-> glyph and back13:08.23 
paulgardiner What is in the fz_text string?13:08.29 
tor7 fz_text_item_s has x, y, glyph id, and unicode code point.13:09.02 
  the glyph id is used to find the glyph outline and render it13:10.01 
paulgardiner Where glyph id makes sense only if the correct encoding is used?13:10.13 
tor7 the unicode code point is used for searching and copy&paste13:10.14 
  the glyph id is the index into the glyph data table in the font. it has no correlation with any encoding.13:10.32 
  a truetype font has a "glyf" table, a "loca" table and a "cmap" table13:10.45 
  the "cmap" table maps from unicode to glyph id13:10.52 
  the "loca" table maps from glyph id to offset in the "glyf" table13:11.01 
  the data in the "glyf" table is the vector outline13:11.14 
paulgardiner Right13:11.21 
tor7 in PDF and XPS, the content streams may index the glyph id ("loca" table) directly, bypassing the "cmap"13:11.39 
  a /Identity-H encoded font in PDF for example13:11.55 
paulgardiner I'd forgotten all the terms13:11.56 
tor7 or wores, a PDF font can use /Identity-H to get a number which is lookuped in the /CIDToGIDMap which is then used to index the "loca" table13:13.23 
  and the embedded font can be a subset that doesn't have a "cmap" table13:13.40 
  and the /ToUnicode entry is optional13:13.46 
  which means there's absolutely no way we can go to or from unicode using this font13:14.02 
paulgardiner For another product I implemented most of the font interpretation and encoding handling. I have known this stuff at one time, but it is a while ago.13:14.04 
tor7 but it prints correctly :)13:14.08 
  so we may need to extend the fz_font struct to allow it to have some metric and encoding capability13:15.24 
paulgardiner Type 1s? Don't the encodings map onto names rather than glyph ids? Or am I misremembering that too?13:15.35 
tor7 the encodings map to names which map to charstrings via a dictionary13:16.03 
  in freetype, the charstrings are loaded into an array, and the names are mapped to an index in this array13:16.26 
paulgardiner But what do you use for glyph id's in fz_text_item_s? ... oh an enumeration of the names maybe?13:16.48 
tor7 the freetype internal index13:17.00 
  in freetype, all glyphs are indexed by a numeric index13:17.23 
paulgardiner For type 1s?13:17.24 
tor7 for type1's freetype will synthesize a numeric index for each glyph13:17.37 
  and give you mapping tables to go from name -> index and back13:17.46 
  glyph_id = FT_Get_Name_Index(face, string)13:18.15 
paulgardiner Ah. You use the freetype thirdparty library in the interpretation of Type1s in PDF?13:19.46 
tor7 yes. we use freetype for all fonts, except type3.13:20.00 
paulgardiner Oh nice. I wish we'd thought of that in the other project I mentioned.13:20.29 
  Ok. Now at last I understand why we can't reuse font descriptors... or well at least why it would mean also hiding extra encoding specific info in the fz_text_item_s13:21.55 
  The worrying thing is that now I realise I have yet another problem I hadn't considered at all. I need to construct a fz_text object from a utf8 string and an fz_font.13:23.28 
tor7 yeah. so if we have an existing font descriptor, in simple cases we could reuse it. like it being a known font (i.e. base14) with a known encoding (i.e. WinAnsi)13:23.29 
  and no metric overrides13:23.34 
paulgardiner I hadn't realised that required layout13:23.38 
tor7 it needs both layout *and* encoding13:23.48 
paulgardiner Yes13:24.00 
tor7 but if it's a base14 font (or a well formed embedded font, with unicode cmap tables) that can be done13:24.14 
  some subset fonts I think can skip the metrics tables as well, to save space13:24.44 
  since the PDF needs the /Widths array for layout anyway13:24.53 
paulgardiner It is a huge amount of work for what can be done so simply with sprintf.13:24.55 
tor7 yeah.13:25.47 
  so anyway, with a proper fz_font we should be able to use freetype to both do the encoding and metrics13:26.14 
  but we probably want to add some functions to do it13:26.27 
paulgardiner I guess the pdf device is going to find it difficult to work out that it doesn't need to do a move for each char13:26.36 
tor7 it needs to check how much the implicit move is, and write the delta if the next char is not where expected13:27.08 
paulgardiner So we are going to end up with inefficient appearance streams13:27.12 
Robin_Watts what tor7 said.13:27.22 
  but inefficient streams will do for a start.13:27.36 
paulgardiner Yeah true13:27.43 
  So roughly how do I create fz_text for a utf8 string... just the base 14 case for now.13:28.22 
tor7 two ways really13:28.37 
  either limit the charset to WinAnsi and use that13:28.50 
  or make a CIDFont font descriptor, and use Identity-H with a 16-bit encoding13:29.01 
  and write out the glyph ids13:29.05 
paulgardiner WinAnsi13:29.10 
tor7 (but that's not going to work for base14, that needs embedding the font data)13:29.19 
  I'd go with WinAnsi for now13:29.27 
  you *could* figure out a character set to use dynamically, or make a separate font descriptor for each "code page" that you need13:29.58 
paulgardiner I was more requesting a rough sequence of calls... although I guess I get that from inspecting the interpreter13:30.12 
  I'm happy for now if only ascii works13:30.39 
tor7 the calls don't exist :)13:30.47 
  when creating the fz_text I'd say use freetype to encode unicode values directly13:31.16 
  text_item->x = point.x13:31.28 
  text_item->y = point.y13:31.33 
  text_item->gid = FT_Get_Char_Index(font->ft_face, c)13:32.03 
  text_item->ucs = c13:32.08 
paulgardiner It's point.x and point.y I was most unsure of13:32.30 
tor7 FT_Get_Advance(face, gid, ...)13:32.38 
paulgardiner Oh okay13:32.44 
tor7 look in source/xps/xps-glyphs.c for an example13:32.49 
  then the pdfwrite device will have to worry about going from glyph id back to WinAnsi13:33.16 
  you also need to FT_Set_Charmap to pick the unicode cmap13:34.02 
  see xps-glyphs.c again13:34.07 
  the xps calls for this could possibly be moved into fz_ (if we make them handle type3 fonts as well)13:35.07 
  or at least not crash on type313:35.13 
  xps_encode_font_char, xps_measure_font_glyph, et13:35.33 
paulgardiner I still can't stop asking myself "is this best way to go from <"Hello","Helv", 12> to "BT /Helv 12 Tf 0 g (Hello) Tj ET"13:36.15 
tor7 paulgardiner: good question, but I think it is if you want to go via the fz_device layer13:36.48 
paulgardiner Oh yes definitely. It's the use of the fz_device layer that I guess I'm questioning. It did show huge advantages for the other types of annotation.13:38.11 
  But I'm happy in the knowledge that similar alarm bells aren't ringing so loudly in your or robin_watts's ears.13:39.28 
Robin_Watts paulgardiner: There may be a shorter route as a hack to get basic annotations working.13:40.05 
tor7 taking care of using base14 fz_font and only the base14 template font descriptor on the pdfwrite back end should short-cut the difficulties13:40.46 
Robin_Watts but in the same way that reusing the existing work that had been done for pdf write helped the annotation stuff in the past, your annotation stuff is now in a position to help the more general pdf write.13:40.53 
tor7 and let us grow that into the full implementation that a generic pdfwrite device will need13:41.06 
paulgardiner To be honest, I don't think it's a huge amount of work to get basic base 14 working this way. I was more concerned about the amount of processing on each use13:41.27 
tor7 and not get us into the problem where we have two separate text layout and font descriptor handling13:41.29 
  paulgardiner: basically it'll map from UTF-8 to glyphs to winansi, rather than just from UTF-8 to winansi13:41.57 
  and let us handle non-winansi cases down the line13:42.03 
paulgardiner I wonder if I can trick the interpreter into creating the fz_text objects for me.13:43.23 
tor7 don't :)13:44.08 
paulgardiner I think I'd need an appearance stream. Now where can I get one of those from? :-)13:45.21 
  It does (almost) make sense to use sprintf to create an appearance stream which then is passed to part of the interpreter to produce an fz_text object which then is passed to the pdf device. The advantage over just using sprintf is that we get to use the pdf device to mix text and line art.13:47.31 
tor7 the pdf interpreter can't do utf-8 though13:49.51 
paulgardiner True. And in any case it seems very kludgy13:50.47 
tor7 the more I think about it, the more I think that just using the freetype calls to do layout and encoding should "just work" on the annotation creation side of things13:52.20 
  maybe do a double check with freetype that the font actually has a unicode encoding table first, and if failed, load up helvetica13:53.00 
paulgardiner Yeah. Doesn't sound too bad.13:53.10 
tor7 (and also check that it isn't a substitute font)13:54.17 
  robin_watts: ping14:09.14 
Robin_Watts pong14:09.22 
tor7 I've looked through the progressive patch except the page and xref bits14:09.40 
  1) ok should be spelled okay14:10.00 
Robin_Watts okay14:10.26 
tor7 2) the linear and progressive flags in pdf_document... what if we can do both progressive and random access?14:10.52 
  like the byte range loading from an http stream14:11.07 
  (mind you, I haven't read the implementation, just a thought that occurred when seeing the enum)14:11.38 
Robin_Watts tor7: flags word, innit.14:11.39 
  That would have been more obvious with an earlier version of the patch that had SEEKABLE=4.14:12.15 
tor7 righ.14:12.26 
  3) fz_stream_meta ... default return value -1 ? wouldn't 0 be better.14:12.42 
  I'm not sure I like the _meta function anyway14:13.16 
Robin_Watts Hmm. Actually, I don't think I need PDF_FILE_STATE_PROGRESSIVE after all.14:13.41 
  it's set and then never consulted.14:13.48 
tor7 4) progressive XPS? maybe not in this patch, but xps could be done progressively by reading the zip entries from the start with no random access14:14.14 
  we just won't know when we can discard stuff14:14.25 
  I'd have thought the progressive state is implicit in the fz_stream14:14.52 
Robin_Watts Having meta return -1 by default is so that for future expansion, all 'unknown' meta keys return -1.14:15.02 
  and so we can easily distinguish 'not supported' from boolean return values.14:15.21 
tor7 okay.14:15.37 
Robin_Watts I like the 'meta' scheme. It has served me well in the past.14:15.56 
tor7 I would prefer one function per meta state though, but I guess it depends on where you draw the line14:16.18 
Robin_Watts But we could just have an 'is_progressive' flag (or function) on every fz_stream.14:16.22 
tor7 I mean, all functions could be done by one big huge meta "switchboard" function if you go to extremes14:16.37 
Robin_Watts tor7: The nice thing about doing it like this is that when we have chains of streams, you can pass the meta functions through.14:17.15 
  but there are several ways we could code this.14:18.04 
tor7 reminds me of 
Robin_Watts 4) Nothing I've done here precludes its use with XPS.14:19.21 
tor7 5) the biggest talking point is the font handling14:19.43 
Robin_Watts the general TRYLATER approach should work (though we may need to tweak the cleanup paths)14:19.52 
  tor7: I do font handling?14:20.16 
tor7 yeah, xps cleanup is very flaky so I'm sure progressive XPS will show up a lot of errors there14:20.21 
  the way you create a fallback font when font loading fails14:20.32 
  it's not going to give useful results on non-standard encodings14:20.52 
Robin_Watts tor7: is that a problem?14:21.04 
tor7 what could work though, is if the file loading fails (trylater) is to treat the font as a substitute font temporarily14:21.18 
Robin_Watts Think of it as 'greeking' :)14:21.27 
tor7 right!14:21.43 
Robin_Watts tor7: OK, we're getting into areas where I'm hazy again...14:22.01 
tor7 so we have two possible fail states when loading fonts. not sure if it's worth bothering with.14:22.02 
  if the font descriptor (pdf_obj stuff) fails to load, we don't have any encoding tables or metrics or anything useful14:22.29 
  this is where greeking would be fine14:22.38 
  if the font descriptor loads, but the embedded font file fails...14:22.52 
  we have a way to deal with that already -- substitute fonts.14:23.03 
  I guess the problem is recovering cleanly once we can reload the font and have the complete thing.14:23.23 
Robin_Watts tor7: if the font descriptor fails to load, we don't go looking for the actual font.14:23.26 
tor7 the font file is loaded early on in the font descriptor loading14:23.46 
  if that fails with TRYLATER we could handle that the same way as we handle a missing font file14:24.07 
Robin_Watts ah, right. This is load_font_or_sub14:24.09 
  I'd entirely wiped the memory of that from my mind.14:24.17 
tor7 yes. I'm talking about inside load_font14:24.21 
Robin_Watts How about we go with what I have for now, and then someone with a clue about this area can tweak that function ? :)14:24.44 
tor7 pdf_load_font_descriptor calls pdf_load_embedded_font14:24.53 
  if that fails with TRYLATER you call rethrow_if14:25.39 
  but we could possibly set some magic flags and proceed by handling it as a substitute font14:25.54 
Robin_Watts I'm sure it could be handled better.14:26.10 
tor7 but that'd mean messing with the font store and "reloading" an existing font14:26.13 
Robin_Watts Yeah, everything so far avoids incomplete stuff going into the store, I believe.14:26.44 
tor7 your greeking approach is fine, but I'd rename the function pdf_load_font_or_greek14:27.20 
  load_font_or_sub is a lie14:27.27 
Robin_Watts load_font_or_hail_mary14:27.40 
tor7 that'd do :)14:27.45 
  also load_fallback_font should be pdf_load_hail_mary_font14:28.16 
Robin_Watts ok.14:28.40 
  actually, this patch is much smaller than my memory had said it was. I guess that's a good thing.14:29.04 
tor7 the implementation of load_fallback_font is fine. a NULL dict, and base14 font name, will get you sane defaults in the pdf_font_descriptor14:30.21 
  you might want to cache it though14:30.40 
  creating fonts is not cheap and is fairly heavy on memory14:30.55 
  (all the font descriptor encoding tables and metric tables that need to be allocated and filled out)14:31.19 
  and "Times" might be a better choice than Helvetica14:32.02 
Robin_Watts tor7: hmm. Where can I cache it ?14:32.05 
  Helvetica chars are narrower than times, typically, right?14:32.20 
tor7 in the store, using a "hail mary" key14:32.21 
Robin_Watts Ah, right.14:32.29 
tor7 robin_watts: shouldn't matter too much, the layout will use the widths from the font descriptor14:32.53 
Robin_Watts Picking a font with chars that's too narrow gives results you can still read.14:33.09 
tor7 it's only if a text object is reset to an absolute position that it'll collide14:33.09 
Robin_Watts picking a font with chars that are too wide results in overlaps and unreadability.14:33.28 
tor7 a hail mary font will not be using the correct metrics, so the text will also be too wide14:34.00 
  so the line widths will be wrong as well14:34.22 
  but yes, if there is absolute positioning, using a narrower font is much better14:34.41 
  and it's bound to be jarring when the real font is loaded anyway14:35.00 
Robin_Watts tor7: OK. I will make those changes, thanks.14:39.47 
tor7 my brain is too mushy to check the rest of the changes. I'll do that tomorrow if that's okay?14:40.19 
Robin_Watts sure. thanks for looking so far.14:42.03 
  vs: pd_okay15:32.12 
  I put it to you that you are outvoted :)15:33.04 
  "No bloat", right? :)15:34.25 
  mvrhel_laptop: ping17:00.44 
mvrhel_laptop robin_watts pong17:01.04 
Robin_Watts I've just pulled in lcms 2.5 and have got 9000 or so differences.17:01.05 
mvrhel_laptop oh my17:01.10 
Robin_Watts The bmpcmp is in my area now.17:01.13 
mvrhel_laptop sounds like you have some work to do ;)17:01.24 
  robin_watts these are from lcms? starting from tests/Ghent_V3.0/020_CMYKSpot_OP_x1a.pdf.pam.72.0?17:02.55 
Robin_Watts oh. I'm so dumb.17:03.20 
  Let me pull in the recent gs changes :)17:03.30 
mvrhel_laptop np17:04.12 
  have to leave in a bit to see kids show off their improved swimming skills as today is the end of swim lessons17:05.11 
Robin_Watts np. Hopefully this should go through with no real changes.17:05.29 
SpNg I'm reading the GS9_Color_management.pdf regarding the introduction of ICC profiles. Is it possible to convert an .eps file from RGB colorspace to CMYK colorspace using .icc profiles?17:44.10 
Robin_Watts mvrhel_laptop: ok, 4000 differences now.17:57.53 
  These seem much more plausible though. I will do a run where I exclude the halftoned ones.18:02.32 
tor7 robin_watts: none of those OK are by me!18:11.35 
  but fine, we have no occurences of "okay" anywhere...18:11.57 
Robin_Watts :)18:26.46 
mvrhel_laptop robin_watts: I don't see any that I have an issue with in a quick view19:21.23 
  grabbing some lunch now19:22.14 
vtorri hmmm19:36.44 
  libgs is multithreaded ?19:36.52 
kens SpNg if you mean produce a CMYK EPS from an RGB EPS< then the answer is no, not yet.19:39.39 
  vtorri, it might be, it depends what you mean and how you configure Ghostscript19:40.35 
SpNg kens: that is what I was referring to.19:41.12 
kens SpNg then the answer is no currently. We don't have a device which produces EPS which uses high level colour management. The current EPS device emits everything as RGB.19:41.53 
  You cna have PostScript output, but not EPS19:42.09 
SpNg kens: ok. is it in the works, or in the distant future?19:42.40 
kens SpNg at some point we want to extend ps2write to produce EPS, at which poitn we can finally kill pswrite. At that point it wil be possible to produce an EPS in a specific colour space. No promises on dates.19:43.41 
SpNg kens: great! thanks for the response19:47.14 
kens No problem19:47.26 
Gigs- ouch, segfault when res=144 but 145 and 143 are OK20:43.31 
  that's a really specific bug20:43.37 
  if someone could mark the attachment private as usual... robin_watts?20:51.08 
SpNg can ghoscript do conversions from RGB to CMYK values using an ICC profile? I'm looking to convert input values, not raster images or vector files22:32.38 
Robin_Watts Gigs-: Someone beat me to it I think.22:47.05 
  SpNg: Yes. Ghostscript will use icc profiles to convert between different colorspaces.22:47.35 
  BUT... pdfwrite and such devices don't use that yet.22:48.02 
  or at least there are cases where they don't use it yet.22:48.13 
SpNg Robin_Watts: I'm trying to do it with values. I would like to use say 3 floats for for RGB, and have them convert to 4 floats in CMYK22:48.17 
Robin_Watts If you use gs to render an input file containing CMYK values as an RGB bitmap you'll get the conversion done.22:48.54 
  likewise RGB to CMYK.22:49.01 
SpNg how would I extract the values so I could use them programmatically? 22:50.26 
  something similar to this post, but this is in java: 
Robin_Watts SpNg: You would use a color management system, like lcms2 to do it for you.22:53.12 
SpNg Robin_Watts: beautiful. Thank you for the pointing me in the right direction. ;-)22:54.32 
 Forward 1 day (to 2013/07/19)>>>