Ghostscript IRC logs

Log of #ghostscript at irc.freenode.net.

	<<<Back 1 day (to 2013/07/17)	2013/07/18
marcosw	is there anybody about who can answer a 'make so' question? specifically with is gs.o included in soobj/ldt.tr? this results in a duplicate .main symbol under AIX. removing gs.o from soobj/ldt.tr fixes the problem and creates a .so that can be called from gsc.	01:27.47
henrys	marcosw:but this works okay on other platforms right?	02:45.05
	so there are two symbols with the same name in the app and the lib which works okay if the linker deals with it okay. With dynamic linking shouldn't be a duplicate right? static would be an issue.	02:47.31
marcosw	henrys: thanks for response. under AIX the duplicate main symbol in the shared library gives a warning, but it does work.	06:36.41
vtorri	chrisl: ping	06:38.50
chrisl	vtorri: pong	06:39.23
vtorri	hey	06:39.28
ghostbot	niihau	06:39.28
vtorri	question about std* redirection	06:39.41
chrisl	vtorri: I'm probably the wrong person to answer that.....	06:39.59
vtorri	if i don't use buf, the callback must return 0, right ?	06:40.03
	i mean the callbacks set with gsapi_set_stdio()	06:40.44
chrisl	You mean if you just discard the contents of the buffer?	06:42.11
vtorri	yes	06:42.34
	just "return 0;"	06:42.57
	is it correct ?	06:43.06
chrisl	No, I think you want to return the number of bytes in the buffer - otherwise the calling code will think you haven't done anything, and may continue calling until the buffer is written out	06:44.37
vtorri	so "return len;"	06:44.57
chrisl	Yes (thanks, was just looking for the name of the parameter!)	06:45.17
vtorri	on the contrary, if I do	06:45.35
	printf(buf);	06:45.41
	i return 0, right ?	06:45.47
	actually, i do something equivalent to printf (it's our login stuff, but in the end, it calls printf(buf))	06:46.49
chrisl	Sorry, I don't follow - printf returns the number of bytes written out	06:48.25
	So just returning zero could potentially be seen as an error condition	06:48.48
vtorri	the problem is that our login system does not return the number of bytes written	06:49.24
chrisl	Okay, with printf() you're probably okay	06:50.14
vtorri	i call	06:50.47
	INF(buf);	06:50.51
	which is our log macro	06:51.02
	there is no returned value	06:51.18
	internally, it calls printf(buf)	06:51.39
chrisl	vtorri: okay, so this is purely in the stdout/stderr callback?	06:51.46
vtorri	but i do not have an access to the value returned by printf	06:51.59
	yes	06:52.04
	i will set stdin callback to NULL	06:52.33
	btw, i do set it to NULL :)	06:52.50
chrisl	Okay, you will never get printf style formatted strings through that callback, we expand those ourselves, so the length of the buffer will be the number of bytes written	06:53.08
vtorri	in my case, i can assume that all the buffer is written	06:57.34
	so :	06:57.36
	http://codepad.org/f0rxo3R3	06:57.46
	right ?	06:57.48
chrisl	vtorri: well, sort of. buf is not guaranteed to be null terminated	07:00.49
vtorri	ouch	07:00.56
chrisl	although in practice, I think it usually is.	07:01.24
vtorri	ok, thanks	07:01.25
	so malloc + memcpy + [len]=0 is preferable ?	07:01.46
chrisl	yes, I would say so	07:02.18
vtorri	thanks	07:02.26
marcosw	morning chrisl	07:02.28
chrisl	morning marcosw	07:02.50
vtorri	chrisl: in the gsapi_set_stdio() doc, the returned value of the function is not described. Is it normal ?	07:03.31
	not the returned value of the callbacks	07:03.47
	the returned value of hte function itself	07:03.58
chrisl	vtorri: it's a normal Ghostscript integer error return value - < 0 is an error, >=0 is fine	07:05.13
vtorri	ok	07:05.25
	thanks	07:05.28
chrisl	vtorri: I thought the integer return values were explained somewhere, not sure where right now, though	07:06.08
	marcosw: So, the linking thing - it sounds like you're okay on that?	07:06.20
vtorri	i don't see it in iapi.h	07:06.26
	ha, it's in ierrors.h	07:06.49
chrisl	vtorri: I'll some words to API.htm just to say "unless otherwise stated integer returns are those in ierrors.h" or something	07:07.59
	s/some/add some	07:08.12
vtorri	ok	07:08.35
Robin_Watts	tor7: Morning	08:32.34
tor7	morning	08:32.55
Robin_Watts	I just pushed an updated robin/progressive. Now appears not to leak, and still passes all the cluster tests.	08:32.55
tor7	okay, will look	08:33.06
Robin_Watts	I hope it's getting towards commitability. Thanks.	08:33.09
	chrisl: While tor7 looks over the progressive stuff, I'm going to look at pulling lcms2.5 into gs and extracting our changes into a separate plugin for it.	08:46.48
chrisl	Robin_Watts: okay, that's fine. Is there anything new of note in 2.5?	09:07.39
Robin_Watts	chrisl: All our fixes :)	10:30.54
	but not our speedups.	10:30.59
chrisl_r61	Robin_Watts: hmm, I wonder why Marti doesn't want the speedups.....	10:31.24
Robin_Watts	chrisl_r61: He thinks that they would be better done as a plugin.	10:32.00
	which is what I am going to try to recast them as this afternoon.	10:32.14
	He doesn't like my chaemeleonic header file to generate optimised transform code.	10:32.30
chrisl_r61	Ah, I can understand that, but still.....	10:32.54
Robin_Watts	http://www.cs.berkeley.edu/~necula/cil/cil016.html	10:37.50
chrisl_r61	Seems like a good guide on how not to write C!	10:39.47
paulgardiner	tor7, robin_watts: There's still something that feels awkward with the font stuff we discussed yesterday. Each font descriptor is created twice. Firstly a font descriptor has to be created so that we can call pdf_load_font to create (indirectly) the fz_font object to use in the fz_text objects. Secondly the pdf-write device creates a font descriptor. It'll certainly do for now, but I can't...	10:52.13
	...help feeling that either the pdf device should be able to find the initially created descriptor, or I should be seeding the resource (passed to the device) with the descriptors and the device should be able to do matching.	10:52.15
Robin_Watts	Yes, that does sound sub-optimal.	10:55.25
	When I've come up against stuff like this before, I've often had to refactor bits of mupdf.	10:56.10
	like functions used to be a pdf level thing, and I promoted them to a fitz level thing.	10:56.34
	similarly with shadings.	10:56.45
	I wonder if we could promote font descriptors from being a pdf level thing to being a fitz level thing.	10:57.10
	Then we could have a fz_font->get_font_descriptor function.	10:57.48
	which would either get the original font descriptor, or would make us one.	10:58.19
	or maybe we need both an fz_font_descriptor and a pdf_font_descriptor or something.	10:58.45
tor7	paulgardiner: would a function to directly create a fz_font (for use with fz_text objects) for the base14 fonts be useful?	11:04.05
	calling pdf_load_font to get the fz_font of an existing font you want to reuse seems logical to me, though	11:04.39
Robin_Watts	tor7: The problem, AIUI, is for the device to get a font descriptor from a font.	11:04.50
	though such a function might be useful for app writers.	11:05.03
tor7	the pdfwrite device should, from that fz_font, either rediscover the existing font descriptor if it matches our narrow requirements, or make a new one	11:05.13
	in the general case, we can't know from the pdfwrite device that we're both creating a stream in an existing file and expect to reuse an existing font descriptor	11:06.05
paulgardiner	I'm not sure whether we necessarily need the descriptor. I know I need the internal name that refers to the descriptor	11:06.18
tor7	the interpreter and output device are separated by the device interface, and should not cross-contaminate IMO	11:06.49
Robin_Watts	tor7: The pdfwrite device can be expected to know details of the document it is working within.	11:08.08
tor7	robin_watts: I disagree about pdf_font_descriptor moving to the fz layer. they're an interpreter only thing.	11:08.15
Robin_Watts	it just can't expect to know those details via the device interface.	11:08.25
paulgardiner	tor7: I did wonder about a function to create an fz_font for base fonts. I dismissed it because I thought I needed the internal name to be in the fz_font obj, but actually it should be the name of the base font, so I think it would work.	11:09.02
tor7	robin_watts: the resource machinery in the pdfwrite device has the responsibility to create (or rediscover) font descriptors	11:09.07
Robin_Watts	tor7: right.	11:09.15
tor7	all it sees are fz_font and glyphs, without the icky encoding and metric crap carried over from the pdf interpreter	11:09.53
	given a fz_font it should be able to create a font descriptor for use; and I was thinking it could also look for an existing one that would look just like one it would create itself and use that if found	11:10.44
Robin_Watts	Where would it look for such an existing one?	11:11.41
paulgardiner	In the resource dict?	11:11.58
tor7	in order to get resource reuse, it'd have to look through existing resource dictionaries (and maybe some pdfdevice internal list)	11:12.00
Robin_Watts	tor7: right. And those existing resource dictionaries come from the existing document.	11:12.46
tor7	basically iterate through the fonts in the resource dict, if the fz_font matches and it follows our criteria about encodings reuse it, else create a new font descriptor and insert	11:12.48
Robin_Watts	or are at least seeded from it.	11:13.03
tor7	I don't think we should scan the entire document for font descriptors, just the current page's resource dict	11:13.54
paulgardiner	tor7: we get to pass a resource dict to the device	11:14.21
	So we could restrict the search to that	11:14.32
tor7	paulgardiner: when creating the pdfwrite device you mean?	11:14.43
paulgardiner	tor7: yes	11:14.49
tor7	paulgardiner: right. that'd be the page resource dict usually, I believe.	11:15.05
	or an Xobject resource dict	11:15.15
Robin_Watts	The resource dict we pass in when creating the device is (If my crap memory serves) the current resource dict for the thing we are writing.	11:15.20
tor7	the one you'd want to use anyway :)	11:15.35
paulgardiner	tor7: certainly the latter when I've used the device	11:15.36
tor7	and the one we want to insert newly created font descriptors in	11:15.50
Robin_Watts	But arguably, when it comes to fonts, we want to pass in other resources from the same file, and say "feel free to pull stuff in from here too".	11:15.57
tor7	robin_watts: yes, we could make use of a list of previously spotted font descriptors and insert those in the current resource dict as well	11:16.41
Robin_Watts	otherwise we could end up creating a new font descriptor for every annotation on a page, say.	11:16.52
paulgardiner	That's what I've been refering to as "seeding"	11:17.20
Robin_Watts	tor7: At the moment, I think we probably create a pdfwrite device, use it, then destroy it.	11:17.31
	hence the ability to "remember" font descriptors internally (and indeed images, shadings etc) is restricted to within a single stream.	11:18.01
tor7	I guess we should extend the pdfwrite device to have some out-of-band non-device calls for this use	11:18.02
Robin_Watts	quite possibly.	11:18.09
tor7	or have a separate pdfwritestate object that the devices share, and is hooked up to the pdf_document that is being edited or created	11:18.54
paulgardiner	I'm slightly lost now. Isn't the existing ability to pass a seeded resource sufficient?	11:19.06
tor7	paulgardiner: actually, I think it might. if you share the same resource dictionary for all annotations you create.	11:19.29
paulgardiner	It seems nice to me to have the flexibility to share or not depending on use, and I'd have thought the passed-in resource allows for that.	11:21.06
Robin_Watts	paulgardiner: AIUI, the resource we send in is just "the resource dictionary for this stream".	11:21.33
paulgardiner	Hmmm. Yes it is.	11:21.56
Robin_Watts	If you 'preseed' that, you end up with potentially having lots of stuff in resource dictionaries that isn't used.	11:21.57
paulgardiner	I may be going around in circles, but:	11:22.35
Robin_Watts	I think I prefer the idea of passing in both "the current resource dictionary" for the thing to add to as required, plus "some other resources you might want to make use of".	11:22.47
paulgardiner	It would be handy if either I could put in just the font-descs I wanted into the resource dict, or if the device could add them as needed (using existing one somehow known from the fz_font obj)	11:23.48
	robin_watts: oh okay	11:24.16
Robin_Watts	Ah fz_fonts tied to a doc ?	11:24.30
	s/Ah/Are/ ahem.	11:24.39
tor7	robin_watts: so, a global resource dictionary that has it all, and then create a stripped minimal resource dict that has just the stuff used in a given stream?	11:24.57
paulgardiner	I think type 3s are tied	11:25.00
tor7	type3 should be independent once they end up in the fz_device interface	11:25.34
	some of the bits are dependent, but they're only used by the interpreter and converted to other device calls	11:26.06
	any type3 font you get in a fz_text object has a display list per glyph you can use	11:26.30
Robin_Watts	tor7: I wouldn't like to mandate that all callers MUST create a global resource dictionary, but I'd like the device to be able to make use of it if it was there.	11:26.45
paulgardiner	Just to make sure I understand: so we choose this approach because we think it would be messy to make descriptors derivable from fz_font objects, but we think it should be possible to match an fz_font object to a known descriptor	11:27.02
tor7	robin_watts: yeah, that's what I meant when I was babbling about a pdfwrite internal list of font descriptors	11:27.22
Robin_Watts	paulgardiner: That agrees with my fuzzy understanding.	11:27.34
	tor7: right. My only qualm with that is that currently we create a device, use it, throw it away. hence any internal list that gets built up wouldn't be long lived.	11:28.10
	perhaps we can think of a way to persist such an internal list across pdf device creation/destruction though.	11:28.38
paulgardiner	We should be sure that's true, because if matching turned out to be just as hard, we are losing for no gain, the simpler approach of not needing to pass a global resource	11:28.50
tor7	robin_watts: which is why I'm thinking I'd like a pdfwrite-state object that the devices use	11:28.51
	so you'd create a new pdfwrite device per stream you want to create	11:29.12
Robin_Watts	tor7: Right. Was about to suggest a pdf_write_resource list or something.	11:29.15
tor7	and then use some calls on the pdfwrite state object to insert the stream	11:29.32
Robin_Watts	and you'd pass the pdf_write_resource into each pdf device creation.	11:29.40
tor7	yeah, except I'd make it more abstract since we're likely to want to use it for more things later on	11:30.02
	like outlines and links, etc	11:30.16
Robin_Watts	tor7: ok.	11:30.31
	I was thinking that images/fonts/shadings would be enough (hence resource), but you may be right about outlines/links etc.	11:31.14
tor7	paulgardiner: we choose this approach because it's going to be messy to reuse a generic font descriptor (all the encoding crap in pdf gets in the way, and substitute fonts make for another mess).	11:32.19
	if by deriving you mean locating a font descriptor object number from a fz_font, that bit is easy enough (just load all the fonts in the resource dict, and do pointer comparisons of the fz_font)	11:33.10
paulgardiner	tor7: I'm not completely understanding why yet, other than the fact that it's difficult fit with the interfaces.	11:33.28
tor7	if by deriving you mean creating a new font descriptor object from a fz_font, then yes, that can get messy but is something we must do anyway	11:33.37
	so basically, we only want to reuse a small subset of existing font descriptors.	11:34.08
paulgardiner	If I have an fz_text object, I can assume that the fz_font that it refers to and the descriptor that came from is going to be appropriate for display of the text	11:34.16
tor7	and that subset is base14 + winansi encoding and no funny business	11:34.19
	paulgardiner: where did you get said fz_text object? if from the interpreter, there is no such guarantee.	11:34.55
paulgardiner	By "deriving" I mean we get the fz_font to internally remember the descriptor it came from	11:35.04
tor7	paulgardiner: that could also be an xps_font...	11:35.20
	IIRC we're talking about this because you want to know what /Name to put in the fill_text call in pdfwrite, and that name must come from an entry in the resource dictionary, and correspond to the fz_font you have. right?	11:36.38
paulgardiner	Yeah, in the case I'm dealing with.	11:37.23
tor7	so the simplest (and broken, not going to work for more than lucky cases) you can just look through the resource dict, load all the fonts, and compare pointers until you find one that matches	11:38.08
	but that will fail both when the fz_font doesn't exist in the resource dict yet, and the font descriptor does funny business (which is true in >90% of cases when dealing with non-annotation text at least)	11:38.52
	my suggestion is to do the same looking, but also check the resulting font descriptor so it doesn't do funny business.	11:39.46
paulgardiner	I'm not looking for the simplest solution. I think we have that with the special function to create fz_fonts for base fonts.	11:39.49
tor7	if funny business, or no matching font, create and insert a new font descriptor for the fz_font	11:40.05
	creating a new font descriptor for base14 fonts is trivial, so I'd start there	11:40.23
paulgardiner	The thing I'm still banging on about is (I think) Robin's suggestion of having (in some cases) fz_fonts remember details of how they were created.	11:40.55
tor7	in the fz_text call you have absolute positions of all the glyphs, you'll need to both look at the /Widths array and the implicit widths from the freetype font data to figure out how much you need to compensate with text metric adjustment commands in the output stream	11:41.33
paulgardiner	And then there is no matching because you can simply ask for the descriptor	11:41.35
Robin_Watts	paulgardiner: The problem there, is what if an fz_font has come from an xps file?	11:42.04
tor7	paulgardiner: if the font is one you got from a font descriptor in the file, all the descriptors etc will be cached in the pdf_store	11:42.04
Robin_Watts	It will have no pdf_font_descriptor.	11:42.16
tor7	so calling pdf_load_font in the pdfwrite device should be snappy	11:42.20
paulgardiner	robin_watts: yes so fz_fonts would need to be virtual and some would allow request for the descriptor some not	11:42.44
Robin_Watts	paulgardiner: And partly that was why I asked if fz_font's were tied to a doc.	11:43.12
tor7	pdf_font_desc pdf_find_font_descriptor(doc, dict, fz_font font)	11:43.20
	that can do the dict enumoration and pdf_load_font and pointer tests	11:43.39
	is that what you want?	11:43.50
Robin_Watts	tor7: What type is doc ?	11:44.06
tor7	pdf_document*	11:44.11
Robin_Watts	And that's the doc that the device is writing into ?	11:44.24
tor7	it'd have to return both the font descriptor, and dict name, I believe	11:44.25
Robin_Watts	makes sense.	11:44.30
tor7	yes.	11:44.32
Robin_Watts	(or at least it makes sense modulo my dodgy understanding of fonts)	11:44.53
tor7	rename the function, and it could also synthesize the font descriptor objects and insert a new entry into the resource dict.	11:45.15
	the pdf_font_desc has all the icky stuff you need to do with metrics as well	11:45.42
paulgardiner	I think with robin_watts's suggestion you could have pdf_font_desc pdf_find_font_descriptor(fz_font font)	11:45.54
tor7	pdf_font_desc_from_font()	11:46.01
	paulgardiner: yeah, but then we'd still be in the situation of needing to detect whether the pdf_font_desc we get is actually suitable for our purposes	11:46.32
	and belongs to the output document	11:46.40
	consider someone doing pdf2pdf using the device	11:46.51
paulgardiner	tor7: couldn't the pdf device just stuff it in the resources	11:47.16
tor7	then it'd have to copy files from the input document, bypassing the device interface. no thanks.	11:47.46
paulgardiner	I was thinking that really we want the pdf_obj back not the pdf_font_desc	11:47.52
tor7	you'll need the pdf_font_desc to deal with the metrics (/Widths arrays) when writing text	11:48.15
	and the encoding tables	11:48.22
	to match the glyph layout and kerning stuff from the fz_text object	11:48.54
paulgardiner	But isn't the pdf_font_desc derivable fromt the pdf_obj form of the descriptor. Isn't that where it came from?	11:49.09
tor7	by pdf_load_font yes	11:49.22
paulgardiner	I think I may misunderstand more of this than I thought.	11:49.32
Robin_Watts	paulgardiner: Perhaps we could put a pdf_obj * pointer into the pdf_font_descriptor ?	11:50.08
tor7	brb, getting dizzy. need lunch.	11:51.22
paulgardiner	Possibly, though perhaps this whole approach doesn't lead anywhere because however neatly it deals with my case it we still need to handle the XPS font case	11:51.49
tor7	robin_watts: I won't object to that (pdf_obj of the 10 0 R form in the pdf_font_desc)	11:52.01
	I'll write up some text explaining my thoughts this afternoon	11:52.19
*Robin_Watts*	lunches (I think my "contribution" to this conversation is pretty much exhausted anyway!)	11:52.31
tor7	or ask miles to schedule a workshop of a few days where we can discuss this in person	11:53.03
paulgardiner	When you're both back can we briefly discuss the base 14 only solution becaue that's what I'd like to do for now.	11:54.05
	I'm hoping it still makes sense to go with yesterdays plan plus the new fz_font creating function for base fonts.	11:54.58
	If that still make sense, I can do that fairly quickly I think	11:55.15
Robin_Watts	paulgardiner: That sounds fine to me, though I only understand it at the highest level.	12:36.20
tor7	robin_watts: skype voice chat maybe?	12:37.45
	paulgardiner: ^	12:37.48
Robin_Watts	tor7: If you like, though I don't necessarily feel I have much to contribute.	12:38.13
tor7	I can rant about the stupidity of fonts in PDF :)	12:38.37
paulgardiner	We could give it a go. I've just realised I may need to externally create a descriptor in any case because I have to add a DA (default appearance) string to the annotation and that can't sensibly be generated by the pdf device	12:40.06
tor7	paulgardiner: http://ghostscript.com/~tor/stuff/fonts.txt some incoherent ramblings	12:44.46
	paulgardiner: so to start, you want to just create base14 font descriptors and not reuse existing font descriptors?	12:47.24
paulgardiner	Yes. That'll do fine for now	12:47.47
	So does yesterdays plan still fit with today's If I carry on in that style, will I be going against what we've decided today?	12:49.00
Robin_Watts	I think getting something (anything!) that works without too much trouble has to be a step forward.	12:49.30
tor7	remind me of yesterday's plan	12:49.35
paulgardiner	tor7: I was hoping you wouldn't ask that. :-)	12:49.56
tor7	paulgardiner: okay, let me paste today's requirements as I understand them	12:50.26
	interpreter load: from pdf_obj to pdf_font_desc (and by extension fz_font)	12:51.04
	interpreter layout: from pdf_font_desc + content stream to fz_font + fz_text	12:51.12
paulgardiner	I think it was "use the pdf-write device as is, but where it currently always creates Helvetica, have it sensitive to the fz_font pass in using the data pointer to decide which base font	12:51.18
tor7	(this is the data flow of our structs)	12:51.21
	in pdfwrite we need: from fz_font + fz_text to pdf_obj + content stream	12:51.39
	at the highest level	12:51.42
	which need to be decomposed to these transforms:	12:51.58
paulgardiner	That's the trouble with IRC. I was joking when I said I didn't know what yesterday's plan was	12:51.58
tor7	from fz_font to pdf_font_desc	12:52.17
	here we can either reuse an existing font descriptor (if it matches our needs)	12:52.39
	or create a new one from scratch based on the fz_font	12:52.50
	from pdf_font_desc to pdf_obj	12:52.57
	so we have something to stick in the resource dictionary	12:53.04
	from fz_text and pdf_font_desc to content stream	12:53.10
	we need the pdf_font_desc to deal with encodings and metric layout	12:53.25
	we also need a way for you to create a fz_text object given a fz_font + unicode string IIRC	12:53.54
	(and I honestly forgot what yesterday's plan was, we've been back and forth so many times)	12:54.54
paulgardiner	I've just paraphrased it above	12:55.16
tor7	right. so currently it always creates a new font descriptor for Helvetica?	12:55.58
paulgardiner	The pdf device already creates descriptors but always Helvetica	12:56.06
	Yes	12:56.11
tor7	right. so we could start by extending that to the full set of base 14	12:56.21
paulgardiner	Yep. I think that was yesterday's plan	12:56.37
tor7	and error out if the fz_font doesn't match	12:56.38
paulgardiner	Or create Helvetica :-)	12:56.48
	Probably best to error out	12:57.01
tor7	given a fz_font based on a freetype font it should be fairly easy to create a font descriptor embedding the data	12:57.28
	but it means reading up and understanding pdf font descriptors...	12:57.47
paulgardiner	As an extension?	12:57.49
tor7	but one of the cases will be base14 so best to start there	12:58.02
paulgardiner	Right	12:58.08
tor7	from a fz_font we have four kinds of font descriptors we need to worry about:	12:58.39
paulgardiner	The one addition from today I was considering was to add the quick method to get an fz_font for a base font	12:58.43
tor7	1) base 14 non-embedded	12:58.47
	2) embedded	12:58.51
	3) substitute fonts (non-embedded non-base14)	12:59.03
	4) type3	12:59.05
paulgardiner	5) not PDF?	12:59.23
tor7	I don't understand that question.	12:59.40
paulgardiner	I thought we also needed to deal with fz_font objects from XPS files.... or do we not create fz_fonts for XPS	13:00.36
tor7	the 4 cases listed are what we may need to create	13:00.44
	regardless of the source	13:00.52
paulgardiner	Oh ok sorry	13:01.00
tor7	xps fonts will always end up in case (2) though	13:01.12
	they're always embedded :)	13:01.16
paulgardiner	Right	13:01.17
tor7	there are many more complications with font descriptors though, simple vs cid font for example.	13:01.53
	so preserving existing font descriptors is a much too difficult problem	13:02.15
paulgardiner	I was reading the whole thing as the incoming fz_font. The one that fz_fill_text finds in the fz_font attached to the fz_text	13:02.19
tor7	the fz_font exists in two flavors -- freetype backed, or type 3	13:02.42
	the freetype backed one has a flag whether it's a real font or whether it's an acting substitute font	13:02.57
paulgardiner	That's one of the things I'm still not completely understanding "preserving exisiting too difficult"	13:03.19
tor7	the font descriptor in pdf lets you go from funny-pdf-encoding-that-you-have-no-idea-what-the-bytes-actually-mean to a glyph id	13:04.15
	and in the lucky cases, also to unicode	13:04.19
	it also has a table mapping from character code (in this funny unknown encoding) to glyph advance width	13:04.41
paulgardiner	But we are passing an fz_text object that presumably has a suitable string for the encoding	13:04.56
tor7	there is no way we can reliably go from a unicode string to a fz_text object from this information	13:05.09
	the fz_text object has raw glyph ids (encoding less) and maybe, just maybe, some unicode text that matches	13:05.50
	depending on how good the input data is	13:05.56
	a font in a pdf file does not have a way to go from glyph id to unicode. it may be possible by reverse lookups in the freetype backed font data. but not guaranteed.	13:06.33
paulgardiner	Still confused. We call fz_fill_text. We give it an fz_text object. However confusing the encoding is, the string in the fz_text object must match the encoding in the fz_font.	13:06.46
tor7	lots of embedded subset fonts have stripped out the necessary tables.	13:06.48
	there is no encoding in fz_text	13:07.01
	there is no encoding in fz_font	13:07.06
	they work on raw glyph ids	13:07.10
paulgardiner	Oh	13:07.11
	Right	13:07.16
	Of couse	13:07.19
	There couldn't be. Silly me	13:07.24
tor7	the fz_text may have supplementary unicode text string, if we're lucky. but that's only for the text extraction device and searching.	13:07.59
paulgardiner	No no no.	13:08.22
tor7	now, in a subset of fz_fonts we can go from unicode <-> glyph and back	13:08.23
paulgardiner	What is in the fz_text string?	13:08.29
tor7	fz_text_item_s has x, y, glyph id, and unicode code point.	13:09.02
	the glyph id is used to find the glyph outline and render it	13:10.01
paulgardiner	Where glyph id makes sense only if the correct encoding is used?	13:10.13
tor7	the unicode code point is used for searching and copy&paste	13:10.14
	the glyph id is the index into the glyph data table in the font. it has no correlation with any encoding.	13:10.32
	a truetype font has a "glyf" table, a "loca" table and a "cmap" table	13:10.45
	the "cmap" table maps from unicode to glyph id	13:10.52
	the "loca" table maps from glyph id to offset in the "glyf" table	13:11.01
	the data in the "glyf" table is the vector outline	13:11.14
paulgardiner	Right	13:11.21
tor7	in PDF and XPS, the content streams may index the glyph id ("loca" table) directly, bypassing the "cmap"	13:11.39
	a /Identity-H encoded font in PDF for example	13:11.55
paulgardiner	I'd forgotten all the terms	13:11.56
tor7	or wores, a PDF font can use /Identity-H to get a number which is lookuped in the /CIDToGIDMap which is then used to index the "loca" table	13:13.23
	and the embedded font can be a subset that doesn't have a "cmap" table	13:13.40
	and the /ToUnicode entry is optional	13:13.46
	which means there's absolutely no way we can go to or from unicode using this font	13:14.02
paulgardiner	For another product I implemented most of the font interpretation and encoding handling. I have known this stuff at one time, but it is a while ago.	13:14.04
tor7	but it prints correctly :)	13:14.08
	so we may need to extend the fz_font struct to allow it to have some metric and encoding capability	13:15.24
paulgardiner	Type 1s? Don't the encodings map onto names rather than glyph ids? Or am I misremembering that too?	13:15.35
tor7	the encodings map to names which map to charstrings via a dictionary	13:16.03
	in freetype, the charstrings are loaded into an array, and the names are mapped to an index in this array	13:16.26
paulgardiner	But what do you use for glyph id's in fz_text_item_s? ... oh an enumeration of the names maybe?	13:16.48
tor7	the freetype internal index	13:17.00
	in freetype, all glyphs are indexed by a numeric index	13:17.23
paulgardiner	For type 1s?	13:17.24
tor7	for type1's freetype will synthesize a numeric index for each glyph	13:17.37
	and give you mapping tables to go from name -> index and back	13:17.46
	glyph_id = FT_Get_Name_Index(face, string)	13:18.15
paulgardiner	Ah. You use the freetype thirdparty library in the interpretation of Type1s in PDF?	13:19.46
tor7	yes. we use freetype for all fonts, except type3.	13:20.00
paulgardiner	Oh nice. I wish we'd thought of that in the other project I mentioned.	13:20.29
	Ok. Now at last I understand why we can't reuse font descriptors... or well at least why it would mean also hiding extra encoding specific info in the fz_text_item_s	13:21.55
	The worrying thing is that now I realise I have yet another problem I hadn't considered at all. I need to construct a fz_text object from a utf8 string and an fz_font.	13:23.28
tor7	yeah. so if we have an existing font descriptor, in simple cases we could reuse it. like it being a known font (i.e. base14) with a known encoding (i.e. WinAnsi)	13:23.29
	and no metric overrides	13:23.34
paulgardiner	I hadn't realised that required layout	13:23.38
tor7	it needs both layout and encoding	13:23.48
paulgardiner	Yes	13:24.00
tor7	but if it's a base14 font (or a well formed embedded font, with unicode cmap tables) that can be done	13:24.14
	some subset fonts I think can skip the metrics tables as well, to save space	13:24.44
	since the PDF needs the /Widths array for layout anyway	13:24.53
paulgardiner	It is a huge amount of work for what can be done so simply with sprintf.	13:24.55
tor7	yeah.	13:25.47
	so anyway, with a proper fz_font we should be able to use freetype to both do the encoding and metrics	13:26.14
	but we probably want to add some functions to do it	13:26.27
paulgardiner	I guess the pdf device is going to find it difficult to work out that it doesn't need to do a move for each char	13:26.36
tor7	it needs to check how much the implicit move is, and write the delta if the next char is not where expected	13:27.08
paulgardiner	So we are going to end up with inefficient appearance streams	13:27.12
Robin_Watts	what tor7 said.	13:27.22
	but inefficient streams will do for a start.	13:27.36
paulgardiner	Yeah true	13:27.43
	So roughly how do I create fz_text for a utf8 string... just the base 14 case for now.	13:28.22
	?	13:28.25
tor7	two ways really	13:28.37
	either limit the charset to WinAnsi and use that	13:28.50
	or make a CIDFont font descriptor, and use Identity-H with a 16-bit encoding	13:29.01
	and write out the glyph ids	13:29.05
paulgardiner	WinAnsi	13:29.10
	fine	13:29.12
tor7	(but that's not going to work for base14, that needs embedding the font data)	13:29.19
	I'd go with WinAnsi for now	13:29.27
	you could figure out a character set to use dynamically, or make a separate font descriptor for each "code page" that you need	13:29.58
paulgardiner	I was more requesting a rough sequence of calls... although I guess I get that from inspecting the interpreter	13:30.12
	I'm happy for now if only ascii works	13:30.39
tor7	the calls don't exist :)	13:30.47
	when creating the fz_text I'd say use freetype to encode unicode values directly	13:31.16
	text_item->x = point.x	13:31.28
	text_item->y = point.y	13:31.33
	text_item->gid = FT_Get_Char_Index(font->ft_face, c)	13:32.03
	text_item->ucs = c	13:32.08
paulgardiner	It's point.x and point.y I was most unsure of	13:32.30
tor7	FT_Get_Advance(face, gid, ...)	13:32.38
paulgardiner	Oh okay	13:32.44
tor7	look in source/xps/xps-glyphs.c for an example	13:32.49
	then the pdfwrite device will have to worry about going from glyph id back to WinAnsi	13:33.16
	you also need to FT_Set_Charmap to pick the unicode cmap	13:34.02
	see xps-glyphs.c again	13:34.07
	the xps calls for this could possibly be moved into fz_ (if we make them handle type3 fonts as well)	13:35.07
	or at least not crash on type3	13:35.13
	xps_encode_font_char, xps_measure_font_glyph, et	13:35.33
	c	13:35.35
paulgardiner	I still can't stop asking myself "is this best way to go from <"Hello","Helv", 12> to "BT /Helv 12 Tf 0 g (Hello) Tj ET"	13:36.15
tor7	paulgardiner: good question, but I think it is if you want to go via the fz_device layer	13:36.48
paulgardiner	Oh yes definitely. It's the use of the fz_device layer that I guess I'm questioning. It did show huge advantages for the other types of annotation.	13:38.11
	But I'm happy in the knowledge that similar alarm bells aren't ringing so loudly in your or robin_watts's ears.	13:39.28
Robin_Watts	paulgardiner: There may be a shorter route as a hack to get basic annotations working.	13:40.05
tor7	taking care of using base14 fz_font and only the base14 template font descriptor on the pdfwrite back end should short-cut the difficulties	13:40.46
Robin_Watts	but in the same way that reusing the existing work that had been done for pdf write helped the annotation stuff in the past, your annotation stuff is now in a position to help the more general pdf write.	13:40.53
tor7	and let us grow that into the full implementation that a generic pdfwrite device will need	13:41.06
paulgardiner	To be honest, I don't think it's a huge amount of work to get basic base 14 working this way. I was more concerned about the amount of processing on each use	13:41.27
tor7	and not get us into the problem where we have two separate text layout and font descriptor handling	13:41.29
	paulgardiner: basically it'll map from UTF-8 to glyphs to winansi, rather than just from UTF-8 to winansi	13:41.57
	and let us handle non-winansi cases down the line	13:42.03
paulgardiner	I wonder if I can trick the interpreter into creating the fz_text objects for me.	13:43.23
tor7	don't :)	13:44.08
paulgardiner	I think I'd need an appearance stream. Now where can I get one of those from? :-)	13:45.21
	It does (almost) make sense to use sprintf to create an appearance stream which then is passed to part of the interpreter to produce an fz_text object which then is passed to the pdf device. The advantage over just using sprintf is that we get to use the pdf device to mix text and line art.	13:47.31
tor7	the pdf interpreter can't do utf-8 though	13:49.51
paulgardiner	True. And in any case it seems very kludgy	13:50.47
tor7	the more I think about it, the more I think that just using the freetype calls to do layout and encoding should "just work" on the annotation creation side of things	13:52.20
	maybe do a double check with freetype that the font actually has a unicode encoding table first, and if failed, load up helvetica	13:53.00
paulgardiner	Yeah. Doesn't sound too bad.	13:53.10
tor7	(and also check that it isn't a substitute font)	13:54.17
	robin_watts: ping	14:09.14
Robin_Watts	pong	14:09.22
tor7	I've looked through the progressive patch except the page and xref bits	14:09.40
	1) ok should be spelled okay	14:10.00
Robin_Watts	okay	14:10.26
tor7	2) the linear and progressive flags in pdf_document... what if we can do both progressive and random access?	14:10.52
	like the byte range loading from an http stream	14:11.07
	(mind you, I haven't read the implementation, just a thought that occurred when seeing the enum)	14:11.38
Robin_Watts	tor7: flags word, innit.	14:11.39
	That would have been more obvious with an earlier version of the patch that had SEEKABLE=4.	14:12.15
tor7	righ.	14:12.26
	3) fz_stream_meta ... default return value -1 ? wouldn't 0 be better.	14:12.42
	I'm not sure I like the _meta function anyway	14:13.16
Robin_Watts	Hmm. Actually, I don't think I need PDF_FILE_STATE_PROGRESSIVE after all.	14:13.41
	it's set and then never consulted.	14:13.48
tor7	4) progressive XPS? maybe not in this patch, but xps could be done progressively by reading the zip entries from the start with no random access	14:14.14
	we just won't know when we can discard stuff	14:14.25
	I'd have thought the progressive state is implicit in the fz_stream	14:14.52
Robin_Watts	Having meta return -1 by default is so that for future expansion, all 'unknown' meta keys return -1.	14:15.02
	and so we can easily distinguish 'not supported' from boolean return values.	14:15.21
tor7	okay.	14:15.37
Robin_Watts	I like the 'meta' scheme. It has served me well in the past.	14:15.56
tor7	I would prefer one function per meta state though, but I guess it depends on where you draw the line	14:16.18
Robin_Watts	But we could just have an 'is_progressive' flag (or function) on every fz_stream.	14:16.22
tor7	I mean, all functions could be done by one big huge meta "switchboard" function if you go to extremes	14:16.37
Robin_Watts	tor7: The nice thing about doing it like this is that when we have chains of streams, you can pass the meta functions through.	14:17.15
	but there are several ways we could code this.	14:18.04
tor7	reminds me of https://en.wikipedia.org/wiki/Gestalt_%28Mac_OS%29	14:18.56
Robin_Watts	4) Nothing I've done here precludes its use with XPS.	14:19.21
tor7	5) the biggest talking point is the font handling	14:19.43
Robin_Watts	the general TRYLATER approach should work (though we may need to tweak the cleanup paths)	14:19.52
	tor7: I do font handling?	14:20.16
tor7	yeah, xps cleanup is very flaky so I'm sure progressive XPS will show up a lot of errors there	14:20.21
	the way you create a fallback font when font loading fails	14:20.32
	it's not going to give useful results on non-standard encodings	14:20.52
Robin_Watts	tor7: is that a problem?	14:21.04
tor7	what could work though, is if the file loading fails (trylater) is to treat the font as a substitute font temporarily	14:21.18
Robin_Watts	Think of it as 'greeking' :)	14:21.27
tor7	right!	14:21.43
Robin_Watts	tor7: OK, we're getting into areas where I'm hazy again...	14:22.01
tor7	so we have two possible fail states when loading fonts. not sure if it's worth bothering with.	14:22.02
	if the font descriptor (pdf_obj stuff) fails to load, we don't have any encoding tables or metrics or anything useful	14:22.29
	this is where greeking would be fine	14:22.38
	if the font descriptor loads, but the embedded font file fails...	14:22.52
	we have a way to deal with that already -- substitute fonts.	14:23.03
	I guess the problem is recovering cleanly once we can reload the font and have the complete thing.	14:23.23
Robin_Watts	tor7: if the font descriptor fails to load, we don't go looking for the actual font.	14:23.26
tor7	the font file is loaded early on in the font descriptor loading	14:23.46
	if that fails with TRYLATER we could handle that the same way as we handle a missing font file	14:24.07
Robin_Watts	ah, right. This is load_font_or_sub	14:24.09
	I'd entirely wiped the memory of that from my mind.	14:24.17
tor7	yes. I'm talking about inside load_font	14:24.21
Robin_Watts	How about we go with what I have for now, and then someone with a clue about this area can tweak that function ? :)	14:24.44
tor7	pdf_load_font_descriptor calls pdf_load_embedded_font	14:24.53
	if that fails with TRYLATER you call rethrow_if	14:25.39
	but we could possibly set some magic flags and proceed by handling it as a substitute font	14:25.54
Robin_Watts	I'm sure it could be handled better.	14:26.10
tor7	but that'd mean messing with the font store and "reloading" an existing font	14:26.13
Robin_Watts	Yeah, everything so far avoids incomplete stuff going into the store, I believe.	14:26.44
tor7	your greeking approach is fine, but I'd rename the function pdf_load_font_or_greek	14:27.20
	load_font_or_sub is a lie	14:27.27
Robin_Watts	load_font_or_hail_mary	14:27.40
tor7	that'd do :)	14:27.45
	also load_fallback_font should be pdf_load_hail_mary_font	14:28.16
Robin_Watts	ok.	14:28.40
	actually, this patch is much smaller than my memory had said it was. I guess that's a good thing.	14:29.04
tor7	the implementation of load_fallback_font is fine. a NULL dict, and base14 font name, will get you sane defaults in the pdf_font_descriptor	14:30.21
	you might want to cache it though	14:30.40
	creating fonts is not cheap and is fairly heavy on memory	14:30.55
	(all the font descriptor encoding tables and metric tables that need to be allocated and filled out)	14:31.19
	and "Times" might be a better choice than Helvetica	14:32.02
Robin_Watts	tor7: hmm. Where can I cache it ?	14:32.05
	Helvetica chars are narrower than times, typically, right?	14:32.20
tor7	in the store, using a "hail mary" key	14:32.21
Robin_Watts	Ah, right.	14:32.29
tor7	robin_watts: shouldn't matter too much, the layout will use the widths from the font descriptor	14:32.53
Robin_Watts	Picking a font with chars that's too narrow gives results you can still read.	14:33.09
tor7	it's only if a text object is reset to an absolute position that it'll collide	14:33.09
Robin_Watts	picking a font with chars that are too wide results in overlaps and unreadability.	14:33.28
tor7	a hail mary font will not be using the correct metrics, so the text will also be too wide	14:34.00
	so the line widths will be wrong as well	14:34.22
	but yes, if there is absolute positioning, using a narrower font is much better	14:34.41
	and it's bound to be jarring when the real font is loaded anyway	14:35.00
Robin_Watts	tor7: OK. I will make those changes, thanks.	14:39.47
tor7	my brain is too mushy to check the rest of the changes. I'll do that tomorrow if that's okay?	14:40.19
Robin_Watts	sure. thanks for looking so far.	14:42.03
	tor7: MB_OK, PDF_ALERT_BUTTON_GROUP_OK, PDF_ALERT_BUTTON_GROUP_OK_CANCEL, PDF_ALERT_BUTTON_OK, FT_Err_Ok, JPEG_HEADER_OK, DSTATE_RAW_OK, smoothing_ok, Z_OK, UNALIGNED_OK, FZ_META_OK, X509_V_OK	15:30.45
	CURL_TELOPT_OK,CURL_TELCMD_OK, CURLE_OK, CHUNKE_OK, SSL_OK, SETCHARSET_OK, ... etc.	15:32.07
	vs: pd_okay	15:32.12
	I put it to you that you are outvoted :)	15:33.04
	"No bloat", right? :)	15:34.25
	mvrhel_laptop: ping	17:00.44
mvrhel_laptop	robin_watts pong	17:01.04
Robin_Watts	I've just pulled in lcms 2.5 and have got 9000 or so differences.	17:01.05
mvrhel_laptop	oh my	17:01.10
Robin_Watts	The bmpcmp is in my area now.	17:01.13
mvrhel_laptop	sounds like you have some work to do ;)	17:01.24
	robin_watts these are from lcms? starting from tests/Ghent_V3.0/020_CMYKSpot_OP_x1a.pdf.pam.72.0?	17:02.55
Robin_Watts	oh. I'm so dumb.	17:03.20
	Let me pull in the recent gs changes :)	17:03.30
	Sorry.	17:04.06
mvrhel_laptop	np	17:04.12
	have to leave in a bit to see kids show off their improved swimming skills as today is the end of swim lessons	17:05.11
Robin_Watts	np. Hopefully this should go through with no real changes.	17:05.29
SpNg	I'm reading the GS9_Color_management.pdf regarding the introduction of ICC profiles. Is it possible to convert an .eps file from RGB colorspace to CMYK colorspace using .icc profiles?	17:44.10
Robin_Watts	mvrhel_laptop: ok, 4000 differences now.	17:57.53
	These seem much more plausible though. I will do a run where I exclude the halftoned ones.	18:02.32
tor7	robin_watts: none of those OK are by me!	18:11.35
	but fine, we have no occurences of "okay" anywhere...	18:11.57
Robin_Watts	:)	18:26.46
mvrhel_laptop	robin_watts: I don't see any that I have an issue with in a quick view	19:21.23
	grabbing some lunch now	19:22.14
vtorri	hmmm	19:36.44
	libgs is multithreaded ?	19:36.52
kens	SpNg if you mean produce a CMYK EPS from an RGB EPS< then the answer is no, not yet.	19:39.39
	vtorri, it might be, it depends what you mean and how you configure Ghostscript	19:40.35
SpNg	kens: that is what I was referring to.	19:41.12
kens	SpNg then the answer is no currently. We don't have a device which produces EPS which uses high level colour management. The current EPS device emits everything as RGB.	19:41.53
	You cna have PostScript output, but not EPS	19:42.09
SpNg	kens: ok. is it in the works, or in the distant future?	19:42.40
kens	SpNg at some point we want to extend ps2write to produce EPS, at which poitn we can finally kill pswrite. At that point it wil be possible to produce an EPS in a specific colour space. No promises on dates.	19:43.41
SpNg	kens: great! thanks for the response	19:47.14
kens	No problem	19:47.26
Gigs-	ouch, segfault when res=144 but 145 and 143 are OK	20:43.31
	that's a really specific bug	20:43.37
	http://bugs.ghostscript.com/show_bug.cgi?id=694423	20:50.51
	if someone could mark the attachment private as usual... robin_watts?	20:51.08
SpNg	can ghoscript do conversions from RGB to CMYK values using an ICC profile? I'm looking to convert input values, not raster images or vector files	22:32.38
Robin_Watts	Gigs-: Someone beat me to it I think.	22:47.05
	SpNg: Yes. Ghostscript will use icc profiles to convert between different colorspaces.	22:47.35
	BUT... pdfwrite and such devices don't use that yet.	22:48.02
	or at least there are cases where they don't use it yet.	22:48.13
SpNg	Robin_Watts: I'm trying to do it with values. I would like to use say 3 floats for for RGB, and have them convert to 4 floats in CMYK	22:48.17
Robin_Watts	If you use gs to render an input file containing CMYK values as an RGB bitmap you'll get the conversion done.	22:48.54
	likewise RGB to CMYK.	22:49.01
SpNg	how would I extract the values so I could use them programmatically?	22:50.26
	something similar to this post, but this is in java: http://stackoverflow.com/questions/4858131/rgb-to-cmyk-and-back-algorithm	22:50.36
Robin_Watts	SpNg: You would use a color management system, like lcms2 to do it for you.	22:53.12
SpNg	Robin_Watts: beautiful. Thank you for the pointing me in the right direction. ;-)	22:54.32
	Forward 1 day (to 2013/07/19)>>>

IRC Logs

Log of #ghostscript at irc.freenode.net.