Ghostscript IRC logs

Log of #ghostscript at irc.freenode.net.

	<<<Back 1 day (to 2012/07/02)	2012/07/03
vtorri	hey	06:04.39
ghostbot	hi, vtorri	06:04.39
vtorri	with mupdf, is there a way to check if the file is a pdf one, without loading the whole file ?	06:05.13
	some kind of "preload" function	06:05.25
	asking again :)	07:56.31
	with mupdf, is there a way to check if the file is a pdf one, without loading the whole file ?	07:56.32
	some kind of "preload" function	07:56.35
kens	vtorri I'm not an expert, but I don;t believe MuPDF will 'load the whole file' if its not a PDF.	07:57.14
vtorri	and if it's a PDF one, it will load it entirely ?	07:57.48
kens	In order to be a valid PDF file it must contain %!PDF within the first 1024 bytes of the file IIRC	07:57.48
vtorri	ha	07:58.04
	i ask because of a doc viewer i'm writing	07:58.32
kens	In order to find the xref it is required to go to the ned of the file. If you know a way to get to the end of the file without having teh whoel file, I'd be interested to hear it ;-)	07:58.33
vtorri	it has several backends	07:58.42
	and i would like to load the module only if the file corresponds to the module	07:59.12
kens	Sadly not all PDF files follow the rules.	07:59.29
vtorri	so, for optimisation, i would like some kind of "preload" function	07:59.30
	arg	07:59.34
	i'm doomed	07:59.39
kens	Because Adobe Acrobat is 'flexible' in assuming files you load are PDF files, PDF producers are very lax about what they create	08:00.05
vtorri	too bad :)	08:00.56
	i'll just use a "prefered" module, based on the extension, if there is one	08:01.31
kens	Well I think you can legitimately search for %PDF in the first 1024 of the file, and assume its not a PDF if you don't find that.	08:01.32
vtorri	i'll ask tor what to search exactly	08:02.00
	maybe he will give some hints or advices about what to do exactly	08:02.20
	haaa, here he comes	08:10.03
	tor8: hey	08:10.07
	tor8: question:	08:10.17
	i would like to optimize a doc viewer that can render pdf with mupdf	08:10.42
	what i would like to do is some kind of "preload" function that would detect that a file is a PDF one, without loading the whole file	08:11.20
	kens told me that i should search for %PDF in the first 1024 bytes of the file	08:11.48
	is it the best way to achieve what i want to do ?	08:12.02
kens	vtorri see implementation notes 13 & 14 in the 1.7 PDF Reference Manual (p 1102)	08:21.26
tor8	vtorri: yeah, that sounds like the best approach	08:24.03
kens	In particular look at implementation note 14 which has an alternate fomr of the header, accepted by Acrobat, which I wasn't aware of.	08:25.02
	"Acrobat viewers also accept a header of the form	08:25.02
	%!PS-Adobe-N . n PDF-M . m"	08:25.02
tor8	vtorri: if mupdf opens a valid pdf file, we load only select bits of it at launch	08:25.03
	if it's a broken pdf file, we may end up parsing the whole file in one go to patch up the broken index	08:25.40
	kens: odd, I've never seen that header before either	08:27.06
kens	:-)	08:27.13
chrisl	Hah, so much for "The text rendering mode has no effect on text displayed in a Type 3 font"......	08:31.09
kens	Huh ?	08:31.47
	I take it Acrobat does when its not a bitmap font ?	08:32.00
chrisl	Not exactly - the colour used to draw the glyph is influenced by the tr mode. It looks like any stroking mode causes the glyph to be drawn in the stroke color	08:33.01
	But Acrobat also (tries to?) apply the clipping tr modes, too......	08:34.09
kens	THat's just bizarre....	08:34.18
vtorri	tor8: so opening the pdf with mupdf is kind of light ?	08:34.18
tor8	vtorri: if it is a well formed PDF, it is a light operation	08:34.41
	if it is a badly formed PDF, then it's a very heavy operation	08:34.51
vtorri	hmm	08:34.57
	ok	08:35.02
tor8	and if it's not a PDF at all, also a heavy operation until we give up	08:35.05
vtorri	i guess that i can't have much better	08:35.14
chrisl	kens: ironically, the test file that shows this was create by "Jaws PDF Library" :-)	08:35.20
kens	ROFL	08:35.34
	Probably it just inherited it from a previous PDF file, but its impossible to know	08:35.54
tor8	well, you could refuse to open files that are obviously not PDF, or obviously broken. but mupdf tries very hard to accept broken files.	08:36.18
chrisl	No, it looks like it's been hand hacked to roll through all the modes with a t3 font	08:36.26
kens	Hmm it sounds like a file I may have created, this all sounds teribly familiar	08:36.55
chrisl	kens: it's comparefiles/pdf-t3-simple.pdf	08:37.13
kens	That sounds like its mine, let me quickly look	08:37.32
	Yes, I'm pretty sure I made that one	08:38.12
chrisl	Well, pretty much everybody seems to get different output for it	08:38.29
kens	I think differerent versions of Acrobat display it differently too	08:38.48
	Acrobat X looks 'correct', all the text is blue, no strokes, no clipping	08:39.08
chrisl	What do you get in AcroX?	08:39.09
kens	6 lines of blue square, blue triangle C one blank line in the middle	08:39.30
chrisl	Ah, Acro9 has blue, red, red, blank, blue/green, red/green, red/green, blank/green	08:40.19
kens	Network is having trouble today	08:41.29
	Acrobat X looks right, other versions look wrong	08:41.51
	And I'm pretty sure that's my test file. I think I was trying to investigate what Acrobat did.	08:42.22
chrisl	Could you send me the Acro X output? It should be fairly easy to get our output the same	08:42.30
kens	OK one second	08:42.38
chrisl	Shame I just spent half an hour getting our output to match Acro 9 :-(	08:42.50
kens	Oh thart's bizarre	08:42.57
	I reopened hte file and its different.....	08:43.09
	Now its applying the clipping, it didn't before	08:43.21
	ROFL	08:43.47
	If I open the file by double-clicking it displays one way, if I use the 'open' dioalog, it displays differntly....	08:44.13
	You can't make this stuff up ;-)	08:44.28
chrisl	Oh, that's just..... <sigh> Adobe all over, really.....	08:44.45
kens	chrisl tiff file on its way to you	08:52.13
chrisl	kens: thanks - very different from Acro 9, and inconsistent with itself - I guess I'll try to match the all blue one, since that actually matches the spec.....	08:59.43
kens	Yes, I think that we should match teh spec, especialy since the most recent version of Acrobat (mostly) does	09:09.27
	How they manage to get different results depending on how you open the file escapes me....	09:09.53
chrisl	Because they are Adobe, and defeating logical reasoning is their forte......	09:15.40
	kens: although, interestingly, even your "best" output from Acro X doesn't match the spec - it's still honouring tr mode 3	09:48.19
Robin_Watts	chrisl: I had to do some stuff in mupdf recently to match type3 fonts that used stroking etc.	09:54.39
	I don't remember it being to do with tr, but it may be related.	09:54.52
kens	chrisl, yes you're quite right, I didn'rt consider that	09:55.03
Robin_Watts	previously we always used to cache the bitmaps produced from a type 3 font, but you can't do that if they rely on the color set in the environment in which they are called.	09:55.25
	So, I now check for things like color being set before it is used; if it isn't set, we don't cache the bitmap.	09:55.54
	and we draw it fresh each time.	09:56.09
chrisl	Robin_Watts: such glyphs should begin with a d0 call - instead of d1	09:56.15
Robin_Watts	I can't remember the details offhand, but I suspect that these weren't that simple.	09:56.50
chrisl	kens: of course, there's no reason not to honour tr mode 3 (unlike the other modes)	09:56.59
kens	Yes, that's the only one that makes any sense	09:57.13
chrisl	Robin_Watts: "A glyph description that begins with the d1 operator should not execute any operators that set the color (or other color-related parameters) in the graphics state; any use of such operators is ignored.""	09:58.32
kens	D'oh, forgot to write an else clause, no wonder I'm getting funny rsults.	09:59.43
Robin_Watts	Bug 692745	10:13.02
	http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commit;f=pdf/pdf_interpret.c;h=2c836b57d5295b47655988cf8deaffda731e1c3c	10:13.08
	It's just started to rain, so it's clearly time for me to run :(	10:14.31
chrisl	Robin_Watts: well, it seems to me that could result in wrong rendering - albeit it really stupid files......	10:47.12
Robin_Watts	paulgardiner: 69 Gig in fact :)	12:13.42
paulgardiner	Eeek	12:13.53
Robin_Watts	I've got the interesting stuff (for you) down onto a single layer blu-ray though, burning now.	12:14.03
paulgardiner	Great. Thanks	12:14.14
	I'll pop over as soon as it's done if that's ok	12:15.13
Robin_Watts	no problem.	12:15.25
	sebras: I'm going to look at that shading bug now (just checking in, in case you had looked)	12:25.29
sebras	Robin_Watts: go ahead.	12:34.35
jen_	I use GS for merging PDF files, is this process lossless?	12:43.45
kens	Depends what you mean	12:43.58
jen_	so, file1.pdf +file2.pdf is merged to file3.pdf. Then file 1.pdf + file3.pdf = file4.pdf, etc. What happens to quality of file1 after many runs.	12:44.20
kens	Probably nothign much	12:44.33
	But best answer is 'don't do that'	12:44.42
jen_	file1 image in the final merged file.	12:44.45
kens	Ghostscript fully interprets each PDF dfile to marking operations, and tehn regeneragters a brand new PDF fiel from teh marking operations.	12:45.12
	There is no correspondence between teh contensts of the inptu and output, except that the marking operastions shoul have the same result	12:45.41
	Modulo specific command line options which may downsample images etc.	12:46.10
jen_	thanks kens , great answers.	12:47.17
kens	NP	12:48.16
marcosw_	kens: A customer asks: "is there support for converting PCL to PDF/A directly?"	13:49.49
kens	marcosw : you can do the conversion, but you can't add some of the field stuff	13:58.00
	Actually colour conversions would be a potentail problem, but PCL is only RGB so as long as you used a RGB IC profile it shoudl be OK	13:59.56
	It might work but I'd have rto think about it for a minute	14:01.08
chrisl	kens: what edition of Acro X do you have?	14:01.12
kens	pro I think	14:01.20
chrisl	ta	14:01.26
kens	yes, pro	14:02.26
marcosw_	kens: thx.	14:02.33
kens	marcosw No it won't work because we add some stuff via a pdfmark, which won't (obviously) wotk in PCL	14:04.53
	We could probably modify teh PCL interperter and pdfwrite to do it though	14:05.10
Robin_Watts	AH, I see the problem with this shading.	14:08.06
	tor8: ping	14:14.23
	tor8: Did you ever get a chance to look at my patch on master? Another tiny one there now.	14:14.45
tor8	Robin_Watts: I asked yesterday if memsetting the out buffer in fz_predict_tiff wouldn't be faster	14:49.27
	Robin_Watts: mubusy fix lgtm	14:50.03
Robin_Watts	tor8: it might be, but then I'd have to know how big the outbuffer was.	14:50.14
	(sorry, obviously missed that comment yesterday)	14:50.28
tor8	Robin_Watts: the 'len' argument, but hm I wonder if there aren't more fishy problems with the tiff predictor	14:51.38
	it doesn't use the len argument!	14:51.43
	Robin_Watts: I'm dithering about how to solve the Symbol font problems	14:52.13
	in a way, I don't want to add specific workarounds for this specific font. it will work if we look for and find system fonts instead of using only the built in ones, but really it's the files fault for not embedding the odd fonts or providing proper file descriptors and encodings.	14:54.24
henrys	paulgardiner:found form test files here https://live.gnome.org/Evince/Forms/ ... toward the bottom of the page.	14:54.34
Robin_Watts	tor8: Right. I'm not feeling any huge pressure to try to fix any of these files if they are actually broken (and lots are).	14:55.26
tor8	I have a simple generic workaround which will make it pick the built in symbol font but not apply the synthetic bold/italic, by providing alias names like we do for the other built in fonts.	14:55.28
paulgardiner	henrys: Oh yes. That looks useful	14:55.53
tor8	and if you compare appendix H.3 in the pdf reference for the "standard type1 fonts" -- the list of valid aliases got halved in pdfref15 as compared to earlier specs	14:56.02
Robin_Watts	but, if there are cheap fixes we can do that make us look better compared to gs and acrobat, then its worth considering.	14:56.21
tor8	Robin_Watts: right. well, the easy fix that I like will get the right glyph out, just in the "wrong" style as compared with finding the real font on disk.	14:56.50
Robin_Watts	but font substitution is an area I'm trying to stay out of, so I'll bow to your decision here.	14:56.52
henrys	I don't know about having these meetings during the tour de france ;-) but let's get started.	14:59.13
Robin_Watts	henrys: The bloke in the yellow jumper always wins.	14:59.48
henrys	I can't believe I work with the 4 europeans that don't watch the tour.	15:00.18
	paulgardiner:probably important for those irs forms to work correctly.	15:00.50
chrisl	henrys: I watch some of the Tour - but it makes me cringe when they fall.....	15:01.08
henrys	you'd cringe a lot this year	15:01.32
Robin_Watts	paulgardiner: Just pushed your calculation stuff.	15:02.43
paulgardiner	Robin_Watts: thanks	15:02.55
chrisl	henrys: Also, they're always picking on the British guys.....	15:03.02
Robin_Watts	paulgardiner: What item does the validation stuff fall under ?	15:03.57
	2?	15:04.11
paulgardiner	Yeah	15:04.21
Robin_Watts	All other things being equal, I'd like to vote for that being prioritised, as it effects the testing. I don't know how others feel.	15:05.13
paulgardiner	That includes reporting the constraints to the app and checking on input	15:05.26
	Robin_Watts: I have no objections, and I'd like to see the changes to the tests that you are planning.	15:06.04
henrys	paulgardiner:so was it difficult to the utility functions? Are those examples where we don't have a spec?	15:06.11
paulgardiner	henrys: no real problems so far	15:07.10
henrys	Robin_Watts:yes I agree with that.	15:07.10
Robin_Watts	(for the benefit of the others that haven't been privy to paul and I talking on the phone) it's the reporting of constraints bit that interests me - it would enable me to generate mjs test files smarter, so we try and put numbers into number fields and dates into dates etc. That in turn would show off the calculation stuff better - at the moment we get lots of 'NaN' in the tests.	15:07.12
paulgardiner	henrys: When I thought about it more after the meeting, I could see it would be strange to take the trouble to write all our own C only to try to then steal their javascript.	15:08.35
tor8	Robin_Watts: paulgardiner: is that possible (detecting constraint type) with a simple string search or do we need a mudraw-v8 for it?	15:08.43
paulgardiner	tor8: Some of the constraints are held as flags in the dictionaries	15:09.28
	I was thinking we'd restrict ourselves to those for now.	15:09.50
	Although we could notice cases where particular javascript functions are used: the formatting code is often a single call to a utility fn	15:10.36
tor8	paulgardiner: okay	15:11.42
paulgardiner	... hmmm, actually I may be misremembering that constraints like "digits only" are held as flags. I'll need to take another look at the spec.	15:11.48
	Yes, it may make sense to do pattern matching on the javascript formatting code sooner rather than later. It looks easy enough.	15:13.48
henrys	anything else to discuss it looks like we can make this a short meeting? It looks like you should have the complete spreadsheet example by next meeting. That will be great.	15:15.35
Robin_Watts	I have nothing else.	15:15.55
henrys	BTW US folks will probably be out tomorrow.	15:16.04
paulgardiner	henrys: Actuall that looks to be working. I just had the sense of a test around the wrong way	15:16.12
henrys	oh super	15:16.56
Robin_Watts	paulgardiner: There is a new patch on your forms branch? Does that need testing/review ?	15:17.34
paulgardiner	henrys: there is a new problem just come up: the strings coming back from the js are utf8, and sometimes straying outside ascii. I think at some stage I'll need to make the appearance stream sythesis respect the utf8 chars correctly	15:17.49
	Robin_Watts: yeah, would be handy	15:18.08
	henrys: a case of it I've seen so far is \u20ac used in formatting an amount in euros	15:19.08
Robin_Watts	paulgardiner: The bytes that go into a pdf string are allowed to be any value from 0..255 (and indeed they have to be to allow for encryption.	15:19.39
henrys	paulgardiner:I wouldn't expect to see a lot of that on form input ...	15:20.05
Robin_Watts	If we are getting top bit chars back from javascript that when decoded give us values in the 0-255 range, that's fine.	15:20.41
paulgardiner	Robin_Watts: I'm imagining that I can put the utf8 in the value, but when I generate the appearance, I'll need to find the write gid or cid (whatever) for the utf8 chars	15:20.43
	We're getting 3-byte encodings back from js	15:21.12
Robin_Watts	If we're getting >=256, then we need to generate hexstrings or something?	15:21.28
paulgardiner	It's still coded in strings of 256bit values	15:22.04
Robin_Watts	Hmm. All strings in pdf are a series of bytes.	15:22.06
paulgardiner	I don't think there's a problem handling the value, but generating the appearance requires going from utf8 to the font index.	15:22.59
Robin_Watts	ok, I'll bow to your knowledge here.	15:23.18
paulgardiner	... I think... I may be misunderstanding....	15:23.29
	Robin_Watts: Don't do that! I was hoping you'd say either "Yes that's right" or "No, you should do it like this" :-)	15:24.08
henrys	I thought there was a bit more to pdf strings than bytes going to the manual	15:24.28
Robin_Watts	Nah, if you remember, whenever I got near fonty stuff in the Picsel PDF stuff I left it all to you :)	15:24.33
	henrys: I just checked the manual :) but it's a bad manual, so another set of eyes would be good.	15:25.09
	So, if we close the meeting early, I can talk paul through driving the cluster?	15:25.44
tor8	paulgardiner: you could use UTF16 for the appearance stream strings and set up the font descriptor correctly for it	15:25.45
henrys	Robin_Watts:sounds good	15:25.55
tor8	paulgardiner: or if you want to reuse the existing ones, you get into encoding madness	15:26.00
paulgardiner	tor8: Ah right. That might be necessary if the font is large	15:26.33
henrys	meeting closed if tor8 is good.	15:27.08
tor8	paulgardiner: I think I tried to push you in that direction in London, to recreate new fonts and font descriptors from scratch	15:27.10
paulgardiner	tor8: Yes I remember that... even though I was singing "la la la I can't hear you" with my fingers in my ears.	15:28.22
tor8	paulgardiner: well, now you know why :) encodings in pdf are crazy stuff.	15:28.51
	paulgardiner: when loading the form value we should probably run it through the ToUnicode cmap to get a utf-8 string to start with	15:30.31
	and then replace the fonts with our own fonts if they aren't compatible (i.e. not using a standard unicode encoding)	15:31.06
	and recreate the appearance stream using our own fonts from the utf-8 string	15:31.21
paulgardiner	I wondered whether the value would already be in UTF8	15:31.22
tor8	paulgardiner: I don't think PDF can represent UTF-8 in the text objects	15:31.46
paulgardiner	I mean the value held under the V item of the field dict	15:31.55
tor8	or rather, the CMap machinery for decoding utf-8	15:32.03
	if it's in a dictionary, it can be either PDF Doc Encoding or UTF16. see pdf_to_utf8.	15:32.43
paulgardiner	The value is sort of held twice, once under V which is just the value as a string, and again under AP which is the graphics commands to draw the text.	15:33.18
tor8	yeah, so the V can be either pdfdocencoding or utf16, and the AP can be in whatever encoding the font descriptor uses	15:33.45
paulgardiner	The stuff under V would presumably have to be unicode because there is nowhere to look up the encoidng	15:33.56
	Oh ok So it's pdfdocencoding or utf16	15:34.30
tor8	and pdf_to_utf8 takes a fz_obj string and returns a utf-8 char*	15:34.56
	by guessing either pdfdocenc or unicode	15:35.08
paulgardiner	Handy	15:35.32
tor8	going the other way, for the AP, is what my ranting about font descriptors is all about :)	15:35.52
paulgardiner	Is guessing necessary. Is there nothing that tells you which it is?	15:35.54
tor8	paulgardiner: unicode always has a BOM :)	15:36.08
paulgardiner	Right	15:36.40
tor8	so you'd need to turn a utf-8 char* back into unicode with a BOM for writing it out as a fz_obj in the V entry	15:37.17
	I don't think we have a function for that yet	15:37.31
Robin_Watts	paulgardiner: ping me when you and tor8 finish.	15:40.21
paulgardiner	Robin_Watts: sure	15:40.38
	tor8: I guess we don't want to do all this unnecessarily, so would be scan first for top-bit-set chars?	15:41.39
tor8	paulgardiner: premature optimization and all that, for the V field. for the AP I think it may make sense to check the encoding and strings both before deciding to replace the fonts.	15:42.49
paulgardiner	Also makes it a pig to debug. I was for a while coding everything as hex with the same effect	15:44.33
	So what goes in the appearance strings here (xxxxxxx) Tj ?	15:46.19
	Would that depend on the font encoding, or are you saying that would also be either unicode or pdfdocenc?	15:46.48
tor8	paulgardiner: the (xxxxx) would depend on the font encoding	15:51.40
	paulgardiner: so I think the easiest way to get that done is to make new fonts with known encodings than trying to use the old font objects with potentially really broken stuff	15:52.26
paulgardiner	Right. Yeah, thought so. As Robin said, I implemented most of the font encoding stuff for Picsel's viewer, but it's a long time ago.	15:52.42
Robin_Watts	So, pdf_buffer_cat_pdf_string could take a font encoding argument, and convert if required as it catted ?	15:52.46
paulgardiner	Robin_Watts: Nice.	15:53.08
tor8	we do have a fair bit of encodings parsed up in the pdf_fontdesc struct	15:53.11
	making a reverse mapping from unicode back down to that should work, but then there's the issue of what to do if a character is missing :)	15:53.33
paulgardiner	But that would be a broken file	15:54.04
Robin_Watts	Omit it? That's all the renderer would do.	15:54.11
paulgardiner	I'd have thought any font used in a form would have well defined unidode mappings	15:54.22
tor8	paulgardiner: well, consider a form where the fonts are only ascii encoded and someone tries to enter a funny character like Ã©.	15:54.44
Robin_Watts	paulgardiner: Bzzt. Expecting sanity from Adobe. Docked 5 points.	15:54.47
tor8	we could expect form generators to do the sane thing, but we ought to check what adobe does	15:55.22
paulgardiner	That's twice in half an hour. I was expecting then to allow utf8 in strings rather than a difficult to determine choice pdfdocenc or unicode wit	15:56.19
tor8	for the encoding we've got ways in the pdf_font_desc struct to map from font encoding to glyph id, and from font encoding to unicode. doing a reverse lookup could be potentially very slow and awkward.	15:56.23
	and then you have to get them out into the right multi-byte crap too	15:57.05
	all of which is possible, but icky	15:57.29
paulgardiner	tor8: But don't you need to do that when creating your new font. Presumably it has to be based on the old one (but with a different encoding) so as to look correctl	15:57.53
tor8	paulgardiner: no, you really need to do this in pdf_buffer_cat_pdf_string	15:58.29
	take a utf-8 character in, reverse look up the font encoding from the fontdesc struct, and figure out the multi-byte encoding to use to put in the buffer	15:59.02
paulgardiner	Sorry. I'm not explaining what I mean well.	15:59.48
henrys	on to the next meeting?	16:00.02
tor8	paulgardiner: ah, right. I misread. yes, you can avoid all that if you make your own font and fontdescriptor	16:00.20
paulgardiner	tor8, Robin_Watts catch you tomorroe	16:00.21
Robin_Watts	night.	16:00.40
henrys	Robin_Watts:so we gave up on regular clusterpush for windows and have something entirely different?	16:00.41
paulgardiner	tor8: but with your own font it may not look right, unless it's based on the old, and then you need to process it's encoding	16:01.01
Robin_Watts	henrys: You can use clusterpush.pl if you have cygwin set up.	16:01.18
paulgardiner	really must go. cyl	16:01.25
tor8	paulgardiner: well, we could pick the nearest of the base 14 fonts and hope for the best :)	16:01.26
	paulgardiner: cya	16:01.29
Robin_Watts	(as you need cygwin for rsync, and even then it'll only work if you're lucky).	16:01.40
henrys	Robin_Watts:oh okay.	16:02.01
Robin_Watts	So I came up with a mechanism that uses git to transfer to casper, and then does a normal clusterpush from casper.	16:02.11
henrys	Robin_Watts:right.	16:02.53
Robin_Watts	So, meeting time ?	16:03.11
henrys	yes, giving mvrhel a few minutes.	16:03.30
Robin_Watts	oh, waiting for mvrhel right.	16:03.34
henrys	oh he's here.	16:03.42
	phone call that I have to take go ahead without me.	16:04.28
	i'm back	16:05.21
mvrhel	oh I am here	16:05.26
	sorry	16:05.28
henrys	ray_work?	16:05.37
	texted ray	16:06.59
	chrisl:what's the progress of the font integration now?	16:07.37
chrisl	Working on the UFST, and getting things going with MT fonts - freetype is working, but not well tested yet	16:08.13
	I haven't done the artificial boldening stuff yet, either	16:08.33
henrys	chrisl:I think that can be safely skipped the first round, just use the old stuff?	16:09.16
chrisl	I have to move the old stuff around, but yes, that's my plan	16:09.38
henrys	alexcher:you said you were going to make a public branch of the mupdf parser with gs?	16:10.31
	tor8:now that I have you trapped, any more thoughts about using your viewer for the other languages?	16:11.09
alexcher	henrys: yes, I remember that I need to do it. I still need to make the first version more usable.	16:12.14
tor8	henrys: I doubt it'll be worth the effort. it means having two back ends for all the gui stuff, and complicates all the code for it.	16:12.39
henrys	alexcher:it doesn't have to work.	16:12.49
alexcher	henrys: OK	16:13.06
Robin_Watts	tor8: does it? I believe henrys is suggesting that we do ANY_FORMAT -> pdf then view the pdf.	16:13.41
tor8	alexcher: henrys: you can push to a 'user' git repo, which is visible but not in the gold repo. like we do with mupdf.	16:13.44
henrys	tor8:what Robin_Watts said.	16:14.01
tor8	Robin_Watts: if it's a 'convert to pdf in a forked thread then open with mupdf' then yeah, sure, no biggie. but it means having to find a ghostscript installation :)	16:14.19
	:( I mean	16:14.38
henrys	tor8:I am assuming we'd use the api and not fork.	16:15.00
	exactly to avoid "finding"	16:15.43
tor8	henrys, alexcher: if you create a git clone in ~/repos/ghostpdl.git it'll show up on git.ghostscript.com (just look at tor, sebras, robin, paulg or chrisl's ~/repos/ directories for an example)	16:16.00
	henrys: right.	16:16.08
Robin_Watts	It's akin to how acrobat calls distiller on postscript input files.	16:16.16
norbertj	henrys: hello, did you get my mail on the optional truetype loading (just checking)?	16:16.28
tor8	Robin_Watts: like what apple's preview does on postscript files too	16:16.35
henrys	yes norbertj and I passed it on to chrisl who is actually doing our new font integration with freetype.	16:17.02
*chrisl*	is not sure we should be looking to Adobe and Apple for inspiration on best practices........	16:17.19
norbertj	perfect. Will see what you think of it.	16:17.51
	have to eat..	16:18.03
henrys	chrisl:I've sort of studied the alternative and that looks like the right way to go. Streamed languages like postscript and PCL are really not appropriate for viewing apps.	16:18.19
	certainly open to other suggestions.	16:18.37
	kens:did you have anything for the meeting, I know you like to leave on time?	16:18.57
chrisl	henrys: I do agree with the approach, I'm just not keen on using "it's how Adobe does it" as an argument in its favour!	16:19.16
Robin_Watts	Any sane alternative is going to involve us rendering to some intermediate displaylist format and then 'viewing' that. Using pdf as that alternative seems reasonable.	16:19.24
kens	henrys the only thing I have is to say that I'm on holiday Thurs Friday this week and Monday Tuesday next week	16:19.52
	I will have intermittent email but don't plan to be on irc	16:20.16
	SO if you want me, email me :-)	16:20.25
henrys	tor8:I'd like a definite okay from you because it effects business decisions so give it some thought and let me know.	16:20.27
rayjj_	Robin_Watts: right, PDF effectively becomes the high level display list	16:21.06
henrys	kens:okay and that reminds me -- the US folks will be celebrating our independence from you tomorrow ;-)	16:21.27
tor8	henrys: it's not a very nice experience (apple and adobe's running distill jobs before opening), compared with something like gv back in the 90's where you could view ps files with dsc comments instantly	16:21.51
chrisl	henrys: the only problem, from a business perspective, is that it only really partially showcases GS - it's more a showcase for mupdf (which we also want, but....)	16:22.04
Robin_Watts	And it means we get all our bugs in one easy to use package.	16:22.27
chrisl	Robin_Watts: not really, too many differences between high level devices, and rendering :-(	16:23.07
Robin_Watts	Presumably we need language switch to be in a reasonable state then ?	16:23.10
rayjj_	chrisl: the advantages of GS for some uses is _not_ (IMHO) as a viewer	16:23.11
henrys	tor8:with this we get support for all the language with text search.	16:23.12
tor8	so I think my biggest concern is, do we have an appetizing api to render a page at a time with postscript and pcl?	16:23.23
Robin_Watts	tor8: -dFirstPage -dLastPage ?	16:23.44
kens	tor8 pdfwrite cna use %d now	16:23.46
tor8	henrys: yeah, going via pdfwrite does get us everything we want, except performance :)	16:24.01
Robin_Watts	oh, what kens said is much better.	16:24.02
rayjj_	having a single viewer that works for PCL and PS seems to be worth having	16:24.20
tor8	does -dFirstPage work on PS?	16:24.31
henrys	tor8:I don't know about that %d in the background should do pretty well kens?	16:24.34
kens	henrys, see above	16:24.42
	tor8 yes -dFirstPage also works with PS I believe	16:25.06
rayjj_	but an important thing is to be able to show the first page BEFORE 'distilling' the entire input file	16:25.13
Robin_Watts	tor8: Ignore -dFirstPage etc. Just generate the whole file as a series of pdf files.	16:25.28
tor8	kens: okay, that'd improve matters or at least give us more options	16:25.29
kens	If we use %d then teh first file will appear when the firs page is completed	16:25.33
rayjj_	tor8: -dFirstPage (and Lastpage) only works with PDF	16:25.36
Robin_Watts	pdfwrite using %d and then open the first page while the later ones are still processing.	16:25.50
tor8	right. so %d it'd have to be.	16:25.54
henrys	yes we definitely want %d	16:25.56
alexcher	-dFirstPage _doesn't_ work with PS.	16:26.20
rayjj_	I agree with ken -- that emitting each page as a separate PDF makes sense	16:26.20
henrys	we should be able to really outperform Adobe and Apple with that approach (on PS)	16:26.45
rayjj_	it's possible (using an EndPage proc with setpagedevice) to skip leading pages, but it still has to do all the work for all previous pages (then just throws it away). Doesn't save much time	16:28.04
tor8	henrys: it's not impossible, and it's an isolated task to hand off to robin or someone else to add if I'm too laz^H^H^Hbusy	16:28.07
henrys	I don't know it seems Robin_Watts has a pretty full plate, but maybe so.	16:29.07
	tor8:the build might be a hassle.	16:29.20
tor8	henrys: well, anyone really. I know we're keeping Robin busy :)	16:29.26
henrys	maybe a fork for the first go would be a lot easier.	16:29.42
Robin_Watts	henrys: At the moment I have an empty plate, but I'm standing in front of a large buffet of broken/slow files from customer 394.	16:30.00
tor8	henrys: we have to solve build issues for the gs bridge too, hopefully something can be learned from that. fork and exec is probably easiest (and reduces the download footprint)	16:30.01
Robin_Watts	How fast you want them eaten determines my business.	16:30.09
kens	the viewerr will have to be smart, normally it has an N page PDF file int htis case it will have N 1-page PDF files. It will have to kno.	16:30.11
Robin_Watts	or busyness :)	16:30.21
rayjj_	henrys: why would the build be a hassle ? wouldn't we just fire off a process ?	16:30.34
Robin_Watts	rayjj_: I assumed henrys meant "build" as in "build of the viewer binary"	16:31.02
tor8	kens: we have a fz_document abstraction that already hides pdf and xps differences, making that one expose the same api but with multiple 1-page pdf files should be no more than a day's job.	16:31.12
kens	well that's good news anyway :-)	16:31.28
henrys	rayjj:it would be a hassle to build in gs and for now it is simpler to fork a process.	16:31.43
tor8	rayjj_: we were considering using gs as a library not external process.	16:31.53
kens	Forking a process means we don't have to have a language switch :-)	16:32.01
Robin_Watts	Yes, the lack of a functioning language switch library build is a problem.	16:32.42
rayjj_	tor8: but we'd (probably) want it to run asynchronously in a separate thread, so having it run as a process isn't any worse (and may be preferred)	16:32.49
tor8	rayjj_: indeed.	16:33.02
Robin_Watts	(Well, we do have a functioning language switch lib build, cos I've used it, but it's not ideal)	16:33.09
chrisl	Depends on your definition of "functional"......	16:33.39
henrys	Robin_Watts:I should look at 394 priority I need to talk to miles, do you have some sort of ballpark estimate for the work you know about?	16:34.06
rayjj_	the only thing about running as a process is knowing when the temp-%d.pdf is complete	16:34.08
chrisl	rayjj_: when the next one appears	16:34.31
Robin_Watts	henrys: Let me summarise the meeting...	16:34.32
henrys	when temp-%d+1 is complete.	16:34.35
kens	When the process exits or the next file turns up	16:34.49
henrys	s/complete/started	16:34.53
Robin_Watts	they are using mupdf v0.9, and broadly they seem happy with it, except in some cases where performance isn't great, probably largely because of floating point.	16:35.05
rayjj_	chrisl: currently the output pdf is created when the page starts, but the file is empty.	16:35.06
Robin_Watts	They gave us several hundred problem files (some where we give different results to "ImageMagick" (i.e. gs), some where we are slow)	16:35.33
chrisl	rayjj_: yes, so when the file for page two appears, page one is ready to use	16:35.33
rayjj_	chrisl: then pfwrite reads its temp files and gradually builds the pdf when the page is closed	16:35.41
Robin_Watts	v1.0 solves about half of these problems (based on the sample I've looked at)	16:35.49
rayjj_	chrisl: yes, that would work.	16:35.58
henrys	Robin_Watts:well bugs can fairly be split between you and tor8 right?	16:36.00
Robin_Watts	right. at the moment I'm running through the files looking for which ones have problems.	16:36.25
	They plan to try to move to 1.0, and in the process pass back to us performance optimisations they have made.	16:36.48
	I think a lot of those are going to be in the form of avoiding (or reducing) floating point.	16:37.12
	They have some thirdparty lib opts which they want to give us for us to pass back to lib maintainers.	16:37.31
rayjj_	kens: does pdfwrite in %d mode still emit one extra (empty) output file, or do you delete it ?	16:37.48
kens	ryajj that's a good question	16:38.04
	ray_work	16:38.09
	WHoever :-)	16:38.14
henrys	so it seems to make sense to hold off on any performance changes and just fix bugs ... I guess you're doing that.	16:38.20
Robin_Watts	So, at the moment, I'm not under any pressure, but when they start feeding us stuff, the workload may get heavier.	16:38.23
kens	I have to say I don't know the answer	16:38.23
Robin_Watts	henrys: indeed.	16:38.27
	Sadly, they are using an ARM9 (no FP unit), and they don't appear to have a profiler on it.	16:38.44
	I've passed them some code that I've used to do profiling on the ARM9 before, but I haven't heard anything back.	16:39.01
	It's extremely likely that any profiling I do on windows (or on the beagleboard) will be completely useless as it will be so massively skewed that it will be meaningless.	16:39.40
rayjj_	kens: pdfwrite does still create one extra PDF that is just a blank page (not an empty file) e.g. annots.pdf creates 7 pdf's	16:40.13
henrys	so work on the viewer and when they come back drop the viewer, there is really no schedule for the viewer, that is if you want to work on it.	16:40.13
Robin_Watts	so yes, I'm just fixing bugs at the moment.	16:40.14
	If tor8 needs me to do stuff on the viewer, just say.	16:40.38
henrys	mvrhel:are you okay with bugs, swamped as usual it appears?	16:41.04
	tor8, Robin_Watts:I assume optimizing for no fp would be a big task in mupdf, isn't it?	16:42.23
mvrhel	henrys: I am doing good. I am working through a few xps transparency things to optimize the group size. I will have a big speed up with a couple files from this	16:42.44
Robin_Watts	henrys: Yes, it would be a major upheaval.	16:42.47
tor8	henrys: yeah, we really do assume floating point is everywhere	16:42.55
	it's not the 90's anymore... but sadly not everyone agrees.	16:43.09
Robin_Watts	It might be feasible to introduce a level in the draw stuff below which everything goes to fixed point.	16:43.13
henrys	tor8, Robin_Watts:I'm wondering if we shouldn't straight up with them about that.	16:43.25
mvrhel	henrys: then I have one minor issue related to icc profiles and then some features for the one customer that I want to get in before the release	16:43.40
	when are we doing the freeze?	16:43.46
tor8	much of the low level rasterization work is done in integer or fixed point math, and some more bits could be pushed down	16:43.56
mvrhel	or I guess we dont do a freeze anymore	16:43.57
Robin_Watts	henrys: I think they are aware of the fact that FP is a problem, and I don't think they expect us to rework to fixed.	16:44.12
mvrhel	but when is the candidate tagged?	16:44.13
tor8	in which case, raw clock speed may be good enough for the remaining floating point bits	16:44.16
henrys	mvrhel:late july or so.	16:44.21
chrisl	mvrhel: we haven't even talked about a target release date, yet.......	16:44.23
Robin_Watts	tor8: 200MHz.	16:44.35
mvrhel	early august sounds better to me....	16:44.41
henrys	chrisl: August ;-)	16:44.43
tor8	Robin_Watts: okay, not so much then...	16:44.47
	I remember when Quake 1 came out and required a FP unit to run...	16:45.09
chrisl	So, if we're aiming	16:45.12
Robin_Watts	I think we should wait to see some profiles (or the best approximation to that that we can get) before panicing about optimisations.	16:45.25
henrys	Robin_Watts:okay	16:45.38
chrisl	mvrhel, henrys: so if we're aiming for early August, then I'd want to do an rc around the 1st	16:45.54
henrys	chrisl:are we freezing a week before the rc?	16:46.34
	or at the rc?	16:46.40
mvrhel	at the rc I hope	16:46.45
	that only leave 4 weeks	16:46.49
chrisl	I usually just ask people to be sensible with their commits in the run up to the rc	16:47.08
mvrhel	I will try to be	16:47.36
Robin_Watts	muhahah	16:47.44
henrys	mvrhel:it seems late for new features, we could do a snapshot release for the customer.	16:48.16
mvrhel	well let me try to get them in at least 2 weeks out.	16:48.48
henrys	mvrhel:okay	16:48.58
	way past meeting end time, back to the salt mine.	16:50.03
chrisl	henrys, mvrhel: I'll send a mail round tomorrow stating the plan - we should also ping tkamppeter and check if he has a driver for the release......	16:50.11
mvrhel	ok . brb	16:50.11
kens	Time for me to go then, night all	16:58.41
tkamppeter	chrisl, what do you mean with whether I have a driver for the release?	17:14.42
chrisl_away	tkamppeter: are there any Ubuntu related deadlines or freezes we need to worry about in the run-up to 9.06?	17:21.27
tkamppeter	chrisl_away, important is to jhave 9.06 ready some days before Feature Freeze, Aug 23. See https://wiki.ubuntu.com/QuantalQuetzal/ReleaseSchedule	17:35.57
chrisl_away	tkamppeter: OKay, our plans fit okay with that. It would probably be wise if you consider starting to take snapshots for early testing soonish.....	17:37.27
	Have to go now....	17:37.56
Robin_Watts	aargh. tor8, you about ?	18:13.10
	and sebras if interested.	18:13.17
tor8	Robin_Watts: briefly here	19:12.35
Robin_Watts	S'OK. Sorted it.	19:12.47
henrys	tor8:so are we going with muview for a monicker?	19:26.30
tor8	henrys: we could, I have no strong opinion either way	19:27.09
henrys	just curious if you had something planned I don't feel strongly either.	19:27.57
Robin_Watts	tor8: Another patch for you to look at on my master branch.	19:54.11
	no hurry.	19:54.24
dawagenaar	I have written a small patch for mupdf to partially implement comment #33 in big #691330. I am new to mupdf development and I have never used IRC before. Can somebody introduce me to basic etiquette here and also tell me what the appropriate method for submitting a patch for discussion? Thanks a lot!	20:42.03
bapt	hi	21:54.46
	the checksum of ghostscript-9.05.tar.bz2 seems to have change, has it been rerolled?	21:55.21
Robin_Watts	dawagenaar: Hi.	22:57.51
	Let me just have a look to see what you're on about :)	22:58.01
	ok, so coming on here and talking to us is a great start.	22:59.41
	You can either attach the patch to the bug, or (probably better in this case as that bug is a large and sprawling thing) make a new enhancement bug and attach the patch to it. Give full details of what it does and why it's needed, and we can then look it over.	23:00.51
	If you add a new bug, then put a note on the existing bug pointing to it.	23:01.10
	The MuPDF developers are on european time (don't know where you are based), so be prepared for delays on irc. we do check the logs though, so you should get an answer to any question when we get back.	23:02.21
dawagenaar	Robin_Watts: Thanks for your advice. I will create a new ticket for discussion of this patch.	23:50.05
	Forward 1 day (to 2012/07/04)>>>

IRC Logs

Log of #ghostscript at irc.freenode.net.