Ghostscript IRC logs

Log of #ghostscript at irc.freenode.net.

	<<<Back 1 day (to 2016/02/09)	20160210
sebras	tor8: I left two commits on sebras/master: 1. support for text-top/-bottom and 2. a patch to support cjk punctuation in iscjk(), not having this included confused me a bunch when I attempted to understand why my chinese input got breaks where I didn't expect them. ;)	00:19.22
halabund	I have asked this question a year or so ago, but I still donât have a solution and I am hoping that some new tools may have become available since then:	10:34.47
	Is there a way to scale a single-page PDF document by a some factor?	10:35.04
	Nothing should visibly change when displayed on screen, however the print size should change. E.g. when scaled by 0.5, if the original page size was 10 by 10 centimetres, it should become 5 by 5.	10:35.45
	Both the page and the contents should scale proportionally.	10:35.57
	Can recent Ghostscript versions do this?	10:36.18
chrisl	You can't really have different scaling for viewing and printing, AFAIK	10:37.14
halabund	chrisl: No, I donât want different scaling for viewing and printing :) I was just trying to explain that everything should scale proportionally. This question is often misunderstood as wanting to place the same graphics on a different page size, say A4 -> Letter or something like that. Thatâs not what I want, I just want a simple scaling of everything by the same factor.	10:40.36
chrisl	halabund: Ghostscript can't manipulate a PDF that way, but it can probably create a new PDF scaled as you want.	10:41.42
halabund	chrisl: What does that mean? What I need is that the old and the new one look identical (other than the scaling). I do not mind if the internal sctructure changes in some way.	10:42.50
chrisl	It should look the same, but internally (including a fair amount of metadata) will not be the same, or will be missing altogether	10:43.34
	But you won't be able to do "scale by 0.5" with Ghostscript - you'll have to get the page size, work out the new page size, and then "convert"	10:43.41
halabund	chrisl: Do you mean that I cannot use a scaling factor, only a target size in centimetres? That is okay, because I do know the starting page size.	10:44.45
chrisl	Yes, exactly	10:44.56
halabund	But only the width.	10:44.56
	I do not know the height.	10:45.00
chrisl	No, you need to know both	10:45.15
halabund	I have an approximation of the height, I guess that might do. If I know the precise target size, how would I get Ghostscript to do this scaling for me?	10:46.05
chrisl	If you add "-dDEVICEWIDTHPOINTS=w -dDEVICEHEIGHTPOINTS=h -dFIXEDMEDIA -dFitPage" to your command line, it should do what you want	10:47.47
halabund	where w and h are the target, right?	10:48.10
chrisl	Yes, in points	10:48.35
	So. something like: gs -sDEVICE=pdfwrite -o new.pdf -dDEVICEWIDTHPOINTS=w -dDEVICEHEIGHTPOINTS=h -dFIXEDMEDIA -dFitPage old.pdf	10:49.35
halabund	OK, let me try. Iâm trying to work around a Mathematica bug where it introduced inaccuracies when exporting at small sizes (due to some stupid internal rounding I guess). I want to export at 5x the size, then rescale to the target size.	10:50.05
chrisl	halabund: there is a Postscript utility we ship called pdf_info.ps which (amongst many other things) will give you the original page size of the PDF (or, more accurately, the dimensions of the various bounding boxes a PDF must/may contain)	10:54.05
halabund	chrisl: Just tried, it works very well with my figures! Thank you. Also, I see that if I give the wrong height, it wonât change the aspect ratio. That is very good. It means that it is not a problem if my height is a bit inaccurate (for as long as it is not as small as to crop the graphics).	10:55.32
chrisl	halabund: well, as I said above, you can use pdf_info.ps to get the exact original size	10:56.36
halabund	Well, thereâs one small problem: it recompresses some images as JPEG and the JPEG artefacts are visible now. Can I instruct it to use lossless compression?	10:57.19
	or very high quality	10:57.33
chrisl	Erm, you can - I'll have to look it up, though, hang on......	10:57.44
tor8	sebras: both lgtm, I'll push.	10:58.06
chrisl	halabund: try adding "-dAutoFilterColorImages=false -dAutoFilterGrayImages=false" to your command line	11:00.27
halabund	chrisl: Thanks! It didnât solve the quality degradation completely, but by googling for this option I found -dColorImageFilter=/FlateEncode which does fix the problem entirely. In my case fortunately it doesnât blow up the file size at all. http://comp.lang.postscript.narkive.com/vwkyi2e5/how-to-tell-ghostscript-to-leave-bitmap-images-alone	11:04.43
	chrisl: Thank you for all the help! :-)	11:05.48
chrisl	Hmm, strange, our docs suggest that disabling the autofilter should have us always using flate - possibly a bug (either docs or code) there	11:06.20
	halabund: NP	11:07.01
halabund	I am using 9.16, maybe I should upgrade to 9.18. I notice itâs available for Mac now. http://pages.uoregon.edu/koch/	11:09.51
chrisl	halabund: whilst we always recommend using an up to date version, it's probably not critical unless you hit a problem - then you should definitely update and try it before reporting it, otherwise, you'll get your wrist slapped ;-)	11:17.10
tor8	Robin_Watts: rats, dirn_matches/fz_bidi_fragment_text/detect_flow_directionality gets stuck in an everlasting loop for some files	11:18.40
Robin_Watts	tor8: It does? Throw the file at me, and I'll see what I can find.	11:19.08
tor8	http://ghostscript.com/~tor/stuff/0.epub	11:19.35
	Robin_Watts: if you ever use linux, and have something that's taking longer than you expected: "sudo perf top" is your friend :)	11:20.48
	like top but it actually looks at which functions in the process is taking time :)	11:21.17
Robin_Watts	nice.	11:35.36
tor8	Robin_Watts: malc_ has found an even simpler test case for the hang	12:01.19
malc_	wtf?	12:01.30
	i haven't found it	12:01.33
	i MADE it	12:01.36
	by hand	12:01.38
	some courtesy please	12:01.43
Robin_Watts	malc_: Oh, fabulous. I'd love to see it.	12:02.14
	(Sorry, I'm buried in gs at the moment, hope to get to this in a short while)	12:02.29
malc_	<html>	12:02.35
	<pre>	12:02.35
	25EFâ¯LARGE CIRCLE	12:02.35
	â 20DDÂ â combining enclosing circle	12:02.35
	â 25CBÂ âÂ white circle	12:02.35
	â 2B24Â â¬¤Â black large circle	12:02.37
	â 2B55Â âÂ heavy large circle	12:02.40
	â 3007Â ãÂ ideographic number zero	12:02.42
	âÃ¡	12:02.45
	aaâa	12:02.47
	latinããããÙØ§ÙØ³Ø©ÙÙÙØ´Ø§ ÙÙ ÙØ´Ø§Ø©Ø´Ù ÙÙ ÙØ´Ø§ÙØ©à°¹à±à±à°à°à±à°µà±à°°à±à°¨à±à°°à±ÑÑÑÑÐºÐ¸Ð¹	12:02.50
	</pre>	12:02.53
	</html>	12:02.55
	i guess only the line with arabic is relevant though	12:02.58
	lemme test	12:03.04
	nope	12:03.25
	<html>	12:04.00
	<pre>	12:04.00
	latinããããÙØ§ÙØ³Ø©ÙÙÙØ´Ø§ ÙÙ ÙØ´Ø§Ø©Ø´Ù ÙÙ ÙØ´Ø§ÙØ©à°¹à±à±à°à°à±à°µà±à°°à±à°¨à±à°°à±ÑÑÑÑÐºÐ¸Ð¹	12:04.00
	</pre>	12:04.03
tor8	if (broken) break; drops it out of the loop but leaves 'end' unchanged so we enter an eternal loop	12:04.05
malc_	</html>	12:04.05
	drop <pre> and the hang disappears	12:04.08
tor8	malc_: the <pre> adds the equivalent of <br/> on all newlines to our internal data structure	12:05.11
	malc_: you only need a <pre> tag to trigger the bug. no need for actual bidi text.	12:06.56
malc_	tor8: don't you grow a new type of appreciation of mozilla/etc developpers? ;)	12:08.34
Robin_Watts	tor8: How about... moving end = end->next; to be just before if (broken) break; ?	12:09.01
	That should remove the possibility of end being unchanged, so we'll always make progress.	12:09.31
	Nothing after that point uses end at all.	12:09.37
tor8	Robin_Watts: ta, that looks like it fixes the problem.	12:09.56
Robin_Watts	Fab.	12:10.04
	tor8, malc_: Thanks.	12:10.11
	tor8: Can i let you commit that?	12:10.23
malc_	Robin_Watts: and thank you for not endulging yourself with 'ta' and 'fab'	12:10.31
tor8	Robin_Watts: yes, I can commit that.	12:11.01
	Robin_Watts: that and one more short commits on tor/master	12:12.07
	one to add a build=sanitize flag to use clang/gcc's address sanitizer	12:12.20
malc_	tor8: you should add sanitize=undefined too :)	12:12.47
	would be fun to experience the fallout of that	12:13.18
Robin_Watts	tor8: lgtm.	12:16.14
sebras	tor8: great, thanks!	12:52.54
Robin_Watts	tor8: I've got a commit on robin/master to sort out that common code. See what you think.	14:03.14
tor8	Robin_Watts: I don't see any new commits	14:11.18
Robin_Watts	tor8: oops.	14:53.26
	tor8: sorry, look now.	14:55.58
	That runs with no diffs.	14:56.10
	(as you might expect)	14:56.16
	Do we have any epub files in the cluster?	14:56.28
HenryStiles	z/OS ... seriously?	14:56.51
	oh I guess it is more recent than I thought I'm confusing it with their older mainframe OS's	14:59.25
tor8	Robin_Watts: looks pretty good. maybe call it string_shaper rather than walker?	15:00.30
Robin_Watts	tor8: Could do.	15:00.42
tor8	not sure which is clearer, walker is pretty obvious :)	15:00.53
Robin_Watts	I originally had the shaping separate to the walking, but then I twigged I could put it all together.	15:01.14
	I think I prefer walker.	15:01.45
kens	HenryStiles : The described condition does not occur for me running the file on Windows, but in the absemnce of a command line.....	15:01.49
HenryStiles	kens: I was going to leave it with marcosw for now	15:02.15
Robin_Watts	cos it makes more sense that a "walker" gets called multiple times whereas a "shaper" might be expected to be called only once.	15:02.22
kens	As I said to Chrisl I was waiting for a regression run so I gave it a quick try	15:02.34
tor8	Robin_Watts: yeah. probably best to just leave it as is.	15:03.18
kens	I find it hard to see how they get to that line with size->y being 0, since there's an explicit test and return against it higher up	15:03.28
tor8	Robin_Watts: LGTM.	15:03.46
Robin_Watts	ta.	15:03.53
	tor8: Do you have a set of epub files you use to test? We should put that in the cluster.	15:07.24
tor8	Robin_Watts: I do not, I just write simple html files for testing new features...	15:12.08
Robin_Watts	tor8: OK.	15:12.23
tor8	that, and my private ebook collection	15:12.38
	which is mostly simple fiction so doesn't exercise any fancy features	15:12.53
	Robin_Watts: I think sebras collected a bunch of epub files a while back	15:15.28
	they should be on casper somewhere	15:15.33
Robin_Watts	tor8: so, next thing to think about...	15:27.28
	for some lines, when we shape things, the combined shaped text has a taller bbox than any of the individual glyphs.	15:28.03
	hence we ought to up the line spacing on such lines.	15:28.27
tor8	Robin_Watts: yeah... that's a difficult problem. uneven line heights are really ugly.	15:29.27
	we could just bump our default line spacing by a fair bit	15:29.45
Robin_Watts	More generally than that, I wonder how baselines compare for different scripts.	15:29.50
	We calculate bboxes as strictly positive things.	15:30.16
	i.e. from (0,0) + (w,h)	15:30.25
tor8	the html specification and implementations all do terrible typographic mistakes, with extra line heights added for stuff like <sup> tags	15:30.56
Robin_Watts	if we have languages that 'hang' from the baseline, then we might need both an 'ascender' and 'descender' value maybe.	15:31.14
tor8	Robin_Watts: measure_line measures the ascenders and descenders and figures out the baseline and total line height	15:31.50
Robin_Watts	oh, right, cool.	15:32.16
tor8	it returns the line height, line width and baseline values	15:32.29
	so I guess the problem we have now is that the font ascender value doesn't always match the final height of shaped stuff?	15:33.04
Robin_Watts	AIUI, no.	15:33.12
	So currently it assumes 80% above the baseline, and 20% below ?	15:33.58
tor8	Robin_Watts: oh, yeah... that code should probably look at the node->font :)	15:34.40
	or we add two floats to the struct (to your horror)	15:35.22
	the node->font is not necessarily the font that will be used, and the metrics may not match the fallback font	15:36.18
Robin_Watts	So layout_flow calls measure_word	15:36.29
	and after a few of those calls flush_line which calls measure_line	15:36.44
	While we are measuring the word we could keep the max/min ascenders/descenders.	15:37.19
	(We have the correct font etc in measure_word)	15:37.27
	those max/mins could be fed into measure_line?	15:37.52
tor8	node->y is the final calculated baseline to use for the node	15:38.38
	right, so do the max_a, max_d and line height calculations on the fly in measure_word instead?	15:39.43
	I'm thinking we could probably simplify a bit of this line layout code by using a packer structure or something	15:40.34
Robin_Watts	'on the fly' ?	15:41.05
tor8	but we don't want to paint ourselves into a corner, so we can't do TeX-style line layout	15:41.07
	Robin_Watts: sorry, that was a bit unclear. I mean we keep track of the ascender, descender and max height values while we loop over measure_word and eliminate the measure_line call	15:41.52
Robin_Watts	certainly we'd keep track of those values over the calls to measure_word.	15:42.31
tor8	then we can get accurate ascenders and descenders from the actual fonts used	15:42.33
Robin_Watts	I haven't got far enough to actually get to the fact that measure_line could go yet.	15:42.50
tor8	yeah. I believe we would not need the measure_line function at all then.	15:42.54
Robin_Watts	but possibly, yes.	15:42.59
tor8	well, all measure_line does is figure out the final width but we already have that at the call site	15:43.49
	or we wouldn't know to call flush_line	15:43.53
	Robin_Watts: still, I think if we try to measure several different possible layouts (as for TeX layout) we want to be able to loop over the nodes and figure out the same	15:44.54
	BUT, here's my gripe, as soon as we hit a fallback character the line height for that line will differ from all the others	15:45.40
	which is going to look absolutely terrible :(	15:45.53
	if we have problems with our default fonts being too close together we can adjust this line instead:	15:47.17
	style->line_height = number_from_property(match, "line-height", 1.2f, N_SCALE);	15:47.23
	change the 1.2 to something bigger, like 1.3	15:47.29
Robin_Watts	tor8: I understand your objection to differing line heights, and I broadly agree.	15:50.25
	What is TeXs solution to that problem?	15:50.44
tor8	not a clue	15:51.09
Robin_Watts	I think we mostly want to lay out lines based upon 1.2 the max ascender-descender	15:51.41
	(of the font)	15:52.17
tor8	Robin_Watts: that's what we currently do, but we use the max of any images in the line and the ascender and descender based on the em-size	15:52.36
Robin_Watts	Most of the time our lines will fit comfortably within that (cos most glyphs don't use the full extent of the glyph bbox)	15:52.50
tor8	if we just pick the actual node->font used I'd be okay with it	15:52.52
	s/used //	15:53.02
	and ignore any ascender/descender values in the fallback fonts	15:53.18
Robin_Watts	When we use a fallback char, we might get a different font.	15:53.28
tor8	then a sudden missing character won't throw off the line spacing	15:53.31
	but if someone picks a font with a specific ascender/descender/line-height we'll use that	15:54.00
	and we can make sure our defaults (in the absence of any user fonts) are sane	15:54.18
Robin_Watts	Or when we shape we might get glyphs that are offset outside that range.	15:54.29
	I reckon we want to stick with the current 1.2 * max, and only increase that if we have glyph combinations that have an actual measured min/max larger than that.	15:55.10
	So a random inserted bit of (say) arabic that slopes upwards can change the line spacing, but only if it genuinely would have run into the stuff above us.	15:56.30
tor8	Robin_Watts: fonts also have a line height property which we currently ignore	15:56.32
	well, we currently ignore everything except the CSS set em-size	15:56.57
sebras	tor8: did I? did you find the epub stash?	16:00.41
tor8	Robin_Watts: https://www.w3.org/TR/CSS2/visudet.html#line-height	16:00.53
Robin_Watts	sebras: I could not find it.	16:02.12
sebras	Robin_Watts: any particular type that your looking for? r2l ones I imagine?	16:04.23
	Robin_Watts: I guess I have them on my desktop. I'll let you know if I find them a little later today.	16:05.09
Robin_Watts	tor8: sorry about that. PC bluescreened.	16:13.53
	tor8: I just tried to read that css line-height spec, and my brain blue screened.	16:24.31
tor8	Robin_Watts: that tends to happen when you try to read the css spec.	16:25.49
Robin_Watts	I'm going to park this for a few hours while I ponder on it some more.	16:26.57
	reboot.	16:38.55
tor8	Robin_Watts: commit on tor/master for tracking serif/bold/italic so we can use them when looking for fallback fonts	17:09.10
Robin_Watts	tor8: using a char to hold a bool ?	17:10.31
tor8	Robin_Watts: yes.	17:10.38
	as far as I am concerned, they could just be ints	17:11.03
Robin_Watts	flag word :)	17:11.15
tor8	masking with constants is annoying...	17:11.33
	and bitfields are overkill	17:11.41
Robin_Watts	static inline int fz_font_is_bold(fz_font *b) { return !!(b->flags && FZ_FONT_BOLD); }	17:12.45
	but sure, it looks fine.	17:12.54
tor8	Robin_Watts: thanks.	17:14.14
	now we get serif fallbacks when available :)	17:14.27
Robin_Watts	Nice.	17:15.10
tor8	and should we one day decide to bloat the binary with italic and bold versions, we can get those too	17:15.33
Robin_Watts	If you ask for a bold font, and we don't have one, does it use fake_bold?	17:17.29
tor8	it does not, but it would be easy to add	17:20.46
	doing the same for italic would be bad though, unless we restrict it to latin/cyrillic/greek scripts where we know italic as slanted works	17:21.27
	but I worry that artificial boldening may make things illegible in some scripts	17:21.48
	currently we only do it for XPS where you can explicitly ask for fake bold	17:22.15
Robin_Watts	We had a complaint about SOT recently that bold is very important for CJKV, and we weren't supplying a bold font.	17:34.57
	Forward 1 day (to 2016/02/11)>>>

IRC Logs

Log of #ghostscript at irc.freenode.net.