Ghostscript IRC logs

Log of #ghostscript at irc.freenode.net.

	<<<Back 1 day (to 2016/01/25)	20160126
keramis	Hi, on stackoverflow I found "Fontmap {exch ==only ( ) print ==} forall". It does exactly what I was looking for, but couldn't find any docs for '==only'. Can anybody give me a clue?	09:00.45
kens	THere are no docs, its a Ghostscript-specific extension to the PostScript language	09:01.14
keramis	I see. Do you know what this exactly does? Just curious...	09:03.04
kens	Its defined in gs_init.ps if you are fmailiar with PostScript you can figure it out	09:03.23
keramis	Thanks, I'll check it out.	09:03.54
	Enlightening, if you know where to look. :) BTW that thread on stackoverflow helped me a lot (https://stackoverflow.com/questions/11137732/what-are-postscript-dictionaries-and-how-can-they-be-accessed-via-ghostscript/11144359#11144359) to understand what is going on "under the hood". Maybe it should be incorporated in the docs or at least this link.	09:17.30
kens	Why ? It all appears to be standard PostScript, and the command line switches are documented	09:18.18
	Also, if its not documented, you're not supposed to meddle with it, we will change undocumented stuff without warning.	09:19.33
*kens*	coffees	09:22.39
keramis	I understand this. But for me as a beginner this thread helped me a lot to get things up and running quickly. Reading 912 pages PLRM also helps of course, but is not nearly as quick! ;)	09:23.53
kens	There are other resources, but our documentation is not intended as abeginners guide to PostScript	09:26.40
keramis	Thanks for clarification anyway!	09:28.43
	Apropos command line switches: I couldn't find the -c switch documented in the manpage of gs(1). It's easy to find out what it does, but is this intended?	09:42.30
kens	Forget man pages, use our documentation	09:42.50
	chrisl finally got dictionaries working.......	09:57.18
	I may actually be starting to understand param lists :-(	09:57.38
chrisl	kens: cool	09:57.39
	I doubt that!	09:58.05
kens	Well, I'll never understand the rationale	09:58.21
	Found another little 'gotcha'	09:58.30
	C param lists (and as far as I can see only C param lists can have a 'target	09:58.51
chrisl	target??	09:59.20
kens	If you fail to find the key in the list, then you search the target, and of course if that's a C param list and you fial to find the key there, then you search its target and so on	09:59.23
	The target is another param list	09:59.35
	gs_param_list_set_persistent_keysIt only seems to exist in C param lists	09:59.50
	Crazy implementation details	10:00.11
chrisl	Why is it called "target" and what purpose does it serve?	10:00.31
kens	Now you are asking....	10:00.42
	For purpose, it allows you to effectively aggregate dictionaries (or other collections) by pointing the 'target of the list to the new list	10:01.21
	In ths case, its 'wrapping' the image params dictoinary with an outer dictionary whch contains some extra values, like the Rows and Columns	10:02.05
chrisl	So, it's a hack, basically?	10:02.30
kens	As far as I can tell, yes	10:02.41
	Its not reliable, if you have a param list that alreaqdy has a target, and that target is not a C param list, then you simply can't replace the existing target	10:03.15
	Basically as you say an ugly hack	10:03.53
	But then that describes param lists pretty accurately throughout	10:04.10
chrisl	It's also doesn't actually replicate the behaviour dictionaries, which is confusing	10:04.39
kens	Well it 'sort of' does, it stops searching when it finds the first (most recent) definition for a key	10:05.18
	But you can't 'undef' a key that way	10:05.36
	Frankly its crap	10:05.55
	I'm guessing it was put in as a quick solution for a problem, just like usual	10:06.23
Robin_Watts	tor8: Updated versions of commits online now.	11:22.29
	That sorts the fallback stuff.	11:22.35
	We're never going to have more than 256 fallback fonts, right? :)	11:22.43
tor8	I certainly hope not! :)	11:22.58
Robin_Watts	I think we need to address line spacing issues.	11:23.04
	This is my quickly hacked together test: http://ghostscript.com/~robin/ManyLang.epub	11:23.48
tor8	if I take all of the noto fonts in regular style, (excepting the CJK fonts), there are 101 of them	11:24.20
	and they are 7.3Mb	11:24.44
	if we're daring, we could actually embed the lot	11:24.51
Robin_Watts	tor8: Can we combine them into sane sets?	11:26.05
tor8	if we drop egyptian hieroglyphs and cuneiform we'll save 1M	11:26.32
	I think they might be uncombinable due to use of opentype features etc for harfbuzz	11:26.56
	but we can group them based on unicode scripts	11:27.16
Robin_Watts	I can't immediately think why combining them should be a problem (assuming the tools know about opentype tables).	11:27.50
	and we can't have more than 65535 glyphs in any given font.	11:28.02
tor8	even if we can, do we really gain anything by it?	11:28.24
Robin_Watts	tor8: I suspect that every font contains the common glyphs.	11:29.31
	You're presumably looking at the unhinted variants of the noto fonts ?	11:30.48
tor8	yes.	11:31.19
	we don't use hinting in our rendering, so that would be a waste of space	11:31.32
Robin_Watts	Indeed.	11:31.36
	I'm guessing that stuff probably groups into 'CJK' 'Other South East Asia' 'Indic' 'Arabic' 'European'	11:32.27
	Middle Eastern rather than Arabic.	11:32.46
AverageJoe	Hello everyone! I am putting together a Android app for educational uses. Im no expert by any means: Is it possible/legal to use Mupdf to display PDFs within the application?	11:33.02
Robin_Watts	AverageJoe: MuPDF is released under 2 licenses.	11:33.20
	The first is the GNU AGPL. If you can abide by the terms of the GNU AGPL then you can use MuPDF in your application for free.	11:33.45
tor8	Robin_Watts: ah, you mean so that customers can easily skip entire families of scripts to save space?	11:33.52
Robin_Watts	tor8: Yes.	11:33.59
tor8	thirdparty/harfbuzz/src/hb-alloc.h:14:17: error: unknown type name 'size_t'	11:34.04
	could be done by #ifdefs in the file that includes the embedded font	11:34.20
	like we do for the CJK CJKNOFULL etc	11:34.25
Robin_Watts	AverageJoe: Those terms include, but are not limited to, the fact that you will have to give away the source code to your entire app to anyone that asks for it that has got a copy of your app.	11:34.59
	AverageJoe: Most app developers looking to make a profit figure that that's a non-starter :)	11:35.26
tor8	Robin_Watts: needs a #include <stddef.h> in hb-alloc.h	11:35.49
Robin_Watts	So, MuPDF is available under another license that removes all the nasty terms and conditions. That's the Artifex Commercial License.	11:36.16
tor8	Lock ordering violation: Attempt to take lock 3 when 2 held already!	11:36.19
Robin_Watts	But that will cost you money.	11:36.21
AverageJoe	well no Intentions to make profits here :D i doubt someone would pay for sth like this anyways.	11:36.41
Robin_Watts	AverageJoe: Well, if you're happy to release your app under the AGPL, then yes, you can use MuPDF under that license for free.	11:37.19
AverageJoe	so if i put it on bitbucket or github or sth. and copypaste the gnu licensing hints and links/email to me i am on the safe side basicaly	11:37.49
Robin_Watts	AverageJoe: Yes, AIUI (but I Am Not A Lawyer)	11:38.12
AverageJoe	guess i have to read into it	11:38.17
	thank you very much so far!	11:38.34
Robin_Watts	no worries. let us know how you get on.	11:38.42
	AverageJoe: There is a new version of MuPDF coming out soon that includes revised Java/JNI code.	11:39.03
	It means you can call MuPDF directly rather than using our example app.	11:39.33
	tor8: crap. How are you getting that?	11:39.47
tor8	Robin_Watts: mupdf-gl on ManyLang.epub	11:40.31
	with your branch	11:40.46
Robin_Watts	I take and drop the freetype lock around the draw code.	11:41.15
tor8	the shaped text looks nothing like what firefox renders for ManyLang.epub (when you extract the html file)	11:41.20
Robin_Watts	Does your draw code use the glyphcache lock ?	11:41.24
tor8	nope, mupdf-gl doesn't use the mupdf font rendering	11:41.40
	it uses freetype directly on its own	11:41.44
Robin_Watts	I will look into it.	11:42.03
	tor8: Shaped text being different - crap. Same font?	11:42.27
tor8	Robin_Watts: which font are you testing with?	11:45.21
Robin_Watts	DroidSansFallback in mupdf	11:53.33
tor8	Robin_Watts: something else is fishy. the japanese, chinese and hindi text disappears	11:55.03
Robin_Watts	Hindi disappears, certainly. It's not in the fallback font.	11:55.33
tor8	it should turn into tofu or bullets, no?	11:55.48
Robin_Watts	tor8: Not currently.	11:56.01
	I only get 1 char of the japanese or chinese text. That's not right :(	11:56.18
tor8	still doesn't explain why only 1 char of japanese or chinese text appears...	11:56.30
Robin_Watts	tor8: indeed.	11:56.45
tor8	and the text in "English sentence with ...... in the middle of it." looks nothing like firefox renders it	11:56.51
Robin_Watts	tor8: I'll look into that, but I'd like to have a better fallback font in place first.	11:57.32
	It wouldn't surprise me to find that the shaping stuff in notosansfallback is knackered, so I might be searching in code for what are actually font problems.	11:58.12
tor8	Robin_Watts: I still get the same characters (not matching firefox) out when using NotoNaskhArabic-Regular.ttf instead of DroidSansFallbackFul.ttc	11:59.10
	and with NotoNaskhArabic-Regular.ttf I get tofu out for Hindi	11:59.53
	still only 1 char each for japanese and chinese	12:00.01
	at least it's the correct 1st char :)	12:00.23
Robin_Watts	Ok, so the locking stuff is happening here. It's down to the draw call taking the glyphcache lock.	12:00.56
	I can fix that.	12:00.59
	tor8: OK, so new commit on robin/harfbuzz that fixes the locking.	12:12.37
	Do you have a commit that adds the proper fonts I can snaffle?	12:13.00
	brb.	12:13.08
tor8	Robin_Watts: working on adding a commit where we can do html_lookup_noto_font with a UCDN script tag	12:21.51
Robin_Watts	tor8: Could that maybe be fitz_lookup_font_for_script ?	12:25.05
	a) Doesn't need to be html rather than fitz.	12:25.22
	b) would be nice if other people could slot in other fonts there without the 'noto' bit confusing names.	12:25.44
tor8	Robin_Watts: yeah, we could put in fitz instead	12:26.47
	might make sense to move the pdf builtin fonts into fitz as well	12:26.59
zoug	hello, any mupdf users here? Do you guys know if it's possible to display two pages side by side? I couldn't find anything on the web	12:49.54
tor8	zoug: it is not possible	12:55.56
zoug	tor8: :(	12:56.08
	thanks for your help	12:56.15
tor8	if you're handy with a c compiler, you can probably add to mupdf-gl given an afternoon or two of hacking	12:57.24
Robin_Watts	zoug: To be clear. The MuPDF core is absolutely capable of that.	12:57.54
	But none of our released viewers include that functionality.	12:58.08
zoug	okey! hopefully it'll be added in the future if you guys think it could be good	12:58.39
Robin_Watts	zoug: Check out gsview.	12:58.56
zoug	I personnally am not good at all in programming so couldn't help unfortunately	12:58.57
Robin_Watts	That uses mupdf as the view engine, and if it's added anywhere, it will probably be to there.	12:59.17
zoug	okey, will do	12:59.42
AverageJoe	Robin_Watts: any estimates on the java/JNI code? I got an example project that is using it but i am not sure if i understand it.	13:09.10
Robin_Watts	AverageJoe: what do you mean by estimates?	13:09.30
AverageJoe	date :D	13:09.42
Robin_Watts	Oh, right, well, it's in the public git now.	13:09.59
AverageJoe	whoops	13:10.09
Robin_Watts	next release is scheduled for march.	13:10.12
	actually, I lie.	13:11.23
	It's not there yet.	13:11.33
	It is here: http://git.ghostscript.com/?p=user/robin/mupdf.git;a=shortlog;h=refs/heads/jni	13:12.01
	The JNI bindings are in the penultimate commit on that branch.	13:12.21
	The final commit is a work in progress to move the app over to working on top of those new bindings.	13:12.49
	It's there enough to prove that those bindings work, but it's not fully featured yet.	13:13.09
tor8	Robin_Watts: commits on tor/master to add noto fonts, and load them into the html fallback chain	13:24.51
Robin_Watts	tor8: Ta.	13:24.59
	tor8: What do we need to do to get the JNI changes in?	13:25.11
tor8	it should build on windows as well, but that's untested	13:25.18
Robin_Watts	You were looking at platform/java ?	13:25.22
tor8	Robin_Watts: I was going to look at platform/java and make a very simple desktop java viewer on top	13:25.37
	but it completely slipped my mind...	13:25.45
	I have goldfish memory when it comes to remembering where I was after the weekend	13:26.08
	and I'd forgotten to write it down in my TODO file :)	13:26.14
Robin_Watts	I'd really like to get this stuff into the release.	13:26.21
tor8	Robin_Watts: Agreed!	13:26.28
	and I'd also love to have shaping with the noto fonts in the release as well	13:26.49
	the full noto set as comiled looks to add 5.8Mb to the binary	13:27.26
	we have both the sans and serif variants where available, but I dropped the 'bold'	13:28.07
Robin_Watts	tor8: How much would bold and italic (and bold italic) options add?	13:29.14
	If they are easily selectable at build time, then I can imagine that desktop users would just add the lot.	13:29.53
tor8	it'd double the size	13:30.45
	roughly	13:30.50
Robin_Watts	For desktop use that's nothing.	13:31.01
tor8	if we drop the serifs (and keep only the sans-regular faces for each script) we're down to 4.7m	13:33.29
	if we take the serif rather than the sans, but still only keep one of the two: 4.8m	13:34.16
Robin_Watts	tor8: In an ideal world, we'd like to be able to say at build time: "include serif", "include sans serif", "include bold", "include italic", "include cjk" etc.	13:35.02
tor8	Robin_Watts: yeah.	13:35.12
	I'm thinking this compiler must be smarter than usual... it's dropping the static arrays that aren't used	13:35.30
	yeah, if I compile with gcc it dumps in the whole lot	13:35.58
Robin_Watts	Possibly we should have a header file that just resolves a nest of INCLUDE_NOTO_SERIF etc stuff into INCLUDE_NOTO_SANS_BOLD etc.	13:36.07
	and that way people can EITHER use INCLUDE_NOTO_{SANS,SERIF,BOLD,ITALIC} etc, or they can use INCLUDE_NOTO_SANS_BOLD (i.e. explicit fonts)	13:36.45
tor8	ah, no, gcc is also smart. it's just dumb if we use the .incbin inline asm directives to speed up compilation	13:37.04
Robin_Watts	tor8: That's strange.	13:46.00
	The ABCDEF japanese text is being converted to A B C D E F before it even gets to the bidi stuff.	13:46.22
tor8	Robin_Watts: that's because japanese can be line broken between any characters	13:46.52
Robin_Watts	oh, ok.	13:47.17
	That makes more sense then, ta.	13:47.27
	D'Oh. Stupid mistake.	14:08.15
	Japanese/Chinese fixed.	14:09.25
tor8	Robin_Watts: hm, there's one noto font that doesn't fit the script scheme -- NotoSansSymbols	14:10.24
	that's got all the weird punctuation in it	14:10.30
	should have that as the final fallback I guess	14:10.36
Robin_Watts	tor8: Ah, that's a pain.	15:06.53
	Cos I split fragments so they are common + any one script in each fragment.	15:07.23
	Are all the 'weird punctuation' things common?	15:07.47
	(I mean, are they all classed as "common" rather than appearing commonly)	15:08.10
kens	Sill not getting email from the mailing lists :-)	15:09.04
Robin_Watts	kens: I was just thinking that'd be a topic for the meeting.	15:09.19
kens	Yeah I noticed because I got Marcos' bug report and was checking through it before meeting	15:09.42
Robin_Watts	tor8: So, looking at these commits:	15:23.01
	I don't like the fz_lookup_noto_font name. Having 'noto' in there bothers me. having it as a fallback font would be better. I reckon.	15:23.55
	I'm also not hugely keen on the implementation. Could we consider a table of script/serif/font pointer ?	15:24.59
	When asked for a font for a given script we'd search the table for a matching entry? If we don't have a matching serif, we would live with a matching script.	15:25.55
	We could actually make the table be script/serif/italic/weight/font.	15:26.17
	and that way people can add new fonts just by adding to the table.	15:26.41
	Possibly we could have font pointers be both internal (memory pointers) or external (filenames).	15:27.18
marcosw	morning all. are we meeting here or on the other side?	15:27.38
Robin_Watts	Other side, I think.	15:28.43
	tor8: Also, I don't like fz_load_html_fallback_font being html specific.	15:29.05
tor8	Robin_Watts: I anticipate adding a caching layer where loaded fz_font's get stored and that's where people would add new fonts	15:32.39
Robin_Watts	tor8: An fz_context->font thing ?	15:33.09
tor8	Robin_Watts: yes, all weird punctuation things are classed as common	15:33.12
	that or fz_html_font_set thing	15:33.21
Robin_Watts	tor8: Does the 'wierd symbols' font contain ALL the punctuation?	15:33.54
tor8	Robin_Watts: no, it does not contain the usual punctuation	15:36.32
Robin_Watts	tor8: urgh. So we might need a wrapper around ucdn's get script call to define the 'weird' stuff as being a different script.	15:40.07
tor8	Robin_Watts: can we split the runs on final font as well as script	15:43.46
	find the script, use the cmap to lookup the encoding in the fallback chain to get the font	15:44.10
	and make runs of common direction+script+font to feed to harfbuzz?	15:44.24
Robin_Watts	Not trivially, the way it's currently structured.	15:44.44
	The bidi stuff (which is where I am doing the splitting at the moment) doesn't have a font.	15:45.01
tor8	the actual glyph index we find when encoding the unicode we'd toss, just use it to see if a code point exists in a given font	15:45.10
	instead of looking for a best_font for an entire script, split where you get different fonts?	15:45.51
Robin_Watts	tor8: nodes are already split on styles.	15:46.51
	I look for a best_font on a node.	15:47.19
tor8	I mean to further split the node by looking up the actual font for each character	16:01.57
Robin_Watts	tor8: Right. I can add code to do that to 'newFragCb' in the html stuff.	16:02.47
tor8	Robin_Watts: I think we should only load the fallback fonts on an as-needed basis though	16:03.29
Robin_Watts	tor8: Yes.	16:03.42
tor8	so we should move away from the font->fallback chain to something else that keeps the fallback fonts in a separate struct	16:03.53
Robin_Watts	tor8: So newFragCb gets called with fragments of code that are already split on directionality and script.	16:04.35
	s/code/text/	16:04.41
	(well, by "script" I mean no fragment contains more than a single script type + punctuation)	16:05.51
tor8	Robin_Watts: hang on, let me check out your branch	16:05.52
	Robin_Watts: single 'resolved' script type (where common and inherited are handled)	16:07.11
	Robin_Watts: so, I think if you look up each character in that text to find the actual font needed and split the node on those boundaries too?	16:07.43
Robin_Watts	inherited?	16:08.06
tor8	combining characters have inherited (diacritics, etc)	16:09.02
	COMBINING GRAVE ACCENT, etc	16:09.25
	just treat them the same as common	16:09.34
Robin_Watts	OK. Will do.	16:09.45
	tor8: I wonder whether I should do any subsequent splitting in measure_word	16:10.06
tor8	I'd just do it here, keep it gathered	16:10.20
Robin_Watts	measure_word is where I attempt a shape on the node and gradually fallback.	16:10.36
tor8	or not split, just feed smaller spans of each node to shaping	16:11.00
Robin_Watts	I reckon there is a better way to do this.	16:14.16
	In the bidi stuff, where I split fragments before I call newFragCb, I call 'ucdn_get_script' for every char.	16:14.56
	I want to call fz_ucdn_get_script instead, and that function will call ucdn_get_script. If it gets 'common' back, it will further subdivide out the 'weird' stuff.	16:15.54
	Then the code will automatically split the nodes properly.	16:16.18
	I could even return the script type with each fragment.	16:17.10
	Which could be stored in the node in the bitfield for no extra cost.	16:17.24
	Then it's easy to load the right fallback font without searching.	16:17.45
marcosw	i'm going to reboot casper. it is overdue for updates.	16:18.06
	tor8 Robin_Watts sebras: ^^^	16:18.29
tor8	the boundary between common and weird-common scripts would depend on the actual font used	16:20.49
kens	chrisl ping	16:21.08
chrisl	kens: pong	16:23.15
kens	Looking into a bug I see a (mior) problem with error reporting	16:23.35
	in stream.h we define the check_file macro which returns gs_error_invalidaccess if the file is invalid, the PLRM says that should be an ioerror	16:24.11
	I propose to change to an ioerror, what do you think ?	16:24.20
	I should say I'm looking at the setfileposition operator which says it shoudl return ioerror if the file is invalid	16:26.20
chrisl	kens: I don't have a problem with that - but I'm a little hazy on what that macro is checking.....	16:26.40
kens	(A0) == flushFirst it checks to see if the type of the object is t_file, then it checks to see if its invalid (no idea what the details are there)	16:27.21
	If its type is wrong then it returns a typecheck, otherwise it returns invalidaccess	16:27.40
	And I thnk that should be ioerror	16:27.49
chrisl	Yep, that makes sense	16:28.17
kens	I'll try a cluster push now	16:28.31
Robin_Watts	tor8: "the boundary between common and weird-common scripts would depend on the actual font used" howso?	16:28.42
kens	Of course I still have to find out why the file is invalid, but that's a different problem	16:28.44
tor8	Robin_Watts: what do you consider 'common' and 'weird-common' and why would you want to split at them?	16:32.27
	I thought this was to get around weird punctuation not existing in a given fallback font, so we can drop to another one	16:32.50
	without dropping for the entire word	16:33.03
Robin_Watts	tor8: Any char that is in 'common' should exist in every font, AIUI.	16:33.23
	Any char that is in 'wierd-common' might well only live in the symbol font.	16:33.39
	Hence I would split every node to have either common/inherited/a single language.	16:34.10
	and any wierd-common stuff goes into their own nodes.	16:34.29
tor8	Robin_Watts: I think that's a flawed assumption though	16:35.38
	it would be nice if such were the case	16:35.51
	but we're going to have 3 levels of fonts as I see it: the user specified font (regardless of script), which will fall back to a per-script fallback font, which will fall back to a catch-all symbol font	16:36.45
	the user-specified font is the one in the "style" struct	16:37.31
Robin_Watts	tor8: Ok. So, I'm still going to change the bidi stuff to also pass out the 'script' value for each fragment (which may be COMMON).	16:39.14
marcosw	I need to run to a doctor's appointment; will be back later today.	16:39.33
tor8	Robin_Watts: CJK for example uses different punctuation, which is script common but still sort-of specific to cjk	16:39.36
Robin_Watts	That way we can try the user specified font, and if that fails, we can drop back direct to the script font.	16:39.53
tor8	Robin_Watts: yeah	16:40.05
Robin_Watts	If the script font STILL misses some chars, I'll split those out, and look for them in the symbol font.	16:40.23
tor8	so you ran into trouble getting the punctuation script for cjk punctuation due to it being one fragment per character?	16:40.39
Robin_Watts	tor8: No.	16:41.04
tor8	we could do the script analysis all the way up in generate_text and feed the script when creating the flow nodes	16:41.14
Robin_Watts	The "only 1 char of japanese" was a stupid mistake to do with zero advance chars.	16:41.38
tor8	in fact, that'd probably be best should we eventually start respecting the html tags that define language etc	16:41.39
Robin_Watts	tor8: Urm...	16:42.03
	html can have markup for l2r and r2l. And that information should be fed into the nodes.	16:42.26
tor8	yeah, and language as well	16:42.39
Robin_Watts	which in turn gets fed into the bidi stuff as a 'base' direction.	16:42.40
tor8	does the bidi stuff split on scripts?	16:42.54
	or just bidi runs?	16:42.57
	I'm suggesting we resolve, split and feed script into the nodes as well	16:43.31
Robin_Watts	The language stuff is mostly useful for resolving the glyphs for unicode things that can have different looks in different languages.	16:43.35
	The core bidi code (which I am not changing really) splits on runs.	16:44.03
	The code of mine that wraps it splits on script too.	16:44.12
tor8	to cover the case where a node has no text that's anything other than UCDN_SCRIPT_COMMON	16:44.24
	but should still copy it from surrounding text	16:44.39
	a case we don't currently handle, AIUI	16:44.46
Robin_Watts	Yeah, the bidi code takes care of all that.	16:44.52
	I gather all the text from a flow up into a single buffer (punctuation, and all, multiple scripts, different styles etc).	16:45.22
	That buffer gets fed to the bidi code which calls me back to say "chars 0 to 3 are one fragment, with directionality l2r" "chars 4 to 6 are another framgnet with directionality r2l" etc.	16:46.04
tor8	oh, right! you do the bidi scanning and splitting on the whole paragraph. nevermind my ramblings then :)	16:46.06
Robin_Watts	I then split the nodes further so that no node is in more than one bidi fragment.	16:46.37
tor8	still, should probably save the script in the node from that step to help looking up the right set of fallback fonts each node	16:46.41
Robin_Watts	Yes.	16:46.45
	And each node should have a tag in it to say what language was actually specified by the markup (to be used when resolving the unicode unified stuff)	16:47.19
tor8	because once in draw_word and measure_word we don't have the script tag from the bidi splitting pass	16:47.26
	Robin_Watts: yes.	16:47.33
Robin_Watts	tor8: It will, cos I want to add that :)	16:47.45
tor8	un-unify the unified CJK characters per language? :)	16:48.17
Robin_Watts	Also each node should contain a tag to say whether a specific l2r or r2l was specified by the markup.	16:48.20
tor8	then we'll need a better CJK font than droidsansfallback	16:48.27
Robin_Watts	tor8: Yes, that's basically what you're required to do.	16:48.27
	tor8: Right.	16:48.36
	Or at least we'll have the freedom to do that.	16:48.44
	THat's the argument for separate C/J/K/V fonts.	16:48.54
tor8	the noto han sans font (which is huge) has language specific characters	16:48.56
	which we could manage to deal with by making a TTC with a subfont per language	16:49.11
	like we currently do to hack vertical/horizontal writing modes	16:49.25
Robin_Watts	tor8: Yes, you can have the different variants within a font and use an opentype specific mechanism to get the right ones.	16:49.29
	but that's hard, and I'm not sure freetype can do that.	16:49.38
tor8	or as the ones they ship do, use opentype	16:49.40
	I think harfbuzz can do that. freetype doesn't handle any opentype tables, at all.	16:49.54
	which is why I had to make the TTC hack for droidsansfallback	16:50.08
	the original droidsansfallback had alternate glyph lookup tables for vertical writing using opentype tables	16:50.26
Robin_Watts	tor8: If Harfbuzz sorts it, great.	16:51.13
	So, I'll push through the changes I need and put a commit up.	16:51.36
tor8	Robin_Watts: it should, not saying it'll be easy. and it won't work for PDF.	16:51.36
Robin_Watts	None of this stuff works for PDF.	16:51.45
tor8	Robin_Watts: fab. I've got a full chain of fallback fonts on tor/master you could rebase ontop of	16:51.54
	and then hack the ugly chain up and make the html_font_set cache things as needed, indexed by script	16:52.20
Robin_Watts	tor8: How many scripts are there ?	16:53.09
	132, I think.	16:53.28
tor8	88 in use	16:53.35
	132 or so max	16:53.50
Robin_Watts	I was going by the UCDN_SCRIPT thing.	16:53.57
	Right.	16:53.58
tor8	but the upper chunk 36 ones don't have a noto font	16:54.06
	Robin_Watts: yeah. I'd use the UCDN_SCRIPT thing as the limit.	16:54.26
Robin_Watts	I think I'd be in favour of a ctx->font->fallbacks.	16:54.26
tor8	Robin_Watts: yeah. that'd probably be better, so we don't reload them all for every html document	16:54.55
Robin_Watts	yeah.	16:54.59
	and can we define 132 to be 'SYMBOL' ?	16:55.18
tor8	the html_font_set is to collect all fonts for a given document	16:55.20
	Robin_Watts: abuse SCRIPT_UNKNOWN?	16:55.39
Robin_Watts	actually, ignore that.	16:55.53
tor8	I'd put the NotoSansSymbol font into font->fallbacks->final_fallback	16:56.22
Robin_Watts	tor8: Did you see my burbling above about script/italic/weight/font ?	16:57.46
tor8	Robin_Watts: yeah, I'm thinking that stuff would go into the ctx->fallback thing?	16:58.12
Robin_Watts	I'd like to be able to say "get me a fallback for this font" and get the closest fallback that we have from the fallback set, via the context thing.	16:58.33
	Yeah, sounds like we're on the same page.	16:58.38
tor8	I'd still like to keep the builtin data font lookups like they are; those functions shouldn't be used by users.	16:59.11
Robin_Watts	So I'll let you do the fonty stuff, and I'll do the html/language/bidi stuff.	16:59.11
tor8	Robin_Watts: okay, cool. I'll look at bashing together a fallback font context thing tomorrow then.	16:59.45
Robin_Watts	I'm thinking that customers may want to customise this stuff.	16:59.46
	If we were to sell this to a Korean e-reader manufacturer for instance, they might want to include bold/italic/sans/serif korean fonts but only rudimentary japanese ones, say.	17:00.50
tor8	Robin_Watts: yeah.	17:01.04
Robin_Watts	Hence having a simple table driven thing would be an advantage.	17:01.09
	Forward 1 day (to 2016/01/27)>>>

IRC Logs

Log of #ghostscript at irc.freenode.net.