| <<<Back 1 day (to 2016/01/20) | 20160121 |
tor8 | Robin_Watts: ugh, the const change is already coming back to bite me in the ass :( | 11:08.45 |
| Robin_Watts: I don't think fz_font needs the const identifier (since it's basically an opaque immutable structure once created) | 11:11.38 |
| and I think I may change my mind about fz_colorspace too | 11:18.10 |
Robin_Watts | If it's an opaque immutable structure, then why would it being const hurt ? | 11:23.57 |
tor8 | then you didn't go far enough in consting the source... | 11:24.14 |
| fz_text takes a non-const fz_font argument | 11:24.26 |
| fz_add_text* | 11:24.33 |
Robin_Watts | Now *that's* possible. | 11:25.06 |
tor8 | if it's opaque, with no public mutating calls, const or non-const shouldn't matter | 11:25.07 |
| and then const just adds line noise | 11:25.20 |
| with no extra semantic gain, IMO | 11:25.25 |
Robin_Watts | If it stops people doing font->size += 2, it's a win :) | 11:25.54 |
tor8 | I'm still with you on the fz_text and fz_path structs being const because they can be mutated | 11:25.59 |
Robin_Watts | oh, it's really opaque to callers? | 11:26.42 |
tor8 | int fz_encode_character_with_fallback(fz_context *ctx, fz_font *font, int ucs, fz_font **out_font); that's where I started running into trouble | 11:26.55 |
Robin_Watts | Then maybe yes, we could back out the font constness. | 11:27.15 |
tor8 | with const and double-star pointers and then realizing I needed to add even more consts to bits that currently didn't | 11:27.18 |
| it should be opaque, it's not because of how we've not separated public and private headers. | 11:27.56 |
| same with fz_colorspace | 11:28.06 |
| Robin_Watts: two commits on tor/master | 11:29.55 |
| the html text layout got quite simplified once I added a font->fallback pointer and chained encoding lookups | 11:30.22 |
Robin_Watts | Can we move the fz_font definition into source/fitz/font-impl.h ? | 11:32.03 |
| fz_encode_character takes an int unicode. fz_encode_character_with_fallback takes int ucs. | 11:33.13 |
| Would be nice if they took the same just to avoid confusion. | 11:33.22 |
tor8 | Robin_Watts: oh, sure | 11:33.32 |
| (re the naming) | 11:33.37 |
Robin_Watts | Also, we should be documenting new functions in the headers as we add them. (I'm as guilty of not doing that as anyone) | 11:33.46 |
| otherwise, looks great. | 11:34.37 |
tor8 | Robin_Watts: I think we could have a 'header february' (or header april, after the release) and restructure our header files into private and public apis | 11:34.38 |
Robin_Watts | ooh, before the release, if we have time. | 11:34.59 |
tor8 | and do the source/pdf/ directory split into common/read/write functionality too? | 11:35.21 |
Robin_Watts | tor8: could do. | 11:35.31 |
tor8 | that's less urgent, but getting a good handle on public and private functions would be good | 11:35.46 |
Robin_Watts | In SO, we have separate include and api dirs. | 11:36.16 |
| Our include/mupdf dir is effectively the api. | 11:36.31 |
tor8 | yeah, and lots of stuff in there belongs in the source/foo/*.h files | 11:36.51 |
Robin_Watts | So either we can stick internal implementation headers (blah-imp.h) in with the source, or we could have a source/include tree too. | 11:37.04 |
tor8 | the problem is what to do about semi-private stuff that is used both in source/fitz/ and source/pdf/? | 11:37.07 |
Robin_Watts | source/include/fitz/ | 11:37.28 |
tor8 | less fond of that idea | 11:37.39 |
| maybe a public but not-included by the mega-header and have the -imp (or -priv) suffix? | 11:37.58 |
| or fix the pdf code to not need the internal guts of fitz structs | 11:38.19 |
Robin_Watts | The megaheader is something we should be moving away from. | 11:38.22 |
| Anything that relies on people only using the megaheader is bad, IMHO. | 11:38.40 |
tor8 | there I tend to disagree. we have too many header files. | 11:38.56 |
Robin_Watts | We could just leave the imp stuff in source/fitz, and make sure that source is on the include path? | 11:39.31 |
tor8 | and I *hate* header files that include other header files automagically | 11:39.40 |
Robin_Watts | Then we can include "fitz/blah-imp.h" | 11:39.44 |
| tor8: I'm 100% against you there. | 11:39.50 |
| header files should include the header files that they require. | 11:40.11 |
tor8 | if you're trying to maintain header hygiene, pulling in a header file that implicitly includes another header file that you then go on to use | 11:40.17 |
| you've failed | 11:40.20 |
Robin_Watts | You should be able to call the compiler to compile any single header file without it giving an error. | 11:40.45 |
tor8 | that's where I disagree, if you're going to have any hope of imposing on the user to include what he needs and keep the code clean you need errors | 11:41.46 |
| otherwise you end up with what's in the ghostpdl source | 11:41.59 |
| adding -Isource would work for me | 11:42.35 |
Robin_Watts | No, the ghostpdl source follows exactly the scheme you're advocating. | 11:42.56 |
tor8 | (mind you, I've conceded your point for the individual header files in mupdf) | 11:43.06 |
Robin_Watts | If foo.h needs a definition from bar.h, then foo.h does NOT include bar.h. It's up to the caller. And it's hell. | 11:43.35 |
tor8 | (but nothing tests that, so we're probably bitrotted) | 11:43.36 |
| it's hell because the headers are crap and don't document what's needed in which order and there's no sanity anywhere in the ghostpdl source | 11:44.03 |
Robin_Watts | I really want to revise the gs headers so that they pass the single header compilation rule. | 11:44.04 |
| Yes, I will absolutely agree with that. | 11:44.13 |
| if a header includes all it requires, then that *is* documentation of sorts. | 11:44.29 |
tor8 | if foo.h needs bar.h, and in your source you include foo.h but only use bar; then your code sucks and that's what I fear we'll end up with | 11:44.47 |
Robin_Watts | tor8: Yes, that is bad, but less bad than the alternatives. | 11:45.13 |
tor8 | then somebody sees you don't use foo, so deletes the foo.h and then your compile fails because your code depends on bar but didn't list the dependency explicitly as we intended | 11:45.18 |
| and I just say to heck with it all and do the mega-header | 11:45.26 |
Robin_Watts | change one header, everything recompiles :( | 11:45.41 |
tor8 | our compiles are still in the <1m range :) | 11:45.55 |
| 11s on my machine | 11:46.19 |
| for a full clear rebuild of mupdf | 11:46.25 |
| clean | 11:46.30 |
| the alternative is often build skew, because the build system doesn't track header dependencies | 11:46.49 |
| bugs from build skew due to incomplete header dependencies have wasted more of my time than recompiling mupdf every time a header changes :) | 11:47.30 |
| Robin_Watts: I don't know if you remember when we used Jam for mupdf? | 11:48.03 |
Robin_Watts | ISTR having to struggle with jam for something. | 11:48.27 |
| I remember having to use darcs too. Was that mupdf? | 11:48.41 |
tor8 | its main advantage over make was that it automatically scanned the source for includes to check if it needed rebuilding a source file | 11:48.46 |
| darcs was mupdf back before we went git yes | 11:48.54 |
Robin_Watts | oh, god, I'd wiped that from my memory. | 11:48.55 |
| Did you see my burblings about the fallback font the other day? | 12:11.41 |
| I found droid sans fallback versions with indic and khmer fonts in too. | 12:12.04 |
tor8 | Robin_Watts: ah, no, I did not. | 12:15.01 |
Robin_Watts | Was wondering if it was worth pulling those into the uber fallback too. | 12:15.22 |
| (and I was wondering if you could explain how you're doing the merging. Sounds like a useful thing to know) | 12:15.48 |
tor8 | Robin_Watts: I use fontforge to do the merging. I open the TTC which has DroidSansFallback-H and -V fonts in it | 12:16.23 |
| I have to change the em-size font info field before merging though, DroidSansFallback is unusual in that it uses a design size of 256 rather than 2048 | 12:16.53 |
| then I just merge in the font, make sure the encoding is correct and save out the sfd's and then in Generate TTC from the -H variant I pick the -V as the additional font and save out the TTC | 12:17.31 |
| I'm not sure, indic fonts need shaping don't they? | 12:17.52 |
Robin_Watts | They don't NEED shaping anymore than arabic ones do. | 12:18.08 |
| My understanding is that it's comprehensible, just not ideal. | 12:18.50 |
tor8 | Robin_Watts: where did you find the indic/khmer fallback version? | 12:21.49 |
Robin_Watts | This page: https://en.wikipedia.org/wiki/Droid_fonts mentions Droid Sans Fallback Indic [Regular] and Droid Sans Fallback Khmer [Regular] | 12:23.02 |
| https://code.google.com/p/galaxy-nexus-khmer/downloads/detail?name=DroidSansFallback.ttf | 12:23.18 |
| http://forum.xda-developers.com/showthread.php?t=798380 | 12:23.40 |
| (and those 2 links I found by googling) | 12:23.49 |
tor8 | Robin_Watts: well, now we have a generic fallback mechanism it would make more sense to back out the big merging I did >.< | 12:24.18 |
| and add more separate small fonts in a chain | 12:24.26 |
Robin_Watts | tor8: haha, possibly. :) | 12:24.31 |
| I fear that these aren't small fonts. | 12:24.50 |
tor8 | though, I think it'd be better to have good looking fallback fonts for these scripts | 12:25.03 |
| the droid sans ones are awfully block sans-serifs | 12:25.15 |
| something serif-ish would be better for epub | 12:25.25 |
Robin_Watts | I suspect these are as good as you'll get without relying on shaping. | 12:25.29 |
| Do you have a tool that will let you subset a font? | 12:25.38 |
tor8 | Robin_Watts: fontforge can probably do it easily enough | 12:26.11 |
Robin_Watts | I'd imagine that if we go the separate fonts route, we'd want to take the indic font, and remove all the chars from it that exist already in the normal fallback. | 12:26.11 |
| So the indic fonts is just the extra chars required for that script and no dupes. | 12:26.33 |
tor8 | my worry there is that the dupes fit nicely with the extra chars; I wonder if punctuation mis-matches are going to be bothersome | 12:27.02 |
| or if we should try to match punctuation with surrounding scripts | 12:27.15 |
Robin_Watts | That's why using the droid sans stuff seems sensible to me. | 12:27.32 |
| It might be blocky, but at least it's consistent across scripts. | 12:27.46 |
| If other people want to do better fallback sets, good on them. | 12:28.06 |
tor8 | the only reason we use droidsansfallback is because the CJK stuff is small | 12:28.28 |
Robin_Watts | partly because it's blocky, I bet. | 12:28.49 |
tor8 | what about the google NoTo fonts? they might be better suited. | 12:28.52 |
| it's small because it's blocky and lacking in detail, indeed | 12:29.02 |
Robin_Watts | The noto fonts all rely on shaping. | 12:29.03 |
| and because they rely on shaping there are LOTS more glyphs. | 12:29.16 |
| which means they are huge, comparatively. And will look awful without shaping. | 12:29.34 |
| If we move to a shaping capable engine, then yes, using the noto fonts is a possibility. | 12:29.55 |
| (at the cost of more memory) | 12:30.11 |
| but for a non shaping engine, the droid sans fallback variants are a better option, I reckon. | 12:30.35 |
tor8 | DroidArabicNaskh looks a lot nicer than DroidArabicKufi | 12:32.48 |
Robin_Watts | Naskh and Kufi are the 2 standard arabic styles, right? | 12:33.22 |
| Neither of which are Droid Sans Fallback variants though. | 12:33.41 |
tor8 | Naskh is the traditional style that you see everywhere | 12:33.43 |
| Droid Sans Fallback is only CJK | 12:34.02 |
kens | Also Nastaliq | 12:34.09 |
tor8 | mis-named | 12:34.09 |
kens | I thought Khufi was a language rather than a script | 12:34.36 |
Robin_Watts | tor8: Which font did you merge into droidsansfallback to get arabic? | 12:35.08 |
tor8 | DroidSansArabic.ttf | 12:35.17 |
Robin_Watts | right. | 12:35.38 |
kens | Ah yes, Kufi, wihtout the 'h', my bad | 12:35.46 |
tor8 | which looks similar to DroidKufi-Regular.ttf | 12:35.59 |
Robin_Watts | tor8: So, you could redo the merging with DroidSansNaskh ? (Or maybe do a differencing to get a chainable fallback font with no dupes?) | 12:36.50 |
tor8 | I'm thinking to revert the merging back to CJK+latin (we merged in latin into CJK for pdf substitute fonts where sometimes latin glyphs were used) | 12:37.51 |
| and then add Naskh and a bunch of others as chained fonts; maybe take a look at minimizing duplicate chars | 12:38.19 |
| Robin_Watts: now the question is, do we rewrite history to purge the (somewhat large) merged droidsansfallback fonts? | 12:39.01 |
Robin_Watts | I wouldn't bother. | 12:39.18 |
| And further, if it was easy, I'd still be tempted to (temporarily at least) put in an uber fallback font with all merged stuff in we can find. | 12:39.58 |
tor8 | I was tempted to use the Noto Sans CJK fonts for a while | 12:40.45 |
| since they have language-specific glyphs | 12:40.51 |
| but then I saw some users complaining about how dreadful the character design is... | 12:41.12 |
| not to mention how huge they are | 12:41.23 |
Robin_Watts | The noto sans CJK suff is all from adobe. You'd expect them to have gotten it right :( | 12:42.26 |
tor8 | "We renamed Droid fonts to Noto and only update the Noto family since renaming." | 12:46.29 |
Robin_Watts | I don't believe there is a NotoSansFallback though? | 12:48.27 |
tor8 | no, they dropped the DroidSansFallback CJK font and replaced it with Source Han from Adobe | 12:48.58 |
Robin_Watts | so for our purposes (no shaping, smallness) I think we still want to be working with DroidSansFallback for now. | 12:49.36 |
jogux | wonder if SOT should be doing the same. what a mess :( | 12:50.07 |
tor8 | I don't think most of the noto fonts differ from the old droid variants in terms of shaping requirements | 12:50.22 |
| there never was a "droidsansfallback" for anything but CJK; just to clear up any potential confusion | 12:50.51 |
| the DroidSansArabic etc are not part of "Fallback"; they were just other DroidSans fonts | 12:51.12 |
| I'm thinking to stick with DroidSansFallback for our CJK fallback font | 12:51.30 |
Robin_Watts | tor8: I would agree with that. | 12:51.41 |
tor8 | but investigate if DroidSansNaskh and NotoNaskh are much different | 12:51.55 |
Robin_Watts | jogux: SOT does shaping though. | 12:52.08 |
jogux | Robin_Watts: oh, mm... | 12:54.07 |
| er, daft question, surely mupdf needs to as well? we reckoned pretty much all the RTL languages relied on shaping? (I'll have to admit to not reading most of the conversation, so feel free to just ignore me if I'm being daft) | 12:55.19 |
tor8 | jogux: FWIW, I think SOT should be using the NoTo fonts. SOT doesn't have the excuse of "there's too many fonts, it'll bloat the software too much"... :P | 12:55.19 |
Robin_Watts | tor8: SOT *does* use the Noto fonts. | 12:55.44 |
tor8 | Robin_Watts: ah, in that case! | 12:56.02 |
Robin_Watts | jogux: Yes, ideally MuPDF should use shaping. We don't need to for for pdf etc, cos shaping doesn't apply. | 12:56.20 |
jogux | tor8: we have a <2M code, multiple customers have had issues with the font sizes :-( | 12:56.23 |
tor8 | dafter question--arabic shaping isn't that mostly just changing to initial/medial/final shapes? | 12:56.35 |
Robin_Watts | We do need to for epub, which is the new thing. | 12:56.43 |
| (it's the first thing we've done which needs layout) | 12:56.54 |
tor8 | or is it placing the vowels too? | 12:57.04 |
Robin_Watts | but to do shaping we need to introduce more libs (like Harfbuzz) and we haven't gotten to that yet. It's on the list. | 12:57.22 |
jogux | nods at Robin_Watts | 12:57.32 |
Robin_Watts | tor8: pass. goldfish memory, remember? :) | 12:57.34 |
tor8 | Robin_Watts: ... :) | 12:57.43 |
Robin_Watts | runs | 12:57.47 |
jogux | tor8: my memory is as crap as Robin's. but clearly Picsel considered it essential to do and invested a good amount of time into doing it and testing it. | 12:58.14 |
tor8 | jogux: I would expect it to be required for text input | 12:58.46 |
| but once input, it'd be "precomposed" so to speak | 12:58.55 |
jogux | I'm pretty certain shaping affects viewing text in word documents. SOT does it at either layout or render time. | 12:59.38 |
tor8 | jogux: hm, it does look like we need to do some form of shaping for arabic... the arabic text in robin's document doesn't look quite right. | 13:01.05 |
| they're all isolated shapes :( | 13:02.10 |
jogux | :( | 13:08.06 |
tor8 | Robin_Watts: whew, harfbuzz can use ucdn ... we don't need yet another set of tables :) | 13:15.37 |
felicity | hello. when converting a PDF to text that includes the GBP character (£, U+00A3), ghostscript emits C0A3 in UTF-8 output. shouldn't that be C2A3? (gs 9.05) | 13:15.56 |
| (with txtwrite, that is) | 13:16.13 |
kens | felicity : Its impossible to say wihtout seeing the original PDF file | 13:20.59 |
| You should also upgrade, the current version of Ghostscript is 9.18 | 13:21.16 |
| I see at least 15 fixes in the txtwrite device since 9.05 was released | 13:23.08 |
felicity | kens: https://people.torchbox.com/~felicity/pound-sign.pdf | 13:25.26 |
| 9.05 is unfortunately the current version in debian - if it's been fixed later (and isn't something i'm doing wrong) i'll probably just live with it | 13:25.44 |
kens | What's your command line ? | 13:26.17 |
| d: | 13:26.20 |
felicity | kens: https://dpaste.de/uGBS | 13:26.22 |
tor8 | kens: mupdf emits C2 A3 for that one in its text extraction | 13:26.39 |
kens | tor8 I need to chec our current code.... | 13:26.54 |
| COmes out as C2 A3 in the current code | 13:28.31 |
| Might be ths bug : | 13:29.41 |
| http://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=72afba4af187b01fded403d6a986d5de0be741d2 | 13:29.41 |
| Only 2 and a bit years old.... | 13:30.12 |
felicity | kens: thanks | 13:31.47 |
kens | NP | 13:32.05 |
chrisl | felicity: you can get a minimal gs binary here: http://ghostscript.com/download/gsdnld.html | 13:32.20 |
| It wouldn't be good to replace the installed one, but could be used for things like this | 13:32.56 |
felicity | chrisl: ah that's helpful, thanks. | 13:33.36 |
kens | Hmm I misread the email from Guillaume, its not a pdfwrite problem, its a PCL problem | 13:35.15 |
kens | is relieved | 13:35.23 |
chrisl | So, the PDF is *actually* the raster output? | 13:35.53 |
kens | Not sure, its not a GS PDF file though | 13:36.21 |
| Our output looks pretty clearly wrong though | 13:36.37 |
| Or at least, ths being PCL, not as intended | 13:36.47 |
| THere's a logo, looks like an image, and its rotated 90 degrees from the box that is supposed to surround it | 13:37.44 |
| Looks like its incorrectly scaled too | 13:38.03 |
| Ah, looks like htey scanned some HP printed output and made a PDF out of it, just to exhibit the problem | 13:39.35 |
| Anyway, not my problem :-D | 13:40.07 |
HenryStiles | yes guillaume is not using pdf output | 14:51.52 |
| probably another plot the size of my living room to look at, yippee | 14:52.35 |
kens | Its big | 15:33.56 |
marcosw | it's 36 x 48 (or more likely the equiv in metric). | 15:40.44 |
HenryStiles | marcosw: I see 19x16 (feet) | 17:20.03 |
| marcosw: did you not add the plot to the bug because it's big? I'll add it if you just forgot | 17:20.59 |
Robin_Watts | I have a fix (I believe) for bug 696466. Anyone up for reviewing it? | 17:31.01 |
| (1 down from the top commit on robin/master) | 17:31.33 |
| (and the one before that enables accuratecurves by default) | 17:31.52 |
| Urgh. My fix is fine, I think, but enabling accurate curves causes problems :( | 17:58.31 |
HenryStiles | so hold off on review? | 18:08.31 |
Robin_Watts | You can review my fix, if you want. | 18:11.26 |
HenryStiles | the bmpcmp's I'm seeing appear completely unrelated to what I'm doing: has anyone seen anything like this sort of diff recently: http://www.ghostscript.com/~regression/henrys/compare1.html ? | 18:15.44 |
Robin_Watts | HenryStiles: No. | 18:16.44 |
| Are you entirely up to date? | 18:16.57 |
HenryStiles | Robin_Watts: yeah, I'll look more I suppose it's possible I've done something ... | 18:17.44 |
Robin_Watts | HenryStiles: My standard test (after I've pulled --rebase golden master) is to run what should be a clean tree (the commit before I made any changes) | 18:18.19 |
HenryStiles | Robin_Watts: I always keep a clean tree with a build next to devel. I don't see a difference locally. | 18:27.20 |
| Robin_Watts: but I'll try that too | 18:27.42 |
Robin_Watts | HenryStiles: cluster pushing a clean tree tests whether the cluster is doing something odd too :) | 18:28.01 |
HenryStiles | right | 18:29.34 |
Robin_Watts | This setaccuratecurves thing is a pain. | 19:28.07 |
| I have a genoa test that creates a path, gsaves, flattens the path, and strokes it. then grestores, and fills it. | 19:28.57 |
| And it repeats that for various different flatnesses. | 19:29.26 |
| accurate curves only applies for strokes, so with that turned on, we can get mismatches between the fills and the strokes. | 19:30.05 |
| really noticable ones. | 19:30.10 |
| I could make path fills honour accurate strokes too, but the code as it exists now explicitly avoids the flattening step for monotonic curves, and we'd lose that. | 19:32.06 |
| marcosw: Why, when I run a clusterpush with -filter=pgmraw.300.0 does it add in an half dozen highdpi tests of other devices ? | 19:46.50 |
| HenryStiles: OK, commits on robin/master. Ignore the top one. The 4th one down down solves the actual bug. | 20:09.04 |
| The 3rd one down fixes the code so that setaccuratecurves is consistent between fill and stroke. This will have a performance cost, but I don't know how great yet. | 20:10.02 |
| The 2nd one down enables setaccuratecurves by default. | 20:10.18 |
| Whether we push 2 and 3 is something we can talk about after I have some performance figures. | 20:10.38 |
| The whole foray into the setaccuratecurves thing comes from the fact that the change that caused the regression was setaccuratecurves being enabled for PDF. | 20:11.32 |
HenryStiles | Robin_Watts: why is the latest commits first log words "Preliminary work" | 20:45.25 |
| ? | 20:45.26 |
| Robin_Watts: nvm on that I thought that was the latest commit, it's not sorry | 20:48.02 |
| Robin_Watts: extra "it" in the last paragraph here: http://git.ghostscript.com/?p=user/robin/ghostpdl.git;a=commit;h=bef3f85fdc33a233fc8283f14d5f0d3e54f758c0 ... | 21:20.59 |
Robin_Watts | Fixed. Thanks. | 21:22.13 |
HenryStiles | Robin_Watts: and why do we need to keep the setaccuratecurves command at all? | 21:22.44 |
| can't all the curves be accurate ? ;-) | 21:23.10 |
Robin_Watts | because to remove it might break code that relies on it? It's been deprecated for ages. | 21:23.30 |
| Ideally all curves would be accurate, but I don't know what that does to rendering times. | 21:23.44 |
| Cluster run goes from 36:10:46 -> 36:39:18 | 21:24.48 |
| So... 1% slower roughly? | 21:25.04 |
HenryStiles | repeatedly? could be noise. It's always nice to get rid of this stuff in gs. But if you want to be done with it: LGTM | 21:26.11 |
Robin_Watts | Am running again now. I'll see how the timings vary. | 21:26.46 |
mvrhel_laptop | tor8: I am going to have to talk with you about this ToUnicode CMap stuff when you have time | 23:23.45 |
| I thought I was going to wrap it up a couple days ago. But I apparently have misunderstood something. I have a simple case and have been pushing that through mudraw to see what is gong wrong. I will send you an email actually to summarize | 23:24.43 |
| it will probably be easier | 23:24.48 |
| oh I think I finally see the issue | 23:39.25 |
Robin_Watts | second run was just 2 mins slower. | 23:45.57 |
| Forward 1 day (to 2016/01/22)>>> | |