| <<<Back 1 day (to 2013/12/16) | 2013/12/17 |
Robin_Watts_ | ray_laptop: Their file has 0xfd in, and they want that white. | 00:24.59 |
| But ignore their file. | 00:25.06 |
| Their file clearly has a non white background, so we should do 'the right thing' and if that solves their problem, great. If not, tough. | 00:25.59 |
| The fact is that we do not currently do 'the right thing'. | 00:26.08 |
| So "having them use a transfer function" is not something we even need to think about. They can't change all their files, and we don't need them to. | 00:26.46 |
| Our thresholding code works exactly the same way as you describe. If contone (x,y) < threshold(x,y) then we paint it black. | 00:28.24 |
| This means that for 32 pixels, we have 33 possible levels (from 0 to 32 pixels set). | 00:28.51 |
| At the moment, values map to output values as: 0-7 -> 0, 8-15 -> 1, ..., 248-254 -> 31, 255 -> 32 | 00:31.10 |
| Clearly that's not right, as we have a region 'close to black' where 8 values map to black, and a region 'close to white' where only a single value maps to white. | 00:31.48 |
| With my change in there, we map: 0-3 -> 0, 4-11 -> 1, ... 244-251 -> 31, 252->255 -> 32 | 00:32.18 |
| So the 'close to black' and 'close to white' regions are 4 values each. | 00:32.37 |
| Changing to a 'nicer' threshold matrix might be nice, but really there is no excuse for not fixing what is wrong with the default one. I'll do some more tests tomorrow. | 00:34.45 |
ray_laptop | Robin_Watts_: I agree, which is why I closed the bug. I don't really understand why the stochastic dither which has a LOT more levels didn't show ANY dots in the background, but I didn't close the bug -- just made the point that we may not make it a priority to sort out the correct solution. | 01:41.57 |
| with max_gray=3 and a 4x4 cell, we'd have nshades=51, right ? so input values 0-84 would be dithered using the 16 cells in the threshold array every increase in darkening would be 5 steps apart. If we 'offset' by 2 I think the number of input levels painted with a constant color would be made up of those painted with 1 (for example when dithering between 0 and 1) and some number of input... | 02:10.18 |
| ...levels painted with constant 1 (when dithering between 1 and 2). | 02:10.19 |
| Robin_Watts_: so I think the offset approach is fine even with max_gray > 1. | 02:10.46 |
mvrhel_laptop | well, I suspect both my P1 bugs are the same issue. It is just a nasty issue with clist, transparency and patterns all rolled into one | 06:04.41 |
| I know what I will be doing the rest of the week.... | 06:04.57 |
| nice. It has a soft mask thrown in for good measure | 06:53.02 |
| I have this thing cut down to a stroke with a pattern fill. | 06:53.34 |
| hmm I wonder if this has something do to with the optimizations that ray did | 06:57.23 |
| no. this looks like a major oversight. | 07:13.07 |
| stroking of patterns that include transparency is not going to work | 07:13.37 |
| how could that have not worked... | 07:13.53 |
chrisl | Stroking with a pattern is pretty rare..... | 07:14.20 |
mvrhel_laptop | yeah apparently | 07:24.32 |
| and a transparency pattern is even more rare | 07:24.48 |
| I don't see how this ever worked | 07:24.58 |
| getting too late for me now. off to bed | 07:25.32 |
chrisl | Good night | 07:25.38 |
mvrhel_laptop | yes. need to stop. had to dig further. this definitely never worked and is going to take a little time to get working correctly | 07:33.38 |
| good night. | 07:33.44 |
kens | chrisl thanksfor trying the file in 694840, I had tried on WIndows and a Linux VM | 08:01.43 |
| But its nice to have some independent verification | 08:01.54 |
chrisl | kens: I thought it worth having a) a second voice, and b) stating explicitly I'd tried 9.10 | 08:02.23 |
kens | I did try 9.07, 9.10 and Master..... I wonder if he's using a shared library | 08:02.49 |
chrisl | It's possible, but I doubt it - he doesn't seem technically savvy enough to realise the difference | 08:03.44 |
kens | :-) | 08:03.50 |
chrisl | I notice he "understands" but hasn't actually attached one of the "many" PDFs that fail..... | 08:05.34 |
kens | Yes, and the one he did attach isn't the one he originally complained about | 08:06.10 |
tkamppeter | chrisl, hi | 09:34.00 |
| chrisl, I have a problem with the PS output of GS with ps2write. On my HP Color LaserJet CM3530 MFP it prints only in duplex when it has page size setting A4 not when its page sizes are Letter. Poppler output prints duplex in both Letter and A4 format for me. | 09:35.43 |
kens | I'm not sure how we could be affecting that. | 09:36.28 |
| Except that possibly we do a setpagedevice on every page, which would defeat duplexing | 09:36.52 |
tkamppeter | kens, chrisl, I can mail you the files. | 09:37.14 |
kens | Well we're going to need to see them. THe simplest file possible would be best (empty pages) | 09:37.40 |
chrisl | I thought duplexing had been fixed a while back - maybe not. If we do a setpagedevice on each page, that would be less than great | 09:38.44 |
tkamppeter | kens, chrisl, you have the files in your mailbox now. | 09:38.47 |
kens | Looks to me like we (potentially) do a setpagedevice on every page, because the MediaBox is used and its 595x842 | 09:40.15 |
tkamppeter | kens, chrisl, I have first generated the letter file from 5003.PPD_Spec_v4.3.pdf (official PPD documentation) with Ghostscript 9.10, ps2write device and I have even set the paper size to A4, buty GS uses the file's paper size. | 09:40.18 |
kens | Ghostscript always uses the MediaBox from teh PDF file, unelss you tell it to take specific action not to. | 09:40.51 |
| I'd really much rather have 2 blank pages than 4 pages filled with extrabeous stuff | 09:41.16 |
tkamppeter | kens, chrisl, for the A4 version I simply edited the letter version, changing the numbers in all MediaBox lines to the A4 size. The letter version does not print duplex, but the A4 version does. The [paper loaded in my printer is A4. | 09:41.39 |
kens | tkamppeter, I'mnot sure why this is a bug then. | 09:42.07 |
| If you have A4 media, and the jobrequests A4, tehn we won't issue a request to switch media size, because we are already using the correct size. | 09:42.37 |
tkamppeter | kens, chrisl, if I convert this PDF file usinmg Poppler, it prints duplex independent whether the page size is letter or A4. | 09:42.49 |
kens | That sounds to me more like a Poppler bug ;-) | 09:43.03 |
tkamppeter | kens, Poppler bug? Poppler does it the correct way. | 09:43.27 |
kens | Depends how you define correctness. | 09:43.38 |
| You have a printer with A4 media. | 09:43.49 |
| You run a job which requests dsuplex and A4 | 09:44.03 |
| You get Duiplex. | 09:44.07 |
| You have a job which requests Letter, so the request cannot be matched | 09:44.26 |
| YTou don't get Duplex | 09:44.32 |
tkamppeter | kens, but in Poppler I can even print some internet-downloaded US manual file in duplex, but in GS I cannot do it. | 09:45.12 |
kens | I'm not at all sure that printing Duplex on A4 when the job requests duplex on Letter is 'correct' | 09:45.26 |
tkamppeter | kens, but is it then correct to print a letter job with duplex requested one-sided on A4? | 09:46.05 |
kens | Ordinarily I would suggest that you manually set the media to the printer media, and select PDFFitPage | 09:46.07 |
| tkamppeter we cannot know that the printer does not support letter | 09:46.35 |
| So we requestg letter when we see that we need it and teh current media is not letter | 09:46.48 |
| What if hte PDF file had different sized pages ? | 09:47.04 |
tkamppeter | kens, so I have to supply -dPDFFitPage to GS so that it sets the size of all pages to the requested paper size? | 09:47.04 |
kens | tkamppeter no | 09:47.14 |
| YOu need to szelect a *fixed* media size, and tehn have GS scale all pages to that fixed size, by adding PDFFitPage | 09:47.41 |
| Otherwise ps2write will request the media for each page, if it is not the same as the current media | 09:48.15 |
tkamppeter | kens, this would be something like "gs -sDEVICE=ps2write -sPAPERSIZE=a4 -dPDFFitPage ..."? | 09:48.38 |
kens | More accurately, the output from ps2write always requests that the iunterpreter uses media which matches the MediaBox. If this is already the media in use, then no change takes place | 09:48.58 |
| tkamppeter I suggest you try it | 09:49.23 |
chrisl | kens: does the opfread code interrogate the device before trying to set the page size to the MediaBox? | 09:49.42 |
kens | chrisl I don't believe so, it simply requests the media each time | 09:49.59 |
| If the requested media is the same as the current one then setpagedevice does nothing | 09:50.17 |
tkamppeter | kens, Poppler only compares the current page's size with the previous page's size and issues setpagedevice only if they differ. | 09:50.17 |
kens | tkamppeter, well we don't. | 09:50.27 |
| We levae it up to the interpreter, which is what you're supposed to do | 09:50.53 |
tkamppeter | kens, would be great if you could do like in Poppler, at least with an option. | 09:50.55 |
kens | tkamppeter then open an enhancement bug request | 09:51.07 |
chrisl | kens: Hmm, then both *should* behave the same, surely, based on your testing with setpagedevice | 09:51.10 |
kens | chrisl I never tested with Duplex, distiller doesn't support it | 09:51.37 |
chrisl | kens: true, but IIRC your testing showed that even an empty dict caused the device to "change" | 09:52.14 |
kens | Only as far as the count of pages was concerned, I have no evidence that it reset duplex | 09:52.38 |
chrisl | It's not resetting duplex, but it will "defeat" duplex because a device change should cause the page to eject | 09:53.32 |
kens | That's what I mean, I have no evidence that it does so | 09:53.45 |
| All I was looking at was the PageCount | 09:54.01 |
chrisl | Hmm, ick.... I'd hate to think it was resetting random parts of the device.... :-( | 09:54.50 |
kens | Modifying opdfread to not emit a setpagedevice is not going to be trivial, it means storing the previous MediaBox and comparing with the new one. | 09:55.12 |
| I'm too busy to look at it just now tkamppeter, if you want me to please raise an enhancement bug or I will forget it | 09:55.44 |
tkamppeter | kens, OK. | 10:01.44 |
chrisl | tkamppeter: I was wondering if there was (or could be added) a way to add arbitrary command line options for when CUPS calls Ghostscript? | 10:04.41 |
Robin_Watts_ | https://github.com/kogmbh/WebODF | 10:07.46 |
chrisl | AGPL though | 10:09.01 |
kens | OK that's my P1 sent off to Henry, tkamppeter if you open that report for me I can have a look at it now (it took me less time than I expected to look at the bug) | 10:09.28 |
chrisl | kens: it's going to be a bit of a pain because the pages are surrounded by save/restores.... | 10:10.18 |
kens | THat's going to make life difficult | 10:10.58 |
| As we will restore back to the previous interpreter state, and therefore page device | 10:11.16 |
chrisl | It probably means adding an initial setpagedevice during the job setup. | 10:11.37 |
kens | Yes I was thinking that | 10:11.45 |
| If we use the first page media that hsould be sufficient | 10:11.56 |
| If the media changes after that, then we need to issue a new request anyway | 10:12.09 |
chrisl | Yes, which buggers duplex anyway | 10:12.26 |
kens | Yes true, but you can't reasonably duplex different sized pages | 10:12.43 |
| At least not without the kind of jiggering with media size and fitting that I described earlier | 10:13.01 |
chrisl | I doubt it will come up very often, and I think it is reasonable to ignore duplex under those circumstances | 10:14.12 |
kens | I don't plan to do any more than store the initial MediaBox and not execute setpagedevice if the requested one is the same. If we do execute setpagedevice then we will store the new MediaBox for further comparison. | 10:14.50 |
tkamppeter | kens, if a duplexed document has diffferent page sizes, at each change duplex has to be reset, if you have a document for example with 3x a4, then 2x A3, and then 3x A4, you get page 1 and 2 on the first sheet, page 3 on the 2nd sheet (with blank back), page 4 and 5 on the third sheet, page 6 and 7 on the 4th sheet and page 8 on the 5th sheet (with blank back). The first two sheets are A4, trhe 3rd is A3, and the last two are A4 again. | 10:15.19 |
kens | tkamppeter yes I'maaware of that | 10:15.38 |
| However, I believe that duplexing will survive this, as the request dictionary is merged with the current dictionary. | 10:16.29 |
| So if /Duplex is true and we execute a change in media, /Duplex will remain true. | 10:16.46 |
| So if you change media size, the duplex request will remain. | 10:17.08 |
| Of course, if you change to a medium where the printer does not support duplexing, and then back again, it will be lost. | 10:17.39 |
| To be honest this seems to me to be 'tough, that's how it works' expecting to maintain duplexing after changing media to one where duplexing is not supported is just plain daft | 10:18.32 |
tkamppeter | kens, the printer is most probably set to print Letter pages on A4 without asking the user and this works perfectly with Poppler. It duplexes the Letter pages from Poppler. So the printer shoul;d be able to duplex Letter. | 10:19.26 |
kens | tkamppeter the interpreter will use the Policy in force, which is usually 'select nearest and crop' or 'select nearest and scale' | 10:20.06 |
| If the PostScript program does not request another media change, then the duplexing works. | 10:20.40 |
| But if it does (because for example the PDF page is actually a different size) then duplexing is aborted. | 10:21.07 |
| If the new media supports duplexing in the printer then the the following pages will again be duplexed, if it does not then no further duplexing will take place, even if you switch back to the original media | 10:21.59 |
| chrisl actually it seems we do check the media agains the current media by executing currentpagedevice and we only execute setpagedevice if it is not the same | 10:25.51 |
chrisl | kens: yeh, I thought that - wasn't sure, the code seems overly complex for such a simple check.... | 10:27.10 |
kens | It should be OK to extend that check against a stored media size and not bother if we have already requested that media. | 10:27.29 |
kens | coffees | 10:27.52 |
nl | hi | 11:35.48 |
ghostbot | hola, nl | 11:35.48 |
Guest25101 | anybody here that knows why ghostscript changes a string like this --> contract na reorganisatie 1â¬.pdf to this --> contract na reorganisatie 1\200.pdf | 11:37.54 |
kens | in what sense 'changes a string' ? | 11:38.21 |
Guest25101 | resulting in an error in my postscript file Error: /ioerror in --file-- Operand stack: --nostringval-- --nostringval-- (D:\\contract na reorganisatie 1\200.pdf) (r) | 11:38.22 |
| I generate a postscript file to process some pdf's | 11:38.42 |
| I use the run command to proces some pdf files | 11:38.52 |
| one of the pdf files has an euro symbol in it's name | 11:39.06 |
kens | OK so don't do that. | 11:39.14 |
Guest25101 | Well I can't always prevent it | 11:39.26 |
| I did read that ghostscript is unicode | 11:39.40 |
kens | Also, I very much doubt Ghostscript 'changes' the string | 11:39.49 |
tor8 | kens: shouldn't filename strings passed through ghostscript all be utf-8 these days? | 11:40.21 |
kens | Robin_Watts_ : is the expert, but as I understand it if you want the filename to be handled you will need to UTF-8 encode it | 11:40.33 |
| tor8 yes | 11:40.38 |
tor8 | Guest25101: so you should create the ps file with the filename encoded as utf-8 | 11:40.57 |
kens | Guest25101 : and you will probably need to escape any unusual data to prevent it being processed | 11:40.57 |
Guest25101 | how do I do that.. encode the filename as utf-8 | 11:41.40 |
kens | That sounds like a job for Mr Google | 11:41.50 |
tor8 | the \200 (decimal 128) is where the euro sign is in windows ansi codepage 1252 | 11:41.52 |
Guest25101 | the postscript file is already in utf-8 format | 11:41.53 |
kens | Ahem, PostScript has no encoding | 11:42.06 |
tor8 | so your postscript filename is *not* utf-8 encoded | 11:42.16 |
kens | The progrtamming language uses ASCII | 11:42.16 |
tor8 | postscript files are, as ken says, binary. completely encoding agnostic. it just passes the bytes through. now what you want is to put utf-8 data in the (D:\\....) string | 11:43.08 |
Guest25101 | so I can't feed the run command a filename that has a euro symbol in it? I'm a little bit confused now | 11:43.33 |
tor8 | either by putting the raw utf-8 bytes, or by escaping them with backslashes and octal codes so it's plain ascii | 11:43.35 |
| Guest25101: you can, you just have to use utf-8 for the filename. you've used windows codepage 1252 in the example you pasted. | 11:44.24 |
Guest25101 | I saw an example somewhere where they did put the name as hex code beteween < and > .. is that the same? | 11:44.28 |
kens | As far as PostScript is concerned it is not 'a Euro' it is a byte of data | 11:44.35 |
| Hex encoding is not the same as UTF-8 but it does avoid having to escape characters | 11:45.05 |
tor8 | in your example you've written the euro as byte \200, which is windows codepage 1252. you need to write it as utf-8 instead. | 11:45.35 |
kens | http://en.wikipedia.org/wiki/UTF-8 | 11:46.02 |
Guest25101 | tor8 ... I didn't write the euro symbol as \200 ... it's how ghostscript dumps it in an error | 11:46.12 |
kens | Dumping in an error is not the same as 'chaning' it | 11:46.26 |
tor8 | it dumps it in the error *exactly* as you gave it to ghostscript (but escaped so it doesn't print binary garbage to stdout) | 11:46.51 |
kens | \200 is the octal representation of the decimal value 128 | 11:46.52 |
| What tor8 said | 11:47.01 |
Guest25101 | I use the run command like this | 11:47.35 |
| ../ConvertItems [ [(D:\\contract na reorganisatie 1â¬.pdf) (contract na reorganisatie 1â¬.pdf) (5959BF6A-C960-44C8-98E9-2B61411838F9) () 0 0 0 0 0 0] ] def | 11:47.48 |
kens | Oh I htink you also have to build Ghostscrtip with UTF-8 support, but I believe teh Windows version has that already | 11:47.54 |
Guest25101 | I put the filename in an array | 11:47.54 |
tor8 | Guest25101: the utf-8 sequence for U+20ac (euro sign) in hex is <e282ac> | 11:47.58 |
Guest25101 | and loop that array like this --> ConvertItem 0 get run | 11:48.15 |
tor8 | \342\202\ | 11:48.34 |
| \342\202\254 in octal escapes | 11:48.41 |
| so if you put (D:\\contract na reorganisatio 1\342\202\254.pdf) it ought to work | 11:49.37 |
Guest25101 | ok now I understand | 11:49.51 |
tor8 | or use utf-8 when you write the postscript file | 11:50.13 |
Guest25101 | So if I make a function that escapes all the unicode char it should solve my problem regarding other unicode characters | 11:50.28 |
kens | Not merely escape, but also UTF-8 encode | 11:50.41 |
Guest25101 | Uhm.. going to google for an example ;-) | 11:50.55 |
tor8 | yes. but do take care to note that windows ansi encoding is not the same as unicode :) | 11:51.08 |
Guest25101 | thanks for the helpl | 11:51.52 |
| it would also be ok when I leave the normal ascii chars 32 until 126 intact (to keep it readable) and convert anything outside that range | 11:56.15 |
tor8 | Guest25101: you really should just convert the whole string to utf-8, and escape all non-printable ascii characters | 11:56.46 |
Guest25101 | ok | 11:57.35 |
kens | The point of UTF-8 is that ASCII characters remain unchanged | 11:57.40 |
Robin_Watts_ | Guest25101: Before you say or do anything else, go and read up on UTF-8 | 11:58.52 |
Guest25101 | so like how this online tool does it --> http://www.rapidmonkey.com/unicodeconverter/ | 11:58.52 |
tor8 | morning Robin_Watts_ | 11:59.16 |
Robin_Watts_ | In PDF, you have strings of bytes. | 11:59.30 |
| If you interpret these as ascii (or as windows encoded) then every byte corresponds to one 'character'. | 11:59.47 |
| Ghostscript now treats these as UTF-8. | 12:00.04 |
| In UTF-8, a character can be represented by either a byte, or a series of bytes. | 12:00.35 |
| Thus while in ASCII (or windows codepage encoding) you are stuck with only 255 representable characters, in UTF-8 you have many thousands of representable characters. | 12:01.48 |
| Encoding in UTF-8 is NOT the same as 'escaping' characters. | 12:02.04 |
| When you have a string with escaped characters there can be many possible representations of the same string. | 12:02.40 |
| With UTF-8 there is just one. | 12:02.49 |
| tor8: Morning | 12:02.58 |
paulgardiner | Robin_Watts_, tor8: did you get a chance to review my JavaScriptCore commits? | 12:03.23 |
tor8 | I saw you talked to zeniko yesterday | 12:03.28 |
| paulgardiner: sorry, no. will do that now. | 12:03.33 |
paulgardiner | ta | 12:03.39 |
Robin_Watts_ | tor8: I did. Did you see anything in the logs you disagreed with? | 12:04.14 |
tor8 | his comment about prefering explicit contexts "as it makes the API contract clearer and requires less lateral thinking" resonated strongly :) | 12:05.04 |
Robin_Watts_ | tor8: Do you really want us to have to change the entire codebase so we do: pdf_gets(ctx, doc, dict, "string")? | 12:05.38 |
tor8 | I still stand by my preference for explicit contexts for streams and outputs | 12:05.46 |
| but for convenience, I would not want to do what you just pasted, so I'm fine with having documents and devices act as context proxies with rebinding | 12:06.27 |
| but really, just those two, please | 12:06.35 |
Robin_Watts_ | And I still dislike it. It makes 99.9% of the code that ever uses streams or outputs harder. | 12:07.03 |
tor8 | and I wouldn't be completely opposed to that either (if it didn't entail changing 95% of the source code) | 12:07.28 |
Robin_Watts_ | I believe paulgardiner sided against that too (but I should let him speak for himself). | 12:08.01 |
| We should get sebras thoughts too. | 12:08.11 |
tor8 | sebras hasn't weighed in yet. | 12:08.14 |
Robin_Watts_ | I got to speak to paulgardiner on the phone to explain the problem, and hence may have influenced his thinking. You should have the same chance with sebras :) | 12:08.50 |
tor8 | I had him here yesterday, but were too busy with other things to bring it up | 12:09.12 |
| but knowing sebras, I think he tends to go for regularity and consistency over convenience every time | 12:09.37 |
| even more so than me | 12:09.44 |
| but I shall let him speak for himself as well | 12:09.59 |
Robin_Watts_ | So let's consistently rebind :) | 12:09.59 |
paulgardiner | Yes, I may have been influenced, but independently I definitely liked the idea of doing it all one way or all the other, so treat streams and outputs the same way as devices and docs | 12:10.16 |
tor8 | paulgardiner: why should streams and outputs be treated the same way as devices and docs? | 12:11.10 |
paulgardiner | I don't like the idea that we have to do this at all, but a set of rebind calls (that very few will need to use and that can be documented only under the multithreads section) seems the lightest touch | 12:11.14 |
tor8 | I can see two lines of reasoning -- the document is a sort of context, so it makes sense there | 12:11.27 |
| the devices, streams and outputs are all single-threaded linearly accessed stuff | 12:11.44 |
Robin_Watts_ | Part of the reason I like streams and outputs being a single 'thing' is that when you have to pass information into libraries, like zlib or jpeglib etc, you get to pass some functions, plus a void *. | 12:12.24 |
| Just one void *, not two. | 12:12.31 |
tor8 | our context consists of both the per-thread stuff (exception stack) and shared caches and settings | 12:12.45 |
Robin_Watts_ | Now, I know I can make a structure with 2 in and pass a pointer to that, but it's horrid. | 12:12.50 |
tor8 | associating the caches and settings part with streams and outputs, I'm not really sure I like conceptually | 12:13.06 |
paulgardiner | tor8: any objects type we are hiding contexts in, we probably had good reason. A rebind call is the least change to the API to make those objects work in this new way. | 12:13.12 |
| ... was my thinking (rightly or wrongly) | 12:13.33 |
Robin_Watts_ | tor8: The context also contains the allocators, which are shared between threads and are needed within streams and outputs. | 12:13.46 |
tor8 | paulgardiner: our good reason was partly to not change the api as much when we introduced the context :) | 12:14.05 |
| everything needs the exception and allocation context | 12:14.20 |
Robin_Watts_ | The context is "global and thread local state" | 12:14.25 |
tor8 | and then we have a bunch of "global" stuff tacked on to the end | 12:14.33 |
Robin_Watts_ | Not everything that holds the context needs all that state, but it needs some of it. | 12:14.37 |
tor8 | (such as the locks and caches) | 12:14.41 |
| I also find having to do fz_context *ctx = doc->ctx; at the top of every function rather annoying :) | 12:16.27 |
| or worse, when we've been lazy and just use doc->ctx in places | 12:16.41 |
| that signals to me that the thing we've packaged the ctx into doesn't make a lot of sense. | 12:17.07 |
paulgardiner | Agreed, but I would have thought the only way to completely avoid that was every single function to take ctx as first argument. | 12:19.39 |
| The main thing rebind has going for it is mostly people can ignore it unless they are doing stuff that requires detailed knowledge of how we use contexts | 12:20.53 |
Robin_Watts_ | Yes. 99.9% of code (both ours and other peoples) remains entirely unchanged. | 12:21.18 |
| And in terms of performance, rebind is probably a win. | 12:21.31 |
| (occasional low cost, rather than a general performance draining sap on passing extra params everywhere). | 12:22.01 |
| And code is MUCH more readable with rebind, IMHO. | 12:22.30 |
tor8 | Robin_Watts_: pass an extra param, or dereference a pointer, not sure which is the more expensive performance wise | 12:25.12 |
| Robin_Watts_: my gut feeling when I read through your patch to make the context explicit for streams was: "this is so much more regular and nicer, why didn't we do this before? this is definitely the right thing to do." | 12:27.08 |
| then you come along with rebind and ruin it all! | 12:27.14 |
Robin_Watts_ | If we do passing params in registers (like happens on ARM), then there is a magic number of params that when exceeded, causes stack manipulations. On ARM that's 4. Using a context pointer everywhere pushes us towards that. | 12:32.04 |
| Also when you pass in registers, anything with a return value trashes the first param, so it needs to be reloaded every time. | 12:32.41 |
tor8 | Robin_Watts_: when inlining functions, I expect the compiler would be smarter about an extra argument, than eliminating an extra memory load | 12:32.54 |
Robin_Watts_ | I don't see why. | 12:33.23 |
tor8 | passing the argument it can see that it is just an alias | 12:33.45 |
| but a memory load, it could have changed via some side effect, so ought to reload just to be sure | 12:34.00 |
Robin_Watts_ | The extra arg just gets folded in as a sub expression, so the load happens as an extra memory load later. i.e. it works out identically, I reckon. | 12:34.10 |
tor8 | which makes extra argument during an inlined function a no-op but the memory load still has to happen to preserve the behaviour | 12:34.40 |
Robin_Watts_ | That's part of the reason we do: fz_context *ctx = doc->ctx; at the top of the functions rather than just using doc->ctx everywhere. | 12:34.54 |
tor8 | Robin_Watts_: yes. | 12:35.04 |
| in fact, I ought to go through and do that *everywhere* so that we never use xxx->ctx forms anywhere | 12:35.26 |
| just to keep things clean | 12:35.37 |
| regardless of whether we have baked and rebound contexts or explicit ones, the code should look like they're explicit apart from the prototype and initial dereference | 12:36.03 |
paulgardiner | But these aren't inlined functions, are they? We're talking about the API. I thought inlining was a freedom only for statics | 12:36.25 |
tor8 | paulgardiner: several of the stream functions are inlined | 12:36.44 |
| fz_read_byte etc | 12:36.48 |
| and anything where performance is required tends to be designed to be inlined, like the plotting/painting functions | 12:37.15 |
| (but still, considering that poppler will have to go through c++ vtables for everything, nothing we do will have quite that overhead....) | 12:38.14 |
Robin_Watts_ | tor8: Take fz_read_byte for example. | 12:38.44 |
| fz_read_byte is inlined, but we don't use the ctx in there. | 12:39.02 |
| The ctx is only used in the rare case we pass through. | 12:39.18 |
| Thus an explicit context can only ever hurt our performance there. | 12:39.46 |
paulgardiner | Anyway, I have to concede that "We should have used contexts explicitly for outputs and streams in the first place" - if true - undermines my reason for wishing to use rebind for all affected objects. | 12:40.59 |
Robin_Watts_ | I would have argued against using explicit contexts for outputs and streams | 12:41.45 |
paulgardiner | I have no feeling for that at the moment, so can't really join in. Any bit of code I should look at as example? | 12:42.31 |
Robin_Watts_ | At least part of the attraction of them to me is that they are a 'single entity' I can write to/pull from. | 12:42.36 |
| paulgardiner: pdf_lex | 12:42.42 |
tor8 | there is the convenience of typing (fewer arguments) versus convenience of remembering (ctx always goes first) | 12:43.48 |
Robin_Watts_ | That's a heavy user of streams. | 12:43.48 |
tor8 | Robin_Watts_: and I didn't even think the changes to pdf_lex were that bad... | 12:44.16 |
paulgardiner | I guess I'd tend to hids a context in most reasonably complex objects, but I can't say why. | 12:44.21 |
tor8 | but I think our sensibilities differ there | 12:44.23 |
paulgardiner | lex_string gets away with no explicit mention of ctx | 12:45.07 |
tor8 | I dislike hiding stuff, and I also dislike adding statefulness (which is what the rebind in streams is really doing as well, but harmlessly in practice since they're not threadsafe anyway) | 12:45.14 |
Robin_Watts_ | paulgardiner: and lex_number, and lex_name, and... | 12:45.29 |
paulgardiner | some use it just when issuing a warning | 12:45.57 |
tor8 | regarding read_byte, there have been places like that where we've had to change the api to introduce a context should the function or a child of the function, suddenly require some allocing or error throwing | 12:46.08 |
| not with the streams, since they come packaged with a context | 12:46.24 |
| but the pixmap functions were a bit problematic | 12:46.30 |
Robin_Watts_ | tor8: But we HAVE a context there (just packaged away), so changing API is not an issue. | 12:46.39 |
tor8 | some of the accessors were context-less because the users of the pixmap accessors didn't have a context available | 12:46.53 |
Robin_Watts_ | pixmaps were a problem because no context was bound. Different case. | 12:46.54 |
paulgardiner | Not sure why fz_buffer doesn't wrap a context | 12:47.13 |
tor8 | Robin_Watts_: yes. arguing for the general design principles. | 12:47.15 |
| not specific instances | 12:47.25 |
Robin_Watts_ | tor8: In Picsels code we passed a context everywhere. | 12:47.41 |
| but we also used error returning rather than exception handling. | 12:48.01 |
| which was (IMHO) a mistake. | 12:48.09 |
tor8 | coming back to touch code I haven't looked at in a while, it annoys me how I always have to keep looking up where the context comes from :( | 12:48.37 |
| and whether it should take a ctx, or a document in some of the pdf_ functions | 12:48.58 |
| Robin_Watts_: what was the mistake: error returning or passing a context everywhere? | 12:49.21 |
Robin_Watts_ | pdf_ functions should always take a document, right? | 12:49.26 |
| Error returning was the mistake. | 12:49.37 |
tor8 | Robin_Watts_: 90% of them do, but not all | 12:49.37 |
Robin_Watts_ | You effectively throw away C's 'return a value from a function' abilities because you're always returning error codes. And no one likes checking error codes anyway. | 12:50.17 |
| I was saying that Picsels code was really good in some ways, but not perfect. | 12:50.45 |
tor8 | in pdf: the cmap, font metrics, crypt, pdf_to_rect, resource store take fz_context rather than pdf_document | 12:50.55 |
paulgardiner | In picsel code, perhaps more than 10% was error tests and early returns, completely avoidable with exception handling | 12:51.07 |
Robin_Watts_ | tor8: Right, cos they are functions that work independent of the document. | 12:51.21 |
| gs suffers in the same way in that regard. | 12:51.48 |
paulgardiner | In MuPDF, I wonder if there is a risk of almost 5% being the chars "ctx, " if we tend towards explicit contexts | 12:51.51 |
tor8 | Robin_Watts_: yeah. and then there's the set of functions that don't take any (like pdf_dict_gets) | 12:51.56 |
Robin_Watts_ | pdf_objects have pdf_documents bound to them. | 12:52.14 |
tor8 | which makes me wonder why pdf_to_rect takes a fz_context where it ough not to have to | 12:52.41 |
| since it has a pdf_object already | 12:52.45 |
Robin_Watts_ | (pdf_object_rebind(object, ctx) ? ) | 12:52.48 |
tor8 | but pulling the context out of the object is non-trivial | 12:52.59 |
Robin_Watts_ | tor8: Probably hysterical raisins. | 12:53.09 |
| tor8: It used to be the case that not all objects had documents. | 12:53.22 |
| only indirect objects had doc pointers. | 12:53.30 |
tor8 | Robin_Watts_: most likely, yes | 12:53.40 |
Robin_Watts_ | Now all objects have pointers we could probably simplify that API. | 12:53.48 |
tor8 | but then we added contexts | 12:53.50 |
Robin_Watts_ | No, IIRC, the addition of pdf_documents to all pdf_objects postdates contexts by quite a lot. | 12:54.22 |
tor8 | which was at the same time that all objects gained a pdf_document (or did we prune the fz_context after that? so many changes, not enough memory) | 12:54.37 |
Robin_Watts_ | It used to be that no pdf_object needed a pdf_document pointer except for the indirect ones. | 12:55.17 |
tor8 | paulgardiner: yes, "ctx, " would be a very big theme in mupdf (ever more so than now) | 12:55.21 |
paulgardiner | I think possibly I had to replace contexts by documents in pdf_objects for the incremental update work | 12:55.33 |
tor8 | maybe ctx is a bad name, should have picked a single uppercase character or something simpler | 12:55.38 |
| I can't count the numebr of times I've spelled it cxt | 12:55.49 |
Robin_Watts_ | What we really need is a hacky set of scripts that do source transformation on our code, and add 'ctx' transparently to all the functions at compile time! | 12:56.15 |
Robin_Watts_ | slams head against desk repeatedly. | 12:56.24 |
tor8 | Robin_Watts_: I'm sure we can do that with a clang plugin! ;) | 12:56.40 |
paulgardiner | I think I'm still siding on hiding contexts in most reasonably complex objects and using rebind. Whenever I see f(fz_context *ctx, fz_buffer *fzbuf, ... I think "Hmmm, why do I need to pass a context when I'm already passing an fz_buffer | 12:58.26 |
Robin_Watts_ | decides not to commit the fix for 694842 yet. To give them a chance to ruminate on becoming a supported customer. | 12:58.44 |
| oh, but ray gave them a workaround already :( | 12:59.41 |
tor8 | paulgardiner: my gut response to that would be to pass ctx everywhere and groan regularly at having to type "ctx, " rather than trying to remember when and where I need to pass what | 13:00.08 |
| seems like the simple decision to add rebinding has spiraled away from that simple decision... | 13:00.50 |
paulgardiner | But we wouldn't need to remember if the few excetptions also carried a context | 13:00.55 |
kens | has completely lost the plot of this thread........ | 13:01.19 |
tor8 | if we renamed ctx to _ everywhere, it would look really odd! | 13:01.59 |
| fz_foo_my_bar(_, foo, bar) | 13:02.21 |
Robin_Watts_ | tor8: Source transformations are bad, M'kay. | 13:03.08 |
| Unless localised to single files. | 13:03.38 |
| Or to add exception handling :) | 13:03.45 |
tor8 | Robin_Watts_: no argument there. _ is a valid identifier, was my point, and maybe less intrusive than ctx. | 13:03.49 |
| but it looks weird | 13:04.02 |
Robin_Watts_ | fz_try(_) { ... } | 13:04.23 |
tor8 | #define fz_try fz_try_imp(ctx) would have looked nicer IMO | 13:04.42 |
Robin_Watts_ | Then people would argue for "why not roll _ into the fz_try ?" and that way lies madness. | 13:04.47 |
| madness! | 13:04.53 |
| That means that everyone who calls a mupdf function from their own code needs to be sure to have a context called ctx. | 13:05.11 |
tor8 | Robin_Watts_: and that's why we don't. | 13:05.25 |
Robin_Watts_ | I think we should park this and wait for sebras, otherwise we'll talk round and round in circles for even longer. | 13:06.02 |
tor8 | though sometimes I worry about exposing our exception handling macros to client code altogether | 13:06.03 |
| Robin_Watts_: okay. then I'll get on to look at paul's code | 13:06.28 |
| paulgardiner: the makefile changes and all that look good | 13:10.31 |
| your comments about private data failing for methods is worrying though | 13:10.42 |
| (but don't let that prevent you from pushing as is) | 13:11.14 |
paulgardiner | tor8: yeah, I need to find a forum to ask about that. | 13:11.49 |
sebras | pops in. | 13:44.39 |
| Robin_Watts_: hm... I think I need to read the logs carefully. maybe tonight. | 13:46.10 |
tor8 | Robin_Watts_: oh, I found another one. fz_color_converter also contains a baked in context. | 13:49.15 |
| and the fz_gel edgelist | 13:49.47 |
| and our png loader has an internal struct which embeds a context | 13:50.50 |
| and the tiff loader | 13:51.00 |
| and the structured text device "soup" struct | 13:51.20 |
| and line height and region mask structs | 13:51.42 |
| and the xml parser has an internal struct | 13:52.05 |
| all but the edge list and color converter are completely internal and not exposed to the public api, but we ought to fix those two | 14:02.36 |
Robin_Watts_ | Are the edge list and color converter public? | 14:20.56 |
tor8 | hm, actually, the edge list probably isn't | 14:21.21 |
Robin_Watts_ | But you are right, the color converter is. | 14:22.24 |
| personally, I'd favour (predictably enough :)) fz_rebind_color_converter. But we'll wait for sebras. | 14:22.48 |
sebras | tor8: did you get kobos today? | 14:40.45 |
Robin_Watts_ | kobo: n. pl. a monetary unit of nigeria? | 14:45.37 |
| equal to 1/100 of a naira, apparently. | 14:45.59 |
kens | e-book reader | 14:46.09 |
Robin_Watts_ | kens: Mine was funnier :) | 14:46.34 |
kens | nigerian monetary uinit sounds like a scam | 14:46.51 |
Robin_Watts_ | sebras and tor8 might have been offered the chance to get $3.5 million kobos out, but they need to open a local currency account first so the transfer can go through? | 14:48.02 |
sebras | Robin_Watts_: I hope I didn't genetically inherit the propensity to send money to Nigeria from my father... | 14:51.43 |
| kens: we (well, tor8) ordered kobos and yesterday they had reached germany. so I'm anticipating their delivery be today. | 14:52.36 |
kens | Then I can look at Tor'sa next meeting | 14:53.12 |
Robin_Watts_ | sebras: Not coming from amazon.de I hope? They're all on strike. | 14:56.02 |
sebras | Robin_Watts_: nope, pixmania I think. | 14:58.56 |
Robin_Watts_ | So, tor8, while I wait for sebras to agree with me ( :) ) should I look at some of the newer mupdf bugs that have come in? | 15:02.51 |
| mvrhel_laptop: Morning. I just unleashed the dithering fix, and opened the enhancement bug as you requested. If I've missed anything, please say. | 15:03.40 |
| Morning henrys, marcosw | 15:05.42 |
marcosw | morning | 15:05.54 |
henrys | howdy | 15:05.58 |
Robin_Watts_ | marcosw: I hit problems with the cluster yesterday. | 15:06.03 |
| A few days ago, I committed one of shellys patches, and you reported that it caused regressions (that the cluster showed correctly). | 15:06.43 |
| so he produced a better patch that (we believe) solves those regressions. When I cluster tested it, it told me that there were (basically) no changes. | 15:07.20 |
marcosw | presumably because the previous md5sums were cached. | 15:07.50 |
Robin_Watts_ | I believe that the cluster is failing to report the changes because it's changed back to the rendering it had just a few versions ago. | 15:08.12 |
| but it didn't say "these files failed to match, but matched a run within the last 50" or whatever it sometimes says. | 15:08.34 |
marcosw | really? i'll look into it. | 15:09.07 |
Robin_Watts_ | So I did 2 commits, one backing out the problem commit and another adding the new patch. | 15:09.13 |
| Doing a user run of those did not tell me "these files failed to match, but matched..." either. | 15:09.41 |
| marcosw: And, you know how you love it when I change 2800 files all at once? | 15:10.45 |
mvrhel_laptop | Robin_Watts_: ok thanks | 15:11.25 |
henrys | marcosw: I wonder if we should report these cluster bugs. It is nice when reading history to refer to the bug numbers but it may be overkill for the cluster. | 15:12.51 |
marcosw | normally there are only a dozen or so files | 15:15.25 |
| sorry, wifi is wonky today. I'm switching to my dekstop | 15:16.55 |
marcosw1 | as I was saying, the email from the commit that backed out shelly's original patch, fa8b62e45f07564304d671b57cba7fede5d4729d, reported: | 15:20.42 |
| The following 174 regression file(s) had differences but matched at least once in the previous 25 runs: | 15:20.48 |
| tests_private/comparefiles/Bug687575a.pdf.cups.300.1 gs fathoms bohrs f42c717329ec13db00f2627bd89ad4c081379fd9 7 | 15:20.48 |
| tests_private/comparefiles/Bug687575a.pdf.pam.72.0 gs fermis furlongs f42c717329ec13db00f2627bd89ad4c081379fd9 7 | 15:20.48 |
| tests_private/comparefiles/Bug687575a.pdf.pbmraw.300.0 gs angstroms xeon f42c717329ec13db00f2627bd89ad4c081379fd9 7 | 15:20.48 |
| ... | 15:20.50 |
| which includes the affected files. | 15:20.55 |
Robin_Watts_ | marcosw1: Yes, but the user clustertest emails did not include that. | 15:21.50 |
| The commit test cluster emails may have included it, but I was trying to check the effect before I pushed my commits to the repo. | 15:23.03 |
tor8 | sebras: no, I think tomorrow. they're still in germany. | 15:23.29 |
| Robin_Watts_: yeah, clearing off the bug list would be nice | 15:24.36 |
marcosw1 | Robin_Watts_: right, the clusterpush emails are cutting off the cache results. that might be a feature. | 15:25.21 |
Robin_Watts_ | Should I report it in featurezilla ? | 15:27.20 |
henrys | kens I left an answer in the logs about -C | 15:27.48 |
kens | I read it henrys and posted to SO | 15:28.00 |
henrys | kens: thanks | 15:28.17 |
kens | NP thanks for the info and suggestion :-) | 15:29.04 |
henrys | so time for the meeting - first recommendation is to make the meetings optional until 2014 - the next 2 are the 24 and and 31 | 15:31.07 |
| and unless anyone has something we can dispense with today also with chris and ray out. | 15:32.16 |
chrisl | henrys: I'm here - next week I'm off | 15:32.32 |
mvrhel_laptop | henrys: this sounds like a great idea | 15:32.44 |
henrys | chrisl: oh I read that wrong but I'll stick with my recommendation | 15:32.54 |
paulgardiner | yeah, sounds good | 15:32.55 |
chrisl | henrys: but I also have nothing special for a meeting, so.... | 15:33.01 |
mvrhel_laptop | I am going to be beating on my p1 customer bug probably all week | 15:33.13 |
tor8 | henrys: just one question -- do I have the go-ahead to set up a bounty for fixing the final bits of the mupdf system font api for zeniko? | 15:33.18 |
henrys | tor8:yes | 15:33.36 |
tor8 | henrys: thanks. | 15:33.42 |
mvrhel_laptop | bug 694844 | 15:33.53 |
kens | I'm OK missing today, I wil be on vcacation next week also | 15:33.59 |
Robin_Watts_ | I have nothing for the meeting today (that I can think of), and making the meetings optional seems sensible, but I'll be around then anyway. | 15:34.03 |
kens | Oh and the 31st | 15:34.05 |
mvrhel_laptop | I will be around too | 15:34.22 |
chrisl | kens: Looks like our "friend" on bugzilla (Bug 694840) is now poking at SO - presumably didn't like our answers..... | 15:34.30 |
kens | :-) | 15:34.37 |
Robin_Watts_ | For the logs, if ray wants to pass anything along to me, I can try to lighten his load a bit. | 15:34.44 |
mvrhel_laptop | My inlaws come today and are here for 10 days so my hours might get a little odd though | 15:34.47 |
paulgardiner | I have one thing to report: MuPDF can now use JavaScriptCore in place of v8, and that seems to be working on iOS, although restricted to versions >= 7 | 15:34.56 |
henrys | mvrhel_laptop: so you'll be more working more ? | 15:35.07 |
Robin_Watts_ | paulgardiner: Excellent. | 15:35.16 |
mvrhel_laptop | ;) | 15:35.18 |
Robin_Watts_ | mvrhel_laptop: Did you fit a bolt on the inside of your door yet? | 15:35.30 |
mvrhel_laptop | unfortunately my office is also the guest room | 15:35.47 |
chrisl | Robin_Watts_: good point - if Ray needs to pass stuff around, I'll pitch in, too | 15:35.54 |
mvrhel_laptop | so I get booted out | 15:35.58 |
kens | chrisl now he claims he's getting a seg fault | 15:36.38 |
| Which duplicates what I saw IIRC | 15:36.47 |
chrisl | kens: yeh, I'm tempted to just a link to the bugzilla thread.... | 15:37.40 |
henrys | chrisl: I did have a question about the fonts, jargon sent an update does that affect the shipping 35? Or is it part of the 136 not in the 35? | 15:37.53 |
| s/jargon/juergon | 15:38.03 |
kens | chrisl I htink I would, go ahead:-) | 15:38.11 |
chrisl | henrys: only the 136 - if that problem was in the 35, we'd have heard about it a *lot*! | 15:38.44 |
mvrhel_laptop | henrys; I am going to be out part of the morning. have to go to my daughters school to watch a presentation | 15:39.27 |
henrys | chrisl: good, have you verified the 35 are a subset of the 136? | 15:40.04 |
| mvrhel_laptop: np | 15:40.18 |
chrisl | henrys: no, I haven't - I shouldn't really have to.... | 15:40.31 |
henrys | chrisl: I just want to make sure a 136 configuration isn't going to regress. | 15:41.49 |
| chrisl: a customer is showing more interest in that. | 15:42.09 |
paulgardiner | Oh, and another thing I need to bring up, I think we may have a bug in the Android app that messes up screen update, but only for devices that support hardware acceleration and are running 3.2 | 15:42.11 |
| I'd try to confirm that and fix it if I had such a device. | 15:42.43 |
henrys | chrisl: if the 35 are identical than we can rule out many regression possibilities. | 15:42.53 |
chrisl | henrys: I'll look at it. But we don't even have a 136 font configuration that works at the moment | 15:43.20 |
henrys | chrisl: we've set one up before, I guess the font map got tossed. It's not in there? I thought kens did a 136 config but I might be mistaken. | 15:44.47 |
chrisl | kens: that SO thread has been "migrated" before I could post a reply - what an idiotic idea. | 15:44.49 |
kens | Yeah that happens sometimes | 15:45.00 |
chrisl | henrys: the font name<->file name<->industry name that URW supplied (that kens) looked at, was incomplete and had errors | 15:45.40 |
henrys | chrisl: I guess that should become the priority then we seem close to selling something like that. Sorry, I did think it worked. | 15:46.40 |
chrisl | henrys: all the file names (and many font names) change when we get an update from URW, so every update renders our fontmap invalid. | 15:48.38 |
marcosw1 | Robin_Watts_, et.al.: I've modified the code so that clusterpush regression runs now report when md5sums matched cached results but it appears that not reporting these was intentional. I quickly checked the commit logs for the cluster code but couldn't find anything related to this (and git blame was any help). I'm trying a clusterpush now to make sure there aren't unexpected side effects. | 15:48.59 |
Robin_Watts_ | marcosw1: ok, thanks. | 15:50.01 |
chrisl | henrys: it's not really feasible for me to visually inspect each of the 136 URW fonts, comparing to the Adobe ones, and work out which maps to which. We'll have to get an updated mapping from URW | 15:50.24 |
henrys | marcosw: I am sort of used to my nice cluster pcl testing times - now I'm doing gs cluster pushes, it's like going back to DSL or something. I guess if everyone is happy with it I won't complain but I'd vote for more horsepower. | 15:50.34 |
Robin_Watts_ | henrys: Judicious use of lowres and -filter can help a lot. | 15:50.57 |
chrisl | henrys: and kens's comments here: http://bugs.ghostscript.com/show_bug.cgi?id=691213#c4 are relevant | 15:51.08 |
Robin_Watts_ | or is it lores? I forget | 15:51.22 |
chrisl | lowres | 15:51.28 |
Robin_Watts_ | chrisl: thanks. | 15:51.39 |
chrisl | I used it a *lot* | 15:51.54 |
henrys | chrisl: well first we can at least make sure the 35 are the same as what we release. That's easy right. Just look at the md5 for the file I'm sure the size is enough. | 15:52.21 |
chrisl | henrys: I'm not sure it is easy - the way they come from URW, there's no surity that the file names and font names match | 15:53.27 |
marcosw1 | henrys: it's just a question of dollars. I can add a couple of new (used) 96 core 2U servers to the cluster for ~$2500, that should cut our run times down by 25% or so. There is the question of where to put them...I don't think my garage can take two more without running into power issues (I think all the outlets in my garage are on one 15 amp breaker). | 15:56.21 |
henrys | chrisl: well if that is the case we need to get urw in on this to make this right surely the 35 standalone must match the corresponding 35 in the 136 set. | 15:56.25 |
| marcosw1: did you say 96 core | 15:57.47 |
chrisl | henrys: okay, so the font file names match, but they don't MD5 the same, so they are not identical - that, of course, doesn't mean the fonts are different..... | 15:57.57 |
Robin_Watts_ | chrisl: Presumably we can assume that the new ones are 'better' though? | 15:58.32 |
marcosw1 | yes. They are xeon 6 core cpus, with hyperthreading, two per board, and four boards per chassis. So 6*2*2*4=96. | 15:58.35 |
chrisl | Robin_Watts_: <shrug> define "new" - I'm not sure how diligent URW are with the version numbers. | 15:59.33 |
Robin_Watts_ | We must have a PS program to dump out a font, right? so we could do print outs of the old vs new, and fairly easily spot dropped glyphs etc. | 15:59.51 |
marcosw1 | that's what angstroms/beards/bohrs/microns is. Each of the boards is an independent node (but with a shared power supply). | 15:59.55 |
henrys | marcosw1: miles' office? | 16:00.28 |
chrisl | Robin_Watts_: we can do that - I don't find it that easy, though. I quickly get bored and miss things | 16:00.46 |
marcosw1 | tried that, there were too loud. that's why they are in my garage :-) | 16:00.50 |
Robin_Watts_ | I thought we'd tried that and miles declined on the grounds IT WAS TOO NOISY. | 16:00.52 |
| chrisl: Or print to bitmaps, then eor the bitmaps? :) | 16:01.23 |
| harder to miss stuff :) | 16:01.37 |
marcosw1 | I'm checking the specs now, presumably there is good information on power requirements available (they are meant to go in data centers, so power is a major concern). | 16:02.02 |
chrisl | Robin_Watts_: There's lots of ways, yes. But I first need to get the fonts into state that GS can use them - or generate a new fontmap | 16:02.16 |
marcosw1 | THe other choice is to rent a 19" rack at a colo. I started loooking into it a while ago but stopped when my garage worked as a solution. | 16:02.50 |
henrys | marcosw1: we can't have a contractor come out and insulate the room a bit. | 16:03.20 |
marcosw1 | at miles' office? No, there is only one room (there is a divider between the office and the "back room", but it doesn't go all the way to the ceiling). | 16:03.59 |
henrys | I guess I can stick with Robin_Watts_ solution of lowres and filter for now. | 16:06.32 |
| marcosw1: should we be replacing servers with these new machines? | 16:07.53 |
Robin_Watts_ | I suspect that people would be unwilling to have a nice quiet server replaced with an uberserver with screaming fans. | 16:08.37 |
henrys | marcosw1: any liquid cooled options? | 16:09.23 |
marcosw1 | So the 2U servers come with a 1100W power supply; which means I can't really run two of those on one circuit. OTOH, I do have 240V available in my garage, for charging the electric car, and that is a 30 amp circuit, I wonder if the power supplies on the dell servers are dual voltage. | 16:09.31 |
Robin_Watts_ | Also, swapping 4 nodes for 1 node will hit the internet connectivity harder. | 16:09.54 |
marcosw1 | henrys: I liquid cool all of the cluster nodes in my office (i7, x6, and inches). but it's not an option for a 2U chassis. | 16:10.17 |
Robin_Watts_ | henrys: These servers are second hand and designed for data centres. | 16:10.27 |
| noise there is not considered a problem. | 16:10.38 |
marcosw1 | I'll look into the colo option again. | 16:11.02 |
henrys | Robin_Watts_: but there must be quiet high powered workstations out there. | 16:11.58 |
Robin_Watts_ | henrys: yes, but they can't compete for these things for bang for buck. | 16:12.19 |
| You can't buy the CPUs new for the prices this place is selling the servers with CPUs and RAM. | 16:13.21 |
henrys | Robin_Watts_: I think it is okay to consider higher priced options, within reason. | 16:13.22 |
Robin_Watts_ | If we've got to spent $2500 to get a 25% reduction in time, I'd say we were into diminishing returns. | 16:14.10 |
| Whatever happened to the idea of using transitive closure of implication to cull the test files? | 16:14.52 |
marcosw1 | henrys: it's becomes expensive fast. my i7 was nearly $2000 and it's a single 6 core i7 (so 12 threads) and it's only slightly faster than one fourth of a dell C6100 2U box (which costs $1200). So $2000 vs. $300 for quieter. | 16:14.56 |
| Robin_Watts_: I did that. It doesn't remove enough of the test files to be interesting. | 16:15.13 |
henrys | marcosw1: wow | 16:15.16 |
Robin_Watts_ | henrys: These servers are an INCREDIBLE deal. | 16:15.28 |
marcosw1 | henrys: off course it's also new vs. used. | 16:15.53 |
Robin_Watts_ | marcosw1: bugger. | 16:16.18 |
henrys | marcosw1: what are typical colo prices, are we going to eat through the savings? | 16:16.28 |
marcosw1 | henrys: it's hard finding colo prices without talking to a human. I'm guessing $200/month is about what we'd need to pay. | 16:17.38 |
henrys | we have to get miles and joann ear protection. | 16:17.41 |
Robin_Watts_ | http://www.colounlimited.com/pricing-costs | 16:18.24 |
| Assuming that's typical, it's a lot. | 16:18.52 |
marcosw1 | we don't need much bandwidth, our biggest requirement is power, which means that california is going to be a bad option (oregon is apparently the cheapest power). | 16:19.07 |
Robin_Watts_ | http://www.serverpronto.com/colocation.php $549 for a half rack. | 16:20.06 |
henrys | well 200 a month over a year you lose your savings completely on one server. Assume we can get 3 years out of these things. the quiet machines look more attractive. | 16:22.13 |
| did I miss something? | 16:22.41 |
marcosw1 | henrys: we'd need more than on server to make a difference. | 16:23.02 |
| I suggested adding two 2U dell servers. That's the equiv. of 8 desktop machines at $2000/each. so $16 000. | 16:23.35 |
| (note my use of a space instead of ',' or '.'. don't want to confuse tor8). | 16:24.04 |
Robin_Watts_ | colocation is a pain in the ass though, as inevitably something goes wrong and someone has to book an appointment and go there. | 16:25.52 |
henrys | marcosw1: oh okay - but still over say a lifetime of 3 years we are going to pay 7200.00 colo - that's quite a bit or money | 16:25.58 |
marcosw1 | I have to run. I'll call he.net and see what deals they have going on for colo and put together a summary of the costs. | 16:25.59 |
Robin_Watts_ | marcosw1: An alternative idea is for Artifex to pay to have a new ringmain put into your garage. | 16:26.21 |
marcosw1 | Robin_Watts_: historically the nodes in my garage have been rock solid. They can be rebooted remotedly, which I have had to do once or twice, via a second ip address. You can even remotedly install a new bios or reinstall an operating system onto bare metal, it's really cool. | 16:27.50 |
Robin_Watts_ | marcosw1: I believe that is Sods Law at work. | 16:28.44 |
| It's only machines in remote locations that go wrong. | 16:28.58 |
marcosw1 | besides he.net is just down the hill from me, I could walk there (can't walk back, since it's up hill :-) | 16:29.10 |
Robin_Watts_ | Seriously, is having a new ring main installed in your garage an option? | 16:29.29 |
| Having the machines local to you is also extremely useful in that overnight tests can only be run on your local nodes, right? | 16:29.58 |
marcosw | Robin_Watts_: yes, a second circuit in my garage would be an option, the mains panel is in the garage, so it would be very little work. | 16:31.12 |
henrys | however having fast internet on each node might make them more flexible if we decide we want to move bitmaps or something. | 16:31.41 |
Robin_Watts_ | That seems the sanest/cheapest option to me. | 16:31.52 |
| henrys: Moving bitmaps is precisely why the overnight tests have to run on local-to-marcos nodes. | 16:32.17 |
marcosw | but no, these machines aren't useful for overnight runs, since they don't have monitors attached. And moving the bitmaps off of them isn't fast. They are connected to my intranet via ethernet over power, which is slow and flakey. | 16:32.47 |
henrys | marcosw1: what do you califonian families do when both your cars run out of charge | 16:33.07 |
| ? | 16:33.12 |
Robin_Watts_ | marcosw: How much would it cost to get the electrician to also run you a cat5 cable? :) | 16:33.14 |
| Or a wifi link ? | 16:33.55 |
marcosw | Robin_Watts_: my wifi is already overloaded, with 4 laptops, phones, video games, tablets, etc. | 16:35.06 |
Robin_Watts_ | Powerline ethernet should be plenty fast, unless you have a very noisy supply. | 16:35.08 |
| You can get 1Gbps adapters now. | 16:35.31 |
marcosw | for whatever reason it's not. I don't know if it's the distance I'm trying to run or if we have noisy power. | 16:35.40 |
Robin_Watts_ | OK, so professor google suggests that 100Mb/s is typical. | 16:36.17 |
| :( | 16:36.20 |
marcosw | henrys: the people I know who have more than one electric car (a Leaf and a Tesla) have two chargers. | 16:36.26 |
Robin_Watts_ | marcosw: Would running a cat-5 cable be an option? | 16:36.40 |
| 1 off cost of new ring main + cat-5 cable still sounds low compared to ongoing colo costs. | 16:37.17 |
marcosw | running cat6 wouldn't be too difficult, but I really don't want to become the artfiex colo facility. among other reasons I do plan on finishing my PhD sometime. | 16:37.23 |
Robin_Watts_ | ah. that's a good point. | 16:37.49 |
| mvrhel has a garage, right? :) | 16:37.56 |
| Actually, henrys has an entirely spare house. | 16:38.25 |
| Or did that get sold the other day? | 16:38.35 |
henrys | Robin_Watts_: I sold it yesterday | 16:38.38 |
| sorry | 16:38.42 |
Robin_Watts_ | Damn. A day too late. | 16:38.44 |
marcosw | Robin_Watts_: you are assuming energy costs in my house are $0 (Jill keeps telling my I should charge artifex for the 10 cluster nodes that are drawing power in my house). I think we decided it was $20/node/month or something like that. | 16:39.07 |
| oops, now I really, really need to go. | 16:39.14 |
henrys | Robin_Watts_: I wonder if construction in Miles' office is possible - that is really where the servers should live. | 16:39.51 |
kens | sound proffing | 16:40.16 |
Robin_Watts_ | Picsel built a "server room" in their offices by partitioning off an area. | 16:40.25 |
henrys | kens: I think just a solid wall with insulation would do fine. | 16:40.39 |
Robin_Watts_ | It overloaded the air con, so they had to install a portable aircon unit in there too. | 16:40.48 |
| And then they had to stand it in a paddling pool to solve condensation problems. | 16:41.06 |
| A sound proofed room would need aircon, which itself produces noise and power problems. | 16:41.39 |
henrys | they just need to go work at home and we'll take over the office. | 16:42.23 |
mvrhel_laptop | Robin_Watts_: questions for you | 16:44.03 |
| are you available for sec? | 16:44.47 |
chrisl | henrys: we definitely have at least one regression in the 136 set compared to the 35.... and looking at the clusterpush, possibly more | 16:45.35 |
Robin_Watts_ | mvrhel_laptop: Sure. | 16:45.40 |
mvrhel_laptop | So I have a case, where we have a path that is stroked with a pattern that includes a transparency | 16:47.04 |
| I had added code that handled the fill case for this some time back around line 2211 in gdevp14.c for the fill case | 16:48.05 |
Robin_Watts_ | ok. | 16:48.13 |
mvrhel_laptop | where it ends up calling pdf14_tile_pattern_fill and going through the rect list | 16:48.27 |
| unfortunately, it appears that this was not done for the stroking case :( | 16:48.49 |
Robin_Watts_ | ok. | 16:49.19 |
mvrhel_laptop | so pdf14_stroke_path calls gx_default_stroke_path | 16:49.26 |
Robin_Watts_ | All our stroking code has 2 paths through it. | 16:49.37 |
mvrhel_laptop | ok. this is where I need your help | 16:49.50 |
Robin_Watts_ | The first path through it is used for "simple" cases (non transparency, idempotent plotting) | 16:50.05 |
| The second path through it is used for more complex cases (non idempotent plotting, transparency). | 16:50.29 |
| The second path, IIRC, basically makes a new path that is equivalent to the stroked outline and then fills it. | 16:51.06 |
mvrhel_laptop | oh may that is what is not happening | 16:51.38 |
| s/may/maybe | 16:51.45 |
Robin_Watts_ | Maybe this is a red herring. If you step through the code, do you make it into the stroking stuff before it crashes? | 16:52.08 |
mvrhel_laptop | Robin_Watts_: yes I am in pdf14_stroke_path and beyond | 16:52.28 |
| it ends up trying to do a tiling from there | 16:52.41 |
| with the transparency tile | 16:52.52 |
| and explodes since no destination buffer has been set up (i.e. a new group) | 16:53.10 |
Robin_Watts_ | No, I meant, do you get into gx_stroke_path_only_aux ? | 16:53.17 |
mvrhel_laptop | yes | 16:53.20 |
| from there, it is trying to tile | 16:53.27 |
| with the transparency tile code | 16:53.33 |
| but nothing has been set up to do that | 16:53.37 |
henrys | chrisl: so it sounds like I should contact urw and figure out why the the 35 in our release and the 35 in the 136 are different? | 16:53.53 |
chrisl | henrys: Yes, I would prefer that to me having to create a list of problems! | 16:54.25 |
henrys | chrisl: will do and then we'll go from there. | 16:54.44 |
Robin_Watts_ | mvrhel_laptop: Probably would make sense for me to run the same thing here to see the backtrace etc, to save me asking you lots of questions. | 16:55.01 |
mvrhel_laptop | Robin_Watts_: let me check what line_proc it is picking | 16:55.03 |
Robin_Watts_ | Can you point me at a bug number/file/command line etc? | 16:55.15 |
mvrhel_laptop | Robin_Watts_: ok the file is the simplified file on bug 694844 | 16:55.50 |
| and just use the command line on the bug | 16:56.06 |
| -sDEVICE=ppmraw -o test.ppm -r100 | 16:56.15 |
| if you put a break point in gx_stroke_path_only_aux it is the only stroke | 16:56.47 |
Robin_Watts_ | Does this happen in the non clist case too ? | 16:57.24 |
mvrhel_laptop | That one is odd. It does not seem to crash, but I can't tell if it is drawing anything | 16:57.53 |
chrisl | henrys: actually, it *looks* like most are just pixel differences - which there shouldn't be but aren't that serious | 16:58.03 |
mvrhel_laptop | Robin_Watts_: the tile is read out of the clist just fine | 16:58.31 |
| including the transparency tile | 16:58.43 |
| the issue where it is crashing is due to the fact that our destination buffer has not been created | 16:59.05 |
Robin_Watts_ | ok, my instinct is always to try to remove the clist whereever possible :) | 16:59.13 |
mvrhel_laptop | since we are not doing a "fill" | 16:59.17 |
| Yes, but this is a pretty simple case I believe | 16:59.28 |
kens | OK off to have a pizza, night all | 16:59.28 |
Robin_Watts_ | mvrhel_laptop: OK. I have it crashing here. Just need to refresh my build as the debugger is complaining I'm out of date. | 17:00.10 |
mvrhel_laptop | Robin_Watts_: I have to head to school now | 17:00.10 |
| sorry | 17:00.16 |
Robin_Watts_ | I'll be here for another 3 or so hours yet :) | 17:00.26 |
mvrhel_laptop | otherwise I miss my daughters presentation | 17:00.28 |
| ok. just type in the logs if I miss you please. | 17:00.36 |
Robin_Watts_ | possibly longer. | 17:00.36 |
mvrhel_laptop | thanks for taking a look | 17:00.40 |
Robin_Watts_ | no worries. | 17:00.45 |
chrisl | henrys: so the definite regression is the germandbls issue we reported: http://bugs.ghostscript.com/show_bug.cgi?id=693827 | 17:00.46 |
henrys | chrisl: probably the thing to do is just delete the 35 in the the 136 set and use our regular 35 | 17:02.44 |
chrisl | henrys: yes, that's a possibility. We still need the revised mapping from URW, though | 17:03.30 |
henrys | chrisl: and we should maintain a "101" set to add to our regular 35 | 17:03.36 |
chrisl | henrys: that's PITA given that the file names change with each release | 17:04.18 |
henrys | chrisl: for history sake let's make a bug of the mapping problem and I'll reference it when I talk to URW | 17:04.43 |
| chrisl: does the internal font name change? | 17:05.05 |
Robin_Watts_ | mvrhel_laptop: (For the logs), so it does appear to be picking the wrong line_proc. | 17:05.21 |
chrisl | henrys: yes, it does, and URW's mapping doesn't include the internal name, either. | 17:05.36 |
henrys | simple enough to parse that out and rename the file - (a script to do that I mean) | 17:05.42 |
chrisl | No when the font name changes, too | 17:05.54 |
| henrys: the fontmap issue already has a bug: http://bugs.ghostscript.com/show_bug.cgi?id=691213 | 17:06.19 |
Robin_Watts_ | stroke_fill is the line proc that is used to 'fill each section of the stroked line as it happens' rather than the others which are used to 'make me a path we can then fill at the end'. | 17:06.42 |
henrys | chrisl: got it thanks | 17:07.18 |
mvrhel_laptop | Robin_Watts_: ok so it is picking the wrong one | 17:08.04 |
Robin_Watts_ | yeah, just trying to remember how this ever works. | 17:08.24 |
mvrhel_laptop | Robin_Watts_: so we need to recognize that the pattern has a transparency | 17:08.27 |
| and use that information to pick the right one | 17:08.39 |
Robin_Watts_ | We specifically check for that. | 17:08.41 |
| stroke_line_proc_t line_proc = | 17:09.05 |
| ((to_path == 0 && !gx_dc_is_pattern1_color_clist_based(pdevc)) | 17:09.07 |
| ? stroke_fill : | 17:09.08 |
| (traditional ? stroke_add_compat : stroke_add_fast)); | 17:09.10 |
mvrhel_laptop | at lin 428, I see that we check if it is clist based | 17:09.11 |
chrisl | henrys: based on the last releases we got, all the ways of identifying the font (file name, /FontName, /FamilyName and the name in the comment) all change between releases, enough to trip up automated solutions to creating a revised map. | 17:09.19 |
mvrhel_laptop | but this one is clist based and transparency | 17:09.19 |
| where does it check for transparency/ | 17:09.31 |
| ? | 17:09.34 |
Robin_Watts_ | has visions of the teacher saying "Do you have something more interesting on that laptop than this presentation Mr Vrhel? Perhaps you'd like to share it with the class?" | 17:09.47 |
| You're right, sorry. It was checking for clist, not transparency. | 17:10.29 |
mvrhel_laptop | sorry for the interuption sir ;) | 17:10.33 |
| shoot I have to go. bbiaw | 17:10.52 |
tkamppeter | chrisl, I have put up http://bugs.ghostscript.com/show_bug.cgi?id=694852 following our IRC discussion today. | 17:14.08 |
chrisl | tkamppeter: I see that, thanks. But I'll leave it to kens unless he asks me to look at it. I'll make sure he sees the bug, though. | 17:14.52 |
henrys | chrisl: you are saying the gemandbis fix changed all the identifying features of the fonts? | 17:16.37 |
chrisl | henrys: yes, I had to manually "fix" all the names | 17:17.20 |
henrys | chrisl: so what is committed you fixed? | 17:19.45 |
chrisl | henrys: yes, I changed all the Postscript visible font names so they matched those in the previous release. | 17:20.43 |
| henrys: this is in the 35 fonts, not the 136 | 17:21.02 |
henrys | chrisl: this is why I don't like our rebase workflow in a normal merge environment you would have merged urw's stuff then made the change then merged that to trunk and both results would be preserved, but I digress. | 17:22.51 |
chrisl | henrys: that wouldn't work as we rename the font files to the Postscript name | 17:24.30 |
Robin_Watts_ | henrys: If we are getting releases from someone then those releases should go in unchanged on a branch. | 17:25.09 |
| Then we should merge those to the trunk. | 17:25.16 |
| We rebase only when we don't want to preserve history (which is most of the time), but can merge when required. We've done it for large development branches etc. | 17:26.19 |
| chrisl: git is capable of spotting renames across merges, I think. | 17:26.51 |
chrisl | Robin_Watts_: the problem with the 35 fonts is that we rename the files from their originals. | 17:26.51 |
henrys | chrisl: of course I forgot about that. | 17:27.21 |
chrisl | And we do have the original URW files in the font repo | 17:27.51 |
Robin_Watts_ | I'd be tempted to have a URWfonts branch with the original fonts checked in. | 17:28.24 |
henrys | chrisl: did you modify any names in the font? | 17:28.34 |
chrisl | henrys: yes, all of them!!! | 17:28.46 |
henrys | chrisl: or just the filename? | 17:28.52 |
Robin_Watts_ | Then from each point in that, I'd make a new temporary branch, and commit a rename of those files. | 17:28.59 |
| Then merge that to master. | 17:29.11 |
| Or something like that. | 17:29.17 |
chrisl | henrys: given that the file name we use in GS *has* to match the Postscript FontName, *obviously* I had to modify the "internal" font names | 17:29.44 |
henrys | chrisl: it doesn't need to match, the fontmap will allow another level of indirection for filenames. | 17:32.16 |
| at least it used to haven't looked at it in a while. | 17:32.49 |
| from fontmap.GS: /CharterBT-Roman(bchr.pfa); | 17:33.54 |
chrisl | henrys: for the default fonts: /Helvetica/NimbusSan-Reg; | 17:34.33 |
| henrys: I specifically asked you whether I should go through renaming all the fonts (and their contents) or modify the default fontmap, and you wanted font names changed | 17:35.17 |
henrys | chrisl: I did yes, I did the same thing with PCL and I'm starting to regret it because I now realize how hard it is to sort the URW history. | 17:36.10 |
chrisl | henrys: yes, I think it would have been easier to change the map - but as I hadn't done it before, I didn't know if there was a "policy", hence my asking | 17:37.06 |
henrys | chrisl: so now the question is, was my mistake stupid enough to be worth fixing after all this time. | 17:38.43 |
chrisl | henrys: I think we leave it until the next time we get an update from URW. With the 136 set, we change the fontmap file | 17:39.29 |
henrys | chrisl: sounds good. I'll get a note out to urw about the fontmap stuff and the rest of that bug today. | 17:40.32 |
| not fontmap - but you know what I mean. | 17:41.19 |
chrisl | henrys: the last comment about the missing /.notdef is (or should be) resolved, the rest I think it still valid | 17:41.53 |
henrys | okay | 17:42.07 |
| chrisl: yeah changing the names screws up the history but jeez why can't they use descriptive names for the font file names like everyone else. | 17:43.52 |
chrisl | henrys: beats me, I just thinking the same thing. I'm wondering about converting the 136 fonts from the URW file names, to the PS style names (keeping the original FontNames). | 17:45.41 |
| henrys: currently the URW-136 fontmap does a double mapping for every font: "industry name" -> "URW name" then "URW name" -> file name | 17:47.00 |
henrys | chrisl: yes that is what I was talking about above you aren't stuck with the postscript name | 17:47.50 |
chrisl | henrys: but every mapping means another font dictionary/object in VM - it has memory and performance implications | 17:49.02 |
| henrys: Oh, and it also can give confusing results when PS does resourceforall on the /Font resource | 17:51.01 |
henrys | chrisl: hmm I thought these were just aliases. I've never studied the implementation | 17:52.10 |
chrisl | henrys: with Postscript fonts we have to execute font program, which creates the font object with the original FontName, and defines it in the FontDirectory, then we copy the dictionary, put the name of the font we've substituting for into the dictionary, and do another definefont, creating another font object and defining it FontDirectory. | 17:54.26 |
| I'm not sure about Truetypes, those might be more direct | 17:54.46 |
henrys | chrisl: well pretty easy to quantify that overhead and it is fixed - only proportional to the number of resident fonts. | 17:56.02 |
chrisl | henrys: and only when they are used - we don't evaluate them all up front. | 17:56.38 |
henrys | I'm sure the double mapping was used before - the resourceforall issue should have come up, are you sure there isn't code somewhere that works around the problems. | 17:57.43 |
| ? | 17:57.51 |
chrisl | It probably depends where the files are stored - I put them in Resource/Fonts to save fighting with search paths | 17:59.08 |
Robin_Watts_ | tor8: ping | 18:22.40 |
| paulgardiner: ping | 18:51.29 |
paulgardiner | Robin_Watts_: pong | 18:53.50 |
sebras | Robin_Watts_: pong. | 18:53.53 |
Robin_Watts_ | sebras: ping :) | 18:54.03 |
| I have various commits on robin/master that should be good to go regardless of the result of the rebinding discussion. | 18:54.51 |
paulgardiner | Got to keep these things balanced | 18:54.52 |
Robin_Watts_ | Could I trouble one of you to look at them? | 18:55.04 |
| I will get links... | 18:55.07 |
sebras | Robin_Watts_: I'm too tired to do really read and understand anything tonight. since my kobo will likely arrive at tor's tomorrow maybe I will be hime and have alook at the discsussion and see if I can contribute. :-/ | 18:55.13 |
Robin_Watts_ | http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=b362eb775e6af4c318ffa1985069b3757c284996 | 18:55.23 |
sebras | I guess this is the drawback with working until 4am in the morning for a week. | 18:55.31 |
Robin_Watts_ | sebras: Sure. I understand. | 18:55.40 |
| eek. | 18:55.41 |
| http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=425c0f2e84deba27f064d8dd0d5b96c44564c1ec | 18:55.52 |
| http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=2f264787e9dac0fdcbadfe431f549f18b4d3aafb | 18:56.03 |
sebras | Robin_Watts_: hello project delivery that is late by almost a full year. :-/ | 18:56.10 |
Robin_Watts_ | and *now* it's urgent? | 18:56.41 |
sebras | people have been speaking about a deadline since april and now they found out when the Real<tm> deadline was. | 18:57.08 |
| you would imaginge that finding this out early on would be the task of a project/program manager, but anyway... | 18:57.42 |
chrisl | sebras: I resolved *never* to do those kinds of hours after first time...... and the second time..... and the.... oh, never mind! | 18:58.23 |
sebras | chrisl: exactly. | 18:58.37 |
chrisl | sebras: and then you'll waste man months afterwards doing some pointless post-mortem that will get completely ignored next time around :-( | 18:59.27 |
paulgardiner | Robin_Watts_: Can't see a problem with any of those. Perhaps if FZ_LOCK_FILE is unused it should be removed, but... | 18:59.31 |
Robin_Watts_ | paulgardiner: Perhaps. But then I thought we were trying to maintain public API's since 1.0 ish. | 19:00.01 |
paulgardiner | Yeah, makes sense | 19:00.27 |
Robin_Watts_ | thanks. | 19:03.26 |
| Hi tor8. | 19:06.41 |
| Just been looking at bug 694810. | 19:06.50 |
| I've fixed the actual reported bug, but the underlying problem with the files is still there. | 19:07.17 |
| The xref says: I have 7 objects, starting at 1. | 19:07.33 |
| And then starts to list the objects from object 0. | 19:07.44 |
| consequently we get the offsets read in wrong. | 19:07.54 |
| so later when we try to read object 3, we get object 2 and get confused and the page load fails. | 19:08.14 |
| Could we maybe try to call 'repair' on a file when we detect an object being wrong, even if it's after a page load ? | 19:09.46 |
mvrhel_laptop | Robin_Watts_: back | 19:10.36 |
Robin_Watts_ | mvrhel_laptop: I've committed a fix. | 19:10.49 |
mvrhel_laptop | great! | 19:10.59 |
| did it fix both bugs? | 19:11.08 |
| I can check | 19:11.28 |
| 693365 was the other one | 19:11.38 |
Robin_Watts_ | The stroke code was generally careful to avoid doing segment by segment strokes in non idempotent cases, but the 'thin_lines' case was slipping through. | 19:11.49 |
mvrhel_laptop | oh I see | 19:12.06 |
Robin_Watts_ | I've fixed that now so it still fills in the thin lines case too. | 19:12.09 |
mvrhel_laptop | thanks for catching that | 19:12.14 |
Robin_Watts_ | But I haven't tested the other bug. | 19:12.25 |
mvrhel_laptop | I will investigate | 19:12.34 |
Robin_Watts_ | (and actually, I only tested the simplified file on the first bug) | 19:12.37 |
| I will leave it to you to close both. | 19:12.46 |
mvrhel_laptop | ok I will check the customer file on that too | 19:12.49 |
| Thanks for helping me Robin_Watts_ | 19:12.55 |
Robin_Watts_ | no worries. | 19:13.00 |
| I've hit this bug a few times, I think, but I've always run screaming away as it's looked to involve clist and patterns. | 19:13.36 |
| Shame on me for not realising the fault was in my bit of the code. | 19:14.03 |
mvrhel_laptop | Robin_Watts_: It was quickly found. Thank you | 19:14.33 |
| Saved me from what I thought was going to be a weeks worth of work | 19:14.51 |
| Now I can work on my banded printing | 19:15.15 |
henrys | I can't get anyone to talk about http://bugs.ghostscript.com/show_bug.cgi?id=694528 ⦠clist and patterns - everyone heads for the hills. | 19:19.13 |
Robin_Watts_ | So, the issue is that a tiny change in the floating point result produces a huge memory difference? | 19:20.49 |
henrys | Robin_Watts_: yes it is | 19:21.32 |
Robin_Watts_ | How can such a small change in the fp result result in several inches on the page? Do we have a stupidly large ctm or something? | 19:21.53 |
henrys | I wonder if the pattern clamp code should be done in device space - fixed point | 19:21.58 |
Robin_Watts_ | Or is it that the imprecision is just enough to tip us into another repeat of the pattern ? | 19:22.38 |
henrys | the second one | 19:22.55 |
Robin_Watts_ | Hmm. Is there a clip path bbox we can apply that might shrink it down? | 19:23.45 |
henrys | Robin_Watts_: the code tries to make the pattern bbox smaller in case it exceeds the size of the page. | 19:23.54 |
| see the comment above clamp_pattern_bbox | 19:25.06 |
Robin_Watts_ | henrys: Is it possible to work in device space? | 19:26.08 |
chrisl | I don't see how 64 vs 32 bit influences that | 19:26.23 |
Robin_Watts_ | Pattern repeats are rectangular when defined, but not rectangular necessarily after translation. Consider a 45 degree rotation. | 19:26.51 |
| chrisl: The issue is that 64bit compilers and 32bit compilers just happen to do floating point rounding at different points. | 19:27.19 |
| So while this exhibits here as a 64 vs 32bit problem, it's actually just a compiler vs compiler problem. | 19:27.44 |
chrisl | Then it's a compiler bug.... | 19:27.55 |
Robin_Watts_ | No. | 19:27.58 |
henrys | chrisl: see the middle of the discussion there is a concrete example of a different result on 32 and 64 | 19:28.08 |
Robin_Watts_ | Both compilers are generating perfectly valid code. | 19:28.09 |
henrys | Robin_Watts_: the problem does go away without those stupid float casts though. I wish we could get rid of those. | 19:29.24 |
Robin_Watts_ | henrys: While I understand that it might be nice to be able to work in device space, I don't think we can in general. | 19:29.44 |
| henrys: What happens if you remove the float casts ? | 19:29.52 |
henrys | then everything works and we go look at 42000 regressions | 19:30.14 |
Robin_Watts_ | henrys: That's not a problem. We should definitely do that. Marcos loves it when I do that for him. | 19:30.40 |
henrys | not everything but this particular problem works | 19:30.41 |
Robin_Watts_ | Is there a particular example where stuff goes wrong? | 19:30.58 |
| my memory says there is a comment in there about a CET test ? | 19:31.09 |
chrisl | "The float casts are there to reproduce results in CET 10-01.ps page 4." - from the bug | 19:31.41 |
henrys | Robin_Watts_: I did see lost data in some test files also | 19:32.26 |
Robin_Watts_ | henrys: CET 10-01.ps page 4 can sod off. | 19:33.16 |
henrys | Robin_Watts_: and then I got to thinking how long it took me to find this damn problem, all of which put me in "LATER" mode | 19:33.21 |
Robin_Watts_ | LATER seems tempting. | 19:33.42 |
| I mean, we are doing the right calculations, and giving the right results. | 19:33.53 |
henrys | Robin_Watts_: yes we are. | 19:34.05 |
Robin_Watts_ | If the maths just happens to work out with a pattern that straddles the edge of a repeat, well, that's life. | 19:34.17 |
| The one thing I would check is whether there is a clipping path in force here. | 19:34.47 |
| If so, the clipping path might allow us to clip the rectangle repeat down into not repeating. | 19:35.12 |
chrisl | I wonder if switching to gs_point_transform2fixed() give us consistent results..... | 19:37.01 |
henrys | where I saw all the memory piling up was clip_copy_color calling clist_copy_color() ... | 19:37.06 |
| but I thought it was just clipping the the bbox set in the clamp pattern routine | 19:38.17 |
Robin_Watts_ | So, this is page 1 of that file? | 19:38.27 |
henrys | Robin_Watts_: it is | 19:38.45 |
Robin_Watts_ | the first instance of clamp_pattern_bbox being called? | 19:39.12 |
henrys | well there is a faux call by the postscript interpreter - the first real one yes. | 19:39.49 |
Robin_Watts_ | I get one call, then it prints some stuff about fonts, and goes away to hibernate for a bit. | 19:40.56 |
henrys | Robin_Watts_: okay then the first one. The ps interpreter does do a dummy pattern call first and I don't remember if it called the clamping code. | 19:41.43 |
Robin_Watts_ | So, we are entered with (0,-8000) (6000, 0) and after clamping I get (-0, -8000) (6000, -3500) | 19:45.48 |
| and you're saying that in the case that works we're getting: (0, -7999.9998) (6000, -3500) ? | 19:46.20 |
henrys | Robin_Watts_: you are on 32 bit right | 19:46.45 |
Robin_Watts_ | s/that works/that works fast/ | 19:46.48 |
| I am on windows. | 19:46.53 |
henrys | Robin_Watts_: 32 right? | 19:47.56 |
Robin_Watts_ | 64bit windows, but running a 32bit binary. | 19:48.40 |
| but from what you're saying in the bug, I believe it's the 32bit linux case that works fast, and the 64 bit linux case that works slowly. | 19:49.24 |
| The 32 vs 64 bit thing is a red herring. floats and doubles have exactly the same representation in a 32bit OS and a 64bit one. | 19:49.58 |
henrys | correct. I'm looking for my bbox's in my notes I'll run it again if need be. | 19:50.03 |
Robin_Watts_ | I think the difference comes because of the compiler taking longer to write the value back to ppt->x | 19:51.26 |
henrys | Robin_Watts_: a red herring - there is a floating point calculation in the bug that shows a different result same input one 8000 one 7999.99 | 19:51.32 |
Robin_Watts_ | Yes, but that is NOT down to the fact that it's a 32 or a 64 bit machine. | 19:51.52 |
| It's down to the fact they have different compilers on the 32 and 64 bit machines. | 19:52.04 |
| In gs_point_transform, the code does: | 19:52.26 |
| ppt->x = (float)(a*b) + c; | 19:52.38 |
| if (blah) | 19:52.47 |
henrys | That was my theory as well until ray said he found differences on 32 vs 64 windows | 19:52.53 |
Robin_Watts_ | ppt->x += (float)(d*e) | 19:52.58 |
henrys | the 64 bit bbox ends up with p.y == .00017⦠so 3500 units larger | 19:54.19 |
Robin_Watts_ | http://www.viva64.com/en/b/0074/ | 19:54.59 |
henrys | Robin_Watts_: and I disassembled each and found the same instructions | 19:55.06 |
Robin_Watts_ | So the reason is that in 64bit mode we are using SSE2 for fp, in 32bit mode we aren't. | 19:55.39 |
henrys | I see no SSE2 but disassembled a debug build | 19:56.05 |
| s/but/but I/ | 19:56.19 |
Robin_Watts_ | urgh. | 19:56.27 |
| well, it's floating point. different implementations can vary and still be IEEE conformant. | 19:57.02 |
| FLT_EPSILON = 0.0000001 | 20:00.11 |
| 8000 differs from 7999.9998 by less than FLT_EPISILON | 20:00.31 |
| thus both answers are 'correct' in terms of IEEE fp. | 20:00.44 |
| Hmm. I expressed that badly. | 20:01.09 |
henrys | yes no I see what you mean. | 20:01.26 |
Robin_Watts_ | The representation power of floats (as shown by FLT_EPSILON) is roughly 7 sig fig. | 20:01.33 |
| and 8000 differs from 7999.9998 in less than its 7th sig fig. | 20:02.02 |
| or more than. bah. I give up. | 20:02.13 |
| So, the code is correct, we shouldn't look for a magic fix in there. | 20:02.52 |
henrys | but the float casts do suck - I just sense that is going to bite us again with a difficult to reproduce problem. | 20:04.26 |
Robin_Watts_ | The float casts aren't bad. | 20:04.40 |
| doing float f = (float)(a * b) + c makes sense. | 20:05.03 |
| a*b will promote to a double to avoid losing data. | 20:05.20 |
henrys | well it's double f = (float)(a * b) + c | 20:05.37 |
| floatp is a double or course ;-) | 20:06.22 |
Robin_Watts_ | Oh! That does suck. | 20:06.22 |
henrys | this is ghostscript after all | 20:06.39 |
chrisl | Maybe we should revisit that CET test - it may be that the PS test is stupid, invalid, device dependent or whatever | 20:07.13 |
Robin_Watts_ | So we hold matrices as floats and points as doubles. Jeez. | 20:08.39 |
| and rects as doubles too, consequently. | 20:08.51 |
henrys | yes mat coefficients are single | 20:09.10 |
Robin_Watts_ | The more I hear about this, the more I think it's just an unfortunate file. We are doing the right thing. | 20:09.59 |
chrisl | henrys: did you say you'd try removing the float casts and it caused problems? | 20:10.50 |
Robin_Watts_ | One thing that might be smarter is to look at where the rectangle is actually used. | 20:11.14 |
| Where we calculate how many repeats we need. | 20:11.30 |
henrys | chrisl: yes missing data and a quick guess was the clamp algorithm went the other way with a too small region | 20:11.42 |
Robin_Watts_ | henrys: That's worrying. | 20:12.13 |
henrys | I didn't confirm it but it was indeed a patten that was missing | 20:12.37 |
Robin_Watts_ | It's worrying that our code is fragile enough for a tiny difference to make a noticable rendering change. | 20:12.45 |
chrisl | henrys: okay, that's a shame. The CET tests referenced in the comment are explicitly "Device Dependent" so it's daft slavishly try to reproduce Adobe results | 20:13.14 |
henrys | Robin_Watts_: yes I was hoping to put the casts see that the cet had the documented change an move on. No such luck | 20:13.28 |
| s/put/pull out/ | 20:13.42 |
| chrisl:what do you think about replacing floatp with double globally see the comment in stdpre.h I think that creates a lot of confusion for the uninitiated. | 20:16.02 |
| ? | 20:16.07 |
chrisl | henrys: I would be extremely happy with that - it still confuses me at times | 20:17.23 |
henrys | chrisl: it seems like it would fall under config stuff so one for your list | 20:17.54 |
chrisl | henrys: Okay, in the new year - can't face it this week! | 20:18.21 |
henrys | and rightfully not ;-) | 20:18.34 |
Robin_Watts_ | I am confused as to why I am getting the 64bit results when running a 32bit binary :( | 20:19.12 |
henrys | you are getting the 32 bit result | 20:19.31 |
| your bbox is 3500 units taller (y) | 20:20.38 |
| sorry shorter | 20:20.50 |
Robin_Watts_ | eh? | 20:21.08 |
| Sorry, what are your 32 and 64bit bboxes respectively ? | 20:21.22 |
henrys | 64 is 0,-8000, 6000,0.00017... | 20:22.43 |
| your y is -3500 so that 3500 units shorter than what I have on 64 | 20:23.20 |
Robin_Watts_ | My 'y' ? | 20:23.33 |
henrys | what is your q.y? | 20:23.46 |
Robin_Watts_ | My q.y, yes. | 20:23.56 |
henrys | be careful with the signs | 20:24.12 |
Robin_Watts_ | Sorry, that's a massive difference that I don't understand. | 20:24.40 |
| I understood that the 2 different bboxes were: (0, -8000) (6000, -3500) and (0, -7999.998) (6000, -3500) | 20:25.09 |
henrys | no the second one is what I wrote above. | 20:25.54 |
| the tiny floating point difference results in a huge increment in the clamping algorithm | 20:26.23 |
Robin_Watts_ | So why did you document the calculation of -8000 in the bug? | 20:26.35 |
henrys | because indirectly it i the cause of the bug - if you do that calculation correctly later the bug doesn't happen there is some swapping and such that goes on in between. | 20:27.47 |
Robin_Watts_ | I need to walk through this myself. | 20:28.12 |
| I will take a run at this tomorrow. | 20:29.25 |
| If only to the stage of understanding what is going on. | 20:29.40 |
henrys | okay | 20:30.00 |
| Robin_Watts_: but in any event you're bounding box for windows is the same as my bounding box for linux 32 | 20:33.21 |
sebras | Robin_Watts_: http://www.eit.lth.se/sprapport.php?uid=498 what have we done with type3 fonts? | 21:28.46 |
| Robin_Watts_: it seems to me that all these characters are really uneven in thickness. | 21:29.53 |
| haven't we been able to render this better before? | 21:30.04 |
tor8 | Robin_Watts_: (about to go to bed) yes, that's something I've thought about a few times, how to run the repair once we discover errors when we try to load an object | 22:57.40 |
| so go ahead and try to tackle that if you have any ideas | 22:57.54 |
| Forward 1 day (to 2013/12/18)>>> | |