| <<<Back 1 day (to 2013/02/18) | 2013/02/19 |
wordToDaBird | I used you guys project mupdf in a project that I was working on for work, and I would like to donate sometime to work on your project. Where should I go to begin? | 07:50.54 |
chrisl | wordToDaBird: the mupdf devs probably won't around for a couple of hours yet | 07:52.16 |
| tor8: you're earlier than I expected - there was a guy on here earlier (wordToDaBird) asking about contribution to mupdf, but he's gone now..... | 08:30.34 |
tor8 | chrisl: right. | 08:32.46 |
Robin_Watts | If anyone here has a Windows 7 Home Premium installation that they'd like to be able to remote desktop into, please say. | 10:53.06 |
| Normally you need Pro, but it can be enabled in Premium with a hack. | 10:53.35 |
tor8 | Robin_Watts: I do, actually. And I cursed quite loudly when I figured out the reason why it silently failed to connect... | 10:57.32 |
| what's the hack? | 10:57.43 |
Robin_Watts | Let me mail you the zipfile. | 10:57.54 |
paulgardiner | Me too please. Although I have professional, might be handy to know for other machines | 10:58.30 |
Robin_Watts | http://www.missingremote.com/guide/how-enable-concurrent-sessions-windows-7-service-pack-1-rtm <- That's where I got it from. | 11:00.08 |
| But you need to register to download etc, hence me mailing it. | 11:00.18 |
| tor8: Bah. Can't mail you a zipfile. | 11:09.09 |
| It's in my home dir as W7-SP1-RTM-RDP-v4.zip | 11:09.25 |
tor8 | Robin_Watts: thanks. got it. | 11:10.13 |
Robin_Watts | I also have a Remote Desktop client that works on MacOSX Lion | 11:20.10 |
| wordToDaBird: Hi | 13:47.45 |
wordToDaBird | hi Robin_Watts how you dping today? | 13:48.12 |
Robin_Watts | You were after mupdf devs earlier... tor8, myself, paulgardiner and sebras are all mupdf devs. | 13:48.20 |
wordToDaBird | doing* | 13:48.25 |
Robin_Watts | fine thanks. Just banging my head against the reflow wall. | 13:48.45 |
wordToDaBird | you guys need any bug tracking done, where would I go to begin helping? | 13:48.52 |
Robin_Watts | MuPDF bugs are listed in bugs.ghostscript.com | 13:49.08 |
| under the "MuPDF" component. | 13:49.19 |
| By default they get allocated to tor, I think. | 13:49.31 |
| But if you see something you're interested in, then great. | 13:49.44 |
wordToDaBird | you guys saved my ass on a project for work. I will look through and dive in. | 13:50.09 |
| going to jump in shower | 13:50.16 |
Robin_Watts | It's worth checking on here before spending too much time on something, because we have have stuff underway in the background etc. | 13:50.21 |
| Yeah, I saw you mention that earlier. Where do you work? | 13:50.33 |
tor8 | aaargh! now I can't get "bird is the word" out of my mind... :) | 13:53.41 |
wordToDaBird | iphone=android dev at a no name company | 13:59.58 |
Robin_Watts | wordToDaBird: So, how did mupdf help you out? Are you using it in an app? | 14:08.01 |
tor8 | Robin_Watts: well, I found out what's wrong with zenikos patches! svn must be creating broken diffs, it has off by one errors in the line ranges in the "@@ -86,10 +68,+13 @@ blablabla" lines... causing patch to eat the header of the next hunk. | 15:24.22 |
Robin_Watts | gawd. | 15:25.35 |
| Well, I think I'm ready to expose my text mode fiddlings to outside criticism... just uploading a new apk now. | 15:28.45 |
| http://ghostscript.com/~robin/MuPDF-9.apk | 15:37.31 |
malc__ | hmmm. xmls in this archive are rather unxmlish to an untrained eye.. (probably shows how much i know about android) | 15:43.10 |
Robin_Watts | malc_: hmm ? | 15:43.27 |
malc__ | well just look at AndroidManifest.xml in the archive with a text editor | 15:43.52 |
Robin_Watts | I am looking. | 15:44.24 |
| My XML knowledge is minimal, so please bear with me. What's wrong with it. | 15:44.41 |
malc__ | 03 00 08 00 is not even a BO marker.. | 15:44.57 |
Robin_Watts | oh. | 15:45.14 |
malc__ | Robin_Watts: in a nut shell what i see is a blob, not an xml | 15:45.23 |
Robin_Watts | The AndroidManifest.xml in the .apk is a compiled thing. | 15:45.35 |
| Android does that as part of the build process. | 15:45.42 |
| see the AndroidManifest.xml in the source for how it started life. | 15:46.02 |
malc__ | Robin_Watts: hence my caveat emptor in the parentheses in the original sentence | 15:46.03 |
Robin_Watts | Gotcha. | 15:46.13 |
malc__ | Robin_Watts: thanks, but i'd rather spend myself the pain | 15:46.23 |
| s;spend;spare | 15:46.32 |
| probably, one can add english to the android to the list of my shortcommings | 15:46.52 |
Robin_Watts | paulgardiner: It's not perfect, but it does a better job of grahames thesis, than previous versions I think. | 15:48.10 |
malc__ | Robin_Watts: working on improving text extraction device? | 15:49.33 |
Robin_Watts | malc_: Install the apk referenced above. | 15:49.46 |
| Load a PDF. | 15:49.51 |
| Hit the 'reflow' button (the one with bendy arrows on it) | 15:50.03 |
malc__ | Robin_Watts: hard to do when one doesn't have anything to install it on | 15:50.05 |
Robin_Watts | malc__: oh, for some reason I thought you had an android port of your viewer. | 15:50.26 |
| but, yes. | 15:50.46 |
malc__ | Robin_Watts: the reason i ask is that past have taught me that every time you guys fiddle with text pain of syncing with the new interfaces ensues | 15:51.37 |
Robin_Watts | malc__: Do you currently use the text extraction device? | 15:51.58 |
malc__ | Robin_Watts: aye | 15:52.32 |
Robin_Watts | We are at pains to say that the text extraction device is experimental and subject to change in the headers. | 15:52.35 |
| Well, there are changes, yes, sorry. | 15:52.45 |
malc__ | never fear, i survived pass by reference, hopefuly i'll survive this too | 15:53.16 |
| s;fuly;fully | 15:53.36 |
Robin_Watts | I'd hope that the pass by reference stuff wasn't too painful. | 15:53.56 |
paulgardiner | Robin_Watts: yeah, I see what you mean about Grahame's thesis | 15:53.58 |
Robin_Watts | It's picking up some things as right aligned that shouldn't be. | 15:54.17 |
malc__ | Robin_Watts: btw. one complaint about text extraction i got in the past was wrt to ARM documentation (opcode tables at the start of insn description for instance) | 15:54.39 |
| -to | 15:54.46 |
paulgardiner | and the h264 spec | 15:54.50 |
Robin_Watts | malc_: Let me try the ARM ARM. | 15:55.05 |
malc__ | holds his breath | 15:55.17 |
paulgardiner | It seems to be making a valiant attempt at tables. I guess the html produced is more complicated because it has slowed zoom down | 15:56.57 |
| bullet points and indentations too | 15:57.31 |
Robin_Watts | paulgardiner: It now outputs tables in the html to try to cope with things like bulletted lists and indentations etc. | 15:58.03 |
paulgardiner | but stuff is getting truncated off screen when I zoom | 15:58.23 |
henrys | almost meeting time but probably not a great deal to discuss this time around. | 15:59.11 |
Robin_Watts | paulgardiner: Yeah, that's the browser not coping well with tables :( | 15:59.28 |
| Possibly there is a smarter way to output the html other than in tables. | 16:00.40 |
| Can you do tableish things with divs and spans ? | 16:01.06 |
paulgardiner | That's the only problem I've seen so far, and not a biggie. Otherwise, quite amazing | 16:01.09 |
malc__ | Robin_Watts: yes | 16:01.41 |
paulgardiner | There's a style "display:table", which makes things behave like table cells, but that may just run into the same problem | 16:03.34 |
| and display:table-cell | 16:04.13 |
Robin_Watts | right. | 16:04.23 |
| Well, in the absence of anything better to discuss, I could run through the current algorithm, in case anyone can see any improvements? | 16:04.46 |
paulgardiner | I've used that before to get equal length columns, which otherwise requires horrid trickery | 16:04.59 |
henrys | tor8:mupdf release notes? I want to start the newsletter. | 16:05.04 |
paulgardiner | Robin_Watts: I'd be interested | 16:05.29 |
Robin_Watts | henrys: http://ghostscript.com/~robin/news.txt | 16:05.35 |
tor8 | http://ghostscript.com/~robin/news.txt | 16:05.52 |
Robin_Watts | or: http://ghostscript.com/~robin/release.txt | 16:05.55 |
tor8 | Robin_Watts: one thing missing -- move to AGPL | 16:06.12 |
Robin_Watts | Ah. Will add that now. | 16:06.23 |
henrys | Robin_Watts: you posted that before, I didn't know if it was complete. Nothing else? | 16:06.59 |
| not to say that is not a lot ... | 16:07.19 |
Robin_Watts | Updated version there now which mentions the AGPL change. | 16:07.36 |
| henrys: It's complete until someone spots something I missed :) | 16:08.26 |
paulgardiner | Robin_Watts: worth mentioning reflow? - even though the capability was sort of already in the lib | 16:08.47 |
Robin_Watts | paulgardiner: The reflow changes aren't in the 1.2 rc. | 16:10.33 |
paulgardiner | ah I see | 16:10.46 |
Robin_Watts | Hmm. http://www.theregister.co.uk/2011/03/31/google_on_open_source_licenses/ | 16:11.18 |
| Google specifically ban the AGPL. | 16:11.27 |
henrys | but ban is not defined ⦠we can't put mupdf on google play would be a serious problem for us, otherwise it doesn't matter. | 16:15.55 |
| s/we/if we | 16:16.29 |
Robin_Watts | henrys: They won't use it in in house software. | 16:18.00 |
| but I guess that's fine, as we are free to license it in other ways if they ever wanted it. | 16:18.21 |
| So... anything else, or should I run through the algorithm quickly? | 16:19.02 |
henrys | actually right now AGPL is much more meaningful to ghostscript - I don't think anyone is doing server side stuff with mupdf are they? | 16:19.28 |
Robin_Watts | henrys: I suspect that mupdf (mudraw) would be very interesting to anyone wanting to produce thumbnails of PDF files server side. | 16:20.06 |
| Plus page extraction/cleaning etc. | 16:20.18 |
| So algorithm: We have a standard device which we run the interpreter (or display list) to. This captures all the text operations. We collate chars placed with this into 'spans'. | 16:22.31 |
| spans are 'groups of characters that follow immediately on from one another sharing the same baseline'. | 16:23.04 |
| There is obviously a bit of magic going on in there to detect when characters "follow on immediately" and to add missing spaces etc. | 16:23.36 |
| These spans go into a 'span soup'. | 16:23.51 |
| We then 'strain' the soup to collect spans that share the same baseline (at least approximately). | 16:24.36 |
| So the spans in a line can be split by column breaks, or table cell breaks, or even by big spaces. | 16:25.58 |
| Some spans can be immediately adjacent to other ones, and just differ by the distance they are from the baseline (so we can cope with super or subscripts). | 16:26.33 |
| At the moment the 'strain_soup' implementation is very dumb and we don't exhaustively search for spans to collate. We can put more effort in there if we need to. | 16:27.11 |
| Lines are then collected into blocks (again, in a very simplistic way). | 16:27.35 |
| So our page is made up of blocks of lines of spans of chars. | 16:28.12 |
| Then we can (optionally) run analysis on this. | 16:28.29 |
| The first analysis step is that we run through the page, and collect the line distances (the distance from one baseline to the next). We store these into a table of (distance x style). | 16:29.19 |
| We then pick the most common distance for each style, and assume that this is the standard line distance. | 16:29.41 |
| We then run through the lines again; any line that doesn't differ from the previous one by the 'standard' distance for a style used in that line is held to be a paragraph break. So we break the block there. | 16:30.38 |
| I also spot lines that start with bullet points here. Those are always held to be the start of a new paragraph. | 16:31.18 |
| I did have some code that spotted numbers/roman numerals etc as the start of paragraphs too, but the need for that has disappeared given the next phase of analysis, so it's disabled for now. | 16:32.06 |
| So, phase 2 of the analysis is to look at patterns of regions within the lines of a page. | 16:33.01 |
| For each line, I form a 'region mask', basically a set of 'start/stop' areas that the spans cover. | 16:33.38 |
| So if I have: "2.1 Transparency in a PDF file 255" | 16:34.27 |
| That'll probably end up as 3 spans ("2.1", "Transparency in a PDF file", "255") | 16:34.50 |
| and hence will end up with a region of 3 start/stop areas: Maybe (90,110) (140,300) (350,375) | 16:35.25 |
| I sort these region masks into order of decreasing area covered. (i.e. for each one I sum the widths of the covered areas). | 16:36.00 |
| Then I attempt to merge them. | 16:36.29 |
tor8 | start/stop areas ... what do you mean more precisely? | 16:36.54 |
Robin_Watts | For my ("2.1", "Transparency in a PDF file", "255") example. | 16:37.14 |
| The first start/stop area starts at the position on the page that "2.1" starts, and stops where it stops. | 16:37.39 |
tor8 | rectangular or just "x" ? | 16:37.55 |
Robin_Watts | Just "x". | 16:38.03 |
| (Actually, all of this works using the baseline vector, so it works for rotated text, vertical text etc too. The start/stop values are positions along the normalised baseline vector for each line). | 16:38.52 |
tor8 | so after sorting, you'd have 3 region masks in this order: ("Transparency in a PDF File", "255", "2.1")? | 16:38.58 |
paulgardiner | real intervals represented by pairs of numbers | 16:39.02 |
Robin_Watts | tor8: No, sorry. | 16:39.22 |
| The 3 intervals generated from "2.1", "Transparency in a PDF file", "2.1" form a single region mask. | 16:39.48 |
| Every line generates such a mask. | 16:40.10 |
tor8 | right, so one such mask per line of spans? | 16:40.24 |
Robin_Watts | On a simple page with justified text, most lines would generate more or less the same region mask. | 16:40.37 |
| tor8: yes. | 16:40.41 |
| imaging you've got a secret CIA document, and they've blocked out every line of text. | 16:41.33 |
tor8 | does this differ from what you can already get by just looking at the widths in the spans of a line? | 16:41.38 |
Robin_Watts | That's how I think of these region masks in my head :) | 16:41.49 |
tor8 | or just a representation easier to work with | 16:41.53 |
| Robin_Watts: I've got the general idea I think. do go on. | 16:42.24 |
| you've just sorted these by total width covered per line | 16:42.50 |
Robin_Watts | tor8: It differs from just looking at the widths in a spans of a line, in that my representation copes with rotated text etc, and spans that follow immediately from one another are collated etc. | 16:42.55 |
| tor8: Right. So I sort them, then I merge them. | 16:43.15 |
tor8 | right, so identical (or near identical) masks get eliminated | 16:43.34 |
Robin_Watts | The merge function is a bit hairy, but essentially it will merge 2 region masks if to do so doesn't lose any 'holes' between spans. | 16:44.00 |
| The regions might get bigger (because we might have 2 lines that have differing amounts of text in them, and when merged the mask has to cover the larger one). | 16:44.44 |
paulgardiner | So each line refers to a region mask, and after the merge, some lines share a common region mask? | 16:45.01 |
Robin_Watts | No reference is kept from line to region mask. | 16:45.16 |
paulgardiner | I don't see now my statement contradicts yours, so I guess I'm still confused | 16:46.06 |
Robin_Watts | paulgardiner: You haven't said anything contradictory. | 16:46.27 |
| And indeed, I could have kept a pointer from line to region mask, and indeed had I done so, some would share a common mask. | 16:46.53 |
paulgardiner | After merge, can there be l1 |= l2, st reg(l1) = reg(l2) ? | 16:47.00 |
Robin_Watts | But I don't keep such a pointer. | 16:47.01 |
paulgardiner | Ah | 16:47.12 |
Robin_Watts | yes. | 16:47.18 |
| So, now armed with my reduced set of region masks, I run through the lines again. | 16:47.37 |
paulgardiner | Is that wtat the merge process does? | 16:47.43 |
| Yep | 16:47.49 |
Robin_Watts | For each line, I look to see what the 'best fit' mask is. | 16:48.10 |
paulgardiner | Hmmm, how is that not it's own? | 16:48.41 |
Robin_Watts | Some masks are incompatible and so can be ignored immediately. | 16:49.07 |
| but we might find several masks that fit it. | 16:49.34 |
| and I score by how well it matches. | 16:49.51 |
| with bonus points given for the left or right edges of the spans closely matching the left or right edges of the masks. | 16:50.18 |
| paulgardiner: That's a good question, and I'm not ignoring it. | 16:50.53 |
paulgardiner | it may just become clear | 16:51.21 |
Robin_Watts | Then I annotate the lines/spans structures with which column the data should go in. | 16:51.39 |
| And the html output makes use of this information to produce tables. | 16:51.59 |
| If several lines of text use the same region mask, then I consider pulling those lines together. | 16:52.56 |
| but only if the latter lines only actually touch one of the regions in the mask. | 16:53.09 |
| For instance, consider a paragraph like: | 16:53.25 |
mvrhel_laptop | good morning/afternoon | 16:53.52 |
Robin_Watts | * This is a bulleted list | 16:54.00 |
| text where the bullet | 16:54.01 |
| only appears at the | 16:54.03 |
| start. | 16:54.04 |
henrys | hi mvrhel_laptop | 16:54.16 |
Robin_Watts | In that case I want to pull that into being "*" and "This is a bulleted list text where the..." | 16:54.33 |
| If instead I had: | 16:54.43 |
henrys | mvrhel_laptop: how was skiing? | 16:55.14 |
Robin_Watts | 1.1 Some section 23 | 16:55.41 |
| 1.2 Some section with 25 | 16:55.43 |
| some more text | 16:55.44 |
| 1.3 Another section 30 | 16:55.46 |
mvrhel_laptop | henrys: it was quite good. we had weather that ranged from white out snow to blue sky | 16:55.47 |
| depending upon where you were on the mountains | 16:56.00 |
Robin_Watts | I wouldn't want to pull that to be "1.1 1.2 1.3" "Some section Some section with some more..." etc | 16:56.01 |
henrys | we are finally getting some good snow here. | 16:56.15 |
| in the mountains that is | 16:56.25 |
mvrhel_laptop | right. thats good | 16:56.32 |
Robin_Watts | I think that's most of the algorithm really. | 16:57.09 |
| Oh, and I attempt to guess alignment within the region masks by looking at where spans start/stop. | 16:57.51 |
| So, why don't I just keep a pointer back to the span for each line? | 16:58.30 |
paulgardiner | So during the merge, you form a reduced set of regions, not attached to particular lines? | 16:58.39 |
Robin_Watts | 1) I didn't think of that, and that's not how the code fell out easily. | 16:58.44 |
| 2) When we merge region masks, they tend to 'bloat' a bit (each region within the mask can stretch outwards) | 16:59.47 |
paulgardiner | sort of a union? | 17:00.10 |
Robin_Watts | at the end of the region merge process, a line may better match a different region. | 17:00.15 |
| paulgardiner: exactly a union, yes. | 17:00.22 |
henrys | ghostscript meeting time! woo hoo! | 17:00.23 |
Robin_Watts | paulgardiner: I'll shut up now. We can discuss this more later if you have thoughts. | 17:00.49 |
paulgardiner | That's amazing. It's highly effective from what I've seen | 17:01.23 |
henrys | like the last meeting I didn't really have anything for this one. | 17:02.03 |
| Is everyone happy with the release? | 17:02.28 |
paulgardiner | henrys: and I'm happy, provided I still seem to be on the right track. | 17:02.32 |
tor8 | Robin_Watts: HTML tables won't be good for epub output | 17:02.52 |
chrisl | henrys: bit bloody late for misgivings! | 17:02.59 |
Robin_Watts | tor8: Well, as I say, I'm open to other ideas. | 17:03.19 |
henrys | chrisl:I meant does anyone have post mortem issues with the release | 17:03.34 |
tor8 | Robin_Watts: if it's just columns, we should just do them as sequential divs | 17:03.47 |
| and for real tables, as html tables | 17:04.03 |
| the oddness I guess is for lists | 17:04.08 |
henrys | for example marcosw noted he would change a test procedure in light of the release | 17:04.12 |
tor8 | and block quotes and the like | 17:04.17 |
| what about formatted source code, in say a <pre> tag with lots of indentation | 17:04.30 |
chrisl | henrys: well, just that we need to be a bit more diligent passing patches "upstream" | 17:04.30 |
marcosw | henrys: and I've done that. | 17:04.30 |
henrys | marcosw:great | 17:04.53 |
Robin_Watts | tor8: Yes. We could analyse the data we get to spot lists etc. | 17:04.57 |
| chrisl: I had passed them upstream. I just hadn't kicked marti enough :) | 17:05.11 |
tor8 | Robin_Watts: lines could have their own div tag with css margins set to get the indentation deviation from the main column they reside in | 17:05.13 |
chrisl | Robin_Watts: that wasn't particularly aimed at you...... | 17:05.53 |
tor8 | Robin_Watts: otherwise, the algorithm sounds like it could work well! it's sort of what I had in mind. I'll have to play with the code and have a look see soon as I get zenikos patches off my back, and that meeting on thursday needs preparing. | 17:06.47 |
chrisl | henrys: also, just check that everyone is happy with the way I take the release branch, instead of freezing master - I seem to remember some discussion about that a couple of weeks ago. | 17:06.52 |
Robin_Watts | tor8: I am unaware of any meeting on thursday. | 17:07.17 |
henrys | chrisl:I like it much better. | 17:07.36 |
tor8 | Robin_Watts: ah. there's a local company I'm going to visit to answer questions about mupdf for | 17:07.49 |
| or demo, or whatever | 17:07.57 |
Robin_Watts | oh, right. cool. | 17:07.59 |
marcosw | chrisl: taking a release branch results in doubling of some of the commit messages in the regression emails, search for "weekly regression report - 64-bit Build - 2013-02-18-16:10:59" | 17:08.44 |
henrys | with in person meeting time coming up everyone should have a look at projects they signed up for last meeting. | 17:08.45 |
| and get started on them ;-) | 17:08.57 |
chrisl | marcosw: is that an issue? | 17:09.12 |
marcosw | probably not but it's kind of misleading. | 17:09.34 |
chrisl | It shouldn't be hard to filter duplicates | 17:10.10 |
henrys | mvrhel_laptop: I was able to find a good ms tool to check xps (isxps.exe) found my syntax problem instantly. | 17:10.25 |
mvrhel_laptop | oh nice. | 17:10.45 |
| henrys: can you send me a link to that if you get the time | 17:10.59 |
chrisl | marcosw: actually, shouldn't the regular weekly regression only be checking master? | 17:11.08 |
marcosw | yup | 17:11.14 |
henrys | it is buried in the windows device development kit | 17:11.22 |
Robin_Watts | Oh, while I remember, for the benefit of those who weren't here earlier... if anyone has a Windows 7 Home Premium installation that they'd like to be able to Remote Desktop into, let me know. | 17:11.43 |
chrisl | So, there shouldn't be duplicate commit messages, surely? | 17:11.54 |
Robin_Watts | probably just ray_laptop, as mvrhel_laptop has moved onto 8. | 17:11.59 |
marcosw | but for the past week I set it to the gs907 release branch. when setting it back to master the duplicate commits appeared. | 17:12.23 |
henrys | alexcher:how is smash coming along? | 17:13.37 |
| smask | 17:13.41 |
alexcher | I didn't work on smask recently. I was busy with the release bugs and 32-bit issues. | 17:14.44 |
henrys | okay I think there should be a group put together to study the issues:alexcher, chrisl and ray_laptop what do you think about that? It just seems like a project that is going to linger forever without more of a push. | 17:16.10 |
chrisl | henrys: I've never really looked at the problem, except for some messing when Robin_Watts first uncovered it. Perhaps if alexcher could summarise where he's at in the bug thread. | 17:17.27 |
henrys | alexcher:would you do that? | 17:18.00 |
alexcher | chrisl: ye' I'll do this. | 17:18.02 |
ray_laptop | henrys: re-opening discussion on the SMask with mvrhel_laptop and I and alexcher would probably help get things "off the dime" | 17:18.23 |
henrys | ray_laptop:okay | 17:18.56 |
ray_laptop | if chrisl wants to participate, fine with me (as another PS expert) | 17:18.59 |
Robin_Watts | ray_laptop: I have no massive desire to get dragged back into it, but if the description of the problem in the bug isn't clear enough, let me know and I can try again. | 17:19.11 |
mvrhel_laptop | If I recall there was some PS stuff that needed to be done, but if I can be of help let me know | 17:19.15 |
| not with PS stuff obviously.... | 17:19.25 |
henrys | ray_laptop:yeah I thought chrisl would be helpful PS wise. | 17:19.31 |
ray_laptop | it's been so long, I've forgotten what problem we are trying to solve | 17:19.32 |
mvrhel_laptop | what was the bug number? | 17:19.47 |
henrys | 693115 | 17:19.58 |
ray_laptop | mvrhel_laptop: I was just going to ask that, too :-) | 17:20.04 |
Robin_Watts | http://bugs.ghostscript.com/show_bug.cgi?id=693115 | 17:20.05 |
henrys | that was all I had anybody else? | 17:20.34 |
Robin_Watts | I pushed the Unicode/UTF-8 changes to master after the release. | 17:20.53 |
chrisl | ray_laptop: I'm willing to pitch in as required, but like you, my memory of the problem is extremely hazy..... | 17:21.08 |
Robin_Watts | So if anyone sees something wrong, please let me know. | 17:21.14 |
| It should only affect windows users. | 17:21.24 |
chrisl | Robin_Watts: I think I'll just stick to linux for a few weeks.... ;-) | 17:21.37 |
Robin_Watts | henrys: We should talk about that PCL utf8 message that came in. | 17:21.39 |
tor8 | paulgardiner: in pdf_js_none.c, is there any reason to allocate a js object and handle js_get_event or can they all just be zeroed out and return NULL? | 17:21.45 |
Robin_Watts | chrisl: It *should* only affect windows users. *should* :) | 17:21.58 |
ray_laptop | Robin_Watts: well, the description is a little confusing, but when I look at the gs output of test_smask2 vs. Adobe, they are VERY similar. Where is the difference ? | 17:22.09 |
henrys | Robin_Watts:I saw that and tried to focus on something else, but yes I suppose you are right. | 17:22.37 |
Robin_Watts | ray_laptop: The alpha level on the green, according to the bug. | 17:23.04 |
| Remember this is me and my crap memory, right? :) | 17:23.15 |
| Comment 2 has details. | 17:23.51 |
chrisl | Robin_Watts: I thought the primary driver on that bug was a performance issue | 17:23.53 |
Robin_Watts | bug 692870, is a performance issue. | 17:24.26 |
chrisl | Yeh, which I thought you worked around, rather than "fixed"? | 17:24.50 |
Robin_Watts | and I fixed that by using a group. | 17:24.50 |
| It's a perfectly valid fix. | 17:25.03 |
ray_laptop | Robin_Watts: oh, nm. I see the difference in the blue square, Adobe has an 'imset' area of darker blue-ish that gs doesn't have | 17:25.13 |
Robin_Watts | (actually, maybe it was Michael that fixed it) | 17:25.28 |
| (yes, credit to Michael, blame to me etc) | 17:25.52 |
mvrhel_laptop | hehe | 17:25.57 |
| I still have some knockout isolated group stuff to fix in the transparency code | 17:26.35 |
chrisl | I never really saw the point in the just-in-time SMask group thing | 17:27.13 |
ray_laptop | chrisl: it's just that adding the extra knockout group, that then has to be composited in _might_ be slower | 17:27.16 |
Robin_Watts | chrisl: The driver on bug 693115 is that we get stuff wrong. | 17:27.50 |
mvrhel_laptop | it is possible the soft mask is not even used | 17:27.51 |
ray_laptop | but back to the SMask issue... | 17:27.54 |
mvrhel_laptop | anyway back to my DeviceN email reply to customer.... | 17:28.10 |
chrisl | ray_laptop: my vague memory was that we'd found a "workaround" for the specific customer problem, but a more general (performance) issue remained - clearly, I'm wrong! | 17:28.24 |
ray_laptop | mvrhel_laptop: how common is it that an SMask is defined, but not used ? | 17:28.24 |
mvrhel_laptop | ray_laptop: probably not very common, but I don't see any other reason to do the just in time | 17:28.43 |
ray_laptop | chrisl: the fix for bug 692870 is fine, and probably doesn't need further optimization | 17:29.04 |
henrys | Robin_Watts:did you see Math last comment about pcl6 unicode stuff | 17:29.34 |
| Math's | 17:29.40 |
ray_laptop | mvrhel_laptop: so if we get rid of the 'just in time' SMask setup, are we good ? | 17:29.42 |
Robin_Watts | henrys: bug number? | 17:29.50 |
henrys | 692722 | 17:30.02 |
mvrhel_laptop | except for those cases where the soft mask is not used ;) | 17:30.04 |
chrisl | ray_laptop: IIRC, we'll only instantiate the SMask group if it's referenced in a resource dictionary, so it is very likely to be actually used | 17:30.15 |
henrys | is this the problem you (Robin_Watts ) were talking about? | 17:30.24 |
Robin_Watts | ray_laptop: The problem is not directly with the just in time stuff. | 17:30.31 |
| The problem is that we don't nest smasks properly. | 17:30.45 |
| Now, it's possible that by removing the just in time, we'll fix the nesting, but... | 17:31.01 |
chrisl | ray_laptop: and doing it just-in-time also adds a large chunk of complexity for cases where the SMask group inherits the current graphics state. | 17:31.19 |
mvrhel_laptop | if the just in time complicates things, it would make sense to simplify | 17:31.22 |
ray_laptop | mvrhel_laptop had added a 'stack' of masks, | 17:31.27 |
Robin_Watts | henrys: That is the bug I was talking about yet. | 17:31.38 |
| s/yet/yes/ | 17:31.41 |
ray_laptop | mvrhel_laptop: that I thought was to deal with the nesting issue | 17:31.42 |
mvrhel_laptop | ray_laptop: that keeps the masks in sync with the graphic state pops | 17:32.12 |
henrys | seems odd the error message gets the letter right but the open failed anyway | 17:32.12 |
Robin_Watts | No, I thought we genuinely needed a stack of masks. | 17:32.15 |
mvrhel_laptop | Robin_Watts: we do have a stack of masks | 17:32.34 |
Robin_Watts | henrys: I think he was making up filenames :) | 17:32.36 |
mvrhel_laptop | hold on let me read through the bug | 17:33.23 |
henrys | oh well I'd be a little more interested to know if it actually opened a file with those codes but I guess I'm a stickler for details ;-) | 17:33.48 |
Robin_Watts | henrys: Look at line 4 of the original description: "invoke pcl6.exe with a missing file as argument, filename in cyrillic" | 17:34.02 |
| Does this affect using pipes? | 17:34.50 |
| i.e. if we pipe a file in on stdin, will the codepage setting matter? | 17:35.06 |
henrys | Robin_Watts:okay I see | 17:35.21 |
ray_laptop | Robin_Watts: if the file is present, is all OK (i.e., is it just the error message that is confused ?) | 17:35.26 |
Robin_Watts | ray_laptop: Such is my belief. | 17:35.36 |
ray_laptop | well that sounds easy (WONTFIX) ;-) | 17:36.00 |
mvrhel_laptop | Robin_Watts: I am confused by your first comment in the bug | 17:36.43 |
| Are blue and green mixed up in some of your discussion | 17:36.54 |
Robin_Watts | 693115 ? | 17:37.09 |
mvrhel_laptop | Robin_Watts: yes | 17:37.14 |
henrys | Robin_Watts: do you mind if we push back that question - since he's set up to test this why not take advantage of it. Get a test of a real file, missing file etc. | 17:37.24 |
Robin_Watts | henrys: I have no problem with that at all. | 17:37.39 |
mvrhel_laptop | Robin_Watts: the stream your wrote at the top has /smask1 gs /red Do /blue Do /smask2 gs /green Do | 17:37.52 |
Robin_Watts | Get him to check piping in etc ? | 17:37.53 |
ray_laptop | mvrhel_laptop: I think comment 2 is easier to follow, but it would help if there was the sequence of that example (test_smask2) as in comment 1 | 17:37.56 |
| mvrhel_laptop: that sequence is for test_smask | 17:38.19 |
mvrhel_laptop | But the discussion has you drawing red, green, then blue | 17:38.21 |
ray_laptop | mvrhel_laptop: yeah, that confused me too, because the initial description had different color sequencing to the example file | 17:39.02 |
Robin_Watts | mvrhel_laptop: Oh, urm... It's possible that I typed that wrong :( | 17:39.05 |
henrys | Robin_Watts:should I respond or can you handle it? | 17:39.14 |
Robin_Watts | it's pcl, so I'll leave it to you :) | 17:39.32 |
mvrhel_laptop | ok. just wanted to check that there was not an even bigger issue.... | 17:39.39 |
henrys | okay | 17:39.47 |
mvrhel_laptop | I will mentally swap blue and green... | 17:39.53 |
Robin_Watts | mvrhel_laptop: Just like I did :) | 17:40.00 |
mvrhel_laptop | oh I see | 17:40.51 |
| the second mask is getting corrupted by the first mask | 17:41.02 |
| was that the issue Robin_Watts ? | 17:41.06 |
| since it was in effect | 17:41.14 |
| corrupted meaning rendered through | 17:41.39 |
Robin_Watts | mvrhel_laptop: To be honest, my memory of this has all but completely atrophied. | 17:41.42 |
mvrhel_laptop | Robin_Watts: me too | 17:41.48 |
ray_laptop | mvrhel_laptop: you and I discussed possibly adding a compositor action to avoid the multiple SMask rendering | 17:41.50 |
Robin_Watts | I tried to be as clear as possible in the bug report for this very reason. | 17:42.03 |
mvrhel_laptop | This comment was pretty clear Robin_Watts | 17:42.22 |
| Secondly, due to Ghostscripts implementation of transparency masks,the first SMask (M1) does not stop being current until theend_transparency_mask of the second one is reached (line 8 above). | 17:42.27 |
| This doesn't actually cause a problem, unless smask1 itself uses a | 17:42.46 |
| transparency group, in which case the rendering of M2 ends up with | 17:42.47 |
| its contents being masked by M1. | 17:42.49 |
| Now, I have seen cases where we have a soft mask inside a softmask, and that works correctly | 17:43.33 |
ray_laptop | I don't understand how the second SMask is getting corrupted by the first. The transparency mask is rendered in it's own buffer that doesn't composite with a previous mask, right ? | 17:43.42 |
mvrhel_laptop | well the first mask is still in the "graphic state" | 17:43.56 |
| or so the graphics code thinks | 17:44.06 |
| so when it draws the next mask which is setting a new graphic state it uses the current mask | 17:44.33 |
| that is my rough understanding from reading this | 17:44.53 |
| I would need to spend a bit of time running the file to verify | 17:45.08 |
ray_laptop | mvrhel_laptop: we only 'use' a mask when we end a group (as we composite to the underlying buffer) right ? | 17:45.38 |
mvrhel_laptop | ray_laptop: that is correct | 17:45.53 |
| so if the next mask is in a group it will get hit with the current mask | 17:46.14 |
| I think.... | 17:46.31 |
ray_laptop | mvrhel_laptop: what do you mean "get hit" | 17:46.37 |
mvrhel_laptop | get rendered with | 17:46.42 |
| sorry for my imprecision in speech ray_laptop | 17:47.00 |
| again, I would need to run through and see what is going on | 17:47.20 |
| but that is my guess | 17:47.26 |
chrisl | mvrhel_laptop: I did a major change quite a while back so we save the graphics state when we encounter an SMask group, and then set that state again before we execute the content stream for the SMask - could it be that going wrong, do you think? | 17:47.27 |
mvrhel_laptop | chrisl: I really have no idea. I am just guessing at this point. I will dig into this today and see what I can find | 17:47.59 |
| first need to finish this support email on DeviceN to customer | 17:48.26 |
| chrisl: but the softmask stuff is so convoluted the odds of it being your change is slim | 17:48.44 |
ray_laptop | mvrhel_laptop: are we talking about the numbered sequence in the description (the initial comment). Then at what step are you saying smask2 is being composited through smask1 ? | 17:49.13 |
chrisl | mvrhel_laptop: okay, the PS that saves and resets the graphics state is pretty darned hairy, so if it looks like there might be mileage there, let me know, and I'll look into it - I wouldn't subject you that Postscript hell! | 17:49.19 |
mvrhel_laptop | ray_laptop: I am only reading what Robin_Watts wrote. Nothing more. | 17:49.53 |
| I need to run the file myself and see what is going on | 17:50.04 |
chrisl | Got to go - like I said, ping me if there's anything you want me to look into | 17:55.25 |
ray_laptop | mvrhel_laptop: it seems that the problem of the SMask being used while rendering an SMask isn't that complicated -- an existing 'maskbuf' is not to be used during a begin...end mask rendering | 17:56.05 |
mvrhel_laptop | ray_laptop: sometimes it is | 17:56.33 |
| there are cases with soft masks inside softmasks | 17:56.43 |
| we need to make sure we handle that correctly too | 17:57.15 |
| some of the files in the regression suite have this | 17:57.39 |
ray_laptop | mvrhel_laptop: OK, so we need to tell the compositor to 'drop' an SMask sometimes (when the new SMask is defined in the same group) | 17:58.36 |
mvrhel_laptop | ray_laptop: not sure what you mean by 'drop' | 17:59.11 |
ray_laptop | mvrhel_laptop: pop the existing mask, to prepare for the begin_transparency_mask to set up the new one (smask2 in the example) | 18:00.11 |
mvrhel_laptop | we have a pop compositor command already | 18:00.45 |
ray_laptop | similar to what we do when we end the group | 18:00.46 |
mvrhel_laptop | so that decision can be made in the interpreter | 18:00.58 |
ray_laptop | but without ending the group | 18:01.05 |
mvrhel_laptop | and the appropriate command sent | 18:01.06 |
| that is a command to pop the current soft mask | 18:01.28 |
| similar to what we do with the Q/q commands | 18:01.56 |
| ray_laptop: do you follow what I am saying. I may not have been clear | 18:02.26 |
ray_laptop | mvrhel_laptop: we don't have a POP_COMPOSITOR (we have POP_DEVICE, POP_SMASK_COLOR, POP_TRANS_STATE) | 18:02.33 |
mvrhel_laptop | hold on a sec | 18:03.07 |
ray_laptop | mvrhel_laptop: are you suggesting that we add a POP_SMASK ? | 18:03.21 |
| mvrhel_laptop: (because that's what I was trying to say we need) | 18:04.11 |
mvrhel_laptop | ray_laptop: hold one one sec | 18:04.21 |
| or one minute | 18:04.28 |
ray_laptop | mvrhel_laptop: np | 18:04.34 |
mvrhel_laptop | PDF14_POP_TRANS_STATE | 18:04.54 |
| will end up doing pdf14_pop_transparency_state | 18:05.16 |
Flx_ | Hi | 18:05.27 |
ghostbot | bonjour | 18:05.27 |
ray_laptop | mvrhel_laptop: but we don't want to pop the entire trans state, so we ? | 18:05.30 |
mvrhel_laptop | which will remove the current softmask | 18:05.36 |
ray_laptop | mvrhel_laptop: since we are still in the same group. | 18:05.45 |
mvrhel_laptop | this pops the current softmask | 18:06.10 |
| ray_laptop: I thought that is what you wanted to do | 18:06.46 |
ray_laptop | mvrhel_laptop: I am looking at the code. It looks like that's all POP_TRANS_STATE does is pop the SMask | 18:07.05 |
mvrhel_laptop | yes | 18:07.11 |
ray_laptop | I didn't realize that from the name | 18:07.24 |
mvrhel_laptop | well it is (or was) only used to keep the masks in sync with the graphic state | 18:07.40 |
Flx_ | I switched from gs 9.05 to 8.71 (production server), and my command to convert a pdf into multiple jpg make colors oversaturated. I cant update. Someone know if there is a solution ? | 18:07.49 |
mvrhel_laptop | with q/Q changes ray_laptop | 18:07.55 |
ray_laptop | mvrhel_laptop: so if we get a 'gs' operator that sets a SMask, we need to know that we need to pop the trans_state (smask) | 18:08.38 |
mvrhel_laptop | ray_laptop: so if the interpreter decides a mask needs to go, it can send the compositor action | 18:08.42 |
| ray_laptop: yes | 18:08.57 |
tor8 | Robin_Watts: lots of fixes on tor/master | 18:08.59 |
mvrhel_laptop | ray_laptop: unless some of the commands are wrapped up in groups | 18:09.37 |
| there may be some issues to worry about | 18:09.59 |
| ray_laptop: I am a little worried about breaking the case where we do have a softmask inside a softmask | 18:10.33 |
ray_laptop | mvrhel_laptop: so how do we determine when we need to do that extra pop ? (or when NOT to do it) | 18:10.54 |
mvrhel_laptop | ray_laptop: that is the question that I don't have an answer for now. I will get this email out to the customer now and then take a closer look at this | 18:11.34 |
ray_laptop | mvrhel_laptop: well, keeping track of the smask nesting level is pretty easy (just bumping up and down in begin/end trans_mask) | 18:11.58 |
| mvrhel_laptop: OK. I'll stop bothering you for a bit. Sorry | 18:12.15 |
mvrhel_laptop | ray_laptop: np. lets talk more about it in a bit | 18:12.41 |
ray_laptop | mvrhel_laptop: please call if I don't seem to be on IRC. | 18:13.00 |
mvrhel_laptop | ray_laptop: ok thanks | 18:13.07 |
ray_laptop | I'll go run an errand in the meantime. | 18:13.16 |
henrys | anybody have opinions on cloud backup services? | 18:16.46 |
mvrhel_laptop | ok email sent. I was planning to work on the windows viewer more today. henrys, do you want me to switch to figuring out what needs to be done on this softmask stuff? | 18:25.31 |
henrys | I'd like to get it done, what do you think? | 18:26.52 |
mvrhel_laptop | I think it has festered long enough that a little push might help it along | 18:27.23 |
| let me find the soft mask in a soft mask file from years ago and work with that and Robin_Watts custom file | 18:28.25 |
| ok looks like 691803 was the soft mask in a soft mask case | 19:35.32 |
| I need to run a couple errands. bbiab | 19:36.31 |
ray_laptop | mvrhel_laptop: give me a call when you dive back into this SMask issue. I had some thoughts on how to handle the issue in the interpreter that would allow us to cover both cases of the SMask replacement (that causes the rendering diff) AND the multiple rendering of the SMask, but not interfere with the nested SMask case | 21:21.33 |
| mvrhel_laptop: I saw that you found a nested SMask case, but I have a couple of cairo files that also do this (from other bugs) | 21:22.05 |
mvrhel_laptop | ray_laptop: I am back if you wanted to tell me your thoughts about the soft mask | 23:46.18 |
| Forward 1 day (to 2013/02/20)>>> | |