IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2016/01/14)20160115 
tor8 chrisl: thanks for helping mvrhel with his CMap confusions!00:15.49 
  Robin_Watts: <br> is tricky, but it should work...00:16.30 
Robin_Watts <p>foo<br>bar</p><p>baz</p> is rendered as foo\nbaz00:23.51 
  <p>foo<br></br>bar</p><p>baz</p> is rendered as foo\nbar\nbaz00:24.12 
  tor8: Let's talk about this stuff tomorrow. Too late now.00:24.31 
bofh_ ugh so I decided to try to implement something like xpdf's pdftotext -layout option to the span extraction and this has revealed to me just how complicated of a mess pdf structured text extraction is.02:31.57 
  also, unrelated: is there any benefit to caching decoded images/pixmaps? on several test systems I had there was none (in fact it was often slower), but these were all x86_64 desktops or fast armv7 boards.02:32.49 
  no idea if it differs on mobile, but that should always be memory-bound, but re-decoding means less data copied than pulling a decoded copy out of the store, at least for anything non-tiny02:33.32 
  (this is all mupdf btw)02:33.46 
kub hi, I want to obtain cmyk raster with 3 drop sizes. 09:19.49 
  What works: -sProcessColorModel=DeviceRGB -sDEVICE=ppmraw -dGrayValues=309:20.06 
kens So you are talking about Ghostscript09:20.18 
  I have no idea what you mean when you say '3 drop sizes'09:20.31 
kub yes09:20.34 
kens And if you use ProcessColorModel=DeviceRGB then you are not producing CMYK09:20.53 
chrisl Also, ppm is an RGB raster format09:21.16 
kub with DeviceCMYK I obtain a Unrecoverable error: rangecheck in .putdeviceprops09:21.23 
kens That's because (as CHris just said) ppmraw is an RGB format09:21.42 
  SO you can't use CMYK with it09:21.48 
kub the tiff* devices produce all the rangecheck error09:21.49 
kens You will need to supply an example file and command line for us to look at it. Probably best if you just open a bug report.09:22.17 
  And I still don't know what you mean by 'drop sizes'09:22.28 
kub :-) 3 drop sizes can our InkJet head print09:23.25 
kens I doubt you can have 3 shades of gray09:23.58 
  So I presume there's a 'nothing' drop size for a total of 4 values09:24.16 
  So you need 2 bpp09:24.22 
  By the way, are you a commercial customer ?09:25.03 
  Setting GrayValues to 3 looks like an invalid number, it should be 1, 2, 4, 809:25.46 
chrisl FWIW, doing a simple "showpage", the above command line (with the addition of an output file) works without an error for me, with the current version09:25.57 
kens Looks like GrayValues should be 4 in ths case09:26.39 
  Goodness knows what that looks like with halftoning09:26.54 
kub gs -q -dPARANOIDSAFER -dNOPAUSE -dBATCH -r600x600 -sProcessColorModel=DeviceRGB -sDEVICE=ppmraw -dGrayValues=3 -sOutputFile=temp.ppm ~/.local/share/ghostscript/9.18/examples/text_graph_image_cmyk_rgb.pdf09:27.43 
  that works -^09:27.54 
kens But it doesn't do what you thnk it does.09:28.13 
kub maybe09:28.21 
kens Its not CMYK output and I doubt that it is 2 bits per pixel either09:28.34 
kub is in GS some documentation about halftoning, diffusion, levels of gray parameters, I did not find so far09:29.03 
kens kub please answer the earlier question, are you a commercial customer, and if not are you representing a printer manufacturer (You say 'our print head' for example)09:29.14 
  Ghostscript implements the PostScript halftoning method and there are specific Ghostscript tecniques09:29.54 
  Please answer the previous question09:30.02 
kub I am R&D and contacted sales@ but obtained no reply so far. ATM I evaluate different RIP's including GS09:31.50 
kens OK that's fine. When did you contact sales ?09:32.13 
chrisl I think the documentation about GrayValues is misleading: "-sDEVICE=ppmraw -dGrayValues=16 will make this the default device and set the number of bits per component to 4" - PPM only works in 1 or 2 byte samples.09:32.21 
kens Sometimes they lose emails, if you haven't heard we can poke them for you09:32.24 
kub 11.1.09:32.35 
  kens thanks, would be glad to read back09:33.11 
kens 4 days is too long, can you forward the email to support (@artifex.com) and I will ensure they contact you09:33.14 
  What version of GS are you suing, and on what platform ? (Linux, Windows, something else)09:33.41 
kub kens - forward done; I develop mainly on Linux and will deploy Windows09:34.51 
kens OK but are you using Linux right now ? We need top reproduce your problem before we can help you09:35.22 
kub ys09:35.29 
  yes09:35.32 
kens OK which Linux, and where did you get the version of GS you are using ? Did you get a package or build it yourself ? What version is it ? and are you using the 64-bit version ?09:36.13 
  Better yet, post the whole back channel output when you run the failing setup09:36.42 
  And please let us know the command line that *doesn't* work09:36.58 
kub oS-Leap41 with gs-9.18 tar ball09:37.11 
kens Because I don't thnk ppmraw is going to be a good format for you09:37.12 
chrisl And preferably *stop* using "-q"09:37.22 
kens I'm assuming that your printer is CMYK, so you need to use either a separating device (4 rasters, one each of C, M, Y and K) or a composite CMYK device. TIFF seems like a good choice for an evaluation. So lets get that working.09:38.45 
kub http://www.behrmann.name/temp/gsgrayvalues3.output.txt first version09:39.51 
kens Canyou do that without -q please09:40.12 
kub bbl09:40.16 
chrisl tiffsep does not support GrayValues, I think09:41.44 
kens Possibly true09:41.57 
  Which might explain the rangecheck error09:42.05 
  We must have some way to produce CMYK halftoned output though09:42.41 
  bitcmyk maybe ?09:43.00 
chrisl I think bitcmyk is the only way to get *2 bit* CMYK. tiffsep1 will produce 1bpp halftoned output09:43.55 
kub chrisl http://www.behrmann.name/temp/gsgrayvalues3withoutPoption.output.txt09:44.32 
kens Well he'll need 2 bit if he wants 3 drop sizes (plus none of course)09:44.33 
  kub the first thing is that is 9.15, not 9.1809:44.54 
kub I was ;-)09:45.02 
  I saw09:45.09 
kens SO you aren't using the version you think you are :-)09:45.23 
  I'd guess that is a version of GS bundled into the operating system09:45.36 
  From what Chris and I can see you need to use the bitcmyk device and GrayValues=4 in order to get CMYK rasters, halftoned, with 4 values (0, 1, 2, 3)09:46.16 
kub checked with 9.18 its the same rangecheck message with different version info09:46.38 
  I'll try,09:47.00 
kens kub it looks like our mail server marked your mail to sales as spam, I'm just forwarding it on now. If you don't hear back in a day or so, please do let me know and I'll poke them again.09:47.12 
  If you give me a minute, I'll try and come up with a command line for you (unless Chris beats me to it)09:47.31 
chrisl Note that bitcmyk will output *raw* data - it won't be wrapped in any image file format09:47.43 
kub -sDEVICE=bitcmyk -dGrayValues=4 gives no error09:48.15 
kens OK well that should be producing what you need, but as raw pixels, as Chris says, its not an image format09:49.50 
  I have no idea what you could use to read that09:50.15 
  There are other screening possibilities, in addition to the standard PostScript methods, but I'm not really up to date with them.09:51.19 
  I guess the first thing is to experiment with ths output format, and come back when you have more questions. We'll do our best to answer them, oor get answers for you from the developers who know more about the screening (they are in the US so won't be here for a few hours)09:52.43 
chrisl Photoshop can read raw files, but I don't know about 2bpp09:52.48 
kens chrisl is the output separate files or one composite ?09:53.07 
kub I can wrap the bits, np09:53.29 
kens Oh, OK09:53.35 
chrisl IIRC, the bit* devices are composite09:53.41 
kens can't decide if composite or separated is easier :-)09:54.06 
  kub is that enough for you to start with ?09:54.18 
kub It's a start to estimate how beautiful the raster will be printed09:56.28 
kens OK then I suggest you start that way, and we'll ask our colour epert about screening this afternoon when he comes online.09:57.06 
kub thanks for your help09:57.32 
kens If you can come back in about 6 or 7 hours then we can give you some more help09:57.34 
  Or drop in on Monday and we'll tell you what we've found out:-)09:57.49 
  kub your web site appears to be non functoinal ths morning. All I get is a blank page....10:15.18 
kub kenshttp://www.behrmann.name looks fine from at least two locations10:17.51 
chrisl kub: I think kens was meaning dropjet.com10:22.11 
kens Yes,http://www.dropjet.com doesn't do anythign for me10:26.06 
tor8 Robin_Watts: a lone BR tag in xhtml must be closed <br/>10:26.20 
Robin_Watts tor8: Ah, so we need to use <br></br> then.10:26.56 
  That does work.10:27.01 
tor8 so the "<p>foo<br>bar</p><p>baz</p>" is parsed as <p>foo<br>bar</br><p>baz</p>[and implicit </p>" since I cheat and don't actually check that a closing tag matches10:27.45 
  Robin_Watts: or just <br/>10:28.04 
Robin_Watts tor8: right.10:28.34 
chrisl tor8: So, I'm hoping I helped clarify the CIDFont/CMap/cmap/ToUnicode things for mvrhel_laptop, and not made things worse!10:29.17 
Robin_Watts So, experimenting with the code, I see that: <p>A<b>B</b>C</p> results in 3 calls to generate text with "A" "B" and "C" respectively.10:29.35 
tor8 chrisl: everything you said was perfectly clear and accurate! (to me, at least... I guess I also have font related job security...)10:29.53 
kens But do you want it.....10:30.33 
Robin_Watts The bidirectional algorithm needs to be passed whole lines (or paragraphs) at a time because the directionality of certain chars depends on context.10:30.45 
  So I reckon it needs to be done at a higher level than generate_text.10:31.12 
tor8 Robin_Watts: can we keep the current paragraph directionality as an in-out parameter that gets passed to generate_text?10:32.09 
Robin_Watts tor8: That won't help.10:32.26 
tor8 or do you need to look both before and after the current char to determine directionality?10:33.15 
Robin_Watts Some chars have different directionality according to the stuff around them, not just the 'current paragraph directionality'.10:33.15 
  "A" "(" "B" for example.10:33.37 
tor8 you could put those dependent bits in a fragment of its own and set the directionality bit to 'depends' in generate_text, and resolve it as a post process?10:33.55 
Robin_Watts The "(" is only L2R if B is.10:33.58 
tor8 it'll mean splitting unneccessarily (gah, I can't spell that word) but I think that'll be easier than splitting afterwards10:35.01 
kub kens chrisl - indeed that link is down, thanks for letting me know10:37.15 
kens NP10:37.21 
tor8 Robin_Watts: so, the lex_number diff is on tor's bmpcmp now10:37.35 
kens I just thought I'd look at dropJet's products to get some familiarity, I was able to get a good overview from the ESMA site though10:37.56 
tor8 Robin_Watts: most of them actually look like progressions10:39.54 
Robin_Watts tor8: the rules for exactly how to recognise fragments are hard. That's why everyone uses the same piece of example code from unicode to do it.10:40.48 
  My preference, I think would be to do a pass over the text after we've parsed it.10:41.11 
tor8 okay, then we'll need to either split flow nodes during that pass or keep directionality per character rather than per flow node10:42.08 
Robin_Watts Run through the flow a paragraph at a time, gathering the text up. Feed that text into the unicode algorithm to get directions out, and split the flow accordingly.10:42.31 
  So at the end of that pass we have the boxes tagged with appropriate directions.10:42.58 
  Then layout just needs to be updated to cope with using those directions.10:43.13 
tor8 Robin_Watts: right. so same result as I was thinking of, but done as a separate post-process by splitting nodes10:44.27 
Robin_Watts yes.10:44.46 
tor8 okay. good.10:47.47 
Robin_Watts How do you explain the difference in 9 ?10:47.51 
  28 looks like a clear progression.10:47.51 
  34 too.10:47.51 
kens I see 2 clear progressions, a bunch of pixely 'who cares' diffs and some oddities10:48.32 
  KenEg catx5720.pdf10:48.52 
  Oh damn10:48.52 
  I mean catx5720.pdf10:49.00 
  #80 looks like a progression too, not surprising given the bug title10:50.03 
  Also 8610:50.20 
  92 looks worse10:50.56 
chrisl We should probably edit that out of the logs ^^10:51.01 
kens Yeah please, if someone could do that, sorry10:51.16 
chrisl Robin_Watts: can you do the honours please?10:51.38 
Robin_Watts Ok, so if you're happy with 9, 80, 88 and 92, then I'm happy.10:52.09 
tor8 Robin_Watts: syntax error in the font descriptor object10:52.14 
kens For number 92 MuPDF matches GS but not Acrobat, ths may be a first example of Acrobat doing something other than setting to 010:52.19 
tor8 it has "/ItalicAngle -17.-21823" which got turned into 3 tokens10:52.29 
  and then parsing failed because -21823 is not a valid dictionary key10:52.42 
kens It 'looks like' Acrobat turns '--' into '-'10:52.57 
Robin_Watts so, -17.-21823 goes to -17.21823 ?10:53.26 
kens No,10:53.37 
  --17.2 goes to -17.210:53.47 
  ~92 has lots of values with doublte negatives10:54.00 
tor8 no, it goes to "-17." and the rest is discarded10:54.03 
Robin_Watts edits logs.10:54.12 
kens defers to tor10:54.13 
  GS interprets teh '--' as invalid and sets the numbers to 010:54.35 
  Whch gives the same result as MuPDF10:54.45 
  However Acrobat differs10:54.54 
  It looks to me like Acrobat is turning the '--' into a '-' and using the rest of the number10:55.12 
tor8 kens: I'm still talking about number 910:55.34 
kens Oh sorry, I was talking about #9210:55.46 
  WHich is actually a slight regression10:56.17 
tor8 kens: so for 92, if I change the '-' detection to a while loop it looks like the pdf creator intended10:57.47 
  so instead of if (c == '-') I just while (c == '-') to eat them all10:58.13 
kens So you truncate the '--' back to a '-' ?10:58.15 
  Right10:58.19 
  Its the first instance I've seen where a malformed number is corrected instead of set to 010:58.36 
tor8 but I don't actually negate the sign twice10:58.38 
kens Yeah that's what I thnk is 'correct' or at least 'same as Acrobat'10:58.57 
  I guess I'll have to try and do that in GS as well :-(10:59.20 
  For #9 I don't see a great difference even with the GS output which sets the -18 to 010:59.39 
  err -17 that is10:59.55 
tor8 kens: the font is embedded so I don't expect the -17 to actually show up in the render11:00.18 
  we dropped the embedded font because we got confused while parsing the dictionary and errored out11:00.42 
kens Hmm, OK but I thought it might affect GS, seems it doesn't11:00.42 
  Oh OK11:00.52 
Robin_Watts tor8: Are you up for the parliament trip in June?11:01.18 
kens overall I'd say its a distinct improvement, and if you treat '--' as '-' its even better11:01.20 
tor8 85 is probably the best test to see if adobe sets to 0 or parses the initial bit11:02.10 
kens Hmm, let me look at that one agin11:02.38 
  Acrobat actually throws a warning11:03.24 
  But it has only the largest square11:03.33 
  SO it is different to MuPDF and GS11:03.48 
tor8 kens: 40.-40 40+60 160 160-1 re s 80e0 80.abc 80 80 re s11:04.00 
kens bangs head on table11:04.19 
  Obviously a hand-broken file11:04.33 
tor8 yes, it is. but it would show what acrobat does. what kind of warning does it toss out?11:04.54 
kens the usual 'something is wrong and hte page may not display as expected'11:05.14 
tor8 right, I forgot how useful adobes errors are :)11:05.32 
kens From prior experience, Acroibat stops processing when it throws that error11:05.33 
  So some part of that broken rect is causing Acrobat to give up11:06.04 
  I could repair bits of it to see where it stops I guess11:06.17 
tor8 Robin_Watts: not sure; how soon do I have to make up my mind?11:06.20 
kens It looks like its OK with the first rect, but the second it throws an error11:06.52 
  I'm guessing its the .abc11:07.01 
tor8 kens: I suspect the 80e0 since it looks hexadecimal11:07.30 
kens Give me a second, just changing it11:07.40 
tor8 but if not, then both 80e0 and 80.abc should be the same, number followed by a word11:08.00 
kens If I take out the e0 then it just displays the first rectangle11:08.07 
tor8 80 then e0 and 80. then abc11:08.09 
Robin_Watts tor8: Just trying to get an idea of numbers.11:08.23 
kens So it looks like the e0 actually makes Acrobat error out11:08.29 
Robin_Watts I'll put you down as a maybe.11:08.41 
kens Interesting, the 80.abc is not treated as 011:09.34 
tor8 Robin_Watts: Thanks. My disdain for politicians and everything they do might be overcome by the group's enthusiasm.11:09.48 
kens Nor is it treated as 80, wtf ?11:10.03 
tor8 kens: huh, that's .... odd11:10.16 
kens repeats the tests11:10.27 
tor8 does it try to parse it as 0x80.abc ?11:10.28 
kens Hard to say11:11.04 
  If I change it to 0 80.abc 80 80 re s then it shows nothing11:11.21 
  If I change it to 0 80 80 80 re s then it strokes a rectangle at 0,8011:11.40 
  If I change it to 0 0 80 80 re s then it strokes a rectangle at 0 011:12.03 
  SO what's it doing with the .abc ?11:12.16 
tor8 very inconsistent handling of numbers then!11:12.18 
kens and teh .abc doesn't throw an error either.....11:12.46 
tor8 0 80 0 80 [ignored 80] re maybe?11:13.00 
kens Hmm, that could be11:13.11 
  THa't'd be a 0 widht rect11:13.19 
  let me put the .abc in the first number11:13.34 
tor8 it'd still show up, it's stroked not filled11:13.51 
kens Its not showing up at all with .abc no matter what I do11:14.20 
  And not giving an error11:14.26 
  Taking away one of the opernds doesn't do anything either11:14.55 
  It looks like Acrobat is sliently ignoring the error11:15.06 
  OK so if I *deliberately* create a rect with too few operands, Acrobat silently ignores it11:15.43 
  Hmm11:16.18 
  Oh boy11:16.48 
  It looks like Acrobt throws away the malformed number, then because there are too few opernds for the 're' it doesn';t draw it. But it doesn't throw an error either.11:17.18 
  What a pile of poo11:17.25 
  Obviously it 'fixes' at least one of the numbers in the larger rectangle11:18.04 
tor8 kens: indeed, I think the conclusion is, acrobat does arbitrary stuff to broken numbers.11:18.49 
kens So ths: "0.abc 0 0 80 80 re s" produces an 80 rectangle at 0,011:19.12 
  Whereas this : "0.abc 0 80 80 re s" silently produces no rectangle11:19.32 
  I don't thnk its really worth trying to duplicate this insane behaviour11:19.56 
  At a guess, Acrobat ignores numerals and signs in the middle of a number, truncating the number from that point. So "40.-40 40+60 160 160-1 re s" becomes " 40 40 160 160 re s"11:21.32 
  But alphas in a number it throws an error on11:21.56 
  The 80e0 I'm not so sure what its doing11:22.31 
  Except that it throws an actual error on that one11:23.18 
  endstreamYeah 80e0 throws an error, 80.e0 does not. Madness11:24.47 
tor8 so the integer and real parsers differ in how they handle errors. madness indeed!11:25.11 
kens Well I wouldn't like to guess what's going on behind the screen11:25.29 
  It might be that they are saying that a .x is a missing traling 0, whereas a alpha in a number is not a missing whtespace11:26.09 
  In any event, I thnk your current approach is more than good enough11:26.29 
  I'll have a poke at GS and see if I can get it to treat '--' as '-' as well :-(11:26.48 
  Hah, GS already treats 40.-40 as 40.0, I didn't know that11:27.57 
  But 160-1 gets turned into a 011:28.10 
Robin_Watts ok, so, tor8: I need to code up that second pass now.11:33.08 
  Am I right in thinking that to find all the text in a paragraph I do a depth first search breaking the text at each 'break' node ?11:34.09 
  Or is this the time we should be looking at http://www.unicode.org/reports/tr14/ ?11:36.45 
tor8 I've sort-of hacked a partial implementation of tr14 already -- it's what creates the 'break' nodes in the first place11:47.30 
  sorry, 'glue' nodes11:47.43 
  the fz_html boxes that get spit out from generate_box come in four flavours11:48.43 
  BLOCK, BREAK, FLOW and INLINE11:48.54 
  for bidi you only care about the FLOW boxes11:49.13 
  ugh, I can barely remember how these things are strung together11:50.27 
  anyway, each BOX_FLOW has a paragraph or possibly more, depending on the presence of <br/> tags or being a <pre> tag, which will show up as FLOW_BREAK nodes11:51.44 
kens lunches12:29.53 
NTQ Is there an example on how to create PDF/A-2a? At the moment I always get PDF/A-2b. If there is no example, can I upload you my test scenario?12:37.16 
chrisl NTQ: I don't know for sure, but I suspect that PDF/A-2a has many of the same requirements as PDF/A-1a. In which case, the answer is covered here: http://ghostscript.com/FAQ.html13:01.34 
NTQ chrisl: Thank you. So because PDF/A-1a is not implemented, you also did not implement PDF/A-2a I guess.13:10.01 
  Because both of them have nearly the same restrictions.13:10.24 
Robin_Watts tor8: Gotcha, ta, I'll give that a whirl.13:11.31 
chrisl NTQ: The information to produce A-1a (and I assume A-2a) is not available by the time we (Ghostscript) see the input.13:13.28 
kens Chris is correct, we cannot make PDF/A-xa files13:16.10 
  The spec (PDF/A-1a) specifically says you sare not supposed to guess at the document structure and without that, you cannot make a 'a' file.13:16.34 
NTQ kens: Thank you. Then I will ask our costumer if he also would accept PDF/A-2b.13:17.33 
  The main reason why I want to use PDF/A-2 is transparency. We sometimes received PDF documents with transparent images. After creating a PDF/A-1b from such a document the whole page gets rendered as an image, so text too. And a few weeks ago I heard from you that it is not possible to only render the image again, excluding the text.13:20.19 
kens You cannot easily tell whether any portion of the text is partially or fully transparent, so you have to render it all.13:21.07 
tor8 Robin_Watts: I expect we'll need to add arabic/hebrew fonts to mupdf now then?13:22.09 
Robin_Watts tor8: some kind of fallback mechanism, yes.13:22.36 
tor8 Robin_Watts: it'll be easy enough to merge in DroidSansArabic and DroidSansHebrew into DroidSansFallback13:23.24 
Robin_Watts but that doesn't solve for other languages.13:24.00 
  would be nicer to have a generic fallback system that could cascade through a set of script fallbacks.13:24.24 
tor8 Robin_Watts: yeah, agreed13:24.32 
  we only have a two-level fallback now13:24.37 
NTQ kens: Sorry, I am not a PDF expert. But if I create a PDF/A-1b with Adobe Acrobat it recognizes exactly which parts of a page have to be rendered new and which not. What makes it hard to identify these parts of a page where a transparent image has any effect?13:28.16 
tor8 Robin_Watts: I'll take a stab at making a cascading fallback font system13:30.05 
kens NTQ I didn't say it was impossible,I said 'easily'13:32.35 
  Say I draw some text, then paint some more stuff, then create a transparency group and draw through it. If paret of that group intersects the text, then the text must be rendered to an image13:33.30 
  But by the time we get to the transparency operation, we've already stored the text in the output PDF file.13:33.50 
  Its not impossible to preparse the entire PDF file, but it would mean totally rewriting our PDF output device, and frankly that's not going to happen13:34.31 
  The benefit is small, the cost is huge13:34.47 
NTQ kens: Alright. Thank you. I fully understand now.13:35.11 
chrisl We can produce PDF/A-2b IIRC13:35.55 
kens We cna, yes13:36.02 
  Possibly even PDF/A-3 now I thnk13:36.16 
tor8 Robin_Watts: on tor/master there's a quick fix that merges DroidSansArabic and Hebrew into the CJK fallback fonte13:59.21 
Robin_Watts Ta.13:59.41 
tor8 Robin_Watts: there's also a "direction" property in CSS that I don't currently pass on14:00.35 
  Robin_Watts: cocked up something with the encoding in that one, there's a new version of the commit up now14:13.11 
Robin_Watts is just boggling at these html structures.14:51.31 
  surely they take a HUGE amount of memory ?14:52.26 
  44 bytes for every flow entry. And there is a flow entry for every word, plus another for every space.14:53.14 
  The type and expand can be combined into a flags word.14:55.11 
tor8 Robin_Watts: not to mention just how damned many of them there are! the fz_html_flow struct is overdue for a diet14:55.34 
Robin_Watts I reckon we should be reference counting styles and sharing them where possible.14:55.59 
tor8 Robin_Watts: the *style is a pointer to the box's embedded struct14:56.13 
Robin_Watts Ok, so can't we just omit that and always pass both a box pointer and a flow pointer ?14:56.45 
tor8 a flow box is always a child of a block box14:57.23 
Robin_Watts fz_html_flow is always a child of an fz_html, you mean ?14:57.56 
tor8 but the inline boxes are also children of the block box, but the text content of the inline box lives in their sibling flow box14:57.59 
  and fz_html_flow is a child of a fz_html with the FLOW_BOX type14:58.21 
  but the flow->style does not necessarily point to the parent box's style14:58.35 
  it may point to it's uncle or cousin box's style14:58.48 
Robin_Watts So... if I have <p>Mary had a <b>little</b>lamb</p>14:59.42 
  we'd have an inline box for the <b> section, then a flow box with "Mary" " " "had" " " "a" " " "little" " " "lamb"15:00.56 
tor8 you get a box tree: { block[p] { inline(b) {}, flow {"Mary had a ", "little", "lamb" } }15:01.03 
  yeah15:01.05 
Robin_Watts and the style for "little" would point to the inline box.15:01.09 
tor8 yeah. I figured I'd save a *little* bit of memory (considering how much I already waste) by not making every flow node have its own style15:01.56 
Robin_Watts Gotcha.15:02.08 
tor8 the inline boxes I don't use for anything other than creating the flow nodes, but I have to keep them around just because they hold the styles15:02.20 
Robin_Watts tor8: I'd consider having a global style dictionary.15:02.28 
tor8 the inline boxes are needed for the css matching15:02.29 
  Yeah. that'd probably save a fair bit of memory.15:02.47 
Robin_Watts and then instead of having pointers to the style, have indexes into the dictionary.15:02.52 
tor8 considering that each html node may have unique style attributes, but the vast majority of them will be shared15:03.14 
Robin_Watts Would we still need the inline boxes then?15:03.30 
tor8 no, then we could free them once we're done15:03.41 
  but everything is allocated using the pool allocator now15:03.56 
Robin_Watts We could allocate inlines using a different pool allocator.15:04.11 
  and then free that pool.15:04.21 
tor8 yeah.15:04.23 
  the fz_css_style could use bitfields for a lot of its fields15:04.47 
Robin_Watts Paragraphs never extend outside a flow block, right?15:05.15 
tor8 and a lot of the flow properties are computable with a bit of care, so don't need to be stored15:05.31 
  define "extend"15:05.52 
Robin_Watts block { flow { "This is a different paragraph" } block { flow "to this" } }15:06.14 
  block { flow { "This is a different paragraph" } block { flow { "to this" } } }15:06.30 
  When computing the directions of the text, I need to pass whole paragraphs to the code at once.15:07.09 
tor8 a single paragraph is never split into multiple flow boxes15:07.24 
Robin_Watts That means passing the contents of a whole 'flow' at once, never having to combine multiple flows together.15:07.33 
  Cool.15:07.35 
tor8 a line break is always at the end of a flow box15:07.47 
  A smarter/dumber way is to not have the flow nodes at all and just have an array of where the spaces and breaks are in the text15:09.23 
  it'll mean more work during rendering, but would save huge amounts of memory15:09.34 
  and assign styles to spans of text15:10.07 
  so the flow box would look something like struct { char *text; char **spaces; char **breaks; style *styles; char **style_starts; }15:11.05 
  and then another array of breaks actually taken15:11.49 
Robin_Watts Or, make use of some of the utf8 unused codes.15:11.52 
tor8 or just plain old escape codes15:12.25 
Robin_Watts so char *text becomes a list of either valid utf8 codes, or invalid ones that act as escapes for 'break', 'change style' etc.15:12.40 
tor8 though I think we should hold off optimizing this too much until we've implemented a bit more15:13.52 
  bidi, floating around images, tables, hyphenation and tex-style global line breaking optimization15:14.18 
Robin_Watts tor8: Yeah.15:14.25 
tor8 this structure is wasteful, but it's designed for rapid prototyping15:14.40 
Robin_Watts I need to add a direction flag to fz_html_flow_s.15:14.46 
  so to do that I'll move expand and type and direction into a single bitfield.15:15.02 
tor8 Robin_Watts: sounds good.15:15.12 
  you could put text and image in a union15:15.36 
Robin_Watts will do.15:16.04 
tor8 the x,y,w,h stuff is used for line layout so needs to stay15:17.51 
  the 'em' is calculated from the style, but depends on the tree context and the current font size set during layout so needs to be stored as well15:18.59 
Robin_Watts tor8: I understand the need for w and h (to avoid repeated measuring). I don't get the need for x and y to be in the structure.15:19.42 
tor8 it's where the layout puts them so the drawing code can draw the node without redoing the layout15:20.11 
  Robin_Watts: one way to skip the x,y,w,h,em fields would be to create the fz_text node during layout instead of during drawing15:27.50 
HenryStiles Robin_Watts, mvrhel_laptop: have you guys tried to login to RSA? I went through the entire process and now it says it doesn't know me. Pretty sure I did everything right.16:33.38 
Robin_Watts HenryStiles: Yeah, worked for me.16:36.11 
  Well, i've registered etc, if that's what you meant.16:36.36 
HenryStiles huh, it worked the second time around.16:40.50 
kub mvrhel_laptop: hello17:27.04 
  mvrhel_laptop: how is GS Even Tone Screening invoked. We need it with 8/16bit CMYK for producing contone colors in 4 levels of gray.17:41.27 
Robin_Watts kub: Hi. Are you a commercial customer of Artifex?17:48.35 
jogux Robin_Watts: jub is the person kens was talking to this morning who hasn't yet had a reply from sales@, iirc.17:50.04 
  kub, even, sorry.17:50.17 
kens scott has replied, I've seen the email17:50.30 
jogux ah, I'd also not noticed kens had reconnected :)17:50.44 
kens yeah network is fl;aky17:50.57 
Robin_Watts Ok. I was interested to know if we (Artifex) had supplied the separate ETS code to kub, or whether he was trying to use the version of it that's pickled into the rinkj deviec in gs.17:51.40 
kens We won't have supplied any new code, at least as yet17:52.26 
Robin_Watts Ok, so I would expect it to be quite hard for kub to do any serious evaluation until he gets the latest version from us.17:53.02 
kens kub did you get an email from Scott Sackett ?17:54.49 
  OK I'm off for hte night, have a good weekend everyone18:05.19 
kub bbl18:45.27 
  Robin_Watts: not yet, but interested in becommig a commercial one19:06.42 
  kens: Scott Sackett did send me an email, and I replied.19:07.12 
Robin_Watts kub: OK. It sounds like we need to get you a copy of the latest code for evaluation. I'm not sure I'm authorised to just send it out.19:08.11 
kub Robin_Watts: is the ETS (EvenTone Screening) not inside AGPL GS?19:08.12 
Robin_Watts HenryStiles: What's the process?19:08.20 
  kub: There is an old version of the code in gs, as part of the rinkj device.19:08.40 
  If you're doing customisation and tuning, then we have a standalone version that is probably easier to work with.19:09.21 
kub aha19:09.21 
  ok19:09.28 
  -sDEVICE=rinkj -dGrayValues=4 gives a rangecheck error and without -dGrayValues I get a crash19:15.06 
  Robin_Watts: yes, the ETS code is appreciated for evaluating19:16.04 
Robin_Watts kub: I need to get the OK from HenryStiles to send it out. His OK may or may not be conditional on getting a signed evaluation agreement between you and Scott.19:17.10 
kub ah, git cloned, but that appears not sufficient from your wording19:21.09 
  Robin_Watts: will wait for your ping or/and email19:22.03 
HenryStiles Robin_Watts: sorry at lunch, it's fine to send it.19:47.26 
Robin_Watts kub: Email address?20:14.02 
HenryStiles it's fine not having the latest stuff out but is ets the reason for rinkj not working? I guess we don't know that.20:27.21 
  rinkj should work.20:27.35 
Robin_Watts I have never used rinkj in my life.20:29.32 
kub rinkj appears to need some setup, which I omitted20:30.31 
  http://ghostscript.com/doc/current/Devices.htm#Rinkj20:31.14 
HenryStiles kub: breaks for me with setup too.20:36.30 
  kub: so the first sentence of the documentation is spot on ;-)20:40.01 
  Robin_Watts: the device uses a color manager directly, it crashes lcms. What is truly bizarre is we have we make this call if lcms_deshandle is NULL, des_color_space = cmsGetPCS(lcms_deshandle) but the first thing that function does is dereference lcms_deshandle so something is awry in the code generally (rinkj aside)20:54.14 
  mvrhel_laptop: ^^^20:54.19 
  gsicc_lcms2.c:523 des_color_space = cmsGetPCS(lcms_deshandle);20:55.53 
  sorry I'll rewrite that gibberish if needed20:57.57 
mvrhel_laptop kub are you still there22:28.25 
  HenryStiles: I was able to login to the RSA, but it appears that I already had an account with my artifex email22:29.10 
  HenryStiles: It looks like rinkj is really screwed up. I will see if I can get it working after I finish up this font stuff in mupdf22:32.15 
HenryStiles mvrhel_laptop: a little more worried about gsicc_lcms2.c, maybe that case can never happen?22:58.44 
mvrhel_laptop oh let me look hold on22:59.04 
  that makes no sense.. hold on23:01.02 
  HenryStiles: I am going to have to take a closer look into this to understand when or how this case could occur. My comment /* We must have a device link profile. */ is a clue.23:05.09 
HenryStiles mvrhel_laptop: don't interrupt your mupdf stuff, but I thought you'd want to know about it.23:07.30 
mvrhel_laptop HenryStiles: thanks. I suspect it is supposed to be lcms_srchandle in line 526 but I will take a closer look at it later.23:08.01 
  I will open a bug to remind myself23:08.07 
 Forward 1 day (to 2016/01/16)>>> 
ghostscript.com
Search: