Log of #mupdf at irc.freenode.net.

Search:
 <<<Back 1 day (to 2020/08/12)Fwd 1 day (to 2020/08/14)>>>20200813 
pedr0 I've a question regarding the fz_quad struct. are 'ul, ur, ll, lr' upper-left upper-right lower-left, lower-right ?11:34.12 
Robin_Watts_ they are the images of those points after transformation, yes.11:37.20 
  (at least, that's what I believe)11:37.34 
malc_ a confidence inspiring comment form one of the authors..11:40.42 
Robin_Watts_ I didn't code any quad stuff :)11:41.55 
  (and Acrobat disagrees with the spec in at least one case in terms of the ordering of points used for highlight regions)11:42.34 
malc_ excuses excuses :)11:42.59 
  ouch11:43.07 
Robin_Watts_ (so it's just possible that the quads might actually be in an unexpected order, hence my caution)11:43.07 
pedr0 :-)11:44.02 
  I see, but the coordinate space is (0,0) bottom left of the plane/page11:45.04 
malc_ pedr0: innocent questions only appear to be that on a very shallow inspection, look deeper and the con of worms presents itself11:45.12 
pedr0 :-))11:45.53 
malc_ con... sigh... can...11:46.38 
malc_ crawls back under his rock11:46.57 
pedr0 eheh11:47.10 
Robin_Watts_ pedr0: For PDF, yes. For mupdf, no :)11:47.45 
  basically, you shouldn't assume anything about the position of the quads.11:48.22 
  If the page is rotated, they obviously they'll be completely messed up.11:48.39 
  All you need to know is that they are 4 positions on the page that are the corners of the highlights.11:49.07 
kens IIRC rectangles have to specify two opposite corners, they need not be lower left and upper right11:49.25 
  In PDF11:49.35 
Robin_Watts_ kens: Right. But the image of those rectangles after transformation may not be axis aligned. Hence why corners of such rectangles are given as quads.11:50.40 
kens Which is why Ghostscript's PDF interpreter has code specifically for 'normalising' rectangles11:50.47 
  OK I wasn't certain if the question was PDF or MuPDF11:51.24 
Robin_Watts_ See page 634 of pdf_reference17.pdf11:51.25 
  It's both a PDF and a MuPDF thing.11:51.39 
pedr0 I see, but given the stext 'view' is there an easy way to know the coordinates *on the screen* of a given <char> ?12:08.42 
Robin_Watts_ Let's talk about a concrete example.12:09.22 
  platform/win32/debug/mutool.exe draw -o - -F stext ../MyTests/pdf_reference1712:10.43 
  .pdf 112:10.43 
  That gives me an stext dump for page 1 of pdf_reference17.pdf - I believe you have the same file, yes?12:11.10 
pedr0 one jiff12:11.20 
  I've the book, hard paper of a previous version ..12:11.52 
Robin_Watts_ ok, so fetch http://ghostscript.com/~robin/pdf_reference17.pdf and you have the file too :)12:12.17 
pedr0 thanks, doing it12:12.43 
  ok, here I am12:14.39 
  got it, right in front of me12:14.52 
Robin_Watts_ So, for the first character on the page, a 'P', I get:12:14.55 
  <char quad="119.94 61.226316 146.772 61.226316 119.94 115.898319 146.772 115.898319" x="119.94" y="103.898319" color="#000000" c="P"/>12:15.04 
pedr0 Yap12:15.09 
Robin_Watts_ So, the 'origin' for that character is at the x/y position. 119.94, 103.89831912:15.38 
  The origin being (for latin fonts at least), the left hand side on the baseline of the char.12:16.01 
pedr0 relative to the page ?12:16.15 
kens Fonts have the origin at bottom left, even for non-Latin glyphs12:16.24 
pedr0 x/y of the box describing the page or the glyph ?12:16.37 
Robin_Watts_ I haven't mentioned any box yet.12:17.10 
pedr0 sorry12:17.15 
Robin_Watts_ The x,y of the origin of the glyph is at the point I said.12:17.39 
  This is in mupdf coordinates, with (0,0) being the top left of the page, and y increasing downwards.12:18.28 
  the page being 531 wide by 666 down.12:19.16 
  OK so far?12:19.18 
pedr0 yes12:19.21 
Robin_Watts_ So, now let's look at the quad.12:19.31 
  The quad gives 4 points.12:19.43 
  If you draw the convex hull of those 4 points, you'll get a box that encloses the glyph.12:20.04 
  If you're happy to ignore non-axis aligned glyphs (or page rotations), then you can find the bbox for the glyph by taking the union of those 4 points.12:21.33 
  Does that make sense?12:21.52 
pedr0 Let me try to digest this a little bit12:24.25 
  the coordinate of the quad, are they MUPDF or 'PDF' ?12:25.01 
  I think they are MUPDF, but I am double checking12:25.14 
  No, I am confused, but it isn't your explanation is that I am not sufficiently acquainted with the matter.12:30.31 
  The reason why I am banging my head is that I am drawing a PDF using PDF.js on a canvas, fair enough, I set the scale to be 1, hence not scaled. Then I draw the quads on the top if it and I can tell that the positions don't match at all12:45.28 
  Although I am starting to be skeptic about the rendering as well.12:46.32 
Robin_Watts Sorry, powercut.13:17.33 
  all the coords are mupdf, obviously. Mixing them within a file would be mad.13:17.50 
pedr0 :-)13:22.04 
  I found the problem, it all makes sense. I still don't get why a rotation messes the quads up as I thought they would 'include' the rotation. But I definitely ignorant on the matter and I don't need to understand everything in a single shot13:23.34 
  *I am*13:23.48 
Robin_Watts "messes the quads up" ?13:25.04 
  I suspect that 'ul' 'll' etc are in terms of the pre-transformed text objects.13:26.07 
  after the transformation, the upper left corner may not be the upper left corner any more.13:26.28 
  (in terms of the position on the page)13:26.41 
ator Robin_Watts: that's why I named them ul ll ur ul because adobe can't keep the numeric ordering of them straight13:36.05 
  mupdf uses its own coordinate system that it shares with all the possible input formats13:36.22 
  pdf.js uses the pdf coordinate system13:36.28 
  usually the translation between them is trivial, but rotation and UserUnit can easily mess you up13:36.44 
pedr0 Robin_Watts: does the same reasoning apply to x,y ?14:00.47 
  They are coordinates ante text-object transormation14:01.06 
Robin_Watts pedr0: All coords given are post transformation.14:01.19 
pedr0 Ah, I see. You meant the 'names' are in terms of pre-trasformed object14:02.11 
  ul, ll ..14:02.15 
Robin_Watts yes.14:02.15 
pedr0 Thanks :-)14:02.20 
 <<<Back 1 day (to 2020/08/12)Forward 1 day (to 2020/08/14)>>> 
ghostscript.com #ghostscript
Search: