[fitz-dev] display tree models
Raph Levien
raph.levien at artifex.com
Fri Oct 10 22:58:29 PDT 2003
On Thu, Oct 09, 2003 at 04:06:40PM +0200, tor wrote:
> Hi Raph et al.
>
> I've given some thought to various ways to represent the page tree.
> There are two wildly different representations that I think are fairly
> good.
>
> The first is my original proposition from way back, where the nodes are
> graphical objects and all attributes are inherited from parent. This one
> would have nodes such as Path, Text, Image as graphical objects. Attribute
> setting nodes such as Transform, Color, LineAttrs would override the
> graphics state used for their children. Clipping would be a special case
> node, as would PDF 1.4 transparency...
>
> The special cases talk against it, but there are some advantages because
> it matches the PDF graphics model better in the case of XObject forms and
> Type3 fonts. Extra care is needed in the caching system, because the
> way a node is rendered depends on the attributes inherited from above,
> so the gstate needs to be part of the cache key.
I agree that it matches the PDF graphics model better. If we're to go
down this road, one question is whether to match PDF exactly, or only
partway. An obvious example is whether to split the graphics state
along stroke/non-stroke lines.
If we match PDF _exactly_, then XObject forms can always be rendered
with sharing. Otherwise, there may be cases where the shared node
would need to be copied, with different in-tree instantiations of the
appropriate PDF graphics state.
Also note that if line width and miter limit are part of the inherited
graphics state, then the computation of bounding boxes is
state-dependent. I find this more than a bit unappealing - I'd really
like to see each node have one bbox, and have that stored in the
node data structure, for fast traversal.
> The second is a distillation of our talk about thinking of the tree
> as a functional program from yesterday. I'm not sure if it holds up,
> but I thought I'd bring the idea up for discussion, because I find it
> strangely appealing in its simplicity.
>
> At the core of the model are abstract images, not the bitmap kind. The
> nodes of the tree are functions that take zero, one, or more arguments
> (which are images) and produce a new image.
The cool thing about thinking functionally that you get to play with
what the signatures are of the functions. Would it still fit your
model if we considered an "abstract image" to be a function from
affine transformation to raster image? (skimming over issues such as
antialiasing)
> Node types Path and Text produce an image which masks the area covered,
> and take no arguments. Color produces an image of a solid color.
> Transform takes one image, applies an affine transform on that image
> yielding a new image. In and Over are compositing nodes that take two
> (or more) images, combine them a la Porter-Duff.
Right. Images have either color or shape, or both, for each pixel, yes?
> The images have, as an optimisation, a bounding box. All source nodes
> (the ones with no arguments, that only produce data) have natural bounds,
> except Color which is unbounded. In intersects, and Over unions. Transform
> transforms :)
>
> [see attached illustration for an example of a display tree]
>
> A renderer would of course do various optimisations when rendering, such
> as keeping the transform as a graphics state passed down when rendering
> child nodes, and getting rid of intermediate images by combining various
> operations and compositing, as when an In node has two children, a Color
> and a Path.
Right. This is fairly consistent with what I have in mind, except that
in my data model, Path and Text nodes have implicit In nodes built-in.
In the PDF imaging model, Path and Text always either paint (in which the
"color" is a child node) or clip. I'm not sure what the benefit is of
having a separate In node, aside from beauty.
> How well would a scheme like this work with PDF 1.4 transparency?
In the general case, not all that well. They made PDF 1.4 too darned
powerful.
Let's look at that in a bit more detail. The PDF 1.4 compositing
operations can include a "blend mode", which looks something like
this:
result pixel = Blend(source pixel, backdrop pixel)
There are a bunch of Blend operations, some of which are very regular
(such as Multiply), others of which are not. For example, SoftLight
can really only be understood as a 2-D lookup table.
PDF 1.4 arranges these compositing operations into a group structure,
which is a tree. At each such "transparency group" node in the tree,
you get to specify various graphics state parameters such as an
overall constant alpha.
However, the network of dependencies does not necessarily follow this
tree structure. If you have a group that contains two subgroups, the
rendering of bottom subgroup is used as the "backdrop image" for the
purposes of rendering the top subgroup. Therefore, in the worst case,
there can be a serial chain of dependency threaded through all the
primitive objects in the tree, from bottom to top.
Note that classical Porter-Duff also fits into this model. If you look
at each painting step as a function from image to image, then a
painting step for an RGBA[1] image looks something like:
lambda(backdrop): over(rgba, backdrop)
However, if all you're doing is Porter-Duff, you can take advantage of
the fact that these functions are associative and have an additive and
multiplicative unit (alpha = 0 and 1, respectively). Thus, it makes
sense to talk about images _not_ depending on a backdrop in any way.
However, in full PDF 1.4, that won't fly in general. I do expect
standard Porter-Duff to be a lot more common in practice than all
these fancy blend modes. Probably the main place we'll see the latter
is when rendering Photoshop files in their new native-PDF format.
So to fit PDF 1.4 into a purely functional model, each node is a
function from backdrop image to result image. A group with children A
and B is implemented as lambda(backdrop): B(A(backdrop)). However, the
case where the function can be factored into a curried Over[2] with an
RGBA image is a rather important optimization.
Raph
[1] I say RGBA for conciseness, but I do mean any concrete color space
with an added alpha component.
[2] Mmm, curred Over. I prefer curried trout, however.
More information about the fitz-dev
mailing list