[jbig2-dev] Multi-page sample document
Hartmut Henkel
hartmut_henkel@gmx.de
Mon, 9 Dec 2002 22:53:07 +0100 (CET)
Dear jbig2-dev team,
thanks again for the multipage JBIG2 teststream which you have sent me.
It's now included into a PDF file by using pdfTeX with a crude
experimental driver (on my homepage). Acroread could show all three
`capac' images.
Now the unclear things bubble up, and I write to you maybe off-topic but
in the hope, that `some JBIG2 specialist out there' can help basic
understanding of the page 0 issue, particularily with interrelationship
to the PDF standard.
The mentioned JBIG2 teststream is roughly structured as follows, where
the numbers are the segment page association, vertical bars are
end-of-page flags.
01111111|2222222|30333|
Am I right, that for decoding page 1 and 2, one needs the page 0 info at
the beginning, and for page 3 decoding one needs the same page 0 info,
augmented by additional page 0 info from the segment inbetween the page
3 segments?
Is it ok if one makes a page 0 stream in PDF, which just collects ALL
available page 0 info? The additional page 0 info from inbetween the
page 3 would not hurt pages 1 and 2, as the segment numbers are higher;
is this the principle?
Do you know whether the sequence of segments within page 0 matters?
Can one make a mapping, which page 0 segment is required by which image
page, to reduce page 0 info to the minimum required for a subset of
pages?
The PDF std. requires that there is ONE page 0 stream only. To benefit
from the optimum JBIG2 multi-page compression, it seems that one needs
to know beforehand which the final amount of page 0 info for a set of
pages will be. Because, once within a PDF object, the page 0 stream
cannot be augmented by an additional stream in PDF. Or do you know of a
feature in PDF how to augment a stream by another one?
These questions are roughly about timing and organization of JBIG2
embedding into PDF. E. g. if page 2 shall be embedded, and LATER also
page 3, what to do with the page 0 stream, as it cannot be extended,
once shipped out as PDF? One would either have to include all page 0
info for page 2 and 3 from the beginning, or the page 0 info for page 2,
and later separately the extended page 0 info for page 3 (which spoils
compression by doubling part of already existing info). But then pages 2
and 3 would be completely independent from each other.
Another more general question: Is it ok just to collect the segments and
pipe them into the PDF streams for page 0 and the page? I currently do
only this, no decoding at all, as it seems that there is no decoding
required for assembling PDF. I only reset all page numbers > 1 to 1, as
required by the PDF std. It's just sorting the segments into page 0 and
the XObject streams.
I would be glad if you could check what's weird above. Any hint from you
is greatly welcomed!
BTW, if you have more multi-page JBIG2 test files, or know some pointers
to them, this would be a really great help for my testing. There is no
statistics, if you only have 1 test file :-)
Greetings Hartmut
On Wed, 13 Nov 2002, William Rucklidge wrote:
> The JBIG2 specification includes such a sample file, which attempts to
> explore as many coding options as possible. Here it is, in hex (yeah,
> sorry, but this is extracted straight from the source of the
> specification):
>
> 0000: 97 4A 42 32 0D 0A 1A 0A 01 00 00 00 03 00 00 00
> ...
> -wjr
------------------------------------------------------------------------
Dr.-Ing. Hartmut Henkel
In den Auwiesen 6, D-68723 Oftersheim, Germany
E-Mail: hartmut_henkel@gmx.de
http://www.circuitwizard.de
------------------------------------------------------------------------