[jbig2-dev] Re: UCS-2 interpretation
William Rucklidge
wjr@imarkets.com
Mon, 22 Jul 2002 16:21:35 -0700
> I have a simpler spec question for you. :)
Happy to answer. Sorry it's taken me a while to reply - I was on vacation,
but am back now.
> In section 7.4.15.2 'Multi-byte coded comment' the character encoding is
> given as UCS-2. I understand UCS-2 doesn't actually specify the byte
> order, though I haven't verified this in the referenced ISO document. I
> would expect it to be big-endian, as with the rest of the spec, but
> wanted to confirm that.
If the byte order is unspecified by the UCS-2 spec, then per the general
rule for encoding of multi-byte quantities, it should be big-endian.
> I'm also puzzled that UCS-2 was specified instead of UTF-16. Do you have
> any insight into current and expected practice there?
Basically, mark this up to ignorance - my ignorance in particular. I
wanted to make sure that there was some way of putting non-ASCII text into
comment segments, but I hadn't had any experience in actual i18n encodings.
I'd heard that ISO 10646 defined a two-byte encoding called UCS-2, so I
specified the use of that - and that exhausted my knowledge (I have no idea
what UTF-16 is or how it might differ from UCS-2). If UCS-2 was a poor
choice, I apologise. If I'd known that the byte order wasn't specified,
I'd at least have added a note clarifying it.
-wjr