2005-06-19

Unicode: Chillu: Confusing code points as base characters

Rachana has a genuine concern that by encoding chillus we are giving them charater status. That is not true. Code points are not base characters. This is a kind of explained in Character Encoding Model.

Regarding the collation, Rachana states:
"Only when two characters or sequences differ in [collation] value (or weight) at the primary level, is there a need to differentiate them at the encoding level."
This is also not true. See this:
  • Eventhough, English lowercase and uppercase characters are encoded seperately, they differ only in tertiary level in Default Unicode Collation Element Table (DUCET). See UTS#10.
  • There are lot of codepoints without any primary weight. An indic example would be Visarga.
In a philosophical way, characters are for humans and codepoints are for computers. Collation Element Table is a way to connect these two.

No comments:

Post a Comment