Do we need Unicode Sinhala?

What is Unicode? (http://www.unicode.org/standard/WhatIsUnicode.html)
quote
Unicode provides a unique number for every character,
no matter what the platform,
no matter what the program,
no matter what the language.
Unquote
History of Unicode Consortium (http://unicode.org/history/)
Quote
The Unicode Consortium was incorporated in January, 1991 in the state of California, four years after the concept of a new character encoding, to be called "Unicode", was broached in discussions started by engineers from Xerox (Joe Becker) and Apple (Lee Collins and Mark Davis).
Unquote
We have not heard of “English Unicode” or “Japanese Unicode” always we hear “Sinhala Unicode”. Why we always talk about a “Sinhala Unicode” in Sri Lanka? Unicode does work well with most of the languages except for the “Bhrami Script” or Indic Languages. The father of our language scripts in the Indian sub continent is the Bharmi Script.
Sinhalese written from left to right has no capital letters. The writing system is called syllabic, where the vowels and consonants are not represented as separate units such as in the Roman script, but as syllabic units in which the vowel is inherent in the consonant (Paranavithana,1972). Historical sources suggests there were written communications in Sinhala amongst the royalty as far back as 4 BC (Former Director General of the Department of Archeology Dr. Shiran Deraniyagala, excavations in Anuradhapura) The first Sinhala characters found in Sri Lanka were inscribed on the top of caves of which Buddhist monks used for meditation during 3 BC. These consisted of only a few very simple characters that were later modernised and diversified. The oldest script available, Siyabaslakara, is the work of the latter part of the Anuradhapura era (8 to 9 AD) (Indrasena, 2001).
The invasion of the Portuguese in 1505 once again hampered the development of the language. However, the Dutch (1656-1796) who succeeded the Portuguese wished to propagate Protestant Christianity and therefore translated the Protestant Bible into Sinhala. The Dutch priest Jacome Gonsalves who conducted the translation added several new characters into the then existing alphabet. It was also the Dutch who first introduced the printing press to Sri Lanka. Nevertheless, semantics of the Sinhala language was impeded for almost one and a half centuries until it was resurrected by the high priest, Venerable Walivita Saranankara in the Kandyan era. It was during the Kandyan era the template of the current Sinhala alphabet was established.(Indrasena, 2001)
A Sinhala - English Dictionery was written by Rev B Clough and was first published in 1830. Second revised edition was published in 1936.(reprints are available at Asian Educational Services NewDelhi) The second Sinhala - English Dictionery was by published by Rev Charles Carter of the Baptist Missionary Society in 1924.(reprints are available at Asian Educational Services NewDelhi) I found that there was a very old Sinhala - English Dictionery by Mudliyer A Mendis Gunersekera.(unable to find a copy) The other Sinhala - English Dictionery published in 1948 (Dharma Samaya Printers) is by A P de
Quote
The Unicode Consortium was incorporated in January, 1991 in the state of California, four years after the concept of a new character encoding, to be called "Unicode", was broached in discussions started by engineers from Xerox (Joe Becker) and Apple (Lee Collins and Mark Davis).
Unquote
We have not heard of “English Unicode” or “Japanese Unicode” always we hear “Sinhala Unicode”. Why we always talk about a “Sinhala Unicode” in Sri Lanka? Unicode does work well with most of the languages except for the “Bhrami Script” or Indic Languages. The father of our language scripts in the Indian sub continent is the Bharmi Script.
Sinhalese written from left to right has no capital letters. The writing system is called syllabic, where the vowels and consonants are not represented as separate units such as in the Roman script, but as syllabic units in which the vowel is inherent in the consonant (Paranavithana,1972). Historical sources suggests there were written communications in Sinhala amongst the royalty as far back as 4 BC (Former Director General of the Department of Archeology Dr. Shiran Deraniyagala, excavations in Anuradhapura) The first Sinhala characters found in Sri Lanka were inscribed on the top of caves of which Buddhist monks used for meditation during 3 BC. These consisted of only a few very simple characters that were later modernised and diversified. The oldest script available, Siyabaslakara, is the work of the latter part of the Anuradhapura era (8 to 9 AD) (Indrasena, 2001).
The invasion of the Portuguese in 1505 once again hampered the development of the language. However, the Dutch (1656-1796) who succeeded the Portuguese wished to propagate Protestant Christianity and therefore translated the Protestant Bible into Sinhala. The Dutch priest Jacome Gonsalves who conducted the translation added several new characters into the then existing alphabet. It was also the Dutch who first introduced the printing press to Sri Lanka. Nevertheless, semantics of the Sinhala language was impeded for almost one and a half centuries until it was resurrected by the high priest, Venerable Walivita Saranankara in the Kandyan era. It was during the Kandyan era the template of the current Sinhala alphabet was established.(Indrasena, 2001)
A Sinhala - English Dictionery was written by Rev B Clough and was first published in 1830. Second revised edition was published in 1936.(reprints are available at Asian Educational Services NewDelhi) The second Sinhala - English Dictionery was by published by Rev Charles Carter of the Baptist Missionary Society in 1924.(reprints are available at Asian Educational Services NewDelhi) I found that there was a very old Sinhala - English Dictionery by Mudliyer A Mendis Gunersekera.(unable to find a copy) The other Sinhala - English Dictionery published in 1948 (Dharma Samaya Printers) is by A P de
There are no publications or even a simple chart giving the complete Sinhala alphabet in Sri Lanka. Except one published by me. ISBN 955-98975-0-0. In Sinhala, there are three versions of the alphabets. The basic pure Sinhala (Elu hodiya) consist of 12 vowels and 25 consonants. A mixed Sinhala alphabet (mishra sinhala akshara malawa) consists of 18 vowels and 41 consonants. However, the accepted Sinhala alphabet (sammatha sinhala akshara malawa) consists of 20 vowels and 41 consonants. In Sinhala, all consonants are expanded for each vowel combination. The present day Sinhala alphabet contains a total over 1660 individual characters. There are many joint characters and special characters like “Repaya” “Yansaya”.
Having many Sinhala characters close to two thousand, if you ask any one who use Sinhala language how many character do we have in our Sinhala alphabet he or she would answer any number less than 65. They only consider the vowels and consonants as Sinhala characters (or Akuru). This is where the dilemma arises in the Sinhala standard. But if you ask the same question from a person who uses Tamil Language, how many Tamil characters there is in Tamil alphabet the answer always be 247. The figure 247 is the total number of Tamil characters in their alphabet. When Unicode Consortium was formed in 1991, Sri Lanka had no Sinhala Standard registered at the Sri Lanka Standard Institute.
http://www.unicode.org/reports/tr2.html
A person called Andy Daniels wrote the Original Sinhala proposal not by any sinhala person or a sri lankan. In his proposal he write as follows.
Quote
There is a standard extant for Sinhala described in A Standard Code for Information Interchange in Sinhalese by V.K. Samaranayake and S.T. Nandasara
(ISO-IEC JTC1/SCL/WG2 N 673, Oct. 1990). The coding proposed in it was found to be an inadequate basis for a modern, computer-based interchange code,though it is adequate to handle the capabilities of a Sinhala typewriter forRepresenting contemporary colloquial Sinhala.
Quote
There is a standard extant for Sinhala described in A Standard Code for Information Interchange in Sinhalese by V.K. Samaranayake and S.T. Nandasara
(ISO-IEC JTC1/SCL/WG2 N 673, Oct. 1990). The coding proposed in it was found to be an inadequate basis for a modern, computer-based interchange code,though it is adequate to handle the capabilities of a Sinhala typewriter forRepresenting contemporary colloquial Sinhala.
Unquote
The unicode consortium gave us time
http://www.unicode.org/Public/TEXT/UTR-2.TXT
They gave time till 1993 to respond. We were sleeping. We were asked to respond. Still we did not. Do you know that Everson is proposing new numbers for Sinhala? Whose language is Sinhala?
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3195.pdf
In the year 2002 the SLSI1134 was proposed. I was the only person objected to this proposal. My comments were over ruled and admitted the Andy Daniels proposal with little changes as the Sri Lanka Standard (SLSI 1134) for information technology. Today we have faced so many problems.
A number of errors have been detected in the text books in Sinhala Language and Literature printed for the academic year 2007. According to the Educational Publication Commissioner Mr. N. Dharmasena there had been such mistakes detected in other textbooks as well. Apart from these errors there is a strong communication gap between rural and urban sectors.
The reasons are as follows:
• There are no compatibility in Sinhala text data across all platforms. Text data created by one application is not readable in the second application.
• There are no e- dictionaries, OCR (Optical Character Recognition), e- Thesaurus, e- encyclopedia for Sinhala Language across all platforms.
• There is no standard for error corrections among typesetters, proofreaders, layout artists and anybody who come in the production process.
See
http://www.youtube.com/watch?v=_3dmkKLMXoE
http://www.youtube.com/watch?v=sqELl3kqXwY
All these problem are created by not identifying all Sinhala characters and giving them absolute UTF values. If you check the properties of the Latin Capital Letter “A” registered in the Unicode consortium is as follows.
Some properties of LATIN CAPITAL LETTER A
16 DECIMAL VALUE : : : : : : 65
17 UTF-8 HEX VALUE : : : : : 0x41
18 UTF-16 HEX VALUE: : : : : 0x0041
19 UTF-32 HEX VALUE: : : : : 0x00000041
20 XHTML : : : : : : : : : : A
21 BLOCK : : : : : : : : : : Basic Latin
Likewise for the few sinhala characters registered in the Unicode too have such properties. If you check the value for “Ka”
SINHALA LETTER ALPAPRAANA KAYANNA
16 DECIMAL VALUE : : : : : : 3482
17 UTF-8 HEX VALUE : : : : : 0xE0B69A
18 UTF-16 HEX VALUE: : : : : 0x0D9A
19 UTF-32 HEX VALUE: : : : : 0x00000D9A
20 XHTML : : : : : : : : : : ක
21 BLOCK : : : : : : : : : : Sinhala
Unfortunately there are no registrations for the “Ki” or other derivatives of “Ka” in Unicode Consortium or in the SLSI. Those characters do not have any utf value. These values are hidden inside a “shaper” or as proprietary fonts or proprietary system. These values are not subject to any standard. Therefore Sinhala text is not compatible across all platforms or software engineers unable to use Sinhala similar to Latin script and develop Sinhala Software.
There are many more drawbacks. I have given the solution by publishing all Sinhala characters. It is for the betterment of this country we correct the SLSI 1134.


