[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Word-breaking in complex scripts (was Re: supporting XIM)
Lainaus Pablo Saratxaga <pablo@xxxxxxxxxxxxxxxx>:
> Kaixo!
>
> On Mon, Mar 31, 2003 at 09:08:23PM -0800, Edward Cherlin wrote:
>
> > Word breaking is not required in Indic writing systems. When a
> > word with a final consonant is followed by a word with an
> > initial vowel, the consonant and vowel are combined. A final
> > consonant and a following initial consonant form a conjunct. But
> > it is permitted to write the words separately on successive
> > lines.
>
> Indic writing systems don't use spaces between words?
Traditionally, no. AFAIK, this used to be the case with every Brahmi derived
script until very recent times. Visible spaces have been succesfully introduced
by Westerners to most scripts which are used in modern-day India, but older
texts and many of the more distant Brahmi derivatives even now, such as Thai,
Lao and Khmer, use no word-separating spaces at all (sometimes there are spaces
between sentences, though).
If you look at texts written in Sanskrit, the authentic ones use no spaces,
while their Western editions do so for Europeans' ease in a limited
environment, i.e. when a conjunct which Edward is describing does _not_ occur.
This, however, happens quite rarely, which is why many Sanskrit "words" look
very long. See, for instance the images at http://www.devyani.com/mt/vaidya1.htm
In reality, most of the long ones are actually sequences of many words, which
are just pronounced so that the final sound of a word gets into the same
orthographical syllable with the initial sound of the next word, and this is
why they are written without an intervening space even here, outside India.
Best regards,
Miikka-Markus Alhonen
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/