Год выпуска: 2014 Автор: Timothy Anondo Издательство: LAP Lambert Academic Publishing Страниц: 96 ISBN: 9783659593093
Описание
So-called ‘resource poor’ languages are largely ignored in the development of Information and Communication Technologies. Kimiiru, as a Kenyan Bantu language, can be classified as a resource scarce language (RSL) with respect to language technology resources, tools and applications. Again, Kimiiru orthography contains two diacritically marked characters (i and u) that require extra keystrokes to generate, a situation which often makes users opt for the diacritically unmarked equivalents, resulting in non-standard Kimiiru texts. These extra characters also pose a challenge for automated corpus collection methods, such as those using optical character recognition (OCR). This book explores the development of an open source spelling checker for Kimiiru language using the Hunspell Language Tools with a view of understanding the fundamental principles and morphological composition of Kimiiru nouns and verbs. An insight into the development of a suggestion component used to generate...