Language technology

The support of language technology is nowadays essential for the maintenance of a language. In everyday life it is possible for Estonians to conduct their business in Estonian by such means as mobile telephones, computers and cash machines. Internal business information systems, financial accounting software for small businesses, the OpenOffice kit for offices, the Opera browser, and to some extent Linux, Microsoft Windows XP and Microsoft Office are all available in Estonian.

As for the Estonian language itself, it is possible to get information on electronic glossaries and corpora, as well as simple text processing aids such a spell-checkers and dividers. Language software and some electronic glossaries are available on KeeleWeb, and glossaries also on the Keelevara web-site. One can, for example, search there for antonyms, synonyms, phrases, and foreign and legal dictionaries. The Estonian Language Institute's search engines help people to find place names from around the world on a database, or dialect words from a dictionary, or etymologies from a glossary or morphological dictionary. Linguists can take pleasure in the electronic text corpora on the Internet, the largest of which is the Estonian Language Institute's corpus (thirteen million words), the University of Tartu corpus of the Estonian literary language (four million words), the corpus of the old literary language and the dialect corpus. By agreement it is possible to do research using the University of Tartu spoken corpus and the dialogue corpus.

Thanks to the work that has been done, it is possible using a computer to determine the form of a word or an element in a sentence, read out a prepared text, speed up the compilation of a dictionary and add grammatical information to a corpus. Step by step there is movement in the direction of increasingly complex goals, such as automatic speech recognition, machine translation programmes. At the same time the aim is to get the computer to comprehend Estonian speech and take part in the simple exchange of information.

