Types of adjectives and pronouns of the Tajik language and their use to generate word-forms

Navruz Madibragimov, Alexander Prutzkow


Despite the informatization of all spheres of people's life, the computational linguistics of the Tajik language suffers from a lack of development. The reason is the lack of research done on this topic. Within the project of formalizing the inflection of natural languages for automatic processing of texts in the Tajik language, a classification of adjectives and pronouns of this language according to the types of morphogenesis is proposed. The classification is based on a universal morphogenesis model, which assumes that inflection can be represented as a chain of transformations of finite length. For 694 adjectives of the Tajik language, 5 types and 2 subtypes of morphogenesis are distinguished. For 32 words related to the pronouns of the Tajik language, 5 types of form formation have been identified. One type of shaping includes words, the receipt of forms of which is described by the same chains of transformations. For the selected types and subtypes, the distinctive features are described, the types of conversions used in the chains are indicated. The classification carried out continues the research begun by the classification of nouns in the Tajik language. The classification was used to fill in the linguistic knowledge base of an Internet application that is available to other researchers and people studying this language in different parts of the world. Using this knowledge base, an Internet application generates the forms of words in the Tajik language. The classification of the remaining parts of speech of the Tajik language continues.

