出版社:The Japanese Society for Artificial Intelligence
摘要:An onomatopoeia is a useful linguistic expression to describe sounds, conditions, degrees and so on. It is said Japanese is rich in onomatopoeic expressions. They are frequently used in daily conversations. The meaning and surface structure of an onomatopoeia varies diachronically. There seem to be regional variations in usage of onomatopoeias. It is necessary to investigate the actual condition of onomatopoeia quantitatively in order to apply onomatopoeias into artificial intelligence. This paper studies practical usages of onomatopoeias in spoken modern Japanese language. To explore Japanese onomatopoeias nowadays, we investigate regional assembly minutes collected from all areas in Japan. The corpus of regional assembly minutes, which has about 300 million words, is the target of the investigation of this study. The minutes of Japanese regional assemblies contain all transcriptions of the utterances in the assemblies. This corpus is suitable for our research since attributes of the speakers are clear and speakers are distributed nation-wide. The first research is about total frequency and regional distribution of onomatopoeias. The onomatopoeias, which represent a request for a promotion of policy, e.g., `` shikkari '', `` dondon '', are used at high frequency in regional assemblies. There are no remarkable regional differences in frequencies of these onomatopoeias though western Japan has slight higher frequency. The second research is about the meaning of the onomatopoeias. Most of onomatopoeias are polysemous. The meaning of the onomatopoeia differs by context. The authors have manually checked through 10,827 sentences, which contain 153 kinds of onomatopoeia, and then classified the meaning of each onomatopoeic expression. We analyzed for the following subjects: i) ambiguity of onomatopoeic expression, ii) regional differences in meaning, iii) new meanings in modern spoken language, iv) special usage in assemblies, and v) onomatopoeias in the named entities. The third research is about false extraction of onomatopoeias in the morphological analysis. The extraction errors are analyzed from the viewpoint of surface structure and appearance position. In terms of surface structure, it is clear that the word length of an onomatopoeic expression, which has highly false extraction, is shorter. The onomatopoeic expressions, which end with special morae, namely moraic obstruent, moraic nasal and long vowel, have a higher rate of false extraction. In terms of appearance position, dialectal grammar is the main factor causing false extraction. About 25% of false extraction is found in the sentence-closing particles in dialectal grammar. The result of quantitative analysis of the onomatopoeia in modern spoken Japanese language serves as the basic data which contributes to engineering. The results of the analysis in our research are exhibited through the WWW. It is hoped that results will contribute broadly to the practical use of onomatopoeia in the engineering field.
关键词:onomatopoeia ; spoken language ; large scale corpus ; regional assembly minutes ; word-sense analysis