Language Resources for Slovene Learners of the Japanese Language (Jezikovni viri za slovenske učence japonskega jezika)

Project leader: Assistant prof. Irena Srdanović, University of Ljubljana

Duration of the project: from 08/2013 to 08/2015 (total 2 years)

Project proposal no.: ARRSRPROJJR Prijava/2011II/814

Type of project: Basic Postdoc Project

Project details: Public call for (co) financing of research projects in 2012 – call in 2011; Phase I results 10.4.2012; Phase II results 26.11.2012, Average grade: 23,63 (out of 25)

Project summary

Language resources for learning the Japanese language have mainly been built relying on the intuitions and experiences of language teachers and textbook writers. Only recently have the advantages of the empirical approach been recognized with its use of large electronic collections of texts, called corpora, and a rapidly growing range of tools that draw on developments in computer science, linguistics, and lexicography. While such resources share the same basic goals as traditional linguistic and lexicographic works, there are fundamental differences in that various tasks can be accomplished more efficiently with the new technologies, which often provide even richer insights into language use.

An area often neglected within the traditional language learning practice is the importance of collocations—words that typically co-occur in a text. The natural combinations of words in a given language are often unpredictable from one’s knowledge of other languages, thus foreign language learners are highly prone to mistakes. For example, ocha wo ireru meaning ‘pripraviti čaj’ (eng. ‘to make a tea’) is unpredictable for Slovene learners of Japanese because its literal meaning is ‘dati čaj noter’ (eng. 'to put tea in’), so they often use the unnatural expression ocha wo tsukuru, where tsukuru is a literal translation of Slovene ‘pripraviti’ (eng. 'to make’). Advances in corpus-based research have clearly highlighted the importance of such language phenomena, although they are not yet fully covered in language learning materials, including those for the Japanese language. Since correct or incorrect usage of collocations strongly depends on the native language of a language learner, this study focuses on Slovene learners of the Japanese language. Moreover, typical collocation relations have been shown to be specific to various academic fields. Accordingly, the aim of this research is to develop language resources that address and resolve these issues, by developing two kinds of modules:

1) Corpus-based resources for learners of Japanese for academic purposes (JAP), based on general language skills and consisting of the following components:

• Collocation query system with information on language proficiency levels

• Collocation syllabus covering theoretical aspects of the treatment of collocations, emphasizing points of similarity and divergence between the two languages

• Model of a Japanese – Slovene – Japanese dictionary of collocations

2) Corpus-based resources for learners of Japanese for specific purposes (JSP) that focus on domain-specific skills and consist of the following components:

• Domain-specific corpus and word list creation

• Domain-specific collocation syllabi and dictionary models,

• This module will concentrate on one or two domains, such as tourism and technical Japanese, which are frequently used in combination with Japanese studies at the institution.

The research will employ various Japanese and Slovene language corpora, the tools summarizing the grammatical and collocational relations, the technology for building specialized corpora and wordlists from web-sources, as well as other sources. The major advantage of such empirical approach is in obtaining valuable linguistic information from large-scale resources in very short periods of time, which is not possible through traditional introspection alone. The systematic treatment of Japanese collocations will be a fundamental resource for creation of other language resources and learning materials and will be of use to both students and teachers. The innovative nature of the research is exhibited in the complementary employment of the state-of-the-art language technologies and categories related to language learning practices, such as proficiency levels. Furthermore, the comparison of collocations in the two languages is expected to provide us with new theoretical insights into the nature of ‘unpredictable’ and ‘predictable’ collocations, which is of wide relevance in the research community.

Some of project reviewer's comments

“To begin with, I have seen many project proposals in my life and very few, if any, have impressed me to the extent this proposal does. The research plan has all the properties that outstanding research requires: The topic and its actuality, well-defined methods and material, all based upon a firm theoretical background. A very detailed planning implies that this endeavour leads to important results. I would be extremely disappointed if this project would not be funded.”

B3 – Research quality of the proposed project –2nd reviewer, Grade 5,0  

“This is a state-of-the-art project. The topic is of high significance for the global world development. The proposal is well-written and the planned study is well presented. The background literature has been well researched and overviewed with enough details. The objectives are clearly defined and the methods are novel in that they combine advances from linguistics, pedagogy and computer science. The high quality of the proposal predicts a high quality of the project results.”

B3 – Research quality of the proposed project – 3rd reviewer, Grade 5,0

“The candidate has a very impressive curriculum vitae. She is certainly a very powerful researcher who has achieved much so far. She has earned various scholarships and studied a language which is not only seldom chosen but for cultural reasons also hard to master. Moreover she has a lot of experience in the computational elaboration of language data which is also a rather rare competence. These two assets alone make the candidate special, but she has also proven excellence.”

B1 – Research excellence of the project leader and the research team – 1st reviewer, Grade 4,8

“The project leader has received excellent training in several fields: languages, research and industry, and seems to be exceptionally well suited for the proposed project. Her research achievements to date are quite impressive for someone at an early stage of career development. It is unfortunate/surprising, though, that her scores for most prestigious / high quality publications are set at 0 points. Maybe, this is because her publications in Japanese have not been considered there?”

B1 – Research excellence of the project leader and the research team – 3rd reviewer, Grade 4,0

“The applicant has demonstrated that she is able to disseminate her knowledge outside academic circles. In view of the fact that she is a postdoctoral researcher it is rather unusual to be engaged with activities that normally are typical of senior scholars. Having these positions in mind, one can anticipate socioeconomically and culturally relevant activities in the future, too.”

B2 – Socioeconomic or cultural relevance of research results of the project leader and the research team – 2nd reviewer, Grade 4,0

“This project has a direct significance for businesses and society in general as its results will contribute to foreign language learning, e-learning, computational linguistics and other areas where Japanese -Slovenian resources might be useful. In addition, the methods developed in this project can be applied to other languages or language pairs.”

B4 – Relevance and potential impact of the results of the proposed project – 3rd reviewer, Grade 5,0

“The knowledge of the applicant, the applied funding, and very detailed planning imply a high degree of feasibility. With other words, my rather great expectations do not mean that the project has too high ambitions.”

B5 – Feasibility of the proposed project 2nd reviewer, Grade 5,0

Related research results (2012/2013)

1 Book chapters

1.1 スルダノヴィッチ・イレーナ(2012) 語の共起関係とシラバス-コーパスに準拠した共起表作りの試み-『日本語学習支援の構築』凡人社 [Srdanović, Irena (2012) Collocational relations and lexical syllabus: corpus-informed syllabus creation, Corpus-Assisted Language Learning System Building, Bonjinsha]

1.2 Srdanović, Irena (2012) Dvojezična korpusna leksikografija in japonski jezik: model za izdelavo japonsko-slovenskega slovarja kolokacij,Dvojezična korpusna leksikografija, ur. Mojca Šorli, Trojina [Bilingual corpus lexicography and Japanese language: Japanese-Slovene collocations dictionary model, Bilingual corpus lexicography, ed. Mojca Sorli, Trojina]

2 Journal papers  

2.1 スルダノヴィッチ・イレーナ(2013)「大規模コーパスを用いた形容詞と名詞のコロケーションの記述的研究―日本語教育用の辞書作成に向けて―」『国立 国語研究所論集』第6号 [Srdanović, Irena (2013) Description of Adjective-Noun Collocations Based on Large-Scale Corpora: Towards a Dictionary for Japanese Language Learners, NINJAL Research Papers, No. 6]

3 Publications at conferences

3.1 スルダノヴィッチ・イレーナ、李 在鎬(2013)「日本語教育用の形容詞の語彙リストと難易度レベル」『「第3回コーパス日本語学ワークショップ」予稿集』国立国語研究所言語資源研究 系・コーパス開発センター、p.281-290 [Srdanović, Irena, Lee, Jae-Ho (2013) Vocabulary List of Adjectives and Levels of Difficulty for Japanese Language Education, Proceeding of the 3rd Japanese corpus linguistics workshop, Department of Corpus Studies/Center for Corpus Development, NINJAL, 281-290]

3.2スルダノヴィッチ・イレーナ、スホメル・ヴィット、小木曽智信、キルガリフ・アダム (2013)「百億語のコーパスを用いた日本語の語彙・文法情報のプロファイリング」『「第3回コーパス日本語学ワークショップ」予稿集』国立国語研究所 言語資源研究系・コーパス開発センター、p.229-238 [Srdanović, Irena, Suchomel, Vit, Ogiso, Toshinobu, Kilgarriff, Adam (2013) Japanese Language Lexical and Grammatical Profiling Using the Web Corpus JpTenTen, Proceeding of the 3rd Japanese corpus linguistics workshop, Department of Corpus Studies/Center for Corpus Development, NINJAL, 229-238]

3.3 Srdanović Irena (2013) Japanese i-adjectives as short and long-word units: implications for language learning Conference of the Pacific Association for Computational Linguistics (PACLING), Tokyo, 2-4. September 2013. 8 pp.

3.4スルダノヴィッチ・イレーナ(2013) コロケーションとシンタクスー形容詞と名詞のコロケーションを対象にー『「第4 回コーパス日本語学ワークショップ」予稿集』国立国語研究所言語資源研究系・コーパス開発センター、8 pp. [Srdanović, Irena (2013) Collocation and Syntax: Adjective and Noun Collocations, Proceeding of the 4th Japanese corpus linguistics workshop, Department of Corpus Studies/Center for Corpus Development, NINJAL, 8.pp]