Three workshops for the British AcademyIPMscheme entitled "The Corpus-based Approach to the Acquisition of Mandarin Chinese as a Foreign Language" were held on March 11th and 12th, 2014 in our center. Dr.Vaclav Brezina and Dr. Dana Gablasova from Lancaster University were invited to organize the workshops. The team members as well as many faculty members and Ph.D. students participated in the three modules of the workshops.
In module one,Drs. VaclavBrezinaand Dana Gablasova presented how corpus data are collected. Dr. Brezina first introduced LOB, the first generation of large-scaled electronic corpus built by Lancaster University. Then he showed the three various stages of creating a corpus, namely, preparation, data collection and transcription, and data processing and corpus annotation. He also demonstrated the functionality ofExpress Scribe, a professional transcription tool for audio recordings, and illustrated the ways to transcribe and label spoken data. In the rest of this session,Dr. Dana Gablasovadiscussed the three challenges in transcribing learner speech, namely, different L1 background, English proficiency of learners, and the recording quality.
Module 2 concerns corpus mark-up and annotation. Dr. Vaclav Brezina, with the assistance of Dr. Dana Gablasova, first demonstrated some functionalities of Microsoft word and Notepad++. He then discussed the principles and advantages of eXtensible Markup Language (XML), and demonstrated step by step the ways to realize a structuralized tagged text in XML through Notepad++.
In Module 3, Dr. Dana Gablasovas discussed the quality control of spoken data. She emphasized the importance of the consistency of transcription, and expounded the methods of estimating errors of spoken data transcription.
During the workshops, Dr. Vaclav Brezina and Dr. Dana Gablasovashowed theirprofound knowledge and patience.The team members and audience benefited greatly from the workshops.