The Potential Contribution of General and Specialized Corpora to Research on Malay and Malaysian English


  • Zuraidah Mohd Don Language Academy, Universiti Teknologi Malaysia 81310 UTM Johor Bahru, Johor, Malaysia
  • Gerry Knowles Independent Scholar



Malay, Malaysian English, Corpora, Specialised Corpora, Empirical Methodology, Frequency Word Lists


Today’s linguists are increasingly concerned with high-level properties of texts, and tend to work top-down in some branch of discourse analysis, while corpus linguists are concerned with low-level properties such as grammatical class, syntactic constructions and different kinds of text annotation, and tend to work bottom-up. This paper seeks to close the gap, using a general corpus and a specialised corpus. The point of departure is the assumption that a corpus is compiled to study the language of texts in some language for some special purpose beyond the existence of the corpus itself. The particular languages in mind are Malay and Malaysian English. The introduction deals with matters that have to be considered when a corpus project is planned, and with the problems that can arise, some of which have been reported. The methodology section concentrates on the groundwork that has to be done for just about any corpus-based project, and starts with a project undertaken long before computers were invented, and describes the role of computational expertise in modern corpus-based projects. The results section reports some preliminary work on a specialised corpus containing the speeches of Tun Mahathir Mohamed, which attempts to go beyond the groundwork to ascertain objectively what the speeches are about. The paper ends with a combined discussion and conclusion that summarises the content of the paper.




How to Cite

Mohd Don, Z., & Knowles, G. (2022). The Potential Contribution of General and Specialized Corpora to Research on Malay and Malaysian English. LSP International Journal, 9(2), 85-96.