A Learner Corpus Investigation of Formulaic Sequence Development in EFL Learners with a Focus on Native and Non-Native Corpora



Formulaic sequences (FSs); Learner corpus; EFL learners; multi-word units


In recent years, the advent of computer technology and software tools have made it available for more complicated and fully operational facilities for corpus linguistics. Thanks to these developments, the compilation of large collections of naturally occurring texts was made more accurately. In line with these developments, the current study aimed to investigate the usage patterns of three- to four-word sequences in a learner corpus composed of two semesters written data from 85 English as a Foreign Language (EFL) learners. The data was analysed by examining collective trends in terms of usage patterns of formulaic sequences across different time intervals. In the collection of data, the frequency approach was used and the most frequent three- and four-word recurrent formulas were extracted from each sub-corpus of the learner corpora in two groups and these sequences were classified structurally and functionally. Then, the use of these sequences was compared across native (LOCNESS) and non-native data by using the Sketch Engine corpus tool. The findings suggested that although formulaic sequences were used frequently in both learner groups, the frequency and type of these formulaic sequences were less diverse, and the number of formulaic sequences was limited when compared with the native data.


Adel, A., & Erman, B. (2012). Recurrent word combinations in academic writing by native and non-native speakers of English: A lexical bundles approach. English for Specific Purposes, 31(2), 84-91. DOI: 10.1016/J.ESP.2011.08.004

Bal-Gezegin, B. (2019). Lexical bundles in published research articles: A corpus-based study. Journal of Language and Linguistic Studies, 15(2), 520-534. https://doi.org/10.17263/jlls.586188

Biber, D., & Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for Specific Purposes, 26(3), 263-286. DOI: 10.1016/j.esp.2006.08.003

Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: Lexical bundles in university teaching and textbooks. Applied linguistics, 25(3), 371-405. https://doi.org/10.1093/applin/25.3.371

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). The Longman grammar of spoken and written English. Longman.

Breeze, R. (2013). Lexical bundles across four legal genres. International Journal of Corpus Linguistics, 18(2), 229-253. https://doi.org/10.1075/ijcl.18.2.03bre

Chenu, F., & Jisa, H. (2009). Reviewing some similarities and differences in L1 and L2 lexical development. Acquisition Et Interaction En Langue Étrangère, 1, 17-38. https://doi.org/10.4000/aile.4506

Conrad, S., & Biber, D. (2005). The frequency and use of lexical bundles in conversation and academic prose. Lexicographica, 20, 56-71. DOI: 10.1515/9783484604674.56

Cooper, P. A. (2016). Academic vocabulary and lexical bundles in the writing of undergraduate psychology students [Unpublished doctoral dissertation]. University of South Africa.

Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213-238. https://doi.org/10.2307/3587951

Ellis, N. C. (1996). Sequencing in SLA: Phonological memory, chunking, and points of order. Studies in Second Language Acquisition, 18(1), 91-126. https://www.jstor.org/stable/44487860?seq=1#metadata_info_tab_contents

Ellis, N. C. (2012). Formulaic language and second language acquisition: Zipf and the phrasal Teddy Bear. Annual Review of Applied Linguistics, 32, 17-44. DOI: 10.1017/S0267190512000025

Ellis, N. C. (2013). Construction grammar and second language acquisition. Oxford University Press.

Ellis, N. C., Simpson-Vlach, R., & Maynard, C. (2008). Formulaic language in native and second language speakers: Psycholinguistics, corpus linguistics, and TESOL. Tesol Quarterly, 42(3), 375-396. https://doi.org/10.1002/j.1545-7249.2008.tb00137.x

Erman, B., & Warren, B. (2000). The idiom principle and the open choice principle. Text & Talk, 20(1), 29-62. https://doi.org/10.1515/text.1.2000.20.1.29

Fattani, R. A. I. (2018). Profiling lexical bundles in an EAP pre-sessional course: A corpus-based study on textbooks and instructors' materials [Unpublished doctoral dissertation]. University of Sheffield.

Giora, R. (2003). On our mind: Salience, context, and figurative language. Oxford University Press.

Granger, S. (2002). A bird’s-eye view of learner corpus research. In, S. Granger, J. Hung & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition, and foreign language teaching (pp. 3-33). John Benjamins Publishing. https://doi.org/10.1075/lllt.6.04gra

Greaves, C., & Warren, M. (2010). What can a corpus tell us about multi-word units? In A. O’Keeffe & M. McCarthy (Eds.), The Routledge handbook of corpus linguistics (pp. 212-226). Routledge. DOI: 10.4324/9780203856949.ch16

Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 27(1), 4-21. DOI: 10.1016/j.esp.2007.06.001

Jones, M., & Haywood, S. (2004). Facilitating the acquisition of formulaic sequences: An exploratory study in an EAP context. In N. Schmitt (Ed.), Formulaic sequences acquisition, processing and use (pp. 269-300). Benjamins.

Juknevičienė, R. (2009). Lexical bundles in learner language: Lithuanian learners vs. native speakers. Kalbotyra, 61(3), 61-72. https://doi.org/10.15388/Klbt.2009.7638

Kashiha, H., & Heng, C. S. (2014). Discourse functions of formulaic sequences in academic speech across two disciplines. GEMA Online® Journal of Language Studies, 14(2), 15-27. http://dx.doi.org/10.17576/GEMA-2014-1402-02

Liou, H. C., & Chen, W. F. (2018). Effects of explicit instruction on learning academic formulaic sequences for EFL college learners' writing. Taiwan Journal of TESOL, 15(1), 61-100. DOI: 10.30397/TJTESOL.201804_15(1).0003

Martinez, R. (2011). The development of a corpus-informed list of formulaic sequences for language pedagogy [Unpublished doctoral dissertation]. University of Nottingham.

Martinez, R., & Schmitt, N. (2012). A phrasal expressions list. Applied Linguistics, 33(3), 299-320. DOI: 10.1093/applin/ams010

McCarthy, M. (1991). Discourse analysis for language teachers Cambridge. Cambridge University Press.

Mel’cuk, I. (1998). Collocations and lexical functions. In A.P. Cowie (Ed.), Phraseology. theory, analysis, and applications (pp. 23-53). Clarendon Press.

Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge University Press.

Oakey, D. (2002). Formulaic language in English academic writing: A corpus-based study of the formal and functional variation of a lexical phrase in different academic disciplines. In R. Reppen, S. M. Fitzmaurice & D. Biber (Eds.), Using corpora to explore linguistic variation: Studies in corpus linguistics (pp. 111-130). John Benjamins.

O'Donnell, B. M., Römer, U., & Ellis, N. C. (2013). The development of formulaic sequences in first and second language writing: Investigating effects of frequency, association, and native norm. International Journal of Corpus Linguistics, 18(1), 83-108. https://doi.org/10.1075/ijcl.18.1.07odo

O'Keeffe, A., McCarthy, M., & Carter, R. (2007). From corpus to classroom: Language use and language teaching. Cambridge University Press.

Özbay, A. Ş., & Kayaoğlu, M. N. (2016). Computerized corpus based investigation of the use of multi word combinations and the developmental stages by tertiary level EFL learners. EPiC Series in Language and Linguistics, 1, 341-350. https://doi.org/10.29007/fs4s

Pavesi, C. (2013). Recurrent sequences in a learner corpus of computer-mediated communication [Unpublished doctoral dissertation]. Università Cattolica del Sacro Cuore.

Schmitt, N., & Carter, R. (2004). Formulaic sequences in action: An introduction. In N. Schmitt (Ed.), Formulaic sequences: Acquisition, processing and use (pp. 1-22). John Benjamins.

Schmitt, N., & Underwood, G. (2004). Exploring the processing of formulaic sequences through a self-paced reading task. In N. Schmitt (Ed.), Formulaic sequences: Acquisition, processing and use (pp. 173-190). John Benjamins.

Simpson-Vlach, R., & Ellis, N. C. (2010). An academic formulas list: New methods in phraseology research. Applied Linguistics, 31(4), 487-512. https://doi.org/10.1093/applin/amp058

Sinclair, J. (2008). The phrase, the whole phrase, and nothing but the phrase. In S. Granger & F. Meunier (Eds.), Phraseology: An interdisciplinary perspective (pp. 407-410). John Benjamins Publishing Company. https://doi.org/10.1075/z.139.33sin

Siyanova‐Chanturia, A., & Spina, S. (2020). Multi‐word expressions in second language writing: A large‐scale longitudinal learner corpus study. Language Learning, 70(2), 420-463. https://doi.org/10.1111/lang.12383

Tomankova, V. (2016). Lexical bundles in legal texts corpora–selection, classification and pedagogical implications. Discourse and Interaction, 9(2), 75-94. https://doi.org/10.5817/DI2016-2-75

Vidaković, I., & Barker, F. (2009). Lexical development across second language proficiency levels: A corpus-informed study. In A. Harris & A. Brandt (Eds.), Language, learning & context: Proceedings of the 42nd annual meeting of the British association for applied linguistics (pp. 143-146). Scitsiugnil Press. https://www.baal.org.uk/wp-content/uploads/2017/12/proceedings_09_full.pdf

Wray, A. (2002). Formulaic language and the lexicon. Cambridge University Press.

Wray, A., & Fitzpatrick, T. (2008). Why can’t you just leave it alone? Deviations from memorized language as a gauge of nativelike competence. In F. Meunier and S. Granger (Eds.), Phraseology in foreign language learning and teaching (pp. 123-147). John Benjamins.

Wray, A., & Perkins, M. (2000). The functions of formulaic language: An integrated model. Language and Communication, 20(1), 1-28. DOI: 10.1016/S0271-5309(99)00015-4

Wong, H. P. (2012). Use of formulaic sequences in task-based oral production of Chinese [Unpublished doctoral dissertation]. Durham University.




How to Cite

Özbay, A. Şükrü, & Öztürk, H. (2021). A Learner Corpus Investigation of Formulaic Sequence Development in EFL Learners with a Focus on Native and Non-Native Corpora. Journal of Narrative and Language Studies, 9(18), 412–437. Retrieved from https://nalans.com/index.php/nalans/article/view/448