Abstract
This work concerns the lexical richness of Beijing Mandarin speakers measured by entropy. The data used for the study are the Beijing Mandarin Spoken Corpora, a conversational and spontaneous speech corpus of contemporary Beijing Mandarin speakers. Based on the sociovariational linguistic hypotheses and data analysis, the study attempts to identify and explain the key demographical and socioeconomic parameters that impact the entropy of each subject's spoken texts. Both one-dimensional and multi-dimensional statistical models are proposed to quantify the relationships between the pertinent measure of lexical richness and the prominent indicative variables, including age, level of education, and profession premium. A multi-dimensional nonlinear model encompassing these findings is designed and calibrated with statistical estimation methods. Possible future directions and applications in relevant field of applied linguistics are provided.
Original language | English |
---|---|
Pages (from-to) | 60-69 |
Number of pages | 10 |
Journal | Language Sciences |
Volume | 44 |
DOIs | |
Publication status | Published - Jul 2014 |
Externally published | Yes |
Keywords
- Beijing Mandarin
- Corpus linguistics
- Entropy
- Lexical richness
- Sociovariational analysis
- Statistical modeling
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language