Entropic evolution of lexical richness of homogeneous texts over time: a dynamic complexity perspective

Yanhui Zhang

Entropic evolution of lexical richness of homogeneous texts over time: a dynamic complexity perspective

Yanhui Zhang

Research output: Journal Publication › Article › peer-review

Abstract

This work concerns the evolving pattern of the lexical richness of the corpus text of China Government Work Report measured by entropy, based on a fundamental assumption that these texts are linguistically homogeneous. The corpus is interpreted and studied as a dynamic system, the components of which maintain spontaneous variations, adjustment, self-organizations, and adaptations to fit into the semantic, discourse and sociolinguistic functions that the text is set to perform. Both the macroscopic structural trend and the microscopic fluctuations of the time series of the interested entropic process are meticulously investigated from the dynamic complexity theoretical perspective. Rigorous nonlinear regression analysis is provided throughout the study for empirical justifications to the theoretical postulations. An overall concave model with modulated fluctuations incorporated is proposed and statistically tested to represent the key quantitative findings. Possible extensions of the current study are discussed.

Original language	English
Pages (from-to)	569
Number of pages	599
Journal	Journal of Language Modelling
Volume	3
Issue number	2
Publication status	Published - 12 Feb 2016
Externally published	Yes

Keywords

dynamic complexity
lexical richness
entropy
homogenous texts
language modeling

Access to Document

https://doi.org/10.15398/jlm.v3i2.111Licence: CC BY

Cite this

@article{ba9c415d900e450a8af590337c42bcb0,

title = "Entropic evolution of lexical richness of homogeneous texts over time: a dynamic complexity perspective",

abstract = "This work concerns the evolving pattern of the lexical richness of the corpus text of China Government Work Report measured by entropy, based on a fundamental assumption that these texts are linguistically homogeneous. The corpus is interpreted and studied as a dynamic system, the components of which maintain spontaneous variations, adjustment, self-organizations, and adaptations to fit into the semantic, discourse and sociolinguistic functions that the text is set to perform. Both the macroscopic structural trend and the microscopic fluctuations of the time series of the interested entropic process are meticulously investigated from the dynamic complexity theoretical perspective. Rigorous nonlinear regression analysis is provided throughout the study for empirical justifications to the theoretical postulations. An overall concave model with modulated fluctuations incorporated is proposed and statistically tested to represent the key quantitative findings. Possible extensions of the current study are discussed.",

keywords = "dynamic complexity, lexical richness, entropy, homogenous texts, language modeling",

author = "Yanhui Zhang",

year = "2016",

month = feb,

day = "12",

language = "English",

volume = "3",

pages = "569",

journal = "Journal of Language Modelling",

issn = "2299-856X",

publisher = "Institute of Computer Science, Polish Academy of Sciences",

number = "2",

}

TY - JOUR

T1 - Entropic evolution of lexical richness of homogeneous texts over time: a dynamic complexity perspective

AU - Zhang, Yanhui

PY - 2016/2/12

Y1 - 2016/2/12

N2 - This work concerns the evolving pattern of the lexical richness of the corpus text of China Government Work Report measured by entropy, based on a fundamental assumption that these texts are linguistically homogeneous. The corpus is interpreted and studied as a dynamic system, the components of which maintain spontaneous variations, adjustment, self-organizations, and adaptations to fit into the semantic, discourse and sociolinguistic functions that the text is set to perform. Both the macroscopic structural trend and the microscopic fluctuations of the time series of the interested entropic process are meticulously investigated from the dynamic complexity theoretical perspective. Rigorous nonlinear regression analysis is provided throughout the study for empirical justifications to the theoretical postulations. An overall concave model with modulated fluctuations incorporated is proposed and statistically tested to represent the key quantitative findings. Possible extensions of the current study are discussed.

AB - This work concerns the evolving pattern of the lexical richness of the corpus text of China Government Work Report measured by entropy, based on a fundamental assumption that these texts are linguistically homogeneous. The corpus is interpreted and studied as a dynamic system, the components of which maintain spontaneous variations, adjustment, self-organizations, and adaptations to fit into the semantic, discourse and sociolinguistic functions that the text is set to perform. Both the macroscopic structural trend and the microscopic fluctuations of the time series of the interested entropic process are meticulously investigated from the dynamic complexity theoretical perspective. Rigorous nonlinear regression analysis is provided throughout the study for empirical justifications to the theoretical postulations. An overall concave model with modulated fluctuations incorporated is proposed and statistically tested to represent the key quantitative findings. Possible extensions of the current study are discussed.

KW - dynamic complexity

KW - lexical richness

KW - entropy

KW - homogenous texts

KW - language modeling

M3 - Article

SN - 2299-856X

VL - 3

SP - 569

JO - Journal of Language Modelling

JF - Journal of Language Modelling

IS - 2

ER -

Entropic evolution of lexical richness of homogeneous texts over time: a dynamic complexity perspective

Abstract

Keywords

Access to Document

Fingerprint

Cite this