메뉴 건너뛰기




Volumn 3, Issue 2, 2013, Pages

Stochastic model for the vocabulary growth in natural languages

Author keywords

Complex systems; Interdisciplinary physics; Statistical physics

Indexed keywords

DESCRIPTIVE MODEL; HIGHER FREQUENCIES; HISTORICAL CHANGES; INFINITE NUMBERS; INTERDISCIPLINARY PHYSICS; LOWER FREQUENCIES; NATURAL LANGUAGES; STATISTICAL PHYSICS;

EID: 84883519135     PISSN: None     EISSN: 21603308     Source Type: Journal    
DOI: 10.1103/PhysRevX.3.021006     Document Type: Article
Times cited : (141)

References (58)
  • 2
    • 84865069972 scopus 로고
    • Culturomics Meets Random Fractal Theory: Insights into Long-Range Correlations of Social and Natural Phenomena over the Past Two Centuries
    • J. Gao, J. Hu, X. Mao, and M. Perc, Culturomics Meets Random Fractal Theory: Insights into Long-Range Correlations of Social and Natural Phenomena over the Past Two Centuries, J. R. Soc. Interface 9, 1956 (2012).
    • (1956) J. R. Soc. Interface , vol.9
    • Gao, J.1    Hu, J.2    Mao, X.3    Perc, M.4
  • 3
    • 84859736942 scopus 로고    scopus 로고
    • Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death
    • A. M. Petersen, J. Tenenbaum, S. Havlin, and H. E. Stanley, Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death, Sci. Rep. 2, 313 (2012).
    • (2012) Sci. Rep. , vol.2 , pp. 313
    • Petersen, A.M.1    Tenenbaum, J.2    Havlin, S.3    Stanley, H.E.4
  • 4
    • 35148885376 scopus 로고    scopus 로고
    • Frequency of Word-Use Predicts Rates of Lexical Evolution throughout Indo-European History
    • (London)
    • M. Pagel, Q. D. Atkinson, and A. Meade, Frequency of Word-Use Predicts Rates of Lexical Evolution throughout Indo-European History, Nature (London) 449, 717 (2007).
    • (2007) Nature , vol.449 , pp. 717
    • Pagel, M.1    Atkinson, Q.D.2    Meade, A.3
  • 5
    • 35148879777 scopus 로고    scopus 로고
    • Quantifying the Evolutionary Dynamics of Language
    • (London)
    • E. Lieberman, J.-P. Michel, J. Jackson, T. Tang, and M. A. Nowak, Quantifying the Evolutionary Dynamics of Language, Nature (London) 449, 713 (2007).
    • (2007) Nature , vol.449 , pp. 713
    • Lieberman, E.1    Michel, J.-P.2    Jackson, J.3    Tang, T.4    Nowak, M.A.5
  • 6
    • 84872400995 scopus 로고    scopus 로고
    • Loops and Self-Reference in the Construction of Dictionaries
    • D. Levary, J.-P. Eckmann, E. Moses, and T. Tlusty, Loops and Self-Reference in the Construction of Dictionaries, Phys. Rev. X 2, 031018 (2012).
    • (2012) Phys. Rev. X , vol.2 , pp. 031018
    • Levary, D.1    Eckmann, J.-P.2    Moses, E.3    Tlusty, T.4
  • 7
    • 34248750412 scopus 로고    scopus 로고
    • Review Article: On Vocabulary Richness
    • G. Wimmer and G. Altmann, Review Article: On Vocabulary Richness, J. Quant. Linguist. 6, 1 (1999).
    • (1999) J. Quant. Linguist. , vol.6 , pp. 1
    • Wimmer, G.1    Altmann, G.2
  • 8
    • 0004005755 scopus 로고    scopus 로고
    • (Kluwer Academic Publishers, Dordrecht, Netherlands, 2001)
    • R. H. Baayen, Word Frequency Distributions (Kluwer Academic Publishers, Dordrecht, Netherlands, 2001).
    • Word Frequency Distributions
    • Baayen, R.H.1
  • 9
    • 0033908395 scopus 로고    scopus 로고
    • Block Addressing Indices for Approximate Text Retrieval
    • R. Baeza-Yates and G. Navarro, Block Addressing Indices for Approximate Text Retrieval, J. Am. Soc. Inf. Sci. 51, 69 (2000).
    • (2000) J. Am. Soc. Inf. Sci. , vol.51 , pp. 69
    • Baeza-Yates, R.1    Navarro, G.2
  • 13
    • 65549163086 scopus 로고    scopus 로고
    • Modeling Statistical Properties of Written Text
    • M. A. Serrano, A. Flammini, and F. Menczer, Modeling Statistical Properties of Written Text, PLoS ONE 4, e5372 (2009).
    • (2009) PLoS ONE , vol.4
    • Serrano, M.A.1    Flammini, A.2    Menczer, F.3
  • 14
    • 72049122956 scopus 로고    scopus 로고
    • The Meta Book and Size-Dependent Properties of Written Language
    • S. Bernhardsson, L. E. Correa da Rocha, and P. Minnhagen, The Meta Book and Size-Dependent Properties of Written Language, New J. Phys. 11, 123015 (2009).
    • (2009) New J. Phys. , vol.11 , pp. 123015
    • Bernhardsson, S.1    Correa da Rocha, L.E.2    Minnhagen, P.3
  • 15
    • 84862560561 scopus 로고    scopus 로고
    • Zipf's Law and Heaps' Law Can Predict the Size of Potential Words
    • Y. Sano, H. Takayasu, and M. Takayasu, Zipf's Law and Heaps' Law Can Predict the Size of Potential Words, Prog. Theor. Phys. Suppl. 194, 202 (2012).
    • (2012) Prog. Theor. Phys. Suppl. , vol.194 , pp. 202
    • Sano, Y.1    Takayasu, H.2    Takayasu, M.3
  • 20
    • 33747456470 scopus 로고    scopus 로고
    • Dynamics of Text Generation with Realistic Zipf's Distribution
    • D. H. Zanette and M. A. Montemurro, Dynamics of Text Generation with Realistic Zipf's Distribution, J. Quant. Linguist. 12, 29 (2005).
    • (2005) J. Quant. Linguist. , vol.12 , pp. 29
    • Zanette, D.H.1    Montemurro, M.A.2
  • 21
    • 79961031117 scopus 로고    scopus 로고
    • The Growth Statistics of Zipfian Ensembles: Beyond Heaps' Law
    • (Amsterdam)
    • I. Eliazar, The Growth Statistics of Zipfian Ensembles: Beyond Heaps' Law, Physica (Amsterdam) 390, 3189 (2011).
    • (2011) Physica , vol.390 , pp. 3189
    • Eliazar, I.1
  • 22
    • 0035889615 scopus 로고    scopus 로고
    • Beyond the Zipf-Mandelbrot Law in Quantitative Linguistics
    • (Amsterdam)
    • M. A. Montemurro, Beyond the Zipf-Mandelbrot Law in Quantitative Linguistics, Physica (Amsterdam) 300, 567 (2001).
    • (2001) Physica , vol.300 , pp. 567
    • Montemurro, M.A.1
  • 23
    • 77956411919 scopus 로고    scopus 로고
    • Fitting Ranked Linguistic Data with Two-Parameter Functions
    • W. Li, P. Miramontes, and G. Cocho, Fitting Ranked Linguistic Data with Two-Parameter Functions, Entropy 12, 1743 (2010).
    • (2010) Entropy , vol.12
    • Li, W.1    Miramontes, P.2    Cocho, G.3
  • 24
    • 84861910362 scopus 로고    scopus 로고
    • Power Laws and Other Heavy-Tailed Distributions in Linguistic Typology
    • G. Jäger, Power Laws and Other Heavy-Tailed Distributions in Linguistic Typology, Adv. Compl. Syst. 15, 1150019 (2012).
    • (2012) Adv. Compl. Syst. , vol.15 , pp. 1150019
    • Jäger, G.1
  • 25
    • 84856991721 scopus 로고    scopus 로고
    • Critical Truths About Power Laws
    • M. P. H. Stumpf and M. A. Porter, Critical Truths About Power Laws, Science 335, 665 (2012).
    • (2012) Science , vol.335 , pp. 665
    • Stumpf, M.P.H.1    Porter, M.A.2
  • 26
    • 24744469980 scopus 로고    scopus 로고
    • Power Laws Pareto Distributions Zipf's law
    • M. E. J. Newman, Power Laws, Pareto Distributions and Zipf's law, Contemp. Phys. 46, 323 (2005).
    • (2005) Contemp. Phys. , vol.46 , pp. 323
    • Newman, M.E.J.1
  • 27
    • 0001778604 scopus 로고
    • A Mathematical Theory of Evolution, Based on the Conclusions of Dr. J.C. Willis, F.R.S., Phil. Trans. R. Soc. B
    • G. U. Yule, A Mathematical Theory of Evolution, Based on the Conclusions of Dr. J.C. Willis, F.R.S., Phil. Trans. R. Soc. B 213, 21 (1925).
    • (1925) , vol.213 , pp. 21
    • Yule, G.U.1
  • 28
    • 79960601554 scopus 로고    scopus 로고
    • A Brief History of Generative Models for Power Law and Log-normal Distributions
    • M. Mitzenmacher, A Brief History of Generative Models for Power Law and Log-normal Distributions, Internet Math. 1, 226 (2004).
    • (2004) Internet Math , vol.1 , pp. 226
    • Mitzenmacher, M.1
  • 32
    • 84883508323 scopus 로고    scopus 로고
    • See Supplemental Material at for further details
    • See Supplemental Material at http://link.aps.org/ supplemental/10.1103/PhysRevX.3.021006 for further details.
  • 33
    • 26144462276 scopus 로고    scopus 로고
    • An Informational Theory of the Statistical Structure of Language
    • Communication Theory (Butterworth, Woburn, MA, 1953)
    • B. Mandelbrot, An Informational Theory of the Statistical Structure of Language, Communication Theory (Butterworth, Woburn, MA, 1953), p. 486.
    • Mandelbrot, B.1
  • 34
    • 17744388922 scopus 로고    scopus 로고
    • The Frequency Spectrum of Text and Vocabulary
    • J. Tuldava, The Frequency Spectrum of Text and Vocabulary, J. Quant. Linguist. 3, 38 (1996).
    • (1996) J. Quant. Linguist. , vol.3 , pp. 38
    • Tuldava, J.1
  • 36
    • 0002999358 scopus 로고    scopus 로고
    • Numerical Analysis of Word Frequencies in Artificial and Natural Language Texts
    • A. Cohen, R. N. Mantegna, and S. Havlin, Numerical Analysis of Word Frequencies in Artificial and Natural Language Texts, Fractals 05, 95 (1997).
    • (1997) Fractals 05 , vol.95
    • Cohen, A.1    Mantegna, R.N.2    Havlin, S.3
  • 37
    • 17744388988 scopus 로고    scopus 로고
    • The Variation of Zipf's Law in Human Language
    • R. Ferrer i Cancho, The Variation of Zipf's Law in Human Language, Eur. Phys. J. B 44, 249 (2005).
    • (2005) Eur. Phys. J. B , vol.44 , pp. 249
    • Ferrer I Cancho, R.1
  • 41
    • 0016355478 scopus 로고
    • A New Look at the Statistical Model Identification
    • H. Akaike, A New Look at the Statistical Model Identification, IEEE Trans. Autom. Control 19, 716 (1974).
    • (1974) IEEE Trans. Autom. Control 19 , vol.716
    • Akaike, H.1
  • 42
    • 84883533568 scopus 로고    scopus 로고
    • Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (Springer, New York, 2002), 2nd ed
    • K. P. Burnham and D. R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (Springer, New York, 2002), 2nd ed.
    • Burnham, K.P.1    Anderson, D.R.2
  • 43
    • 84883527535 scopus 로고    scopus 로고
    • mpmath: A Python Library for Arbitrary-Precision Floating-Point Arithmetic (Version 0.14), 2010
    • F. Johansson et al. mpmath: A Python Library for Arbitrary-Precision Floating-Point Arithmetic (Version 0.14), 2010, http://code.google.com/p/mpmath/
    • Johansson, F.1
  • 44
    • 84883534930 scopus 로고    scopus 로고
    • SciPy: Open Source Scientific Tools for Python
    • E. Jones et al. SciPy: Open Source Scientific Tools for Python, 2001, http://www.scipy.org/
    • (2001)
    • Jones, E.1
  • 45
    • 84883512023 scopus 로고    scopus 로고
    • Wikimedia: Dump of the English Wikipedia on 02/06/2012
    • Wikimedia: Dump of the English Wikipedia on 02/06/2012, http://dumps.wikimedia.org/enwiki/20120601/
  • 46
    • 84883502175 scopus 로고    scopus 로고
    • University of Pisa Multimedia Lab: Wikipedia Extractor. See Ref. [32] for access dates
    • University of Pisa Multimedia Lab: Wikipedia Extractor, http://medialab.di.unipi.it/wiki/Wikipedia_Extractor. See Ref. [32] for access dates.
  • 47
    • 84883526373 scopus 로고    scopus 로고
    • The sharp transition between the two regimes in Eq. (1) might seem artificial. We believe that alternative distributions which interpolate between the two scalings could provide a similarly good account of the data. The advantage of the distribution Eq. (1) is that the transition point r = b appears explicitly as a free parameter and can be independently estimated from data
    • The sharp transition between the two regimes in Eq. (1) might seem artificial. We believe that alternative distributions which interpolate between the two scalings could provide a similarly good account of the data. The advantage of the distribution Eq. (1) is that the transition point r = b appears explicitly as a free parameter and can be independently estimated from data.
  • 48
    • 0001835919 scopus 로고    scopus 로고
    • Models for Power Law Relations in Linguistics and Information Science
    • S. Naranan and V. Balasubrahmanyan, Models for Power Law Relations in Linguistics and Information Science, J. Quant. Linguist. 5, 35 (1998).
    • (1998) J. Quant. Linguist. , vol.5 , pp. 35
    • Naranan, S.1    Balasubrahmanyan, V.2
  • 49
    • 0013122906 scopus 로고    scopus 로고
    • Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf's Law Revisited
    • R. Ferrer i Cancho and R.V. Solé, Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf's Law Revisited, J. Quant. Linguist. 8, 165 (2001).
    • (2001) J. Quant. Linguist. , vol.8 , pp. 165
    • Ferrer I Cancho, R.1    Solé, R.V.2
  • 50
    • 84871728624 scopus 로고    scopus 로고
    • Languages Cool as They Expand: Allometric Scaling and the Decreasing Need for New Words
    • A. M. Petersen, J. Tenenbaum, S. Havlin, H. E. Stanley, and M. Perc, Languages Cool as They Expand: Allometric Scaling and the Decreasing Need for New Words, Sci. Rep. 2, 943 (2012).
    • (2012) Sci. Rep. , vol.2 , pp. 943
    • Petersen, A.M.1    Tenenbaum, J.2    Havlin, S.3    Stanley, H.E.4    Perc, M.5
  • 51
    • 0000718929 scopus 로고    scopus 로고
    • On the Theory of Word Frequencies and on Related Markovian Models of Discourse
    • Proceedings of Symposia in Applied Mathematics Vol. XII (American Mathematical Society, Providence, RI, 1961)
    • B. Mandelbrot, On the Theory of Word Frequencies and on Related Markovian Models of Discourse, Structure of Language and Its Mathematical Aspects: Proceedings of Symposia in Applied Mathematics Vol. XII (American Mathematical Society, Providence, RI, 1961).
    • Structure of Language and Its Mathematical Aspects
    • Mandelbrot, B.1
  • 52
    • 0000570212 scopus 로고
    • On a Class of Skew Distribution Functions
    • H. A. Simon, On a Class of Skew Distribution Functions, Biometrika 42, 425 (1955).
    • (1955) Biometrika 42 , pp. 425
    • Simon, H.A.1
  • 53
    • 79952221890 scopus 로고    scopus 로고
    • Wikipedia Information Flow Analysis Reveals the Scale-Free Architecture of the Semantic Space
    • A. P. Masucci, A. Kalampokis, V. M. Eguíluz, and E. Hernández-García, Wikipedia Information Flow Analysis Reveals the Scale-Free Architecture of the Semantic Space, PLoS ONE 6, e17333 (2011).
    • (2011) PLoS ONE , vol.6
    • Masucci, A.P.1    Kalampokis, A.2    Eguíluz, V.M.3    Hernández-García, E.4
  • 54
    • 79961075445 scopus 로고    scopus 로고
    • Emergence of Zipf's Law in the Evolution of Communication
    • B. Corominas-Murtra, J. Fortuny, and R.V. Solé, Emergence of Zipf's Law in the Evolution of Communication, Phys. Rev. E 83, 036115 (2011).
    • (2011) Phys. Rev. E , vol.83 , pp. 036115
    • Corominas-Murtra, B.1    Fortuny, J.2    Solé, R.V.3
  • 56
    • 84883508103 scopus 로고    scopus 로고
    • *= 7873) in every single year in one decade and in none of the years in the other decade (ordered by the average frequency in the decade in which they belonged to the core vocabulary)
    • *= 7873) in every single year in one decade and in none of the years in the other decade (ordered by the average frequency in the decade in which they belonged to the core vocabulary).


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.