메뉴 건너뛰기




Volumn 81, Issue 6, 2010, Pages

Statistical mechanics of letters in words

Author keywords

[No Author keywords available]

Indexed keywords

ENERGY LANDSCAPE; LOCAL MINIMUMS; MAXIMUM ENTROPY MODELS; PAIRWISE CORRELATION;

EID: 77953980096     PISSN: 15393755     EISSN: 15502376     Source Type: Journal    
DOI: 10.1103/PhysRevE.81.066119     Document Type: Article
Times cited : (51)

References (27)
  • 2
    • 11944266539 scopus 로고
    • 10.1103/PhysRev.106.620
    • E. T. Jaynes, Phys. Rev. 106, 620 (1957). 10.1103/PhysRev.106.620
    • (1957) Phys. Rev. , vol.106 , pp. 620
    • Jaynes, E.T.1
  • 4
    • 33748175149 scopus 로고    scopus 로고
    • 10.1523/JNEUROSCI.1282-06.2006
    • J. Shlens, J. Neurosci. 26, 8254 (2006). 10.1523/JNEUROSCI.1282-06.2006
    • (2006) J. Neurosci. , vol.26 , pp. 8254
    • Shlens, J.1
  • 5
    • 77953985114 scopus 로고    scopus 로고
    • See the presentations at the Society for Neuroscience
    • See the presentations at the Society for Neuroscience (http://www.sfn.org/am2007/)
  • 6
    • 77953980618 scopus 로고    scopus 로고
    • Annual Meeting of the Society for Neuroscience (unpublished), 615.8/O01;
    • I. E. Ohiorhenuan and J. D. Victor, Annual Meeting of the Society for Neuroscience (unpublished), 615.8/O01
    • Ohiorhenuan, I.E.1    Victor, J.D.2
  • 7
    • 77953999923 scopus 로고    scopus 로고
    • Annual Meeting of the Society for Neuroscience (unpublished), 615.14/O07;
    • S. Yu, D. Huang, W. Singer, and D. Nikolić, Annual Meeting of the Society for Neuroscience (unpublished), 615.14/O07
    • Yu, S.1    Huang, D.2    Singer, W.3    Nikolić, D.4
  • 8
    • 77953990667 scopus 로고    scopus 로고
    • Annual Meeting of the Society for Neuroscience (unpublished), 790.1/J12;
    • M. A. Sacek, Annual Meeting of the Society for Neuroscience (unpublished), 790.1/J12
    • Sacek, M.A.1
  • 9
    • 77954004565 scopus 로고    scopus 로고
    • Annual Meeting of the Society for Neuroscience (unpublished), 792.4/K27.
    • A. Tang, Annual Meeting of the Society for Neuroscience (unpublished), 792.4/K27.
    • Tang, A.1
  • 11
    • 77953982900 scopus 로고    scopus 로고
    • Ph.D. thesis, Princeton University
    • G. Tkačik, Ph.D. thesis, Princeton University, 2007.
    • (2007)
    • Tkačik, G.1
  • 13
    • 77954011771 scopus 로고    scopus 로고
    • W. Bialek and R. Ranganathan, e-print arXiv:0712.4397
    • W. Bialek and R. Ranganathan, e-print arXiv:0712.4397.
  • 14
    • 77954000201 scopus 로고    scopus 로고
    • G. Tkačik, E. Schneidman, M. J. Berry II, and W. Bialek, e-print arXiv:q-bio.NC/0611072
    • G. Tkačik, E. Schneidman, M. J. Berry II, and W. Bialek, e-print arXiv:q-bio.NC/0611072.
  • 18
  • 19
    • 0002640874 scopus 로고    scopus 로고
    • "In one's introductory linguistics course, one learns that Chomsky disabused the field once and for all of the notion that there was anything of interest to statistical models of language. But one usually comes away a little fuzzy on the question of what, precisely, he proved;", in edited by J. L. Klavans and P. Resnik (MIT Press, Cambridge, MA
    • "In one's introductory linguistics course, one learns that Chomsky disabused the field once and for all of the notion that there was anything of interest to statistical models of language. But one usually comes away a little fuzzy on the question of what, precisely, he proved;" S. Abney, in The Balancing Act: Combining Statistical and Symbolic Approaches to Language, edited by, J. L. Klavans, and, P. Resnik, (MIT Press, Cambridge, MA, 1996), pp. 1-26.
    • (1996) The Balancing Act: Combining Statistical and Symbolic Approaches to Language , pp. 1-26
    • Abney, S.1
  • 21
    • 77953982565 scopus 로고    scopus 로고
    • note
    • The Austen word corpus was created via Project Guttenberg (www.gutenberg.org), combining text from Emma, Lady Susan, Love and Friendship, Mansfield Park, Northhanger Abbey, Persuassion, Pride and Prejudice, and Sense and Sensibility. Out of 676302 total words in our Austen corpus there were 7114 unique words, 763 of which were four-letter words; the four-letter words occurred in the corpus a total of 135441 times. We used the second release of the ANC (www.americannationalcorpus.org) with ∼ 2 × 10 7 words and restricted ourselves to words used more than 100 times, providing 798 unique four-letter words occurring 2 179108 times. These numbers indicate that we can sample the distribution of four-letter words with reasonable confidence. To control for potential typographic errors, words were also checked against a large dictionary database (http://wordlist.sourceforge.net/12dicts-readme.html).
  • 22
    • 77953987873 scopus 로고    scopus 로고
    • For technical points about finite sample sizes and error bars, see N. Slonim, G. S. Atwal, G. Tkačik, and W. Bialek, e-print arXiv:cs/0502017v1
    • For technical points about finite sample sizes and error bars, see N. Slonim, G. S. Atwal, G. Tkačik, and W. Bialek, e-print arXiv:cs/0502017v1.
  • 23
    • 77953999612 scopus 로고    scopus 로고
    • . Broderick, M. Dudik, G. Tkačik, R. Schapire, and W. Bialek, e-print arXiv:0712.2437
    • T. Broderick, M. Dudik, G. Tkačik, R. Schapire, and W. Bialek, e-print arXiv:0712.2437.
  • 26
    • 0026953429 scopus 로고
    • 10.1109/18.165464
    • W. Li, IEEE Trans. Inf. Theory 38, 1842 (1992). 10.1109/18.165464
    • (1992) IEEE Trans. Inf. Theory , vol.38 , pp. 1842
    • Li, W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.