메뉴 건너뛰기




Volumn 10, Issue 4, 2005, Pages 517-541

Creating and using Web corpora

Author keywords

Academic language; Web; Web corpus

Indexed keywords


EID: 34248700528     PISSN: 13846655     EISSN: 15699811     Source Type: Journal    
DOI: 10.1075/ijcl.10.4.07the     Document Type: Article
Times cited : (12)

References (65)
  • 2
    • 65849487843 scopus 로고    scopus 로고
    • Detecting student copying in a corpus of science laboratory reports: Simple and smart approaches
    • D. Archer, P. Rayson, A. Wilson & T. McEnery Eds, Lancaster: University of Lancaster
    • Atwell, E., Gent, P. Medori, J. & Souter, C. (2003). Detecting student copying in a corpus of science laboratory reports: Simple and smart approaches. In D. Archer, P. Rayson, A. Wilson & T. McEnery (Eds.), Proceedings of the Corpus Linguistics 2003 conference (pp. 48-53). Lancaster: University of Lancaster
    • (2003) Proceedings of the Corpus Linguistics 2003 conference , pp. 48-53
    • Atwell, E.1    Gent, P.2    Medori, J.3    Souter, C.4
  • 5
    • 0038483826 scopus 로고    scopus 로고
    • Emergence of scaling in random networks
    • Barabási, A. & Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509-512
    • (1999) Science , vol.286 , pp. 509-512
    • Barabási, A.1    Albert, R.2
  • 6
    • 65849164674 scopus 로고
    • Progress with a corpus of New Zealand English and some early results
    • C. Souter & E. Atwell Eds, Amsterdam: Rodopi
    • Bauer, L. (1993). Progress with a corpus of New Zealand English and some early results. In C. Souter & E. Atwell (Eds.), Corpus-based computational linguistics (pp. 1-10). Amsterdam: Rodopi
    • (1993) Corpus-based computational linguistics , pp. 1-10
    • Bauer, L.1
  • 7
    • 33746152949 scopus 로고
    • Available
    • Berners-Lee, T. (1992). W3 servers. Available: http://www.w3.org/History/ 19921103-hypertext/hypertext/DataSources/WWW/Servers. html
    • (1992) W3 servers
    • Berners-Lee, T.1
  • 9
    • 20444408862 scopus 로고    scopus 로고
    • Variation among University spoken and written registers: A new multi-dimensional analysis
    • P. Leistyna & C. F. Meyer Eds, Amsterdam: Rodopi
    • Biber, D. (2003). Variation among University spoken and written registers: A new multi-dimensional analysis. In P. Leistyna & C. F. Meyer (Eds.), Corpus Analysis: Language Structure and Language Use (pp. 47-70). Amsterdam: Rodopi
    • (2003) Corpus Analysis: Language Structure and Language Use , pp. 47-70
    • Biber, D.1
  • 13
    • 0038589165 scopus 로고    scopus 로고
    • The anatomy of a large scale hypertextual Web search engine
    • Brin, S. & Page, L. (1998). The anatomy of a large scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30 (1-7), 107-117
    • (1998) Computer Networks and ISDN Systems , vol.30 , Issue.1 , pp. 107-117
    • Brin, S.1    Page, L.2
  • 15
    • 62449293436 scopus 로고    scopus 로고
    • Sebastopol, CA: O'Reilly
    • Burke, S. (2002). Perl and LWP. Sebastopol, CA: O'Reilly
    • (2002) Perl and LWP
    • Burke, S.1
  • 17
  • 19
    • 33746933109 scopus 로고    scopus 로고
    • Creating and using multi-million word corpora from web-based newspapers
    • R.C. Simpson & J. M. Swales Eds, Ann Arbor: University of Michigan
    • Davies, M. (2001). Creating and using multi-million word corpora from web-based newspapers. In R.C. Simpson & J. M. Swales (Eds.), Corpus Linguistics in North America (pp. 58-75). Ann Arbor: University of Michigan
    • (2001) Corpus Linguistics in North America , pp. 58-75
    • Davies, M.1
  • 21
    • 85149140333 scopus 로고    scopus 로고
    • Utilizing the World Wide Web as an encyclopedia: Extracting term descriptions from semi-structured texts
    • Fujii, A. & Ishikawa, T. (2000a). Utilizing the World Wide Web as an encyclopedia: Extracting term descriptions from semi-structured texts. In Association for Computational Linguistics 2000, 488-495
    • (2000) Association for Computational Linguistics 2000 , pp. 488-495
    • Fujii, A.1    Ishikawa, T.2
  • 22
    • 65849409567 scopus 로고    scopus 로고
    • Cross-Language Information Retrieval Based on Query Keyword Translation: An Internet Search Application
    • Fujii, A. & Ishikawa, T. (2000b). Cross-Language Information Retrieval Based on Query Keyword Translation: An Internet Search Application. International Journal of Computer Processing of Oriental Languages, 13 (1), 1-13
    • (2000) International Journal of Computer Processing of Oriental Languages , vol.13 , Issue.1 , pp. 1-13
    • Fujii, A.1    Ishikawa, T.2
  • 23
    • 85069863121 scopus 로고    scopus 로고
    • Mapping networks of support for the Zapatista Movement: Applying Social Network Analysis to study contemporary social movements
    • M. McCaughey & M. Ayers Eds, New York: Routledge
    • Garrido, M. & Halavais, A. (2003). Mapping networks of support for the Zapatista Movement: Applying Social Network Analysis to study contemporary social movements. In M. McCaughey & M. Ayers (Eds.), Cyberactivism: online activism in theory and practice (pp. 165-184). New York: Routledge
    • (2003) Cyberactivism: Online activism in theory and practice , pp. 165-184
    • Garrido, M.1    Halavais, A.2
  • 24
    • 25144475200 scopus 로고    scopus 로고
    • Greenspan, R. (2002). Senior surfing surges. Available: http://cyberatlas.internet.com/big-picture/demographics/article/0,,5901-3111871, 00.html
    • (2002) Senior surfing surges
    • Greenspan, R.1
  • 26
    • 79951675059 scopus 로고    scopus 로고
    • Mercator: A scalable, extensible Web crawler
    • Heydon, A. & Najork, M. (1999). Mercator: A scalable, extensible Web crawler. World Wide Web, 2, 219-229
    • (1999) World Wide Web , vol.2 , pp. 219-229
    • Heydon, A.1    Najork, M.2
  • 28
    • 84930566053 scopus 로고
    • Some syntactic and lexico-semantic features of an Indian variant of English
    • Hosali, P. (1991). Some syntactic and lexico-semantic features of an Indian variant of English. Central Institute of English and Foreign Languages Bulletin, 3 (1-2), 65-83
    • (1991) Central Institute of English and Foreign Languages Bulletin , vol.3 , Issue.1 , pp. 65-83
    • Hosali, P.1
  • 30
    • 0344154400 scopus 로고    scopus 로고
    • Using the Web to obtain frequencies for unseen bigrams
    • Keller, F. & Lapata, M. (2003). Using the Web to obtain frequencies for unseen bigrams. Computational Linguistics, 29 (3), 459-484
    • (2003) Computational Linguistics , vol.29 , Issue.3 , pp. 459-484
    • Keller, F.1    Lapata, M.2
  • 31
    • 84868789046 scopus 로고    scopus 로고
    • WebCorp: Applying the Web to Linguistics and Linguistics to the Web
    • Honolulu, Hawaii
    • Kehoe, A. & Renouf, A. (2002). WebCorp: Applying the Web to Linguistics and Linguistics to the Web. In Proceedings of the WWW2002 Conference, Honolulu, Hawaii. http://www2002.org/CDROM/poster/67/
    • (2002) Proceedings of the WWW2002 Conference
    • Kehoe, A.1    Renouf, A.2
  • 32
    • 0344154403 scopus 로고    scopus 로고
    • Introduction to the special issue on the Web as corpus
    • Kilgarriff, A. & Grefenstette, G. (2003). Introduction to the special issue on the Web as corpus. Computational Linguistics, 29 (3), 333-347
    • (2003) Computational Linguistics , vol.29 , Issue.3 , pp. 333-347
    • Kilgarriff, A.1    Grefenstette, G.2
  • 33
    • 0012945808 scopus 로고    scopus 로고
    • Retrieved December 2, 2003, from
    • Kilgarriff, A. (2003). BNC database and word frequency lists. Retrieved December 2, 2003, from http://www.itri.brighton.ac.uk/Adam.Kilgarriff/bnc- readme.html
    • (2003) BNC database and word frequency lists
    • Kilgarriff, A.1
  • 34
    • 4243148480 scopus 로고    scopus 로고
    • Authoritative sources in a hyperlinked environment
    • Kleinberg, J. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46 (5), 604-632
    • (1999) Journal of the ACM , vol.46 , Issue.5 , pp. 604-632
    • Kleinberg, J.1
  • 35
    • 0344154402 scopus 로고    scopus 로고
    • Embedding Web-based statistical translation models in cross-language information retrieval
    • Kraaij, W., Nie, J. & Simard, M. (2003). Embedding Web-based statistical translation models in cross-language information retrieval. Computational Linguistics, 29 (3), 381-417
    • (2003) Computational Linguistics , vol.29 , Issue.3 , pp. 381-417
    • Kraaij, W.1    Nie, J.2    Simard, M.3
  • 36
    • 0033536218 scopus 로고    scopus 로고
    • Accessibility of information on the Web
    • Lawrence, S. & Giles, C. (1999). Accessibility of information on the Web. Nature, 400, 107-109
    • (1999) Nature , vol.400 , pp. 107-109
    • Lawrence, S.1    Giles, C.2
  • 37
    • 84868719069 scopus 로고
    • Swedish TEFL meets reality
    • S. Johansson & A. Stenström (Eds, Selected Papers and Research Guide, Berlin: Mouton de Gruyter
    • Ljung, M. (1991). Swedish TEFL meets reality. In S. Johansson & A. Stenström (Eds.), English Computer Corpora: Selected Papers and Research Guide (pp. 245-256). Berlin: Mouton de Gruyter
    • (1991) English Computer Corpora , pp. 245-256
    • Ljung, M.1
  • 38
    • 33746918899 scopus 로고    scopus 로고
    • Tracking ongoing grammatical change and recent diversification in present-day standard English: The complementary role of small and large corpora
    • Mair, C. (2003). Tracking ongoing grammatical change and recent diversification in present-day standard English: the complementary role of small and large corpora. Paper presented at the annual ICAME conference, Guernsey. Available: http://fips.igl.unifreiburg.de/lsf/docs/Mair.pdf
    • (2003) annual ICAME conference
    • Mair, C.1
  • 39
    • 34249852033 scopus 로고
    • Building a large annotated corpus of English: The Penn Treebank
    • Marcus, M., Santorini, B. & Marcinkiewicz, M. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19 (2), 313-330
    • (1993) Computational Linguistics , vol.19 , Issue.2 , pp. 313-330
    • Marcus, M.1    Santorini, B.2    Marcinkiewicz, M.3
  • 40
    • 61049241267 scopus 로고    scopus 로고
    • Swearing in modern British English
    • B. Lewandowska-Tomaszczyk & J. Melia Eds, Europaischer Verlag der Wissenschaften: Peter Lang
    • McEnery, T., Baker, P. & Hardie, A. (2000). Swearing in modern British English. In B. Lewandowska-Tomaszczyk & J. Melia (Eds.), Practical Applications in Language Corpora (pp. 37-48). Europaischer Verlag der Wissenschaften: Peter Lang
    • (2000) Practical Applications in Language Corpora , pp. 37-48
    • McEnery, T.1    Baker, P.2    Hardie, A.3
  • 42
    • 65849390036 scopus 로고    scopus 로고
    • Corpus linguistics
    • R. Mitkov Ed, Oxford: Oxford University Press
    • McEnery, T. (2003). Corpus linguistics. In R. Mitkov (Ed.), The Oxford handbook of computational linguistics (pp. 448-463). Oxford: Oxford University Press
    • (2003) The Oxford handbook of computational linguistics , pp. 448-463
    • McEnery, T.1
  • 43
    • 0037195143 scopus 로고    scopus 로고
    • Growing and navigating the small world Web by local content
    • Menczer, F. (2002). Growing and navigating the small world Web by local content. Proceedings of the National Academy of Sciences, 99 (22), 14014-14019
    • (2002) Proceedings of the National Academy of Sciences , vol.99 , Issue.22 , pp. 14014-14019
    • Menczer, F.1
  • 44
    • 0035612855 scopus 로고    scopus 로고
    • Internet search engines-fluctuations in document accessibility
    • Mettrop, W. & Nieuwenhuysen, P. (2001). Internet search engines-fluctuations in document accessibility. Journal of Documentation, 57 (5), 623-651
    • (2001) Journal of Documentation , vol.57 , Issue.5 , pp. 623-651
    • Mettrop, W.1    Nieuwenhuysen, P.2
  • 46
    • 0344718508 scopus 로고    scopus 로고
    • Presenting a model for the structure and content of a university World Wide Web site
    • Middleton, I., McConnell, M. & Davidson, G. (1999). Presenting a model for the structure and content of a university World Wide Web site. Journal of Information Science, 25 (3), 219-227
    • (1999) Journal of Information Science , vol.25 , Issue.3 , pp. 219-227
    • Middleton, I.1    McConnell, M.2    Davidson, G.3
  • 48
    • 33846964050 scopus 로고    scopus 로고
    • Text segmentation
    • R. Mitkov Ed, Oxford: Oxford University Press
    • Mikheev, A. (2003). Text segmentation. In R. Mitkov (Ed.), The Oxford handbook of computational linguistics (pp. 201-218). Oxford: Oxford University Press
    • (2003) The Oxford handbook of computational linguistics , pp. 201-218
    • Mikheev, A.1
  • 50
    • 84898588482 scopus 로고    scopus 로고
    • Linguistic research with the XML/RDF aware WebCorp tool
    • Budapest
    • Morley, B., Renouf, A. & Kehoe, A. (2003). Linguistic research with the XML/RDF aware WebCorp tool. In Proceedings of the WWW2003 Conference, Budapest. http://www2003.org/cdrom/papers/poster/p005/p5-morley.html
    • (2003) Proceedings of the WWW2003 Conference
    • Morley, B.1    Renouf, A.2    Kehoe, A.3
  • 51
    • 84967550311 scopus 로고
    • American and British influence on Australian verb morphology
    • U. Fries, G. Tottie & P. Schneider Eds, Amsterdam: Rodopi
    • Peters, P. (1993). American and British influence on Australian verb morphology. In U. Fries, G. Tottie & P. Schneider (Eds.), Creating and using language corpora (pp. 149-158). Amsterdam: Rodopi
    • (1993) Creating and using language corpora , pp. 149-158
    • Peters, P.1
  • 53
    • 84948691857 scopus 로고    scopus 로고
    • Towards automatic Web genre identification-A corpus-based approach in the domain of academia by example of the academic's Personal Homepage
    • Big Island, Hawaii, January. Available
    • Rehm, G. (2002). Towards automatic Web genre identification-A corpus-based approach in the domain of academia by example of the academic's Personal Homepage. In Proceedings of the Hawaii International Conference on System Sciences, Big Island, Hawaii, January. Available: http://www.uni-giessen. de/g91063/pdf/HICSS35-rehm.pdf
    • (2002) Proceedings of the Hawaii International Conference on System Sciences
    • Rehm, G.1
  • 54
    • 0345376175 scopus 로고    scopus 로고
    • The Web as a parallel corpus
    • Resnik, P. & Smith, N. (2003). The Web as a parallel corpus. Computational Linguistics, 29 (3), 349-380
    • (2003) Computational Linguistics , vol.29 , Issue.3 , pp. 349-380
    • Resnik, P.1    Smith, N.2
  • 56
    • 0344154302 scopus 로고    scopus 로고
    • Automatic association of Web directories with word senses
    • Santamaria, C., Gonzalo, J. & Verdejo, F. (2003). Automatic association of Web directories with word senses. Computational Linguistics, 29 (3), 485-503
    • (2003) Computational Linguistics , vol.29 , Issue.3 , pp. 485-503
    • Santamaria, C.1    Gonzalo, J.2    Verdejo, F.3
  • 60
    • 65849499658 scopus 로고    scopus 로고
    • Do we talk (or write?) differently over the Net? A lexical enquiry into 'a' Net-EN
    • D. Archer, P. Rayson, A. Wilson & T. McEnery Eds, Lancaster: University of Lancaster
    • Takahashi, J. (2003). Do we talk (or write?) differently over the Net? A lexical enquiry into 'a' Net-EN. In D. Archer, P. Rayson, A. Wilson & T. McEnery (Eds.), Proceedings of the Corpus Linguistics 2003 conference (pp. 764-772). Lancaster: University of Lancaster
    • (2003) Proceedings of the Corpus Linguistics 2003 conference , pp. 764-772
    • Takahashi, J.1
  • 61
    • 0041724754 scopus 로고    scopus 로고
    • Linguistic patterns of academic Web use in Western Europe
    • Thelwall, M., Tang, R. & Price, E. (2003). Linguistic patterns of academic Web use in Western Europe. Scientometrics, 56 (3), 417-432
    • (2003) Scientometrics , vol.56 , Issue.3 , pp. 417-432
    • Thelwall, M.1    Tang, R.2    Price, E.3
  • 62
    • 0035204851 scopus 로고    scopus 로고
    • A Web crawler design for data mining
    • Thelwall, M. (2001). A Web crawler design for data mining. Journal of Information Science, 27 (5), 319-325
    • (2001) Journal of Information Science , vol.27 , Issue.5 , pp. 319-325
    • Thelwall, M.1
  • 64
    • 2942610876 scopus 로고    scopus 로고
    • Search engine coverage bias: Evidence and possible causes
    • Vaughan, L. & Thelwall, M. (2004). Search engine coverage bias: Evidence and possible causes. Information Processing & Management, 40 (4), 693-707
    • (2004) Information Processing & Management , vol.40 , Issue.4 , pp. 693-707
    • Vaughan, L.1    Thelwall, M.2
  • 65
    • 0345016957 scopus 로고    scopus 로고
    • WebMT: Developing and validating an example-based machine translation system using the world wide Web
    • Way, A. & Gough, N. (2003). WebMT: Developing and validating an example-based machine translation system using the world wide Web. Computational Linguistics, 29 (3), 421-457
    • (2003) Computational Linguistics , vol.29 , Issue.3 , pp. 421-457
    • Way, A.1    Gough, N.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.