메뉴 건너뛰기




Volumn , Issue , 2014, Pages 1-452

Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining

Author keywords

[No Author keywords available]

Indexed keywords


EID: 85128786445     PISSN: None     EISSN: None     Source Type: Book    
DOI: None     Document Type: Book
Times cited : (98)

References (159)
  • 1
    • 79952157680 scopus 로고    scopus 로고
    • R package version 0.2
    • Adler D. 2005. vioplot: Violin Plot. R package version 0.2. http://wsopuppenkiste.wiso.unigoettingen. de/ dadler
    • (2005) vioplot: Violin Plot
    • Adler, D.1
  • 2
    • 84873628465 scopus 로고    scopus 로고
    • O’Reilly, Sebastopol, CA
    • Adler J. 2006. Baseball Hacks. O’Reilly, Sebastopol, CA.
    • (2006) Baseball Hacks
    • Adler, J.1
  • 10
    • 77957952692 scopus 로고    scopus 로고
    • O’Reilly, Sebastopol, CA
    • Beaulieu A. 2009. Learning SQL. O’Reilly, Sebastopol, CA.
    • (2009) Learning SQL
    • Beaulieu, A.1
  • 14
    • 33644890961 scopus 로고
    • RFC 114, (Last accessed December 16, 2013)
    • Bhushan A. 1971. A file transfer protocol. RFC 114. http://tools.ietf.org/html/rfc114 (Last accessed December 16, 2013).
    • (1971) A file transfer protocol
    • Bhushan, A.1
  • 24
    • 79960909239 scopus 로고    scopus 로고
    • Networks in the legislative arena: How group dynamics affect cosponsorship
    • Bratton K and Rouse SM. 2011. Networks in the legislative arena: How group dynamics affect cosponsorship. Legislative Studies Quarterly 36(3), 423-460.
    • (2011) Legislative Studies Quarterly , vol.36 , Issue.3 , pp. 423-460
    • Bratton, K.1    Rouse, S.M.2
  • 25
    • 84891941337 scopus 로고    scopus 로고
    • National and local influenza surveillance through twitter: An analysis of the 2012-2013 influenza epidemic
    • Broniatowski DA, Paul MJ, and Dredze M. 2013. National and local influenza surveillance through twitter: An analysis of the 2012-2013 influenza epidemic. Plos One 8(12), doi:10.1371/ journal.pone.0083672.
    • (2013) Plos One , vol.8 , Issue.12
    • Broniatowski, D.A.1    Paul, M.J.2    Dredze, M.3
  • 29
    • 85186008514 scopus 로고    scopus 로고
    • (Last accessed March 1, 2014)
    • Center for Responsive Politics. 2014. http://www.opensecrets.org/ (Last accessed March 1, 2014).
    • (2014)
  • 32
    • 85034850329 scopus 로고
    • Proceedings of the 1974 ACM SIGFIDET Workshop on Data Description, Access and Control, May 1974, Ann Arbor, MI
    • Chamberlin DD and Boyce RF. 1974. Sequel: A Structured English Query Language. Proceedings of the 1974 ACM SIGFIDET Workshop on Data Description, Access and Control, May 1974, Ann Arbor, MI, pp. 249-264.
    • (1974) Sequel: A Structured English Query Language. , pp. 249-264
    • Chamberlin, D.D.1    Boyce, R.F.2
  • 33
    • 33845202283 scopus 로고    scopus 로고
    • An empirical examination of Wikipedia’s credibility
    • Chesney T. 2006. An empirical examination of Wikipedia’s credibility. First Monday 11(11), doi:10.5210/fm.v11i11.1413.
    • (2006) First Monday , vol.11 , Issue.11
    • Chesney, T.1
  • 34
    • 77951212244 scopus 로고    scopus 로고
    • Legislative success in a small world: Social network analysis and the dynamics of congressional legislation
    • Cho WKT and Fowler JH. 2010. Legislative success in a small world: Social network analysis and the dynamics of congressional legislation. Journal of Politics 72(1), 124-135.
    • (2010) Journal of Politics , vol.72 , Issue.1 , pp. 124-135
    • Cho, W.K.T.1    Fowler, J.H.2
  • 37
    • 0014797273 scopus 로고
    • A relational model of data for large shared data banks
    • Codd EF. 1970. A relational model of data for large shared data banks. Communications of the ACM 13(6), 377-387.
    • (1970) Communications of the ACM , vol.13 , Issue.6 , pp. 377-387
    • Codd, E.F.1
  • 39
    • 85013183953 scopus 로고    scopus 로고
    • R package version 0.2.13
    • Couture-Beil A. 2013. Rjson: JSON for R. R package version 0.2.13. http://CRAN.R-project .org/package=rjson
    • (2013) Rjson: JSON for R
    • Couture-Beil, A.1
  • 40
    • 34548524804 scopus 로고    scopus 로고
    • 2nd edn. John Wiley & Sons, Hoboken, NJ
    • Crawley MJ. 2012. The R Book. 2nd edn. John Wiley & Sons, Hoboken, NJ.
    • (2012) The R Book
    • Crawley, M.J.1
  • 43
    • 84892617949 scopus 로고    scopus 로고
    • (Last accessed October 15, 2013)
    • Dailey D. 2010. An SVG primer for today’s browsers. http://www.w3.org/Graphics/SVG/IG/resources/ svgprimer.html (Last accessed October 15, 2013).
    • (2010) An SVG primer for today’s browsers
    • Dailey, D.1
  • 46
    • 0004035649 scopus 로고    scopus 로고
    • RFC 2246, (Last accessed December 12, 2013)
    • Dierks T and Allen C. 1999. The tls protocol version 1.0. RFC 2246. http://tools.ietf.org/html/rfc2246 (Last accessed December 12, 2013).
    • (1999) The tls protocol version 1.0
    • Dierks, T.1    Allen, C.2
  • 48
    • 84873943261 scopus 로고    scopus 로고
    • The collective action of data collection: A data infrastructure on parties, elections and cabinets
    • Döring H. 2013. The collective action of data collection: A data infrastructure on parties, elections and cabinets. European Union Politics 14(1), 161-178.
    • (2013) European Union Politics , vol.14 , Issue.1 , pp. 161-178
    • Döring, H.1
  • 49
    • 84962233126 scopus 로고    scopus 로고
    • Internet ‘data scraping’: A primer for counseling clients
    • July
    • Dreyer AJ and Stockton J. 2013. Internet ‘data scraping’: A primer for counseling clients. New York Law Journal July, 1-3.
    • (2013) New York Law Journal , pp. 1-3
    • Dreyer, A.J.1    Stockton, J.2
  • 52
    • 85185995434 scopus 로고    scopus 로고
    • (Last accessed January 26, 2014)
    • Essaid R. 2013b. Scraping just got a lot more dangerous. http://www.distilnetworks.com/scraping-justgot- a-lot-more-dangerous/ (Last accessed January 26, 2014).
    • (2013) Scraping just got a lot more dangerous
    • Essaid, R.1
  • 61
    • 33748766601 scopus 로고    scopus 로고
    • Connecting the congress: A study of cosponsorship networks
    • Fowler JH. 2006a. Connecting the congress: A study of cosponsorship networks. Political Analysis 14(4), 456-487.
    • (2006) Political Analysis , vol.14 , Issue.4 , pp. 456-487
    • Fowler, J.H.1
  • 62
    • 33746211706 scopus 로고    scopus 로고
    • Legislative cosponsorship networks in the us house and senate
    • Fowler JH. 2006b. Legislative cosponsorship networks in the us house and senate. Social Networks 28(4), 454-465.
    • (2006) Social Networks , vol.28 , Issue.4 , pp. 454-465
    • Fowler, J.H.1
  • 67
    • 80052005583 scopus 로고    scopus 로고
    • 3rd edn. O’Reilly, Sebastopol, CA
    • Gennick J. 2011. SQL Pocket Guide. 3rd edn. O’Reilly, Sebastopol, CA.
    • (2011) SQL Pocket Guide.
    • Gennick, J.1
  • 71
    • 30744439551 scopus 로고    scopus 로고
    • Internet encyclopae dias go head to head
    • Giles J. 2005. Internet encyclopae dias go head to head. Nature 438, 900-901.
    • (2005) Nature , vol.438 , pp. 900-901
    • Giles, J.1
  • 72
    • 85185990105 scopus 로고    scopus 로고
    • (Last accessed August 11, 2014)
    • Gillespie P. 2012. Pronouncing sql: S-q-l or sequel? http://patorjk.com/blog/2012/01/26/pronouncingsql- s-q-l-or-sequel/(Last accessed August 11, 2014).
    • (2012) Pronouncing sql: S-q-l or sequel?
    • Gillespie, P.1
  • 74
    • 84910665817 scopus 로고    scopus 로고
    • (Last accessed August 3, 2014)
    • Google. 2014. Chrome devtools. https://developers.google.com/chrome-developer-tools/ (Last accessed August 3, 2014).
    • (2014) Chrome devtools
  • 76
    • 84880655688 scopus 로고    scopus 로고
    • Text as data: The promise and pitfalls of automatic content analysis methods for political texts
    • Grimmer J and Stewart BM. 2013. Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis 21(3), 267-297.
    • (2013) Political Analysis , vol.21 , Issue.3 , pp. 267-297
    • Grimmer, J.1    Stewart, B.M.2
  • 78
    • 77954595043 scopus 로고    scopus 로고
    • RFC 5849, (Last accessed February 25, 2014)
    • Hammer-Lahav E. 2010. The Oauth 1.0 protocol. RFC 5849. http://tools.ietf.org/html/rfc5849 (Last accessed February 25, 2014).
    • (2010) The Oauth 1.0 protocol
    • Hammer-Lahav, E.1
  • 79
    • 80052123982 scopus 로고    scopus 로고
    • RFC 6749, (Last accessed February 25, 2014)
    • Hardt D. 2012. The Oauth 2.0 authorization framework. RFC 6749. http://tools.ietf.org/html/rfc6749 (Last accessed February 25, 2014).
    • (2012) The Oauth 2.0 authorization framework
    • Hardt, D.1
  • 83
    • 0032376126 scopus 로고    scopus 로고
    • Violin plots: A box plot-density trace synergism
    • Hintze JL and Nelson RD. 1998. Violin plots: A box plot-density trace synergism. The American Statistician 52(2), 181-184.
    • (1998) The American Statistician , vol.52 , Issue.2 , pp. 181-184
    • Hintze, J.L.1    Nelson, R.D.2
  • 87
    • 73649109957 scopus 로고    scopus 로고
    • A method of automated nonparametric content analysis for social science
    • Hopkins D and King G. 2010. A method of automated nonparametric content analysis for social science. American Journal of Political Science 54(1), 229-247.
    • (2010) American Journal of Political Science , vol.54 , Issue.1 , pp. 229-247
    • Hopkins, D.1    King, G.2
  • 88
    • 67349192921 scopus 로고    scopus 로고
    • Open-source machine learning: R meets weka
    • Hornik K, Buchta C, and Zeileis A. 2009. Open-source machine learning: R meets weka. Computational Statistics 24(2), 225-232.
    • (2009) Computational Statistics , vol.24 , Issue.2 , pp. 225-232
    • Hornik, K.1    Buchta, C.2    Zeileis, A.3
  • 89
    • 85186011313 scopus 로고    scopus 로고
    • Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), August 2004, Seattle, WA
    • HuMand Liu B. 2004. Mining and Summarizing Customer Reviews. Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), August 2004, Seattle, WA.
    • (2004) Mining and Summarizing Customer Reviews.
    • HuMand Liu, B.1
  • 92
    • 84883239661 scopus 로고    scopus 로고
    • ggmap: Spatial visualization with ggplot2
    • Kahle D and Wickham H. 2013. ggmap: Spatial visualization with ggplot2. The R Journal 5(1), 144-161.
    • (2013) The R Journal , vol.5 , Issue.1 , pp. 144-161
    • Kahle, D.1    Wickham, H.2
  • 95
    • 84861946113 scopus 로고    scopus 로고
    • 2nd edn. John Wiley & Sons, Hoboken, NJ
    • Kriegel A and Trukhnov BM. 2008. SQL Bible. 2nd edn. John Wiley & Sons, Hoboken, NJ.
    • (2008) SQL Bible.
    • Kriegel, A.1    Trukhnov, B.M.2
  • 106
    • 85186030462 scopus 로고    scopus 로고
    • (Last accessed March 10, 2014)
    • Mozilla Developer Network. 2013. XPath. https://developer.mozilla.org/en-US/docs/Web/XPath (Last accessed March 10, 2014).
    • (2013) XPath
  • 114
    • 35748944689 scopus 로고    scopus 로고
    • Classes and methods for spatial data in R
    • Pebesma EJ and Bivand RS. 2005. Classes and methods for spatial data in R. R News 5(2), 9-13.
    • (2005) R News , vol.5 , Issue.2 , pp. 9-13
    • Pebesma, E.J.1    Bivand, R.S.2
  • 116
  • 117
    • 80052400297 scopus 로고    scopus 로고
    • (Last accessed February 27, 2014)
    • Princeton University. 2010a. About WordNet.http://wordnet.princeton.edu (Last accessed February 27, 2014).
    • (2010) About WordNet.
  • 118
    • 85186058213 scopus 로고    scopus 로고
    • (Last accessed February 27, 2014)
    • Princeton University. 2010b. License and Commercial Use of WordNet. http://wordnet.princeton .edu/wordnet/license/ (Last accessed February 27, 2014).
    • (2010) License and Commercial Use of WordNet.
  • 119
    • 84904410315 scopus 로고    scopus 로고
    • R package version 0.2-7, (Last accessed August 13, 2014)
    • R Special Interest Group on Databases. 2013. DBI: R Database Interface. R package version 0.2-7. http://cran.r-project.org/web/packages/DBI/index.html (Last accessed August 13, 2014).
    • (2013) DBI: R Database Interface
  • 120
    • 84881576435 scopus 로고    scopus 로고
    • (Last accessed March 22, 2014)
    • Ramisch C. 2008. N-gram models for language detection. http://www.inf.ufrgs.br/ceramisch/ download_files/courses/Master_FRANCE/ENSIMAG_2008_2/Ingenierie_des_Langues_et_de_la_Parole/Rapport.pdf (Last accessed March 22, 2014).
    • (2008) N-gram models for language detection
    • Ramisch, C.1
  • 122
    • 0142154760 scopus 로고    scopus 로고
    • 2nd edn. O’Reilly, Sebastopol, CA
    • Ray ET. 2003. Learning XML. 2nd edn. O’Reilly, Sebastopol, CA.
    • (2003) Learning XML
    • Ray, E.T.1
  • 124
    • 39449134841 scopus 로고    scopus 로고
    • Comparison of Wikipedia and other encyclopedias for accuracy, breadth, and depth in historical articles
    • Rector LH. 2008. Comparison of Wikipedia and other encyclopedias for accuracy, breadth, and depth in historical articles. Reference Services Review 36(1), 7-22.
    • (2008) Reference Services Review , vol.36 , Issue.1 , pp. 7-22
    • Rector, L.H.1
  • 127
    • 85185992272 scopus 로고    scopus 로고
    • Selenium Documentation, (Last accessed March 24, 2014)
    • Selenium Project. 2014a. Selenium Commands - Selenese. Selenium Documentation. http://docs .seleniumhq.org (Last accessed March 24, 2014).
    • (2014) Selenium Commands - Selenese.
  • 128
    • 84937486698 scopus 로고    scopus 로고
    • (Last accessed March 24, 2014)
    • Selenium Project. 2014b. Selenium Documentation.http://docs.seleniumhq.org/docs/ (Last accessed March 24, 2014).
    • (2014) Selenium Documentation.
  • 134
    • 81855225412 scopus 로고    scopus 로고
    • O’Reilly, Sebastopol, CA
    • Teetor P. 2011. R Cookbook. O’Reilly, Sebastopol, CA.
    • (2011) R Cookbook
    • Teetor, P.1
  • 135
    • 85186033234 scopus 로고    scopus 로고
    • (Last accessed December 13, 2013)
    • Temple Lang D. 2012a. RCurl philosophy. http://www.omegahat.org/RCurl/philosophy.html (Last accessed December 13, 2013).
    • (2012) RCurl philosophy
    • Temple Lang, D.1
  • 136
    • 85186030082 scopus 로고    scopus 로고
    • R package version 0.9-1
    • Temple Lang D. 2012b. SSOAP: Client-Side SOAP Access for S. R package version 0.9-1. http://www.omegahat.org/SSOAP, http://www.omegahat.org, http://www.omegahat.org/bugs
    • (2012) SSOAP: Client-Side SOAP Access for S
    • Temple Lang, D.1
  • 143
    • 81755187390 scopus 로고    scopus 로고
    • Election forecasts with twitter. How 140 characters reflect the political landscape
    • Tumasjan A, Sprenger TO, Sandner PG, andWelpe IM. 2011. Election forecasts with twitter. How 140 characters reflect the political landscape. Social Science Computer Review 29(4), 402-418.
    • (2011) Social Science Computer Review , vol.29 , Issue.4 , pp. 402-418
    • Tumasjan, A.1    Sprenger, T.O.2    Sandner, P.G.3    Welpe, I.M.4
  • 144
    • 85171506583 scopus 로고    scopus 로고
    • USA v. Swartz, (Last accessed January 26, 2014)
    • United States District Court District of Massachusetts. 2013. USA v. Swartz. http://pacer.mad.uscourts .gov/dc/cgi-bin/recentops.pl?filename=gorton/pdf/swartz%20protective%20order%20mo.pdf (Last accessed January 26, 2014).
    • (2013) United States District Court District of Massachusetts
  • 145
    • 85186059901 scopus 로고    scopus 로고
    • W3C, (Last accessed December 2, 2013)
    • W3C. 1999. W3C. http://www.w3.org/TR/xpath/ (Last accessed December 2, 2013).
    • (1999)
  • 146
    • 85076234242 scopus 로고    scopus 로고
    • (Last accessed January 26, 2014)
    • Warden P. 2010. How i got sued by Facebook. http://petewarden.com/2010/04/05/how-i-got-sued-byfacebook/ (Last accessed January 26, 2014).
    • (2010) How i got sued by Facebook
    • Warden, P.1
  • 147
    • 84883349095 scopus 로고    scopus 로고
    • Stringr: Modern, consistent string processing
    • Wickham H. 2010. Stringr: Modern, consistent string processing. The R Journal 2(2), 38-40.
    • (2010) The R Journal , vol.2 , Issue.2 , pp. 38-40
    • Wickham, H.1
  • 148
    • 79961232797 scopus 로고    scopus 로고
    • The split-apply-combine strategy for data analysis
    • Wickham H. 2011. The split-apply-combine strategy for data analysis. Journal of Statistical Software 40(1), 1-29.
    • (2011) Journal of Statistical Software , vol.40 , Issue.1 , pp. 1-29
    • Wickham, H.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.