메뉴 건너뛰기




Volumn , Issue , 2014, Pages 117-126

Towards building a scholarly big data platform: Challenges, lessons and opportunities

Author keywords

Big Data; Information Extraction; Scholarly Big Data

Indexed keywords

BIG DATA; COMPUTER ARCHITECTURE; DATA ANALYTICS; DIGITAL LIBRARIES; DISTRIBUTED DATABASE SYSTEMS; INFORMATION RETRIEVAL;

EID: 84919397810     PISSN: 15525996     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/JCDL.2014.6970157     Document Type: Conference Paper
Times cited : (48)

References (39)
  • 2
    • 33749012764 scopus 로고    scopus 로고
    • Layout and content extraction for pdf documents
    • Springer
    • H. Chao and J. Fan. "Layout and content extraction for pdf documents," in Document Analysis Systems VI. Springer, 2004, pp. 213-224.
    • (2004) Document Analysis Systems VI , pp. 213-224
    • Chao, H.1    Fan, J.2
  • 3
    • 84893307393 scopus 로고    scopus 로고
    • Ascos: An asymmetric network structure context similarity measure
    • H.-H. Chen and C. L. Giles. "Ascos: An asymmetric network structure context similarity measure," in Proceedings of ASONAM, 2013, pp. 442-449.
    • (2013) Proceedings of ASONAM , pp. 442-449
    • Chen, H.-H.1    Giles, C.L.2
  • 4
    • 79960548564 scopus 로고    scopus 로고
    • Collabseer: A search engine for collaboration discovery
    • H.-H. Chen, L. Gou, X. Zhang, and C. L. Giles. "Collabseer: A search engine for collaboration discovery," in Proceedings of JCDL, 2011, pp. 231-240.
    • (2011) Proceedings of JCDL , pp. 231-240
    • Chen, H.-H.1    Gou, L.2    Zhang, X.3    Giles, C.L.4
  • 8
    • 84919328754 scopus 로고    scopus 로고
    • Automatic identification of research articles from crawled documents
    • K. W. S. D. G. M. K. P. T. Cornelia Caragea, Jian Wu and C. L. Giles. "Automatic identification of research articles from crawled documents. " in Proceedings of WSDM-WSCBD, 2014.
    • (2014) Proceedings of WSDM-WSCBD
    • Cornelia, C.1    Wu, J.2    Giles, C.L.3
  • 9
    • 85029602093 scopus 로고    scopus 로고
    • Parscit: An open-source crf reference string parsing package
    • I. G. Councill, C. L. Giles, and M. yen Kan. "Parscit: An open-source crf reference string parsing package," in Proceedings of the LREF, 2008, pp. 661-667.
    • (2008) Proceedings of the LREF , pp. 661-667
    • Councill, I.G.1    Giles, C.L.2    Yen Kan, M.3
  • 10
    • 42549140738 scopus 로고    scopus 로고
    • An experimental comparison of click position-bias models
    • N. Craswell, O. Zoeter, M. Taylor, and B. Ramsey. "An experimental comparison of click position-bias models," in WSDM, 2008, pp. 87-94.
    • (2008) WSDM , pp. 87-94
    • Craswell, N.1    Zoeter, O.2    Taylor, M.3    Ramsey, B.4
  • 12
    • 57349140734 scopus 로고    scopus 로고
    • A user browsing model to predict search engine click data from past observations
    • G. Dupret and B. Piwowarski. "A user browsing model to predict search engine click data from past observations. " in SIGIR, 2008, pp. 331-338.
    • (2008) SIGIR , pp. 331-338
    • Dupret, G.1    Piwowarski, B.2
  • 15
    • 84941274546 scopus 로고    scopus 로고
    • Automatic document metadata extraction using support vector machines
    • H. Han, C. L. Giles, E. Manavoglu, H. Zha, Z. Zhang, and E. A. Fox. "Automatic document metadata extraction using support vector machines," in JCDL, 2003, pp. 37-48.
    • (2003) JCDL , pp. 37-48
    • Han, H.1    Giles, C.L.2    Manavoglu, E.3    Zha, H.4    Zhang, Z.5    Fox, E.A.6
  • 16
    • 42549170654 scopus 로고    scopus 로고
    • Collaboration over time: Characterizing and modeling network evolution
    • J. Huang, Z. Zhuang, J. Li, and C. L. Giles. "Collaboration over time: characterizing and modeling network evolution," in Proceedings of WSDM, 2008, pp. 107-116.
    • (2008) Proceedings of WSDM , pp. 107-116
    • Huang, J.1    Zhuang, Z.2    Li, J.3    Giles, C.L.4
  • 18
    • 0001685668 scopus 로고    scopus 로고
    • Real life information retrieval: A study of user queries on the web
    • J. B. B. J. Jansen and T. Saracevic. "Real life information retrieval: A study of user queries on the web," SIGIR Forum, vol. 32, no. 1, pp. 5-17, 1998.
    • (1998) SIGIR Forum , vol.32 , Issue.1 , pp. 5-17
    • Jansen, J.B.B.J.1    Saracevic, T.2
  • 19
    • 77958607310 scopus 로고    scopus 로고
    • Utilizing context in generative bayesian models for linked corpus
    • S. Kataria, P. Mitra, and S. Bhatia. "Utilizing context in generative bayesian models for linked corpus. " in Proceedings of AAAI, 2010, pp. 1340-1345.
    • (2010) Proceedings of AAAI , pp. 1340-1345
    • Kataria, S.1    Mitra, P.2    Bhatia, S.3
  • 20
    • 0036383247 scopus 로고    scopus 로고
    • Exploring behavior of e-journal users in science and technology: Transaction log analysis of elsevier's sciencedirect onsite in taiwan
    • H.-R. Ke, R. Kwakkelaar, Y.-M. Tai, and L.-C. Chen. "Exploring behavior of e-journal users in science and technology: Transaction log analysis of elsevier's sciencedirect onsite in taiwan," Library and Information Science Research, vol. 24, no. 3, pp. 265-291, 2002.
    • (2002) Library and Information Science Research , vol.24 , Issue.3 , pp. 265-291
    • Ke, H.-R.1    Kwakkelaar, R.2    Tai, Y.-M.3    Chen, L.-C.4
  • 21
    • 84901218198 scopus 로고    scopus 로고
    • The number of scholarly documents on the public web
    • M. Khabsa and C. L. Giles. "The number of scholarly documents on the public web," PLOS one, 2014.
    • (2014) PLOS One
    • Khabsa, M.1    Giles, C.L.2
  • 22
    • 84863541932 scopus 로고    scopus 로고
    • Ackseer: A repository and search engine for automatically extracted acknowledgments from digital libraries
    • M. Khabsa, P. Treeratpituk, and C. L. Giles. "Ackseer: A repository and search engine for automatically extracted acknowledgments from digital libraries," in Proceedings of JCDL, 2012, pp. 185-194.
    • (2012) Proceedings of JCDL , pp. 185-194
    • Khabsa, M.1    Treeratpituk, P.2    Giles, C.L.3
  • 23
    • 36348992621 scopus 로고    scopus 로고
    • Tableseer: Automatic table metadata extraction and searching in digital libraries
    • Y. Liu, K. Bai, P. Mitra, and C. L. Giles. "Tableseer: Automatic table metadata extraction and searching in digital libraries. " in JCDL, 2007, pp. 91-100.
    • (2007) JCDL , pp. 91-100
    • Liu, Y.1    Bai, K.2    Mitra, P.3    Giles, C.L.4
  • 24
    • 67650417928 scopus 로고    scopus 로고
    • Automated analysis of images in documents for intelligent document search
    • X. Lu, S. Kataria, W. J. Brouwer, J. Z. Wang, P. Mitra, and C. L. Giles. "Automated analysis of images in documents for intelligent document search," IJDAR, vol. 12, no. 2, pp. 65-81, 2009.
    • (2009) IJDAR , vol.12 , Issue.2 , pp. 65-81
    • Lu, X.1    Kataria, S.2    Brouwer, W.J.3    Wang, J.Z.4    Mitra, P.5    Giles, C.L.6
  • 27
    • 81855221691 scopus 로고    scopus 로고
    • An analysis of web proxy logs with query distribution pattern approach for search engines
    • M. Taghavi, A. Patel, N. Schmidt, C. Wills, and Y. Tew. "An analysis of web proxy logs with query distribution pattern approach for search engines. " Computer Standards and Interfaces, vol. 34, no. 1, pp. 162-170, 2012.
    • (2012) Computer Standards and Interfaces , vol.34 , Issue.1 , pp. 162-170
    • Taghavi, M.1    Patel, A.2    Schmidt, N.3    Wills, C.4    Tew, Y.5
  • 28
    • 70450273106 scopus 로고    scopus 로고
    • Disambiguating authors in academic publications using random forests
    • P. Treeratpituk and C. L. Giles. "Disambiguating authors in academic publications using random forests," in Proceedings JCDL, 2009, pp. 39-48.
    • (2009) Proceedings JCDL , pp. 39-48
    • Treeratpituk, P.1    Giles, C.L.2
  • 29
    • 84889575897 scopus 로고    scopus 로고
    • Automatic detection of pseudocodes in scholarly documents using machine learning
    • S. Tuarob, S. Bhatia, P. Mitra, and C. Giles. "Automatic detection of pseudocodes in scholarly documents using machine learning," in Proceedings of ICDAR, 2013, pp. 738-742.
    • (2013) Proceedings of ICDAR , pp. 738-742
    • Tuarob, S.1    Bhatia, S.2    Mitra, P.3    Giles, C.4
  • 30
    • 84882283563 scopus 로고    scopus 로고
    • A classification scheme for algorithm citation function in scholarly works
    • S. Tuarob, P. Mitra, and C. Giles. "A classification scheme for algorithm citation function in scholarly works," in Proceedings of JCDL, 2013, pp. 367-368.
    • (2013) Proceedings of JCDL , pp. 367-368
    • Tuarob, S.1    Mitra, P.2    Giles, C.3
  • 31
    • 84863539696 scopus 로고    scopus 로고
    • Improving algorithm search using the algorithm co-citation network
    • S. Tuarob, P. Mitra, and C. L. Giles. "Improving algorithm search using the algorithm co-citation network," in Proceedings of JCDL, 2012, pp. 277-280.
    • (2012) Proceedings of JCDL , pp. 277-280
    • Tuarob, S.1    Mitra, P.2    Giles, C.L.3
  • 32
    • 84869071720 scopus 로고    scopus 로고
    • The evolution of a crawling strategy for an academic document search engine: Whitelists and blacklists
    • J. Wu, P. Teregowda, J. P. F. Ramrez, P. Mitra, S. Zheng, and C. L. Giles. "The evolution of a crawling strategy for an academic document search engine: whitelists and blacklists," in Proceedings of WebSci, 2012, pp. 340-343.
    • (2012) Proceedings of WebSci , pp. 340-343
    • Wu, J.1    Teregowda, P.2    Ramrez, J.P.F.3    Mitra, P.4    Zheng, S.5    Giles, C.L.6
  • 34
    • 84887328647 scopus 로고    scopus 로고
    • Searching online book documents and analyzing book citations
    • Z. Wu, S. Das, Z. Li, P. Mitra, and C. L. Giles. "Searching online book documents and analyzing book citations," in Proceedings of DocEng, 2013, pp. 81-90.
    • (2013) Proceedings of DocEng , pp. 81-90
    • Wu, Z.1    Das, S.2    Li, Z.3    Mitra, P.4    Giles, C.L.5
  • 35
    • 84889563700 scopus 로고    scopus 로고
    • Measuring term informativeness in context
    • Z. Wu and C. L. Giles. "Measuring term informativeness in context," in Proceedings of NAACL-HLT 2013, 2013, pp. 259-269.
    • (2013) Proceedings of NAACL-HLT 2013 , pp. 259-269
    • Wu, Z.1    Giles, C.L.2
  • 37
    • 84889592982 scopus 로고    scopus 로고
    • Can back-of-the-book indexes be automatically created?
    • Z. Wu, Z. Li, P. Mitra, and C. L. Giles. "Can back-of-the-book indexes be automatically created" in Proceedings of CIKM, 2013, pp. 1745-1750.
    • (2013) Proceedings of CIKM , pp. 1745-1750
    • Wu, Z.1    Li, Z.2    Mitra, P.3    Giles, C.L.4
  • 38
    • 84889577640 scopus 로고    scopus 로고
    • Table of contents recognition and extraction for heterogeneous book documents
    • Z. Wu, P. Mitra, and C. L. Giles. "Table of contents recognition and extraction for heterogeneous book documents," in Proceedings of ICDAR, 2013, pp. 1205-1209.
    • (2013) Proceedings of ICDAR , pp. 1205-1209
    • Wu, Z.1    Mitra, P.2    Giles, C.L.3
  • 39
    • 60549085552 scopus 로고    scopus 로고
    • Time series analysis of a web search engine transaction log
    • Y. Zhang, B. J. Jansen, and A. Spink. "Time series analysis of a web search engine transaction log. " Inf. Process. Manage., vol. 45, no. 2, pp. 230-245, 2009.
    • (2009) Inf. Process. Manage. , vol.45 , Issue.2 , pp. 230-245
    • Zhang, Y.1    Jansen, B.J.2    Spink, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.