메뉴 건너뛰기




Volumn 73, Issue , 2018, Pages 50-60

Survey and evaluation of web search engine hit counts as research tools in computational linguistics

Author keywords

Computational linguistics; Hit counts; Information distribution; Semantic similarity; Web search engines

Indexed keywords

COMPUTATIONAL LINGUISTICS; DATA MINING; DIGITAL STORAGE; INFORMATION RETRIEVAL; LINGUISTICS; NATURAL LANGUAGE PROCESSING SYSTEMS; QUERY LANGUAGES; SEMANTICS; SURVEYS; WEBSITES;

EID: 85040055325     PISSN: 03064379     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.is.2017.12.007     Document Type: Article
Times cited : (22)

References (62)
  • 1
    • 0345376175 scopus 로고    scopus 로고
    • The web as a parallel corpus
    • Resnik, P., Smith, N., The web as a parallel corpus. Comput. Ling. 29 (2003), 349–380.
    • (2003) Comput. Ling. , vol.29 , pp. 349-380
    • Resnik, P.1    Smith, N.2
  • 2
    • 84944081118 scopus 로고    scopus 로고
    • A tool for producing structured interoperable data from product features on the web
    • Özacar, T., A tool for producing structured interoperable data from product features on the web. Inf. Syst. 56 (2016), 36–54.
    • (2016) Inf. Syst. , vol.56 , pp. 36-54
    • Özacar, T.1
  • 3
    • 84855651895 scopus 로고    scopus 로고
    • Quality-aware similarity assessment for entity matching in web data
    • Yerva, S.R., Miklós, Z., Aberer, K., Quality-aware similarity assessment for entity matching in web data. Inf. Syst. 37 (2012), 336–351.
    • (2012) Inf. Syst. , vol.37 , pp. 336-351
    • Yerva, S.R.1    Miklós, Z.2    Aberer, K.3
  • 7
    • 85010217018 scopus 로고    scopus 로고
    • Web-based models for natural language processing
    • Lapata, M., Keller, F., Web-based models for natural language processing. ACM Trans. Speech Lang. Process. 2 (2005), 1–31.
    • (2005) ACM Trans. Speech Lang. Process. , vol.2 , pp. 1-31
    • Lapata, M.1    Keller, F.2
  • 8
    • 38349143926 scopus 로고    scopus 로고
    • Extracting accurate and complete results from search engines: case study windows live
    • Thelwall, M., Extracting accurate and complete results from search engines: case study windows live. J. Am. Soc. Inf. Sci. Technol. 59 (2007), 38–50.
    • (2007) J. Am. Soc. Inf. Sci. Technol. , vol.59 , pp. 38-50
    • Thelwall, M.1
  • 11
    • 0344154400 scopus 로고    scopus 로고
    • Using the web to obtain frequencies for unseen bigrams
    • Keller, F., Lapata, M., Using the web to obtain frequencies for unseen bigrams. Comput. Ling. 29 (2003), 459–484.
    • (2003) Comput. Ling. , vol.29 , pp. 459-484
    • Keller, F.1    Lapata, M.2
  • 14
    • 79956128897 scopus 로고    scopus 로고
    • Automatic extraction of acronym definitions from the web
    • Sánchez, D., Isern, D., Automatic extraction of acronym definitions from the web. Appl. Intell. 34 (2011), 311–327.
    • (2011) Appl. Intell. , vol.34 , pp. 311-327
    • Sánchez, D.1    Isern, D.2
  • 15
    • 38649107088 scopus 로고    scopus 로고
    • Learning non-taxonomic relationships from web documents for domain ontology construction
    • Sánchez, D., Moreno, A., Learning non-taxonomic relationships from web documents for domain ontology construction. Data Knowl. Eng. 63 (2008), 600–623.
    • (2008) Data Knowl. Eng. , vol.63 , pp. 600-623
    • Sánchez, D.1    Moreno, A.2
  • 16
    • 40549141816 scopus 로고    scopus 로고
    • Pattern-based automatic taxonomy learning from the web
    • Sánchez, D., Moreno, A., Pattern-based automatic taxonomy learning from the web. AI Commun. 21 (2008), 27–48.
    • (2008) AI Commun. , vol.21 , pp. 27-48
    • Sánchez, D.1    Moreno, A.2
  • 17
    • 77951142171 scopus 로고    scopus 로고
    • A methodology to learn ontological attributes from the web
    • Sánchez, D., A methodology to learn ontological attributes from the web. Data Knowl. Eng. 69 (2010), 573–597.
    • (2010) Data Knowl. Eng. , vol.69 , pp. 573-597
    • Sánchez, D.1
  • 18
    • 84855907139 scopus 로고    scopus 로고
    • Learning relation axioms from text: an automatic web-based approach
    • Sánchez, D., Moreno, A., Vasto-Terrientes, L.D., Learning relation axioms from text: an automatic web-based approach. Expert Syst. Appl. 39 (2012), 5792–5805.
    • (2012) Expert Syst. Appl. , vol.39 , pp. 5792-5805
    • Sánchez, D.1    Moreno, A.2    Vasto-Terrientes, L.D.3
  • 21
    • 79955999805 scopus 로고    scopus 로고
    • Content annotation for the semantic web: an automatic web-based approach
    • Sánchez, D., Isern, D., Millán, M., Content annotation for the semantic web: an automatic web-based approach. Knowl. Inf. Syst. 27 (2011), 393–418.
    • (2011) Knowl. Inf. Syst. , vol.27 , pp. 393-418
    • Sánchez, D.1    Isern, D.2    Millán, M.3
  • 22
    • 84867842712 scopus 로고    scopus 로고
    • Preventing automatic user profiling in web 2.0 applications
    • Viejo, A., Sánchez, D., Castellà-Roca, J., Preventing automatic user profiling in web 2.0 applications. Knowl.-Based Syst. 36 (2012), 191–205.
    • (2012) Knowl.-Based Syst. , vol.36 , pp. 191-205
    • Viejo, A.1    Sánchez, D.2    Castellà-Roca, J.3
  • 24
    • 84975078571 scopus 로고    scopus 로고
    • C-sanitized: a privacy model for document redaction and sanitization
    • Sánchez, D., Batet, M., C-sanitized: a privacy model for document redaction and sanitization. J. Assoc. Inf. Sci. Technol. 67 (2016), 148–163.
    • (2016) J. Assoc. Inf. Sci. Technol. , vol.67 , pp. 148-163
    • Sánchez, D.1    Batet, M.2
  • 27
    • 84946074841 scopus 로고    scopus 로고
    • Evaluating the retrieval effectiveness of web search engines using a representative query sample
    • Lewandowski, D., Evaluating the retrieval effectiveness of web search engines using a representative query sample. J. Assoc. Inf. Sci. Technol. 66 (2015), 1763–1775.
    • (2015) J. Assoc. Inf. Sci. Technol. , vol.66 , pp. 1763-1775
    • Lewandowski, D.1
  • 28
    • 34548474030 scopus 로고    scopus 로고
    • Evaluation of web search for the information practitioner
    • Macfarlane, A., Evaluation of web search for the information practitioner. Aslib Proc. 59 (2007), 352–366.
    • (2007) Aslib Proc. , vol.59 , pp. 352-366
    • Macfarlane, A.1
  • 29
    • 79551499272 scopus 로고    scopus 로고
    • Performance evaluation and comparison of the five most used search engines in retrieving web resources
    • Deka, S.K., Lahkar, N., Performance evaluation and comparison of the five most used search engines in retrieving web resources. Online Inf. Rev. 34 (2010), 757–771.
    • (2010) Online Inf. Rev. , vol.34 , pp. 757-771
    • Deka, S.K.1    Lahkar, N.2
  • 30
    • 84865209921 scopus 로고    scopus 로고
    • Ranking, relevance judgment, and precision of information retrieval on children's queries: evaluation of Google, Yahoo!, Bing, Yahoo! Kids, and ask Kids
    • Bilal, D., Ranking, relevance judgment, and precision of information retrieval on children's queries: evaluation of Google, Yahoo!, Bing, Yahoo! Kids, and ask Kids. J. Am. Soc. Inf. Sci. Technol. 63 (2012), 1879–1896.
    • (2012) J. Am. Soc. Inf. Sci. Technol. , vol.63 , pp. 1879-1896
    • Bilal, D.1
  • 31
    • 77956190113 scopus 로고    scopus 로고
    • Search engines? responses to several search feature selections
    • Zhang, J., Fei, W., Search engines? responses to several search feature selections. Int. Inf. Library Rev. 42 (2010), 212–225.
    • (2010) Int. Inf. Library Rev. , vol.42 , pp. 212-225
    • Zhang, J.1    Fei, W.2
  • 32
    • 82255166998 scopus 로고    scopus 로고
    • A method to assess search engine results
    • Bar-Ilan, J., Levene, M., A method to assess search engine results. Online Inf. Rev. 35 (2011), 854–868.
    • (2011) Online Inf. Rev. , vol.35 , pp. 854-868
    • Bar-Ilan, J.1    Levene, M.2
  • 33
    • 0142030258 scopus 로고    scopus 로고
    • A taxonomy of web search
    • Broder, A., A taxonomy of web search. ACM Sigir forum 36 (2002), 3–10.
    • (2002) ACM Sigir forum , vol.36 , pp. 3-10
    • Broder, A.1
  • 34
    • 51049099404 scopus 로고    scopus 로고
    • Quantitative comparisons of search engine results
    • Thelwall, M., Quantitative comparisons of search engine results. J. Am. Soc. Inf. Sci. Technol. 59 (2008), 1702–1710.
    • (2008) J. Am. Soc. Inf. Sci. Technol. , vol.59 , pp. 1702-1710
    • Thelwall, M.1
  • 35
    • 68249153264 scopus 로고    scopus 로고
    • Investigation of the accuracy of search engine hit counts
    • Uyar, A., Investigation of the accuracy of search engine hit counts. J. Inf. Sci 35 (2009), 469–480.
    • (2009) J. Inf. Sci , vol.35 , pp. 469-480
    • Uyar, A.1
  • 37
    • 78649832936 scopus 로고    scopus 로고
    • Reliability verification of search engines’ hit counts: how to select a reliable hit count for a query
    • Springer
    • Funahashi, T., Yamana, H., Reliability verification of search engines’ hit counts: how to select a reliable hit count for a query. Current Trends in Web Engineering, 2010, Springer, 114–125.
    • (2010) Current Trends in Web Engineering , pp. 114-125
    • Funahashi, T.1    Yamana, H.2
  • 39
    • 80054080083 scopus 로고    scopus 로고
    • A prediction model for web search hit counts using word frequencies
    • Tian, T., Chun, S.A., Geller, J., A prediction model for web search hit counts using word frequencies. J. Inf. Sci. 37 (2011), 462–475.
    • (2011) J. Inf. Sci. , vol.37 , pp. 462-475
    • Tian, T.1    Chun, S.A.2    Geller, J.3
  • 41
    • 85040046501 scopus 로고    scopus 로고
    • Netmarketshare. Desktop Search Engine Market Share. March 2017. Available at
    • Netmarketshare. Desktop Search Engine Market Share. March 2017. Available at https://www.netmarketshare.com/search-engine-market-share.aspx?qprid=4&qpcustomd=0.
  • 42
    • 34047135006 scopus 로고    scopus 로고
    • Googleology is bad science
    • Kilgarriff, A., Googleology is bad science. Comput. Ling. 33 (2007), 147–151.
    • (2007) Comput. Ling. , vol.33 , pp. 147-151
    • Kilgarriff, A.1
  • 43
    • 84863873032 scopus 로고
    • Contextual correlates of synonymy
    • Rubenstein, H., Goodenough, J., Contextual correlates of synonymy. Commun. ACM 8 (1965), 627–633.
    • (1965) Commun. ACM , vol.8 , pp. 627-633
    • Rubenstein, H.1    Goodenough, J.2
  • 44
    • 0033408730 scopus 로고    scopus 로고
    • Can search engines be used as tools for web-link analysis? A critical view
    • Snyder, H., Rosenbaum, H., Can search engines be used as tools for web-link analysis? A critical view. J. Doc. 55 (1999), 375–384.
    • (1999) J. Doc. , vol.55 , pp. 375-384
    • Snyder, H.1    Rosenbaum, H.2
  • 45
    • 0035612855 scopus 로고    scopus 로고
    • Internet search engines - fluctuations in document accessibility
    • Mettrop, W., Nieuwenhuysen, P., Internet search engines - fluctuations in document accessibility. J. Doc. 57 (2001), 623–651.
    • (2001) J. Doc. , vol.57 , pp. 623-651
    • Mettrop, W.1    Nieuwenhuysen, P.2
  • 47
    • 85040045472 scopus 로고    scopus 로고
    • A difference of a factor of 70,000 between hit counts and results returned in Google.
    • In: Unpublished technical note;
    • E. Davis. A difference of a factor of 70,000 between hit counts and results returned in Google. In: Unpublished technical note; 2015.
    • (2015)
    • Davis, E.1
  • 50
    • 33750701866 scopus 로고    scopus 로고
    • Methods for evaluating dynamic changes in search engine rankings: a case study
    • Bar-Ilan, J., Levene, M., Mat-Hassan, M., Methods for evaluating dynamic changes in search engine rankings: a case study. J. Doc. 62 (2006), 708–729.
    • (2006) J. Doc. , vol.62 , pp. 708-729
    • Bar-Ilan, J.1    Levene, M.2    Mat-Hassan, M.3
  • 52
    • 85015963191 scopus 로고    scopus 로고
    • HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset
    • Lastra-Díaz, JuanJ., García-Serrano, A., Batet, M., Fernández, M., Chirigati, F., HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. Inf. Syst. 66 (2017), 97–118.
    • (2017) Inf. Syst. , vol.66 , pp. 97-118
    • Lastra-Díaz, J.1    García-Serrano, A.2    Batet, M.3    Fernández, M.4    Chirigati, F.5
  • 53
    • 80955140428 scopus 로고    scopus 로고
    • Ontology based semantic clustering
    • Batet, M., Ontology based semantic clustering. AI Commun. 24 (2011), 291–292.
    • (2011) AI Commun. , vol.24 , pp. 291-292
    • Batet, M.1
  • 54
    • 84888198960 scopus 로고    scopus 로고
    • Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text
    • McInnes, B.T., Pedersen, T., Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. J. Biomed. Inf. 46 (2013), 1116–1124.
    • (2013) J. Biomed. Inf. , vol.46 , pp. 1116-1124
    • McInnes, B.T.1    Pedersen, T.2
  • 55
    • 84875634268 scopus 로고    scopus 로고
    • A semantic framework to protect the privacy of electronic health records with non-numerical attributes
    • Martínez, S., Sánchez, D., Valls, A., A semantic framework to protect the privacy of electronic health records with non-numerical attributes. J. Biomed. Inf. 46 (2013), 294–303.
    • (2013) J. Biomed. Inf. , vol.46 , pp. 294-303
    • Martínez, S.1    Sánchez, D.2    Valls, A.3
  • 56
    • 84897535843 scopus 로고    scopus 로고
    • The distributional hypothesis
    • Sahlgren, M., The distributional hypothesis. Rivista di Linguistica 20 (2008), 33–53.
    • (2008) Rivista di Linguistica , vol.20 , pp. 33-53
    • Sahlgren, M.1
  • 58
    • 32344447157 scopus 로고    scopus 로고
    • Distributional measures of semantic distance: a survey
    • Mohammad, S., Hirst, G., Distributional measures of semantic distance: a survey. http://arxiv.org/abs/1203.1858, 2006.
    • (2006)
    • Mohammad, S.1    Hirst, G.2
  • 60
    • 84957638868 scopus 로고    scopus 로고
    • Estimating search engine index size variability: a 9-year longitudinal study
    • van den Bosch, A., Bogers, T., de Kunder, M., Estimating search engine index size variability: a 9-year longitudinal study. Scientometrics 107 (2016), 839–856.
    • (2016) Scientometrics , vol.107 , pp. 839-856
    • van den Bosch, A.1    Bogers, T.2    de Kunder, M.3
  • 61
    • 34248172904 scopus 로고    scopus 로고
    • Measures of semantic similarity and relatedness in the biomedical domain
    • Pedersen, T., Pakhomov, S., Patwardhan, S., Chute, C., Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inf. 40 (2007), 288–299.
    • (2007) J. Biomed. Inf. , vol.40 , pp. 288-299
    • Pedersen, T.1    Pakhomov, S.2    Patwardhan, S.3    Chute, C.4
  • 62
    • 85017479160 scopus 로고    scopus 로고
    • The pitfalls of using Google ngram to study language
    • Zhang, S., The pitfalls of using Google ngram to study language. in: Wired, 2015.
    • (2015) in: Wired
    • Zhang, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.