메뉴 건너뛰기




Volumn 29, Issue 6, 2005, Pages 867-918

How should a speech recognizer work?

Author keywords

Automatic speech recognition; Computational modeling; Human speech recognition; Spoken word recognition

Indexed keywords


EID: 33645107709     PISSN: 03640213     EISSN: None     Source Type: Journal    
DOI: 10.1207/s15516709cog0000_37     Document Type: Article
Times cited : (63)

References (88)
  • 1
    • 0001183904 scopus 로고    scopus 로고
    • Tracking the time course of spoken-word recognition using eye movements: Evidence for continuous mapping models
    • Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the time course of spoken-word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory and Language, 38, 419-439.
    • (1998) Journal of Memory and Language , vol.38 , pp. 419-439
    • Allopenna, P.D.1    Magnuson, J.S.2    Tanenhaus, M.K.3
  • 3
    • 0028502529 scopus 로고
    • The effect of subphonetic differences on lexical access
    • Andruski, J. E., Blumstein, S. E., & Burton, M. (1994). The effect of subphonetic differences on lexical access. Cognition, 52, 163-187.
    • (1994) Cognition , vol.52 , pp. 163-187
    • Andruski, J.E.1    Blumstein, S.E.2    Burton, M.3
  • 4
    • 0023676437 scopus 로고
    • The recognition of words after their acoustic offsets in spontaneous speech: Effects of subsequent context
    • Bard, E. G., Shillcock, R. C., & Altmann, G. T. M. (1988). The recognition of words after their acoustic offsets in spontaneous speech: Effects of subsequent context. Perception and Psychophysics, 44, 395-408.
    • (1988) Perception and Psychophysics , vol.44 , pp. 395-408
    • Bard, E.G.1    Shillcock, R.C.2    Altmann, G.T.M.3
  • 7
    • 0023297787 scopus 로고
    • Phonological parsing and lexical retrieval
    • Church, K. (1987). Phonological parsing and lexical retrieval. Cognition, 25, 53-69.
    • (1987) Cognition , vol.25 , pp. 53-69
    • Church, K.1
  • 9
    • 0347861619 scopus 로고    scopus 로고
    • Phonetic transcriptions in the Spoken Dutch Corpus: How to combine efficiency and good transcription quality
    • Aalborg, Denmark: Kommunik Grafiske Løninger A/S
    • Cucchiarini, C., Binnenpoorte, D, & Goddijn, S. M. A. (2001). Phonetic transcriptions in the Spoken Dutch Corpus: How to combine efficiency and good transcription quality. In Proceedings of Eurospeech (pp. 1679-1682). Aalborg, Denmark: Kommunik Grafiske Løninger A/S.
    • (2001) Proceedings of Eurospeech , pp. 1679-1682
    • Cucchiarini, C.1    Binnenpoorte, D.2    Goddijn, S.M.A.3
  • 10
    • 0036581319 scopus 로고    scopus 로고
    • Universality versus language-specificity in listening to running speech
    • Cutler, A., Demuth, K., & McQueen, J. M. (2002). Universality versus language-specificity in listening to running speech. Psychological Science, 13, 258-262.
    • (2002) Psychological Science , vol.13 , pp. 258-262
    • Cutler, A.1    Demuth, K.2    McQueen, J.M.3
  • 13
    • 84941157887 scopus 로고    scopus 로고
    • Flavor: A flexible architecture for LVCSR
    • Rundle Mall, Australia: Casual Productions
    • Demuynck, K., Laureys, T., Van Compernolle, D., & Van hamme, H. (2003). FlaVoR: A flexible architecture for LVCSR. In Proceedings of Eurospeech (pp. 1973-1976). Rundle Mall, Australia: Casual Productions.
    • (2003) Proceedings of Eurospeech , pp. 1973-1976
    • Demuynck, K.1    Laureys, T.2    Van Compernolle, D.3    Van hamme, H.4
  • 14
    • 0001727387 scopus 로고
    • Exploiting lawful variability in the speech wave
    • J. S. Perkell & D. H. Klatt (Eds.), Hillsdale, NJ: Lawrence Erlbaum Associates, Inc
    • Elman, J. L., & McClelland, J. L. (1986). Exploiting lawful variability in the speech wave. In J. S. Perkell & D. H. Klatt (Eds.), Invariance and variability of speech processes (pp. 360-380). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
    • (1986) Invariance and Variability of Speech Processes , pp. 360-380
    • Elman, J.L.1    McClelland, J.L.2
  • 15
    • 0004061086 scopus 로고    scopus 로고
    • An overview of speaker recognition technology
    • C.-H. Lee, F. K. Soong, & K. K. Paliwal (Eds.), Boston: Kluwer Academic
    • Furui, S. (1996). An overview of speaker recognition technology. In C.-H. Lee, F. K. Soong, & K. K. Paliwal (Eds.), Automatic speech and speaker technology (pp. 31-56). Boston: Kluwer Academic.
    • (1996) Automatic Speech and Speaker Technology , pp. 31-56
    • Furui, S.1
  • 16
    • 0042159903 scopus 로고    scopus 로고
    • Lexical competition and the acquisition of novel words
    • Gaskell, M. G., & Dumay, N. (2003). Lexical competition and the acquisition of novel words. Cognition, 89, 105-132.
    • (2003) Cognition , vol.89 , pp. 105-132
    • Gaskell, M.G.1    Dumay, N.2
  • 17
    • 0039058002 scopus 로고    scopus 로고
    • Integrating form and meaning: A distributed model of speech perception
    • Gaskell, M. G., & Marslen-Wilson, W. D. (1997). Integrating form and meaning: A distributed model of speech perception. Language and Cognitive Processes, 12, 613-656.
    • (1997) Language and Cognitive Processes , vol.12 , pp. 613-656
    • Gaskell, M.G.1    Marslen-Wilson, W.D.2
  • 18
    • 0032041690 scopus 로고    scopus 로고
    • Echoes of echoes? An episodic theory of lexical access
    • Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105, 251-279.
    • (1998) Psychological Review , vol.105 , pp. 251-279
    • Goldinger, S.D.1
  • 21
    • 0009625231 scopus 로고    scopus 로고
    • A comparison of novel techniques for rapid speaker adaptation
    • Hazen, T. J. (2000). A comparison of novel techniques for rapid speaker adaptation. Speech Communication, 31, 15-33.
    • (2000) Speech Communication , vol.31 , pp. 15-33
    • Hazen, T.J.1
  • 22
    • 0343408338 scopus 로고
    • "Schema abstraction" in a multiple-trace memory model
    • Hintzman, D. L. (1986). "Schema abstraction" in a multiple-trace memory model. Psychological Review, 93, 411-428.
    • (1986) Psychological Review , vol.93 , pp. 411-428
    • Hintzman, D.L.1
  • 25
    • 0024880831 scopus 로고
    • Multilayer feedforward networks are universal approximators
    • Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359-366.
    • (1989) Neural Networks , vol.2 , pp. 359-366
    • Hornik, K.1    Stinchcombe, M.2    White, H.3
  • 27
    • 0000763574 scopus 로고    scopus 로고
    • Spoken language processing
    • (Eds.)
    • Juang, B. H., & Furui, S. (Eds.). (2000). Spoken language processing [Special issue]. Proceedings of the IEEE, 88(8).
    • (2000) Proceedings of the IEEE , vol.88 , Issue.8 SPEC. ISSUE
    • Juang, B.H.1    Furui, S.2
  • 28
    • 0030117215 scopus 로고    scopus 로고
    • A probabilistic model of lexical and syntactic access and disambiguation
    • Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science, 20, 137-194.
    • (1996) Cognitive Science , vol.20 , pp. 137-194
    • Jurafsky, D.1
  • 29
    • 0001756175 scopus 로고
    • Speech perception: A model of acoustic-phonetic analysis and lexical access
    • Klatt, D. H. (1979). Speech perception: A model of acoustic-phonetic analysis and lexical access. Journal of Phonetics, 7, 279-312.
    • (1979) Journal of Phonetics , vol.7 , pp. 279-312
    • Klatt, D.H.1
  • 30
    • 0002492158 scopus 로고
    • Review of selected models of speech perception
    • W. D. Marslen-Wilson (Ed.), Cambridge, MA: MIT Press
    • Klatt, D. H. (1989). Review of selected models of speech perception. In W. D. Marslen-Wilson (Ed.), Lexical representation and process (pp. 169-226). Cambridge, MA: MIT Press.
    • (1989) Lexical Representation and Process , pp. 169-226
    • Klatt, D.H.1
  • 32
    • 0022678563 scopus 로고
    • A computational analysis of uniqueness points in auditory word recognition
    • Luce, P. A. (1986). A computational analysis of uniqueness points in auditory word recognition. Perception and Psychophysics, 39, 155-158.
    • (1986) Perception and Psychophysics , vol.39 , pp. 155-158
    • Luce, P.A.1
  • 34
    • 0031914738 scopus 로고    scopus 로고
    • Recognizing spoken words: The neighborhood activation model
    • Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19, 1-36.
    • (1998) Ear and Hearing , vol.19 , pp. 1-36
    • Luce, P.A.1    Pisoni, D.B.2
  • 35
    • 84936526625 scopus 로고
    • Cambridge, England: Cambridge University Press
    • Maddieson, I. (1984). Patterns of sounds. Cambridge, England: Cambridge University Press.
    • (1984) Patterns of Sounds
    • Maddieson, I.1
  • 36
    • 0037346973 scopus 로고    scopus 로고
    • Lexical effects on compensation for coarticulation: The ghost of Christmash past
    • Magnuson, J. S., McMurray, B., Tanenhaus, M. K., & Aslin, R. N. (2003). Lexical effects on compensation for coarticulation: The ghost of Christmash past. Cognitive Science, 27, 285-298.
    • (2003) Cognitive Science , vol.27 , pp. 285-298
    • Magnuson, J.S.1    McMurray, B.2    Tanenhaus, M.K.3    Aslin, R.N.4
  • 37
  • 39
    • 0028527631 scopus 로고
    • Levels of perceptual representation and process in lexical access: Words, phonemes, and features
    • Marslen-Wilson, W., & Warren, P. (1994). Levels of perceptual representation and process in lexical access: Words, phonemes, and features. Psychological Review, 101, 653-675.
    • (1994) Psychological Review , vol.101 , pp. 653-675
    • Marslen-Wilson, W.1    Warren, P.2
  • 40
    • 0023296876 scopus 로고
    • Functional parallelism in spoken word recognition
    • Marslen-Wilson, W. D. (1987). Functional parallelism in spoken word recognition. Cognition, 25, 71-102.
    • (1987) Cognition , vol.25 , pp. 71-102
    • Marslen-Wilson, W.D.1
  • 42
    • 0001968167 scopus 로고
    • Processing interactions and lexical access during word-recognition in continuous speech
    • Marslen-Wilson, W. D., & Welsh, A. (1978). Processing interactions and lexical access during word-recognition in continuous speech. Cognitive Psychology, 10, 29-63.
    • (1978) Cognitive Psychology , vol.10 , pp. 29-63
    • Marslen-Wilson, W.D.1    Welsh, A.2
  • 44
    • 0003025330 scopus 로고    scopus 로고
    • Segmentation of continuous speech using phonotactics
    • McQueen, J. M. (1998). Segmentation of continuous speech using phonotactics. Journal of Memory and Language, 39, 21-46.
    • (1998) Journal of Memory and Language , vol.39 , pp. 21-46
    • McQueen, J.M.1
  • 45
    • 0141756068 scopus 로고    scopus 로고
    • The ghost of Christmas future: Didn't Scrooge learn to be good? Commentary on Magnuson, McMurray, Tanenhaus and Astin (2003)
    • McQueen, J. M. (2003). The ghost of Christmas future: Didn't Scrooge learn to be good? Commentary on Magnuson, McMurray, Tanenhaus and Astin (2003). Cognitive Science, 27, 795-799.
    • (2003) Cognitive Science , vol.27 , pp. 795-799
    • McQueen, J.M.1
  • 46
    • 84950998145 scopus 로고    scopus 로고
    • Speech perception
    • K. Lamberts & R. Goldstone (Eds.), London: Sage
    • McQueen, J. M. (2005). Speech perception. In K. Lamberts & R. Goldstone (Eds.), The handbook of cognition (pp. 255-275). London: Sage.
    • (2005) The Handbook of Cognition , pp. 255-275
    • McQueen, J.M.1
  • 47
  • 52
    • 0025740746 scopus 로고
    • Virtual pitch and phase sensitivity of a computer model of the auditory periphery: I. Pitch identification
    • Meddis, R., & Hewitt, M. J. (1991). Virtual pitch and phase sensitivity of a computer model of the auditory periphery: I. Pitch identification. Journal of the Acoustical Society of America, 89, 2866-2882.
    • (1991) Journal of the Acoustical Society of America , vol.89 , pp. 2866-2882
    • Meddis, R.1    Hewitt, M.J.2
  • 53
    • 33645102438 scopus 로고    scopus 로고
    • Constraints on theories of human vs. machine recognition of speech
    • R. Smits, J. Kingston, T. M. Nearey, & R. Zondervan (Eds.), Nijmegen, The Netherlands: MPI for Psycholinguistics
    • Moore, R. K., & Cutler, A. (2001). Constraints on theories of human vs. machine recognition of speech. In R. Smits, J. Kingston, T. M. Nearey, & R. Zondervan (Eds.), Proceedings of the Workshop on Speech Recognition as Pattern Classification (pp. 145-150). Nijmegen, The Netherlands: MPI for Psycholinguistics.
    • (2001) Proceedings of the Workshop on Speech Recognition As Pattern Classification , pp. 145-150
    • Moore, R.K.1    Cutler, A.2
  • 54
    • 0004698012 scopus 로고    scopus 로고
    • Dynamic programming search: From digit strings to large vocabulary word graphs
    • C.-H. Lee, F. K. Soong, & K. K. Paliwal (Eds.), Boston: Kluwer Academic
    • Ney, H., & Aubert, X. (1996). Dynamic programming search: From digit strings to large vocabulary word graphs. In C.-H. Lee, F. K. Soong, & K. K. Paliwal (Eds.), Automatic speech and speaker recognition (pp. 385-413). Boston: Kluwer Academic.
    • (1996) Automatic Speech and Speaker Recognition , pp. 385-413
    • Ney, H.1    Aubert, X.2
  • 55
    • 0020014757 scopus 로고
    • Autonomous processes in comprehension: A reply to Marslen-Wilson and Tyler
    • Norris, D. (1982). Autonomous processes in comprehension: A reply to Marslen-Wilson and Tyler. Cognition, 11, 97-101.
    • (1982) Cognition , vol.11 , pp. 97-101
    • Norris, D.1
  • 56
    • 0022687226 scopus 로고
    • Word recognition: Context effects without priming
    • Norris, D. (1986). Word recognition: Context effects without priming. Cognition, 22, 93-136.
    • (1986) Cognition , vol.22 , pp. 93-136
    • Norris, D.1
  • 57
    • 33745205144 scopus 로고
    • Shortlist: A connectionist model of continuous speech recognition
    • Norris, D. (1994). Shortlist: A connectionist model of continuous speech recognition. Cognition, 52, 189-234.
    • (1994) Cognition , vol.52 , pp. 189-234
    • Norris, D.1
  • 58
    • 85082898812 scopus 로고    scopus 로고
    • How do computational models help us develop better theories?
    • A. Cutler (Ed.), Hillsdale, NJ: Lawrence Erlbaum Associates, Inc
    • Norris, D. (2005). How do computational models help us develop better theories? In A. Cutler (Ed.), Twenty-first century psycholinguistics: Four cornerstones (pp. 331-346). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
    • (2005) Twenty-first Century Psycholinguistics: Four Cornerstones , pp. 331-346
    • Norris, D.1
  • 60
    • 0033823717 scopus 로고    scopus 로고
    • Merging information in speech recognition: Feedback is never necessary
    • Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences, 23, 299-325.
    • (2000) Behavioral and Brain Sciences , vol.23 , pp. 299-325
    • Norris, D.1    McQueen, J.M.2    Cutler, A.3
  • 62
    • 0031302986 scopus 로고    scopus 로고
    • The possible-word constraint in the segmentation of continuous speech
    • Norris, D., McQueen, J. M., Cutler, A., & Butterfield, S. (1997). The possible-word constraint in the segmentation of continuous speech. Cognitive Psychology, 34, 191-243.
    • (1997) Cognitive Psychology , vol.34 , pp. 191-243
    • Norris, D.1    McQueen, J.M.2    Cutler, A.3    Butterfield, S.4
  • 64
    • 0029132067 scopus 로고
    • Time-domain modelling of peripheral auditory processing: A modular architecture and a software platform
    • Patterson, R. D., Allerhand, M., & Giguere, C. (1995). Time-domain modelling of peripheral auditory processing: A modular architecture and a software platform. Journal of the Acoustical Society of America, 98, 1890-1894.
    • (1995) Journal of the Acoustical Society of America , vol.98 , pp. 1890-1894
    • Patterson, R.D.1    Allerhand, M.2    Giguere, C.3
  • 67
    • 0000441221 scopus 로고    scopus 로고
    • Is compensation for coarticulation mediated by the lexicon?
    • Pitt, M. A., & McQueen, J. M. (1998). Is compensation for coarticulation mediated by the lexicon? Journal of Memory and Language, 39, 347-370.
    • (1998) Journal of Memory and Language , vol.39 , pp. 347-370
    • Pitt, M.A.1    McQueen, J.M.2
  • 69
    • 0242333915 scopus 로고    scopus 로고
    • The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension
    • Salverda, A. P., Dahan, D., & McQueen, J. M. (2003). The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension. Cognition, 90, 51-89.
    • (2003) Cognition , vol.90 , pp. 51-89
    • Salverda, A.P.1    Dahan, D.2    McQueen, J.M.3
  • 70
    • 0035407588 scopus 로고    scopus 로고
    • Knowing a word affects the fundamental perception of the sounds within it
    • Samuel, A. G. (2001). Knowing a word affects the fundamental perception of the sounds within it. Psychological Science, 12, 348-351.
    • (2001) Psychological Science , vol.12 , pp. 348-351
    • Samuel, A.G.1
  • 71
    • 0037295147 scopus 로고    scopus 로고
    • Lexical activation (and other factors) can mediate compensation for coarticulation
    • Samuel, A. G., & Pitt, M. A. (2003). Lexical activation (and other factors) can mediate compensation for coarticulation. Journal of Memory and Language, 48, 416-434.
    • (2003) Journal of Memory and Language , vol.48 , pp. 416-434
    • Samuel, A.G.1    Pitt, M.A.2
  • 73
    • 85009209106 scopus 로고    scopus 로고
    • Modelling human speech recognition using automatic speech recognition paradigms in SpeM
    • Rundle Mall, Australia: Casual Productions
    • Scharenborg, O., McQueen, J. M., ten Bosch, L., & Norris, D. (2003). Modelling human speech recognition using automatic speech recognition paradigms in SpeM. In Proceedings of Eurospeech (pp. 2097-2100). Rundle Mall, Australia: Casual Productions.
    • (2003) Proceedings of Eurospeech , pp. 2097-2100
    • Scharenborg, O.1    McQueen, J.M.2    ten Bosch, L.3    Norris, D.4
  • 75
    • 33645099366 scopus 로고    scopus 로고
    • Recognising "real-life" speech with SpeM: A speech-based computational model of human speech recognition
    • Rundle Mall, Australia: Casual Productions
    • Scharenborg O., ten Bosch, L., & Boves, L. (2003b). Recognising "real-life" speech with SpeM: A speech-based computational model of human speech recognition. In Proceedings of Eurospeech (pp. 2285-2288). Rundle Mall, Australia: Casual Productions.
    • (2003) Proceedings of Eurospeech , pp. 2285-2288
    • Scharenborg, O.1    ten Bosch, L.2    Boves, L.3
  • 76
    • 0346217031 scopus 로고    scopus 로고
    • Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition
    • Scharenborg, O., ten Bosch, L., Boves, L., & Norris, D. (2003). Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition. Journal of the Acoustical Society of America, 114(6), 3032-3035.
    • (2003) Journal of the Acoustical Society of America , vol.114 , Issue.6 , pp. 3032-3035
    • Scharenborg, O.1    ten Bosch, L.2    Boves, L.3    Norris, D.4
  • 78
    • 0036219864 scopus 로고    scopus 로고
    • Toward a model for lexical access based on acoustic landmarks and distinctive features
    • Stevens, K. N. (2002). Toward a model for lexical access based on acoustic landmarks and distinctive features. Journal of the Acoustical Society of America, 111, 1872-1891.
    • (2002) Journal of the Acoustical Society of America , vol.111 , pp. 1872-1891
    • Stevens, K.N.1
  • 79
    • 85009084222 scopus 로고    scopus 로고
    • Impact of speaking style and speaking task on acoustic models
    • Beijing, China: China Military Friendship Publish
    • Sturm, J., Kamperman, H., Boves, L., & den Os, E. (2000). Impact of speaking style and speaking task on acoustic models. In Proceedings of ICSLP (pp. 361-364). Beijing, China: China Military Friendship Publish.
    • (2000) Proceedings of ICSLP , pp. 361-364
    • Sturm, J.1    Kamperman, H.2    Boves, L.3    den Os, E.4
  • 82
    • 0000362570 scopus 로고    scopus 로고
    • When words compete: Levels of processing in spoken word recognition
    • Vitevitch, M. S., & Luce, P. A. (1998). When words compete: Levels of processing in spoken word recognition. Psychological Science, 9, 325-329.
    • (1998) Psychological Science , vol.9 , pp. 325-329
    • Vitevitch, M.S.1    Luce, P.A.2
  • 83
    • 0001074661 scopus 로고    scopus 로고
    • Probabilistic phonotactics and neighborhood activation in spoken word recognition
    • Vitevitch, M. S., & Luce, P. A. (1999). Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language, 40, 374-408.
    • (1999) Journal of Memory and Language , vol.40 , pp. 374-408
    • Vitevitch, M.S.1    Luce, P.A.2
  • 88
    • 0024689279 scopus 로고
    • The locus of the effects of sentential-semantic context in spoken-word processing
    • Zwitserlood, P. (1989). The locus of the effects of sentential-semantic context in spoken-word processing. Cognition, 32, 25-64.
    • (1989) Cognition , vol.32 , pp. 25-64
    • Zwitserlood, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.