메뉴 건너뛰기




Volumn , Issue , 2007, Pages 69-76

Computational auditory scene analysis and its application to robot audition: Five years experience

Author keywords

[No Author keywords available]

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; HUMAN COMPUTER INTERACTION; MICROPHONES; ROBOTICS; SPEECH ANALYSIS; SPEECH RECOGNITION;

EID: 34548809160     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICKS.2007.7     Document Type: Conference Paper
Times cited : (17)

References (42)
  • 1
    • 80052339383 scopus 로고
    • Some experiments on the recognition of speech, with one and with two ears
    • E. C. Cherry, "Some experiments on the recognition of speech, with one and with two ears," J. of ASA, vol. 25, 975-979, 1953.
    • (1953) J. of ASA , vol.25 , pp. 975-979
    • Cherry, E.C.1
  • 2
    • 0003444613 scopus 로고    scopus 로고
    • D. Rosenthal and H. G. Okuno, eds, Mahwah, New Jersey: Lawrence Erlbaum Associates
    • D. Rosenthal and H. G. Okuno, eds., Computational Auditory Scene Analysis. Mahwah, New Jersey: Lawrence Erlbaum Associates, 1998.
    • (1998) Computational Auditory Scene Analysis
  • 3
    • 33846170539 scopus 로고    scopus 로고
    • Enhanced robot speech recognition based on microphone array source separation and missing feature theory
    • S. Yamamoto, J.-M. Valin, K. Nakadai, T. Ogata, and H. G. Okuno, "Enhanced robot speech recognition based on microphone array source separation and missing feature theory," Proc. of IEEE ICRA 2005, 1489-1494.
    • (2005) Proc. of IEEE ICRA , pp. 1489-1494
    • Yamamoto, S.1    Valin, J.-M.2    Nakadai, K.3    Ogata, T.4    Okuno, H.G.5
  • 4
    • 33749539191 scopus 로고    scopus 로고
    • Recognition of simultaneous speech by estimating reliability of separated signals for robot audition, PRICAI 2006: Trends in Artificial Intelligence
    • S. Yamamoto, R. Takeda, K. Nakadai, M. Nakano, H. Tsujino, J.-M. Valin, K. Komatani, T. Ogata, and H. G. Okuno, "Recognition of simultaneous speech by estimating reliability of separated signals for robot audition," PRICAI 2006: Trends in Artificial Intelligence, LNCS 4099, 484-494, 2006.
    • (2006) LNCS , vol.4099 , pp. 484-494
    • Yamamoto, S.1    Takeda, R.2    Nakadai, K.3    Nakano, M.4    Tsujino, H.5    Valin, J.-M.6    Komatani, K.7    Ogata, T.8    Okuno, H.G.9
  • 5
    • 34250689497 scopus 로고    scopus 로고
    • Missing-feature based speech recognition for two simultaneous speech signals separated by ica with a pair of humanoid ears
    • R. Takeda, S. Yamamoto, K. Komatani, T. Ogata, and H. G. Okuno, "Missing-feature based speech recognition for two simultaneous speech signals separated by ica with a pair of humanoid ears," Proc. of IEEE/RSJIROS 2006, 878-885.
    • (2006) Proc. of IEEE/RSJIROS , pp. 878-885
    • Takeda, R.1    Yamamoto, S.2    Komatani, K.3    Ogata, T.4    Okuno, H.G.5
  • 6
    • 33746191291 scopus 로고    scopus 로고
    • Genetic algorithm-based improvement of robot hearing capabilities in separating and recognizing simultaneous speech signals, Advances in Applied Artificial Intelligence
    • S. Yamamoto, K. Nakadai, M. Nakano, H. Tsujino, J.-M. Valin, R. Takeda, K. Komatani, T. Ogata, and H. G. Okuno, "Genetic algorithm-based improvement of robot hearing capabilities in separating and recognizing simultaneous speech signals," Advances in Applied Artificial Intelligence, LNAI 4031, 207-217, 2006.
    • (2006) LNAI , vol.4031 , pp. 207-217
    • Yamamoto, S.1    Nakadai, K.2    Nakano, M.3    Tsujino, H.4    Valin, J.-M.5    Takeda, R.6    Komatani, K.7    Ogata, T.8    Okuno, H.G.9
  • 10
    • 34547541093 scopus 로고    scopus 로고
    • K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, Drum sound recognition for polyphonic audio signals by adaptation and matching of spectral templates with harmonic harmonic structure suppression, IEEE Trans. on Audio, Speech and Language Processing, 15, in print, 2007.
    • K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Drum sound recognition for polyphonic audio signals by adaptation and matching of spectral templates with harmonic harmonic structure suppression," IEEE Trans. on Audio, Speech and Language Processing, vol. 15, in print, 2007.
  • 11
    • 34548710579 scopus 로고    scopus 로고
    • Drumix: An audio player with functions of real-time drum-part rearrangement for active music listening, J. of IPSJ, vol. 48, no. 3
    • accepted
    • K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Drumix: An audio player with functions of real-time drum-part rearrangement for active music listening," J. of IPSJ, vol. 48, no. 3, accepted, 2007.
    • (2007)
    • Yoshii, K.1    Goto, M.2    Komatani, K.3    Ogata, T.4    Okuno, H.G.5
  • 12
    • 34547549019 scopus 로고    scopus 로고
    • Integration and adaptation of harmonic and inharmonic models for separating polyphonic musical signals
    • accepted
    • K. Itoyama, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Integration and adaptation of harmonic and inharmonic models for separating polyphonic musical signals," Proc. of IEEE ICASSP'2007, accepted.
    • (2007) Proc. of IEEE ICASSP
    • Itoyama, K.1    Goto, M.2    Komatani, K.3    Ogata, T.4    Okuno, H.G.5
  • 13
    • 33846220762 scopus 로고    scopus 로고
    • Instrument identification in polyphonic music: Feature weighting to minimize influence of sound overlaps
    • Article ID 51979, p
    • T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Instrument identification in polyphonic music: Feature weighting to minimize influence of sound overlaps," EURASIP J on Advances in Signal Processing, vol. 2007, Article ID 51979, p. 15, 2007.
    • (2007) EURASIP J on Advances in Signal Processing , vol.2007 , pp. 15
    • Kitahara, T.1    Goto, M.2    Komatani, K.3    Ogata, T.4    Okuno, H.G.5
  • 14
    • 34548737949 scopus 로고    scopus 로고
    • Instrogram: Probabilistic representation of instrument existence for polyphonic music
    • accepted
    • T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Instrogram: Probabilistic representation of instrument existence for polyphonic music," J of IPSJ, vol. 48, 2007, accepted.
    • (2007) J of IPSJ , vol.48
    • Kitahara, T.1    Goto, M.2    Komatani, K.3    Ogata, T.4    Okuno, H.G.5
  • 15
    • 33947678880 scopus 로고    scopus 로고
    • F0 estimation method for singing voice in polyphonic audio signal based on statistical vocal model and viterbi search
    • K. Fujihara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "F0 estimation method for singing voice in polyphonic audio signal based on statistical vocal model and viterbi search," Proc. of IEEE ICASSP'2006, vol. V, 253-256.
    • Proc. of IEEE ICASSP'2006 , vol.5 , pp. 253-256
    • Fujihara, K.1    Goto, M.2    Komatani, K.3    Ogata, T.4    Okuno, H.G.5
  • 16
    • 34547508425 scopus 로고    scopus 로고
    • Automatic synchronization between lyrics and music cd recordings based on viterbi alignment of segregated vocal signals
    • K. Fujihara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Automatic synchronization between lyrics and music cd recordings based on viterbi alignment of segregated vocal signals," Proc. of IEEE ISM 2006, 257-264.
    • (2006) Proc. of IEEE ISM , pp. 257-264
    • Fujihara, K.1    Goto, M.2    Komatani, K.3    Ogata, T.4    Okuno, H.G.5
  • 18
    • 0141631749 scopus 로고    scopus 로고
    • Musical instrument identification based on F0-dependent multivariate normal distribution
    • T. Kitahara, M. Goto, and H. G. Okuno, "Musical instrument identification based on F0-dependent multivariate normal distribution.," ICASSP-2003, 421-424.
    • (2003) ICASSP , pp. 421-424
    • Kitahara, T.1    Goto, M.2    Okuno, H.G.3
  • 19
    • 4544297184 scopus 로고    scopus 로고
    • Category-level identification of non-registered musical instrument sounds
    • T. Kitahara, M. Goto, and H. G. Okuno, "Category-level identification of non-registered musical instrument sounds," Proc. ofICASSP-2004, p. 253-256.
    • Proc. ofICASSP-2004 , pp. 253-256
    • Kitahara, T.1    Goto, M.2    Okuno, H.G.3
  • 20
    • 4544229825 scopus 로고    scopus 로고
    • Comparing features for forming music streams in automatic music transcription
    • Y. Sakuraba, T. Kitahara, and H. G. Okuno, "Comparing features for forming music streams in automatic music transcription," Proc. of ICASSP-2004, 273-276.
    • (2004) Proc. of ICASSP , pp. 273-276
    • Sakuraba, Y.1    Kitahara, T.2    Okuno, H.G.3
  • 21
    • 44949088691 scopus 로고    scopus 로고
    • Dynamic help generation by estimating user's mental model in spoken dialogue systems
    • Y. Fukubayashi, K. Komatani, T. Ogata, and H. G. Okuno, "Dynamic help generation by estimating user's mental model in spoken dialogue systems," Proc. of Interspeech-2006, 1946-1949.
    • Proc. of Interspeech-2006 , pp. 1946-1949
    • Fukubayashi, Y.1    Komatani, K.2    Ogata, T.3    Okuno, H.G.4
  • 23
    • 21444440253 scopus 로고    scopus 로고
    • User modeling in spoken dialogue systems to generate flexible guidance
    • K. Komatani, S. Ueno, T. Kawahara, and H. G. Okuno, "User modeling in spoken dialogue systems to generate flexible guidance," User Model. User-Adapt. Interact, vol. 15, no. 1-2, 169-183, 2005.
    • (2005) User Model. User-Adapt. Interact , vol.15 , Issue.1-2 , pp. 169-183
    • Komatani, K.1    Ueno, S.2    Kawahara, T.3    Okuno, H.G.4
  • 24
    • 22944489210 scopus 로고    scopus 로고
    • Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes, PRICAI2004: Trends in Artificial Intelligence
    • K. Ishihara, T. Nakatani, T. Ogata, and H. G. Okuno, "Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes," PRICAI2004: Trends in Artificial Intelligence, LNCS 3157, 909-918, 2004.
    • (2004) LNCS , vol.3157 , pp. 909-918
    • Ishihara, K.1    Nakatani, T.2    Ogata, T.3    Okuno, H.G.4
  • 26
    • 50149091391 scopus 로고    scopus 로고
    • Generation of robot motions from environmental sounds using inter-modality mapping by rnnpb
    • T. Ogata, Y. Hattori, H. Kozima, K. Komatani, and H. G. Okuno, "Generation of robot motions from environmental sounds using inter-modality mapping by rnnpb," Proc. of EpiRobo-2006, 95-102.
    • Proc. of EpiRobo-2006 , pp. 95-102
    • Ogata, T.1    Hattori, Y.2    Kozima, H.3    Komatani, K.4    Okuno, H.G.5
  • 27
    • 26944446929 scopus 로고    scopus 로고
    • Distance based dynamic interaction of humanoid robot with multiple people, Innovations in Applied Artificial Intelligence
    • T. Tasaki, S. Matsumoto, H. Ohba, M. Toda, K. Komatani, T. Ogata, and H. Okuno, "Distance based dynamic interaction of humanoid robot with multiple people," Innovations in Applied Artificial Intelligence, LNAI 3533, 111-120, 2005.
    • (2005) LNAI , vol.3533 , pp. 111-120
    • Tasaki, T.1    Matsumoto, S.2    Ohba, H.3    Toda, M.4    Komatani, K.5    Ogata, T.6    Okuno, H.7
  • 28
    • 10444249505 scopus 로고    scopus 로고
    • Computational auditory scene analysis and its application to robot audition
    • IEEE
    • H. G. Okuno, T. Ogata, K. Komatani, and K. Nakadai, "Computational auditory scene analysis and its application to robot audition," Proc. of ICKS 2004, 73-80, IEEE.
    • Proc. of ICKS 2004 , pp. 73-80
    • Okuno, H.G.1    Ogata, T.2    Komatani, K.3    Nakadai, K.4
  • 29
    • 14044260635 scopus 로고    scopus 로고
    • Enhanced robot audition based on microphone array source separation with postfilter
    • J.-M. Valin, J. Rouat, and F. Michaud, "Enhanced robot audition based on microphone array source separation with postfilter," IROS 2004, 2123-2128.
    • IROS 2004 , pp. 2123-2128
    • Valin, J.-M.1    Rouat, J.2    Michaud, F.3
  • 30
    • 85032752225 scopus 로고    scopus 로고
    • Missing-feature approaches in speech recognition
    • IEEE
    • B. Raj and R. M. Stern, "Missing-feature approaches in speech recognition," Signal Processing Magazine, vol. 22, no. 5, 101-116, IEEE, 2005.
    • (2005) Signal Processing Magazine , vol.22 , Issue.5 , pp. 101-116
    • Raj, B.1    Stern, R.M.2
  • 31
    • 85009106519 scopus 로고    scopus 로고
    • Robust asr based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
    • ESCA
    • J. Barker, M. Cooke, and P. Green, "Robust asr based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise," Proc. of Eurospeech-2001, 213-216, ESCA, 2001.
    • (2001) Proc. of Eurospeech-2001 , pp. 213-216
    • Barker, J.1    Cooke, M.2    Green, P.3
  • 32
    • 0035342414 scopus 로고    scopus 로고
    • Robust automatic speech recognition with missing and unreliable acoustic data
    • May
    • M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Communication, vol. 34, 267-285, May 2001.
    • (2001) Speech Communication , vol.34 , pp. 267-285
    • Cooke, M.1    Green, P.2    Josifovski, L.3    Vizinho, A.4
  • 35
    • 3042525207 scopus 로고    scopus 로고
    • Localization of simultaneous moving sound sources for mobile robot using a frequency-domain steered beamformer approach
    • J.-M. Valin, F. Michaud, B. Hadjou, and J. Rouat, "Localization of simultaneous moving sound sources for mobile robot using a frequency-domain steered beamformer approach," ICRA 2004, 1033-1038.
    • ICRA 2004 , pp. 1033-1038
    • Valin, J.-M.1    Michaud, F.2    Hadjou, B.3    Rouat, J.4
  • 36
    • 0036753896 scopus 로고    scopus 로고
    • Geometric source separation: Mergin convolutive source separation with geometric beamforming
    • L. C. Parra and C. V. Alvino, "Geometric source separation: Mergin convolutive source separation with geometric beamforming," IEEE Trans. on Speech and Audio Processing, vol. 10, no. 6, 352-362, 2002.
    • (2002) IEEE Trans. on Speech and Audio Processing , vol.10 , Issue.6 , pp. 352-362
    • Parra, L.C.1    Alvino, C.V.2
  • 37
    • 0141847846 scopus 로고    scopus 로고
    • Microphone array post-filtering for non-stationary noise suppression
    • I. Cohen and B. Berdugo, "Microphone array post-filtering for non-stationary noise suppression," Proc. of ICASSP-2002, 901-904.
    • (2002) Proc. of ICASSP , pp. 901-904
    • Cohen, I.1    Berdugo, B.2
  • 38
    • 85009144958 scopus 로고    scopus 로고
    • Free software toolkit for Japanese large vocabulary continuous speech recognition
    • T. Kawahara and A. Lee, "Free software toolkit for Japanese large vocabulary continuous speech recognition," Proc. of ICSLP-2000, 476-479.
    • Proc. of ICSLP-2000 , pp. 476-479
    • Kawahara, T.1    Lee, A.2
  • 39
    • 33947625206 scopus 로고    scopus 로고
    • Automatic drum sound description for real-world music using template adaptation and matching methods
    • K. Yoshii, M. Goto, and H. G. Okuno, "Automatic drum sound description for real-world music using template adaptation and matching methods," Proc. of ISMIR-2004, 184-191.
    • Proc. of ISMIR-2004 , pp. 184-191
    • Yoshii, K.1    Goto, M.2    Okuno, H.G.3
  • 40
    • 4444224374 scopus 로고    scopus 로고
    • An audio-based real-time beat tracking system for music with or without drum-sounds
    • M. Goto, "An audio-based real-time beat tracking system for music with or without drum-sounds," J. of New Music Research, vol. 30, no. 2, 159-171, 2001.
    • (2001) J. of New Music Research , vol.30 , Issue.2 , pp. 159-171
    • Goto, M.1
  • 41
    • 23944456976 scopus 로고    scopus 로고
    • Exploring music collections by browsing different views
    • E. Pampalk, S. Dixon, and G. Widmer, "Exploring music collections by browsing different views," Proc. of ISMIR-2003, 201-208.
    • Proc. of ISMIR-2003 , pp. 201-208
    • Pampalk, E.1    Dixon, S.2    Widmer, G.3
  • 42
    • 0141623871 scopus 로고    scopus 로고
    • RWC music database: Popular, classical, and jazz music databases
    • M. Goto, T. Hashiguchi, and R. Oka, "RWC music database: Popular, classical, and jazz music databases," Proc. of ISMIR-2002, 287-288.
    • Proc. of ISMIR-2002 , pp. 287-288
    • Goto, M.1    Hashiguchi, T.2    Oka, R.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.