SCOPUS 정보 검색 플랫폼

Proceedings - Second International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2007

Volumn , Issue , 2007, Pages 69-76

Computational auditory scene analysis and its application to robot audition: Five years experience

(3) Okuno, Hiroshi G a Ogata, Tetsuya a Komatani, Kazunori a

a KYOTO UNIVERSITY (Japan)

Author keywords

[No Author keywords available]

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; HUMAN COMPUTER INTERACTION; MICROPHONES; ROBOTICS; SPEECH ANALYSIS; SPEECH RECOGNITION;

AUDITORY SCENE ANALYSIS; POLYPHONIC MUSIC SIGNALS; SOUND SIGNALS; SOUND SOURCE LOCALIZATION;

SPEECH COMMUNICATION;

EID: 34548809160 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICKS.2007.7 Document Type: Conference Paper

Times cited : (17)

References (42)

1
- 80052339383
- Some experiments on the recognition of speech, with one and with two ears
- E. C. Cherry, "Some experiments on the recognition of speech, with one and with two ears," J. of ASA, vol. 25, 975-979, 1953.
- (1953) J. of ASA , vol.25 , pp. 975-979
- Cherry, E.C.¹

2
- 0003444613
- D. Rosenthal and H. G. Okuno, eds, Mahwah, New Jersey: Lawrence Erlbaum Associates
- D. Rosenthal and H. G. Okuno, eds., Computational Auditory Scene Analysis. Mahwah, New Jersey: Lawrence Erlbaum Associates, 1998.
- (1998) Computational Auditory Scene Analysis

3
- 33846170539
- Enhanced robot speech recognition based on microphone array source separation and missing feature theory
- S. Yamamoto, J.-M. Valin, K. Nakadai, T. Ogata, and H. G. Okuno, "Enhanced robot speech recognition based on microphone array source separation and missing feature theory," Proc. of IEEE ICRA 2005, 1489-1494.
- (2005) Proc. of IEEE ICRA , pp. 1489-1494
- Yamamoto, S.¹ Valin, J.-M.² Nakadai, K.³ Ogata, T.⁴ Okuno, H.G.⁵

4
- 33749539191
- Recognition of simultaneous speech by estimating reliability of separated signals for robot audition, PRICAI 2006: Trends in Artificial Intelligence
- S. Yamamoto, R. Takeda, K. Nakadai, M. Nakano, H. Tsujino, J.-M. Valin, K. Komatani, T. Ogata, and H. G. Okuno, "Recognition of simultaneous speech by estimating reliability of separated signals for robot audition," PRICAI 2006: Trends in Artificial Intelligence, LNCS 4099, 484-494, 2006.
- (2006) LNCS , vol.4099 , pp. 484-494
- Yamamoto, S.¹ Takeda, R.² Nakadai, K.³ Nakano, M.⁴ Tsujino, H.⁵ Valin, J.-M.⁶ Komatani, K.⁷ Ogata, T.⁸ Okuno, H.G.⁹

5
- 34250689497
- Missing-feature based speech recognition for two simultaneous speech signals separated by ica with a pair of humanoid ears
- R. Takeda, S. Yamamoto, K. Komatani, T. Ogata, and H. G. Okuno, "Missing-feature based speech recognition for two simultaneous speech signals separated by ica with a pair of humanoid ears," Proc. of IEEE/RSJIROS 2006, 878-885.
- (2006) Proc. of IEEE/RSJIROS , pp. 878-885
- Takeda, R.¹ Yamamoto, S.² Komatani, K.³ Ogata, T.⁴ Okuno, H.G.⁵

6
- 33746191291
- Genetic algorithm-based improvement of robot hearing capabilities in separating and recognizing simultaneous speech signals, Advances in Applied Artificial Intelligence
- S. Yamamoto, K. Nakadai, M. Nakano, H. Tsujino, J.-M. Valin, R. Takeda, K. Komatani, T. Ogata, and H. G. Okuno, "Genetic algorithm-based improvement of robot hearing capabilities in separating and recognizing simultaneous speech signals," Advances in Applied Artificial Intelligence, LNAI 4031, 207-217, 2006.
- (2006) LNAI , vol.4031 , pp. 207-217
- Yamamoto, S.¹ Nakadai, K.² Nakano, M.³ Tsujino, H.⁴ Valin, J.-M.⁵ Takeda, R.⁶ Komatani, K.⁷ Ogata, T.⁸ Okuno, H.G.⁹

7
- 34250652551
- Real-time robot audition system that recognizes simultaneous speech in the real world
- S. Yamamoto, R. Takeda, K. Nakadai, M. Nakano, H. Tsujino, J.-M. Valin, K. Komatani, T. Ogata, and H. G. Okuno, "Real-time robot audition system that recognizes simultaneous speech in the real world," Proc. of IEEE/RSJ IROS 2006, 5333-5338.
- (2006) Proc. of IEEE/RSJ IROS , pp. 5333-5338
- Yamamoto, S.¹ Takeda, R.² Nakadai, K.³ Nakano, M.⁴ Tsujino, H.⁵ Valin, J.-M.⁶ Komatani, K.⁷ Ogata, T.⁸ Okuno, H.G.⁹

8
- 33745188273
- Multiple moving speaker tracking by microphone array on mobile robot
- M. Murase, S. Yamamoto, J.-M. Valin, K. Nakadai, K. Yamada, K. Komatani, T. Ogata, and H. G. Okuno, "Multiple moving speaker tracking by microphone array on mobile robot," Eurospeech-2005), 249-252.
- (2005) Eurospeech , pp. 249-252
- Murase, M.¹ Yamamoto, S.² Valin, J.-M.³ Nakadai, K.⁴ Yamada, K.⁵ Komatani, K.⁶ Ogata, T.⁷ Okuno, H.G.⁸

9
- 34250684156
- Real-time tracking of multiple sound sources by integration of in-room and robot-embedded microphone arrays
- K. Nakadai, H. Nakajima, M. Murase, S. Kaijiri, K. Yamada, Y Hasegawa, H. G. Okuno, and H. Tsujino, "Real-time tracking of multiple sound sources by integration of in-room and robot-embedded microphone arrays," Proc. of IROS 2006, 852-859.
- Proc. of IROS 2006 , pp. 852-859
- Nakadai, K.¹ Nakajima, H.² Murase, M.³ Kaijiri, S.⁴ Yamada, K.⁵ Hasegawa, Y.⁶ Okuno, H.G.⁷ Tsujino, H.⁸

10
- 34547541093
- K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, Drum sound recognition for polyphonic audio signals by adaptation and matching of spectral templates with harmonic harmonic structure suppression, IEEE Trans. on Audio, Speech and Language Processing, 15, in print, 2007.
- K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Drum sound recognition for polyphonic audio signals by adaptation and matching of spectral templates with harmonic harmonic structure suppression," IEEE Trans. on Audio, Speech and Language Processing, vol. 15, in print, 2007.

11
- 34548710579
- Drumix: An audio player with functions of real-time drum-part rearrangement for active music listening, J. of IPSJ, vol. 48, no. 3
- accepted
- K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Drumix: An audio player with functions of real-time drum-part rearrangement for active music listening," J. of IPSJ, vol. 48, no. 3, accepted, 2007.
- (2007)
- Yoshii, K.¹ Goto, M.² Komatani, K.³ Ogata, T.⁴ Okuno, H.G.⁵

12
- 34547549019
- Integration and adaptation of harmonic and inharmonic models for separating polyphonic musical signals
- accepted
- K. Itoyama, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Integration and adaptation of harmonic and inharmonic models for separating polyphonic musical signals," Proc. of IEEE ICASSP'2007, accepted.
- (2007) Proc. of IEEE ICASSP
- Itoyama, K.¹ Goto, M.² Komatani, K.³ Ogata, T.⁴ Okuno, H.G.⁵

13
- 33846220762
- Instrument identification in polyphonic music: Feature weighting to minimize influence of sound overlaps
- Article ID 51979, p
- T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Instrument identification in polyphonic music: Feature weighting to minimize influence of sound overlaps," EURASIP J on Advances in Signal Processing, vol. 2007, Article ID 51979, p. 15, 2007.
- (2007) EURASIP J on Advances in Signal Processing , vol.2007 , pp. 15
- Kitahara, T.¹ Goto, M.² Komatani, K.³ Ogata, T.⁴ Okuno, H.G.⁵

14
- 34548737949
- Instrogram: Probabilistic representation of instrument existence for polyphonic music
- accepted
- T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Instrogram: Probabilistic representation of instrument existence for polyphonic music," J of IPSJ, vol. 48, 2007, accepted.
- (2007) J of IPSJ , vol.48
- Kitahara, T.¹ Goto, M.² Komatani, K.³ Ogata, T.⁴ Okuno, H.G.⁵

15
- 33947678880
- F0 estimation method for singing voice in polyphonic audio signal based on statistical vocal model and viterbi search
- K. Fujihara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "F0 estimation method for singing voice in polyphonic audio signal based on statistical vocal model and viterbi search," Proc. of IEEE ICASSP'2006, vol. V, 253-256.
- Proc. of IEEE ICASSP'2006 , vol.5 , pp. 253-256
- Fujihara, K.¹ Goto, M.² Komatani, K.³ Ogata, T.⁴ Okuno, H.G.⁵

16
- 34547508425
- Automatic synchronization between lyrics and music cd recordings based on viterbi alignment of segregated vocal signals
- K. Fujihara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Automatic synchronization between lyrics and music cd recordings based on viterbi alignment of segregated vocal signals," Proc. of IEEE ISM 2006, 257-264.
- (2006) Proc. of IEEE ISM , pp. 257-264
- Fujihara, K.¹ Goto, M.² Komatani, K.³ Ogata, T.⁴ Okuno, H.G.⁵

17
- 32244446324
- Common acoustical pole estimation from multi-channel musical audio signals
- T. Yoshioka, T. Hikichi, M. Miyoshi, and H. G. Okuno, "Common acoustical pole estimation from multi-channel musical audio signals," IEICE Trans. on Fundamentals of Electronics, Communications, and Computer Sciences, vol. E89-A, no. 1, 240-247, 2006.
- (2006) IEICE Trans. on Fundamentals of Electronics, Communications, and Computer Sciences , vol.E89-A , Issue.1 , pp. 240-247
- Yoshioka, T.¹ Hikichi, T.² Miyoshi, M.³ Okuno, H.G.⁴

18
- 0141631749
- Musical instrument identification based on F0-dependent multivariate normal distribution
- T. Kitahara, M. Goto, and H. G. Okuno, "Musical instrument identification based on F0-dependent multivariate normal distribution.," ICASSP-2003, 421-424.
- (2003) ICASSP , pp. 421-424
- Kitahara, T.¹ Goto, M.² Okuno, H.G.³

19
- 4544297184
- Category-level identification of non-registered musical instrument sounds
- T. Kitahara, M. Goto, and H. G. Okuno, "Category-level identification of non-registered musical instrument sounds," Proc. ofICASSP-2004, p. 253-256.
- Proc. ofICASSP-2004 , pp. 253-256
- Kitahara, T.¹ Goto, M.² Okuno, H.G.³

20
- 4544229825
- Comparing features for forming music streams in automatic music transcription
- Y. Sakuraba, T. Kitahara, and H. G. Okuno, "Comparing features for forming music streams in automatic music transcription," Proc. of ICASSP-2004, 273-276.
- (2004) Proc. of ICASSP , pp. 273-276
- Sakuraba, Y.¹ Kitahara, T.² Okuno, H.G.³

21
- 44949088691
- Dynamic help generation by estimating user's mental model in spoken dialogue systems
- Y. Fukubayashi, K. Komatani, T. Ogata, and H. G. Okuno, "Dynamic help generation by estimating user's mental model in spoken dialogue systems," Proc. of Interspeech-2006, 1946-1949.
- Proc. of Interspeech-2006 , pp. 1946-1949
- Fukubayashi, Y.¹ Komatani, K.² Ogata, T.³ Okuno, H.G.⁴

22
- 84857778881
- Multi-domain spoken dialogue system with extensibility and robustness against speech recognition errors
- K. Komatani, N. Kanda, M. Nakano, K. Nakadai, H. Tsujino, T. Ogata, H. G. Okuno, "Multi-domain spoken dialogue system with extensibility and robustness against speech recognition errors," Proc. of SIGdial Workshop on Discourse and Dialogue, 9-17, 2006.
- (2006) Proc. of SIGdial Workshop on Discourse and Dialogue , pp. 9-17
- Komatani, K.¹ Kanda, N.² Nakano, M.³ Nakadai, K.⁴ Tsujino, H.⁵ Ogata, T.⁶ Okuno, H.G.⁷

23
- 21444440253
- User modeling in spoken dialogue systems to generate flexible guidance
- K. Komatani, S. Ueno, T. Kawahara, and H. G. Okuno, "User modeling in spoken dialogue systems to generate flexible guidance," User Model. User-Adapt. Interact, vol. 15, no. 1-2, 169-183, 2005.
- (2005) User Model. User-Adapt. Interact , vol.15 , Issue.1-2 , pp. 169-183
- Komatani, K.¹ Ueno, S.² Kawahara, T.³ Okuno, H.G.⁴

24
- 22944489210
- Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes, PRICAI2004: Trends in Artificial Intelligence
- K. Ishihara, T. Nakatani, T. Ogata, and H. G. Okuno, "Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes," PRICAI2004: Trends in Artificial Intelligence, LNCS 3157, 909-918, 2004.
- (2004) LNCS , vol.3157 , pp. 909-918
- Ishihara, K.¹ Nakatani, T.² Ogata, T.³ Okuno, H.G.⁴

25
- 34548807783
- Extracting multi-modal dynamics of objects using rnnpb
- T. Ogata, H. Ohba, J. Tani, K. Komatani, and H. G. Okuno, "Extracting multi-modal dynamics of objects using rnnpb," J of Robotics and Mechatronics, vol. 17, no. 6, 681-688, 2007.
- (2007) J of Robotics and Mechatronics , vol.17 , Issue.6 , pp. 681-688
- Ogata, T.¹ Ohba, H.² Tani, J.³ Komatani, K.⁴ Okuno, H.G.⁵

26
- 50149091391
- Generation of robot motions from environmental sounds using inter-modality mapping by rnnpb
- T. Ogata, Y. Hattori, H. Kozima, K. Komatani, and H. G. Okuno, "Generation of robot motions from environmental sounds using inter-modality mapping by rnnpb," Proc. of EpiRobo-2006, 95-102.
- Proc. of EpiRobo-2006 , pp. 95-102
- Ogata, T.¹ Hattori, Y.² Kozima, H.³ Komatani, K.⁴ Okuno, H.G.⁵

27
- 26944446929
- Distance based dynamic interaction of humanoid robot with multiple people, Innovations in Applied Artificial Intelligence
- T. Tasaki, S. Matsumoto, H. Ohba, M. Toda, K. Komatani, T. Ogata, and H. Okuno, "Distance based dynamic interaction of humanoid robot with multiple people," Innovations in Applied Artificial Intelligence, LNAI 3533, 111-120, 2005.
- (2005) LNAI , vol.3533 , pp. 111-120
- Tasaki, T.¹ Matsumoto, S.² Ohba, H.³ Toda, M.⁴ Komatani, K.⁵ Ogata, T.⁶ Okuno, H.⁷

28
- 10444249505
- Computational auditory scene analysis and its application to robot audition
- IEEE
- H. G. Okuno, T. Ogata, K. Komatani, and K. Nakadai, "Computational auditory scene analysis and its application to robot audition," Proc. of ICKS 2004, 73-80, IEEE.
- Proc. of ICKS 2004 , pp. 73-80
- Okuno, H.G.¹ Ogata, T.² Komatani, K.³ Nakadai, K.⁴

29
- 14044260635
- Enhanced robot audition based on microphone array source separation with postfilter
- J.-M. Valin, J. Rouat, and F. Michaud, "Enhanced robot audition based on microphone array source separation with postfilter," IROS 2004, 2123-2128.
- IROS 2004 , pp. 2123-2128
- Valin, J.-M.¹ Rouat, J.² Michaud, F.³

30
- 85032752225
- Missing-feature approaches in speech recognition
- IEEE
- B. Raj and R. M. Stern, "Missing-feature approaches in speech recognition," Signal Processing Magazine, vol. 22, no. 5, 101-116, IEEE, 2005.
- (2005) Signal Processing Magazine , vol.22 , Issue.5 , pp. 101-116
- Raj, B.¹ Stern, R.M.²

31
- 85009106519
- Robust asr based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
- ESCA
- J. Barker, M. Cooke, and P. Green, "Robust asr based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise," Proc. of Eurospeech-2001, 213-216, ESCA, 2001.
- (2001) Proc. of Eurospeech-2001 , pp. 213-216
- Barker, J.¹ Cooke, M.² Green, P.³

32
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- May
- M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Communication, vol. 34, 267-285, May 2001.
- (2001) Speech Communication , vol.34 , pp. 267-285
- Cooke, M.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

33
- 14044275759
- Code reusability tools for programming mobile robots
- C. Côté, D. Létourneau, F. Michaud, J.-M. Valin, Y Brosseau, C. Raievsky, M. Lemay, and V. Tran, "Code reusability tools for programming mobile robots," Proc. of IEEE/RSJ IROS 2004, 1820-1825.
- (2004) Proc. of IEEE/RSJ IROS , pp. 1820-1825
- Côté, C.¹ Létourneau, D.² Michaud, F.³ Valin, J.-M.⁴ Brosseau, Y.⁵ Raievsky, C.⁶ Lemay, M.⁷ Tran, V.⁸

34
- 33746259880
- Y. Nishimura and S. Furui, "Multiband julius." http://www.furui.cs.titech.ac.jp/mband.julius/, 2005.
- (2005) Multiband julius
- Nishimura, Y.¹ Furui, S.²

35
- 3042525207
- Localization of simultaneous moving sound sources for mobile robot using a frequency-domain steered beamformer approach
- J.-M. Valin, F. Michaud, B. Hadjou, and J. Rouat, "Localization of simultaneous moving sound sources for mobile robot using a frequency-domain steered beamformer approach," ICRA 2004, 1033-1038.
- ICRA 2004 , pp. 1033-1038
- Valin, J.-M.¹ Michaud, F.² Hadjou, B.³ Rouat, J.⁴

36
- 0036753896
- Geometric source separation: Mergin convolutive source separation with geometric beamforming
- L. C. Parra and C. V. Alvino, "Geometric source separation: Mergin convolutive source separation with geometric beamforming," IEEE Trans. on Speech and Audio Processing, vol. 10, no. 6, 352-362, 2002.
- (2002) IEEE Trans. on Speech and Audio Processing , vol.10 , Issue.6 , pp. 352-362
- Parra, L.C.¹ Alvino, C.V.²

37
- 0141847846
- Microphone array post-filtering for non-stationary noise suppression
- I. Cohen and B. Berdugo, "Microphone array post-filtering for non-stationary noise suppression," Proc. of ICASSP-2002, 901-904.
- (2002) Proc. of ICASSP , pp. 901-904
- Cohen, I.¹ Berdugo, B.²

38
- 85009144958
- Free software toolkit for Japanese large vocabulary continuous speech recognition
- T. Kawahara and A. Lee, "Free software toolkit for Japanese large vocabulary continuous speech recognition," Proc. of ICSLP-2000, 476-479.
- Proc. of ICSLP-2000 , pp. 476-479
- Kawahara, T.¹ Lee, A.²

39
- 33947625206
- Automatic drum sound description for real-world music using template adaptation and matching methods
- K. Yoshii, M. Goto, and H. G. Okuno, "Automatic drum sound description for real-world music using template adaptation and matching methods," Proc. of ISMIR-2004, 184-191.
- Proc. of ISMIR-2004 , pp. 184-191
- Yoshii, K.¹ Goto, M.² Okuno, H.G.³

40
- 4444224374
- An audio-based real-time beat tracking system for music with or without drum-sounds
- M. Goto, "An audio-based real-time beat tracking system for music with or without drum-sounds," J. of New Music Research, vol. 30, no. 2, 159-171, 2001.
- (2001) J. of New Music Research , vol.30 , Issue.2 , pp. 159-171
- Goto, M.¹

41
- 23944456976
- Exploring music collections by browsing different views
- E. Pampalk, S. Dixon, and G. Widmer, "Exploring music collections by browsing different views," Proc. of ISMIR-2003, 201-208.
- Proc. of ISMIR-2003 , pp. 201-208
- Pampalk, E.¹ Dixon, S.² Widmer, G.³

42
- 0141623871
- RWC music database: Popular, classical, and jazz music databases
- M. Goto, T. Hashiguchi, and R. Oka, "RWC music database: Popular, classical, and jazz music databases," Proc. of ISMIR-2002, 287-288.
- Proc. of ISMIR-2002 , pp. 287-288
- Goto, M.¹ Hashiguchi, T.² Oka, R.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.