SCOPUS 정보 검색 플랫폼

IEEE International Conference on Intelligent Robots and Systems

Volumn , Issue , 2007, Pages 1751-1756

Coarse speech recognition by audio-visual integration based on missing feature theory

(3) Koiwa, Tomoaki a,b Nakadai, Kazuhiro a,b Imura, Jun Ichi a

a TOKYO INSTITUTE OF TECHNOLOGY (Japan)

b HONDA RESEARCH INSTITUTE JAPAN CO LTD (Japan)

Author keywords

[No Author keywords available]

Indexed keywords

AUDIO-VISUAL; AUDIO-VISUAL SPEECH RECOGNITION; INTERNATIONAL CONFERENCES; MISSING FEATURE THEORY; REAL-WORLD;

ACOUSTIC NOISE; INTELLIGENT ROBOTS; INTELLIGENT SYSTEMS; ROBOTICS; SPEECH; SPEECH ANALYSIS;

SPEECH RECOGNITION;

EID: 51349110555 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IROS.2007.4399300 Document Type: Conference Paper

Times cited : (10)

References (21)

1
- 10444237268
- Improvement of recognition of simultaneous speech signals using AV integration and scattering theory for humanoid robots
- K. Nakadai et al., "Improvement of recognition of simultaneous speech signals using AV integration and scattering theory for humanoid robots", Speech Communication, vol.44, 2004, pp.97-112.
- (2004) Speech Communication , vol.44 , pp. 97-112
- Nakadai, K.¹

2
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models
- C.J. Leggetter et al., "Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models", Computer Speech and Language, vol.9, 1995, pp. 171-185.
- (1995) Computer Speech and Language , vol.9 , pp. 171-185
- Leggetter, C.J.¹

3
- 85122848536
- Active audition for humanoid
- K. Nakadai et al., "Active audition for humanoid", in Proc. of 17th National Conference on Artificial Intelligence, 2000, pp.832-839.
- (2000) Proc. of 17th National Conference on Artificial Intelligence , pp. 832-839
- Nakadai, K.¹

4
- 34250652551
- IEEE
- S. Yamamoto et al., "Real-time robot audition system that recognizes simultaneous speech in the real world", IROS-2006, pp.5333-5338. IEEE.
- Real-time robot audition system that recognizes simultaneous speech in the real world , vol.IROS-2006 , pp. 5333-5338
- Yamamoto, S.¹

5
- 14044262966
- IEEE
- I. Hara et al., "Robust speech interface based on audio and video information fusion for humanoid HRP-2", IROS-2004, pp.2404-2410. IEEE.
- Robust speech interface based on audio and video information fusion for humanoid HRP-2 , vol.IROS-2004 , pp. 2404-2410
- Hara, I.¹

6
- 51349162990
- K. Nakadai et al., Real-time auditory and visual multiple-object tracking for robots, IJCAI-2001, MIT Press, pp.1424-1432.
- K. Nakadai et al., "Real-time auditory and visual multiple-object tracking for robots", IJCAI-2001, MIT Press, pp.1424-1432.

7
- 0035386489
- A cascade visual front end for speaker independent automatic speechreading
- G. Potamianos et al., "A cascade visual front end for speaker independent automatic speechreading", Speech Technology, Special Issue on Multimedia, vol.4, 2001, pp.193-208.
- (2001) Speech Technology , vol.4 , pp. 193-208
- Potamianos, G.¹

8
- 33646814706
- A stream-weight optimization method for multi-stream hmms based on likelihood value normalization
- SP
- S. Tamura et al., "A stream-weight optimization method for multi-stream hmms based on likelihood value normalization", Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing, 2005, SP-P5.2.
- (2005) Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing
- Tamura, S.¹

9
- 0030638031
- A post-processing systems to yield reduced word error rates: Recognizer output voting error reduction (ROVER)
- IEEE
- J. Fiscus, "A post-processing systems to yield reduced word error rates: Recognizer output voting error reduction (ROVER)", in Proc. of the Workshop on Automatic Speech Recognition and Understanding, IEEE, 1997, pp.347-354.
- (1997) Proc. of the Workshop on Automatic Speech Recognition and Understanding , pp. 347-354
- Fiscus, J.¹

10
- 85009106519
- Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
- ESCA
- J. Barker et al., "Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise", Proc. of 7fh European Conference on Speech Communication Technology, 2001, pp.213-216. ESCA.
- (2001) Proc. of 7fh European Conference on Speech Communication Technology , pp. 213-216
- Barker, J.¹

11
- 85009143830
- Comparison of HMM experts with MLP experts in the full combination multi-band approach to robust ASR
- A. Hagen et al., "Comparison of HMM experts with MLP experts in the full combination multi-band approach to robust ASR", Proc. of Int'l Conf. on Spoken Language Processing, 2000, pp.345-348.
- (2000) Proc. of Int'l Conf. on Spoken Language Processing , pp. 345-348
- Hagen, A.¹

12
- 0030355935
- A new ASR approach based on independent processing and recombination of partial frequency bands
- H. Bourlard et al., "A new ASR approach based on independent processing and recombination of partial frequency bands", Proc. of Int'l Conf. on Spoken Language Processing, 1996, pp.426-429.
- (1996) Proc. of Int'l Conf. on Spoken Language Processing , pp. 426-429
- Bourlard, H.¹

13
- 48149111531
- Speech recognition for a humanoid with motor noise utilizing missing feature theory
- IEEE
- Y. Nishimura et al., "Speech recognition for a humanoid with motor noise utilizing missing feature theory", Proc. of Int'l Conf. on Humanoid Robots, 2006, pp.26-33. IEEE.
- (2006) Proc. of Int'l Conf. on Humanoid Robots , pp. 26-33
- Nishimura, Y.¹

14
- 84955023511
- An analysis of perceptual confusions among some english consonants
- G. Miller et al., "An analysis of perceptual confusions among some english consonants", JASA, vol.27, 1955, pp.338-352.
- (1955) JASA , vol.27 , pp. 338-352
- Miller, G.¹

15
- 0017357502
- Effect of training on the visual recognition of consonants
- B. Walden et al., "Effect of training on the visual recognition of consonants", J, of Speech and Hearing Research, vol.20, 1977, pp. 130-145.
- (1977) J, of Speech and Hearing Research , vol.20 , pp. 130-145
- Walden, B.¹

16
- 0037221164
- Look at the big picture (details will follow)
- D. Ringach, "Look at the big picture (details will follow)", Nature Neuroscience, vol.6, 2003, no.l, pp.7-8.
- (2003) Nature Neuroscience , vol.6 , Issue.L , pp. 7-8
- Ringach, D.¹

17
- 0020498257
- Isolated word recognition using phoneme-like templates
- N. Sugamura et al., "Isolated word recognition using phoneme-like templates", Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing, 1983, pp.723-726.
- (1983) Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing , pp. 723-726
- Sugamura, N.¹

18
- 0029229987
- Markov model based phoneme class partitioning for improved constrained iterative speech enhancement
- J.H.L. Hansen et al., "Markov model based phoneme class partitioning for improved constrained iterative speech enhancement", IEEE Trans. on Speech and Audio Processing, vol.3, 1995, no.l, pp.98-104.
- (1995) IEEE Trans. on Speech and Audio Processing , vol.3 , Issue.L , pp. 98-104
- Hansen, J.H.L.¹

19
- 85032689322
- Minimum cost based phoneme class detection for improved iterative speech enhancement
- L. Arslan et al., "Minimum cost based phoneme class detection for improved iterative speech enhancement", Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing, vol.11, 1994, pp.45-48.
- (1994) Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing , vol.11 , pp. 45-48
- Arslan, L.¹

20
- 33947659967
- Liptracking and mpeg4 animation with feed-back control
- B. Beaumesnil et al., "Liptracking and mpeg4 animation with feed-back control", Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing, vol.II, 2006, pp.677-680.
- (2006) Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing , vol.2 , pp. 677-680
- Beaumesnil, B.¹

21
- 0036298106
- Missing data speech recognition in reverberant conditions
- K. Palomaki et al., "Missing data speech recognition in reverberant conditions", Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing, vol.I, 2002, pp.65-68.
- (2002) Proc. of Int'l Conf. on Acoustics, Speech and Signal Processing , vol.1 , pp. 65-68
- Palomaki, K.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.