메뉴 건너뛰기




Volumn , Issue , 2013, Pages 8081-8085

Unsupervised discovery of linguistic structure including two-level acoustic patterns using three cascaded stages of iterative optimization

Author keywords

hidden Markov models; iterative optimization; spoken term detection; unsupervised learning; zero resource speech recognition

Indexed keywords

INITIALIZATION STEP; ITERATIVE OPTIMIZATION; LARGE VOCABULARY; LINGUISTIC STRUCTURE; MANUAL ANNOTATION; MODEL PARAMETERS; N-GRAM LANGUAGE MODELS; SPOKEN TERM DETECTIONS;

EID: 84890479779     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2013.6639239     Document Type: Conference Paper
Times cited : (28)

References (35)
  • 1
    • 84865770260 scopus 로고    scopus 로고
    • Towards unsupervised training of speaker independent acoustic models
    • A. Jansen and K. Church "Towards Unsupervised Training of Speaker Independent Acoustic Models" in InterSpeech, 2011, pp. 1693-1696.
    • (2011) InterSpeech , pp. 1693-1696
    • Jansen, A.1    Church, K.2
  • 2
    • 84867809023 scopus 로고    scopus 로고
    • A nonparametric bayesian approach to acoustic model discovery
    • C. Lee and J. Glass, "A Nonparametric Bayesian Approach to Acoustic Model Discovery" in Proc. The Association for Computer Linguistics, 2012, vol. 1, pp. 40-49.
    • (2012) Proc. The Association for Computer Linguistics , vol.1 , pp. 40-49
    • Lee, C.1    Glass, J.2
  • 3
    • 70450158585 scopus 로고    scopus 로고
    • Unsupervised training of an hmm-based speech recognizer for topic classification
    • H. Gish, M. Siu, A. Chan, and B. Belfield, "Unsupervised training of an HMM-based Speech Recognizer for Topic Classification" in InterSpeech, 2009, pp. 1935-1938.
    • (2009) InterSpeech , pp. 1935-1938
    • Gish, H.1    Siu, M.2    Chan, A.3    Belfield, B.4
  • 4
    • 79959819374 scopus 로고    scopus 로고
    • Improved topic classification and keyword discovery using an hmm-based speech recognizer trained without supervision
    • M. Siu, H. Gish, A. Chan, and W. Belfield, "Improved Topic Classification and Keyword Discovery using an HMM-based Speech Recognizer Trained without Supervision" in Inter-Speech, 2010, pp. 2838-2841.
    • (2010) Inter-Speech , pp. 2838-2841
    • Siu, M.1    Gish, H.2    Chan, A.3    Belfield, W.4
  • 5
    • 80051626575 scopus 로고    scopus 로고
    • Unsupervised acoustic sub-word unit detection for query-by-example spoken term detection
    • M. Huijbregts, M. McLaren, and D. van Leeuwen, "Unsupervised acoustic sub-word unit detection for query-by-example spoken term detection," in ICASSP, 2011, pp. 4436-4439.
    • (2011) ICASSP , pp. 4436-4439
    • Huijbregts, M.1    McLaren, M.2    Van Leeuwen, D.3
  • 6
    • 70349210894 scopus 로고    scopus 로고
    • Unsupervised acoustic and language model training with small amounts of labelled data
    • S. Novotney, R. Schwartz, and J. Ma, "Unsupervised acoustic and language model training with small amounts of labelled data," in ICASSP, 2009, pp. 4297-4300.
    • (2009) ICASSP , pp. 4297-4300
    • Novotney, S.1    Schwartz, R.2    Ma, J.3
  • 7
    • 84865757470 scopus 로고    scopus 로고
    • Unsupervised hidden markov modeling of spoken queries for spoken term detection without speech recognition
    • C. Chan and L. Lee, "Unsupervised Hidden Markov Modeling of Spoken Queries for Spoken Term Detection without Speech Recognition" in InterSpeech, 2011, pp. 2141-2144.
    • (2011) InterSpeech , pp. 2141-2144
    • Chan, C.1    Lee, L.2
  • 8
    • 84867209590 scopus 로고    scopus 로고
    • Computational language acquisition by statistical bottom-up processing
    • O. J. Rasanen, U. K. Laine, T. Altosaar "Computational language acquisition by statistical bottom-up processing," in Inter-Speech, 2008, pp. 1980-1983.
    • (2008) Inter-Speech , pp. 1980-1983
    • Rasanen, O.J.1    Laine, U.K.2    Altosaar, T.3
  • 9
    • 70450212196 scopus 로고    scopus 로고
    • A noise robust method for pattern discovery in quantized time series: The concept matrix approach
    • O. J. Rasanen, U. K. Laine, T. Altosaar "A noise robust method for pattern discovery in quantized time series: the concept matrix approach," in InterSpeech, 2009, pp. 3035-3038.
    • (2009) InterSpeech , pp. 3035-3038
    • Rasanen, O.J.1    Laine, U.K.2    Altosaar, T.3
  • 10
    • 70450191104 scopus 로고    scopus 로고
    • Self-learning vector quantization for pattern discovery from speech
    • O. J. Rasanen, U. K. Laine, T. Altosaar, "Self-learning vector quantization for pattern discovery from speech," in InterSpeech, 2009, pp. 852-855.
    • (2009) InterSpeech , pp. 852-855
    • Rasanen, O.J.1    Laine, U.K.2    Altosaar, T.3
  • 13
    • 51449096712 scopus 로고    scopus 로고
    • Unsupervised optimal phoneme segmentation: Objectives, algorithm and comparisons
    • Y. Qiao, N. Shimomura, and N. Minematsu, "Unsupervised optimal phoneme segmentation: objectives, algorithm and comparisons," in ICASSP, 2008, pp. 3989-3992.
    • (2008) ICASSP , pp. 3989-3992
    • Qiao, Y.1    Shimomura, N.2    Minematsu, N.3
  • 14
    • 80051622244 scopus 로고    scopus 로고
    • Integrating frame-based and segmentbased dynamic time warping for unsupervised spoken term detection with spoken queries
    • C. Chan and L. Lee, "Integrating Frame-based and Segmentbased Dynamic Time Warping for Unsupervised Spoken Term Detection with Spoken Queries" in ICASSP, 2011, pp. 5652-5655.
    • (2011) ICASSP , pp. 5652-5655
    • Chan, C.1    Lee, L.2
  • 15
    • 84890526441 scopus 로고    scopus 로고
    • Toward unsupervised model-based spoken term detection with spoken queries without annotated data
    • C. Chan, C. Chung, Y. Kuo and L. Lee, "Toward Unsupervised Model-based Spoken Term Detection with Spoken Queries without Annotated Data" in ICASSP, 2013
    • (2013) ICASSP
    • Chan, C.1    Chung, C.2    Kuo, Y.3    Lee, L.4
  • 16
    • 85013744934 scopus 로고
    • A successive state splitting algorithm for efficient allophone modeling
    • J. Takami and S. Sagayama, "A successive state splitting algorithm for efficient allophone modeling," in ICASSP, 1992, vol. 1, pp. 573-576.
    • (1992) ICASSP , vol.1 , pp. 573-576
    • Takami, J.1    Sagayama, S.2
  • 17
    • 0029745231 scopus 로고    scopus 로고
    • Maximum likelihood successive state splitting
    • H. Singer and M. Ostendorf, "Maximum likelihood successive state splitting," in ICASSP, 1997, vol. 2, pp. 601-604.
    • (1997) ICASSP , vol.2 , pp. 601-604
    • Singer, H.1    Ostendorf, M.2
  • 18
    • 84867221475 scopus 로고    scopus 로고
    • Automatically learning speaker-independent acoustic subword units
    • B. Varadarajan and S. Khudanpur, "Automatically Learning Speaker-independent Acoustic Subword Units," in InterSpeech, 2008.
    • (2008) InterSpeech
    • Varadarajan, B.1    Khudanpur, S.2
  • 19
    • 77955759248 scopus 로고    scopus 로고
    • Performance analysis for lattice-based speech indexing approaches using word and subword units
    • August
    • Y.-c. Pan and L.-s. Lee, "Performance analysis for lattice-based speech indexing approaches using word and subword units," IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 6, August 2010, pp. 1562-1574.
    • (2010) IEEE Transactions on Audio, Speech, and Language Processing , vol.18 , Issue.6 , pp. 1562-1574
    • Pan, Y.-C.1    Lee, L.-S.2
  • 20
    • 79959851706 scopus 로고    scopus 로고
    • Towards spoken term discovery at scalewith zero resources
    • A. Jansen, K. Church, and H. Hermansky, "Towards Spoken Term Discovery At ScaleWith Zero Resources" in InterSpeech, 2010, pp. 1676-1679.
    • (2010) InterSpeech , pp. 1676-1679
    • Jansen, A.1    Church, K.2    Hermansky, H.3
  • 21
    • 0007493259 scopus 로고    scopus 로고
    • Topological gray-scale watershed transform
    • M. Couprie, G. Bertrand, "Topological gray-scale watershed transform," in Proc. of SPIE Vision Geometry V, 1997, vol. 3168, pp. 136-146.
    • (1997) Proc. of SPIE Vision Geometry v , vol.3168 , pp. 136-146
    • Couprie, M.1    Bertrand, G.2
  • 22
    • 0013281072 scopus 로고    scopus 로고
    • Updateable pat-tree approach to chinese key phrase extraction using mutual information: A linguistic foundation for knowledge management
    • T. Ong and H. Chen, "Updateable PAT-Tree Approach to Chinese Key Phrase Extraction using Mutual Information: A Linguistic Foundation for Knowledge Management," in Proc. The Second Asian Digital Library Conference, 1999, pp. 63-84.
    • (1999) Proc. The Second Asian Digital Library Conference , pp. 63-84
    • Ong, T.1    Chen, H.2
  • 23
    • 84890511750 scopus 로고    scopus 로고
    • Enhancing query expansion for semantic retrieval of spoken content with automatically discovered acoustic patterns
    • H. Lee, Y. Li, C. Chung, and L. Lee, "Enhancing Query Expansion for Semantic Retrieval of Spoken Content with Automatically Discovered Acoustic Patterns," in ICASSP, 2013
    • (2013) ICASSP
    • Lee, H.1    Li, Y.2    Chung, C.3    Lee, L.4
  • 24
    • 34547516258 scopus 로고    scopus 로고
    • Approximating the kullback liebler divergence between gaussain mixture models
    • J. Hershey and P. Olsen, "Approximating the Kullback Liebler Divergence between Gaussain Mixture Models" in ICASSP, 2007, vol. 4, pp. 317-320.
    • (2007) ICASSP , vol.4 , pp. 317-320
    • Hershey, J.1    Olsen, P.2
  • 25
    • 84867316017 scopus 로고    scopus 로고
    • The spoken web search task at mediaeval 2011
    • F. Metze, N. Rajput et al., "The spoken web search task at Mediaeval 2011," in ICASSP, 2012, pp. 5165-5168.
    • (2012) ICASSP , pp. 5165-5168
    • Metze, F.1    Rajput, N.2
  • 26
    • 84867600320 scopus 로고    scopus 로고
    • An acoustic segment modeling approach to query-by-example spoken term detection
    • H. Wang, C.-C. Leung, T. Lee, B. Ma, and H. Li, "An acoustic segment modeling approach to query-by-example spoken term detection," in ICASSP, 2012, pp. 5157-5160.
    • (2012) ICASSP , pp. 5157-5160
    • Wang, H.1    Leung, C.-C.2    Lee, T.3    Ma, B.4    Li, H.5
  • 27
    • 33947644326 scopus 로고    scopus 로고
    • Keyword spotting of arbitrary words using minimal speech resources
    • A. Garcia and H. Gish, "Keyword spotting of arbitrary words using minimal speech resources," in ICASSP, 2006.
    • (2006) ICASSP
    • Garcia, A.1    Gish, H.2
  • 28
    • 58049100564 scopus 로고    scopus 로고
    • A phonetic search approach to the 2006 NIST spoken term detection evaluation
    • R. Wallace, R. Vogt, and S. Sridharan, "A phonetic search approach to the 2006 NIST spoken term detection evaluation," in InterSpeech, 2007, pp. 2385-2388.
    • (2007) InterSpeech , pp. 2385-2388
    • Wallace, R.1    Vogt, R.2    Sridharan, S.3
  • 29
    • 84865709671 scopus 로고    scopus 로고
    • Open vocabulary spoken-document retrieval based on query expansion using related web documents
    • M. Terao, T. Koshinaka, S. Ando, R. Isotani, and A. Okumura, "Open vocabulary spoken-document retrieval based on query expansion using related web documents," in InterSpeech, 2008, pp. 2171-2174.
    • (2008) InterSpeech , pp. 2171-2174
    • Terao, M.1    Koshinaka, T.2    Ando, S.3    Isotani, R.4    Okumura, A.5
  • 30
    • 70450160623 scopus 로고    scopus 로고
    • A comparison of queryby example methods for spoken term detection
    • W. Shen, C. M. White, and T. J. Hazen, "A comparison of queryby example methods for spoken term detection," in InterSpeech, 2009, pp. 2143-2146.
    • (2009) InterSpeech , pp. 2143-2146
    • Shen, W.1    White, C.M.2    Hazen, T.J.3
  • 32
    • 84865770619 scopus 로고    scopus 로고
    • A piecewise aggregate approximation lowerbound estimate for posteriorgram-based dynamic time warping
    • Y. Zhang and J. R. Glass, "A piecewise aggregate approximation lowerbound estimate for posteriorgram-based dynamic time warping," in InterSpeech, 2011, pp. 1909-1912.
    • (2011) InterSpeech , pp. 1909-1912
    • Zhang, Y.1    Glass, J.R.2
  • 33
    • 0035509488 scopus 로고    scopus 로고
    • Speech recognition and utterance verification based on a generalized confidence score
    • M.-W. Koo, C.-H. Lee, and B.-H. Juang, "Speech recognition and utterance verification based on a generalized confidence score," IEEE Transactions on Speech and Audio Processing, vol. 9, no. 8, pp. 821V-832, 2001.
    • (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.8
    • Koo, M.-W.1    Lee, C.-H.2    Juang, B.-H.3
  • 34
    • 78049411640 scopus 로고    scopus 로고
    • An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition
    • Y. Tsao, H. Sun, H. Li, and C.-H. Lee, "An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition," in ICASSP, 2010, pp. 4422-4425.
    • (2010) ICASSP , pp. 4422-4425
    • Tsao, Y.1    Sun, H.2    Li, H.3    Lee, C.-H.4
  • 35
    • 33750319706 scopus 로고    scopus 로고
    • Overview of TREC 2006
    • E. Voorhees, "Overview of TREC 2006," in TREC, 2006.
    • (2006) TREC
    • Voorhees, E.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.