-
1
-
-
85085788280
-
Trecvid 2013-An introduction to the goals, tasks, data, evaluation mechanisms, and metrics
-
Gaithersburg, MD; U.S.A., Nov., National Institute of Standards and Technology
-
Paul Over, Jon Fiscus, and Greg Sanders, "TRECVID 2013-An introduction to the goals, tasks, data, evaluation mechanisms, and metrics, " in Proc. TRECVID, Gaithersburg, MD; U.S.A., Nov. 2013, National Institute of Standards and Technology, http://wwwnlpir. nist.gov/projects/tv2013/.
-
(2013)
Proc. TRECVID
-
-
Over, P.1
Fiscus, J.2
Sanders, G.3
-
2
-
-
84937454179
-
Creating havic: Heterogeneous audio visual internet collection
-
Istanbul, Turkey, May 2012, European Language Resources Association (ELRA
-
Stephanie Strassel, Amanda Morris, Jonathan Fiscus, Christopher Caruso, Haejoong Lee, Paul Over, James Fiumara, Barbara Shaw, Brian Antonishek, and Martial Michel, "Creating HAVIC: Heterogeneous audio visual internet collection, " in Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), Istanbul, Turkey, May 2012, European Language Resources Association (ELRA).
-
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)
-
-
Strassel, S.1
Morris, A.2
Fiscus, J.3
Caruso, C.4
Lee, H.5
Over, P.6
Fiumara, J.7
Shaw, B.8
Antonishek, B.9
Michel, M.10
-
3
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
Geoffrey E. Hinton, Li Deng, Dong Yu, George E. Dahl, Abdelrahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, and Brian Kingsbury, " Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, 2012.
-
(2012)
IEEE Signal Process. Mag
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.E.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
4
-
-
84905262359
-
-
Tech. Rep. CMU-LTI-12-07, Carnegie Mellon University, Pittsburgh, PA; U.S.A
-
Susanne Burger, Qin Jin, Peter F. Schulam, and Florian Metze, " Noisemes: Manual annotation of environmental noise in audio streams," Tech. Rep. CMU-LTI-12-07, Carnegie Mellon University, Pittsburgh, PA; U.S.A., 2012.
-
(2012)
Noisemes: Manual Annotation of Environmental Noise in Audio Streams
-
-
Burger, S.1
Jin, Q.2
Schulam, P.F.3
Metze, F.4
-
5
-
-
84878551166
-
Event-based video retrieval using audio
-
Qin Jin, Peter F. Schulam, Shourabh Rawat, Susanne Burger, Duo Ding, and Florian Metze, " Event-based video retrieval using audio," In Proc. INTERSPEECH
-
Proc. INTERSPEECH
-
-
Jin, Q.1
Schulam, P.F.2
Rawat, S.3
Burger, S.4
Ding, D.5
Metze, F.6
-
6
-
-
85041458689
-
Audio concept ranking for video event detection on user-generated content
-
Marseille, France, Aug., ISCA
-
Benjamin Elizalde,Mirco Ravanelli,Gerald Friedland Audio concept ranking for video event detection on user-generated content in Proceedings of the First Workshop on Speech, Language and Audio in Multimedia (SLAM), Marseille, France, Aug. 2013, ISCA.
-
(2013)
Proceedings of the First Workshop on Speech, Language and Audio in Multimedia (SLAM)
-
-
Elizalde, B.1
Ravanelli, M.2
Friedland, G.3
-
7
-
-
84870415507
-
Supervised acoustic concept extraction for multimedia event detection
-
Nara; Japan, Oct. ACM
-
Stephanie Pancoast, Murat Akbacak, and Michelle Sanchez, "Supervised acoustic concept extraction for multimedia event detection, " in ACM Multimedia Workshop on Audi o and Multimedia Methods for Large-Scale Video Analysis (AMVA), Nara; Japan, Oct. 2012, ACM.
-
(2012)
ACM Multimedia Workshop on Audi O and Multimedia Methods for Large-Scale Video Analysis (AMVA)
-
-
Pancoast, S.1
Akbacak, M.2
Sanchez, M.3
-
8
-
-
84905270442
-
IBM research and columbia university trecvid-2011 multimedia event detection (med) system
-
Gaithersburg, MD; U.S.A. , Nov., National Institute of Standards and Technology
-
Liangliang Cao, Shih-Fu Chang, Noel Codella, Courtenay Cotton, Dan Ellis, Leiguang Gong, Matthew Hill, Gang Hua, John Kender, Michele Merler, Yadong Mu, Apostol Natseve, and John R. Smith, " IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (MED) System," in Proc. TRECVID, Gaithersburg, MD; U.S.A. , Nov. 2011, National Institute of Standards and Technology, http://wwwnlpir. nist.gov/projects/tv2011/.
-
(2011)
Proc. TRECVID
-
-
Cao, L.1
Chang, S.-F.2
Codella, N.3
Cotton, C.4
Ellis, D.5
Gong, L.6
Hill, M.7
Hua, G.8
Kender, J.9
Merler, M.10
Mu, Y.11
Natseve, A.12
Smith, J.R.13
-
9
-
-
84906214187
-
Robust audio codebooks for large scale event detection in consumer videos
-
Lyon; France, Aug., ISCA
-
Shourabh Rawat, Peter Schulam, Susanne Burger, Duo Ding, Yipei Wang, and Florian Metze, " Robust audio codebooks for large scale event detection in consumer videos," in Proc. INTERSPEECH, Lyon; France, Aug. 2013, ISCA.
-
(2013)
Proc. INTERSPEECH
-
-
Rawat, S.1
Schulam, P.2
Burger, S.3
Ding, D.4
Wang, Y.5
Metze, F.6
-
10
-
-
84878606595
-
Bag-of-Audiowords approach for multimedia event classification
-
Stephanie Pancoast and Murat Akbacak, "Bag-of-Audiowords approach for multimedia event classification," In Proc. INTERSPEECH [27].
-
Proc. INTERSPEECH [
, vol.27
-
-
Pancoast, S.1
Akbacak, M.2
-
11
-
-
84878580398
-
Compact audio representation for event detection in consumer media
-
Xiaodan Zhuang, Stavros Tsakalidis, Shuang Wu, Pradeep Natarajan, Rohit Prasad, and Prem Natarajan, " Compact audio representation for event detection in consumer media," In Proc. INTERSPEECH [27].
-
Proc. INTERSPEECH [
, vol.27
-
-
Zhuang, X.1
Tsakalidis, S.2
Wu, S.3
Natarajan, P.4
Prasad, R.5
Natarajan, P.6
-
12
-
-
84878587807
-
Robust event detection from spoken content in consumer domain videos
-
Stavros Tsakalidis, Xiaodan Zhuang, Roger Hsiao, ShuangWu, Pradeep Natarajan, Rohit Prasad, and Prem Natarajan, " Robust event detection from spoken content in consumer domain videos, " In Proc. INTERSPEECH [27].
-
Proc. INTERSPEECH [
, vol.27
-
-
Tsakalidis, S.1
Zhuang, X.2
Hsiao, R.3
Wu, S.4
Natarajan, P.5
Prasad, R.6
Natarajan, P.7
-
13
-
-
84455207538
-
Audio-visual fusion using Bayesian model combination for web video retrieval
-
New York, NY, USA, MM '11, ACM
-
Vasant Manohar, Stavros Tsakalidis, Pradeep Natarajan, Rohit Prasad, and Prem Natarajan, " Audio-visual fusion using Bayesian model combination for web video retrieval," in Proceedings of the 19th ACM International Conference on Multimedia, New York, NY, USA, 2011, MM '11, pp. 1537-1540, ACM.
-
(2011)
Proceedings of the 19th ACM International Conference on Multimedia
, pp. 1537-1540
-
-
Manohar, V.1
Tsakalidis, S.2
Natarajan, P.3
Prasad, R.4
Natarajan, P.5
-
14
-
-
84937415065
-
-
National Institute of Standards of Technology Aug. 2013, Last acccessed: April 15
-
National Institute of Standards of Technology, " 2013 TRECVID Multimedia Event Detection Track," http://www.nist.gov/itl/iad/mig/med13.cfm, Aug. 2013, Last acccessed: April 15, 2014.
-
(2014)
2013 TRECVID Multimedia Event Detection Track
-
-
-
16
-
-
84962868641
-
A one-pass decoder based on polymorphic linguistic context assignment
-
Madonna di Campiglio, Italy Dec IEEE
-
Hagen Soltau, Florian Metze, Christian Fügen, and Alex Waibel, " A One-pass Decoder based on Polymorphic Linguistic Context Assignment," in Proc. Automatic Speech Recognition and Understanding (ASRU), Madonna di Campiglio, Italy, Dec. 2001, IEEE.
-
(2001)
Proc. Automatic Speech Recognition and Understanding (ASRU)
-
-
Soltau, H.1
Metze, F.2
Fügen, C.3
Waibel, A.4
-
17
-
-
84953744816
-
A statistical interpretation of term specificity and its application in retrieval
-
Karen Sparck Jones, " A statistical interpretation of term specificity and its application in retrieval," Journal of Documentation, 1972.
-
(1972)
Journal of Documentation
-
-
Jones, K.S.1
-
18
-
-
84937454189
-
Extracting deep bottleneck features using stacked auto-encoders
-
Jonas Gehring, Yajie Miao, Florian Metze, and Alex Waibel, " Extracting deep bottleneck features using stacked auto-encoders," In Proc. ICASSP [28].
-
Proc. ICASSP [
, vol.28
-
-
Gehring, J.1
Miao, Y.2
Metze, F.3
Waibel, A.4
-
19
-
-
84890499569
-
Unsupervised hierarchical structure induction for deeper semantic analysis of audio
-
Sourish Chaudhuri and Bhiksha Raj, "Unsupervised hierarchical structure induction for deeper semantic analysis of audio, " In Proc. ICASSP [28], pp. 833-837.
-
Proc. ICASSP
, vol.28
, pp. 833-837
-
-
Chaudhuri, S.1
Raj, B.2
-
20
-
-
51449103447
-
Optimizing bottleneck features for lvcsr
-
Las Vegas, NV; U.S.A. Apr. IEEE
-
Frantisek Grézl and Petr Fousek, "Optimizing bottleneck features for LVCSR, " in Proc. ICASSP, Las Vegas, NV; U.S.A., Apr. 2008, IEEE.
-
(2008)
Proc. ICASSP
-
-
Grézl, F.1
Fousek, P.2
-
21
-
-
84955035459
-
A scale for the measurement of the psychological magnitude pitch
-
Stanley S. Stevens, John Volkman, and Edwin B. Newman, " A scale for the measurement of the psychological magnitude pitch," The Journal of the Acoustical Society of America, vol. 8, no. 3, pp. 185-190, 1937.
-
(1937)
The Journal of the Acoustical Society of America
, vol.8
, Issue.3
, pp. 185-190
-
-
Stevens, S.S.1
Volkman, J.2
Newman, E.B.3
-
22
-
-
84937454189
-
Extracting deep bottleneck features using stacked auto-encoders
-
22] Jonas Gehring, Yajie Miao, Florian Metze, and Alex Waibel, " Extracting Deep Bottleneck Features Using Stacked Auto-Encoders," In Proc. ICASSP [28].
-
Proc. ICASSP [
, vol.28
-
-
Gehring, J.1
Miao, Y.2
Metze, F.3
Waibel, A.4
-
23
-
-
84857819132
-
Theano: A cpu and gpu math expression compiler
-
Oral Presentation
-
James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio, "Theano: a CPU and GPU math expression compiler, " in Proceedings of the Python for Scientific Com puting Conference (SciPy), June 2010, Oral Presentation.
-
(2010)
Proceedings of the Python for Scientific Com Puting Conference (SciPy), June
-
-
Bergstra, J.1
Breuleux, O.2
Bastien, F.3
Lamblin, P.4
Pascanu, R.5
Desjardins, G.6
Turian, J.7
Warde-Farley, D.8
Bengio, Y.9
-
25
-
-
78650977476
-
Opensmile: The munich versatile and fast open-source audio feature extractor
-
New York, NY; USA, MM '10 ACM
-
Florian Eyben, Martin Wöllmer, and Björn Schuller, " Opensmile: the Munich versatile and fast open-source audio feature extractor," in Proceedings of the International Conference on Multimedia, New York, NY; USA, 2010, MM '10, pp. 1459-1462, ACM.
-
(2010)
Proceedings of the International Conference on Multimedia
, pp. 1459-1462
-
-
Eyben, F.1
Wöllmer, M.2
Schuller, B.3
-
26
-
-
84890530296
-
Subband autocorrelation features for video soundtrack classification
-
Courtenay V. Cotton and Dan P.W. Ellis, " Subband autocorrelation features for video soundtrack classification," In Proc. ICASSP [28], pp. 8663-8666.
-
Proc. ICASSP [
, vol.28
, pp. 8663-8666
-
-
Cotton, C.V.1
Ellis, D.P.W.2
|