메뉴 건너뛰기




Volumn 34, Issue 6, 2017, Pages 96-108

Deep multimodal learning: A survey on recent advances and trends

Author keywords

[No Author keywords available]

Indexed keywords

DEEP LEARNING;

EID: 85040306665     PISSN: 10535888     EISSN: None     Source Type: Journal    
DOI: 10.1109/MSP.2017.2738401     Document Type: Article
Times cited : (788)

References (103)
  • 1
    • 84930630277 scopus 로고    scopus 로고
    • Deep learning
    • Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning, " Nature, vol. 521, no. 7553, pp. 436-444, 2015.
    • (2015) Nature , vol.521 , Issue.7553 , pp. 436-444
    • LeCun, Y.1    Bengio, Y.2    Hinton, G.3
  • 2
    • 84940426185 scopus 로고    scopus 로고
    • Multimodal data fusion: An overview of methods, challenges, and prospects
    • D. Lahat, T. Adali, and C. Jutten, "Multimodal data fusion: An overview of methods, challenges, and prospects, " Proc. IEEE, vol. 103, no. 9, pp. 1449-1477, 2015.
    • (2015) Proc. IEEE , vol.103 , Issue.9 , pp. 1449-1477
    • Lahat, D.1    Adali, T.2    Jutten, C.3
  • 4
    • 84867336190 scopus 로고    scopus 로고
    • Multisensor data fusion: A review of the state-of-the-art
    • B. Khaleghi, A. Khamis, F. O. Karray, and S. N. Razavi, "Multisensor data fusion: A review of the state-of-the-art, " Inform. Fusion, vol. 14, no. 1, pp. 28-44, 2013.
    • (2013) Inform. Fusion , vol.14 , Issue.1 , pp. 28-44
    • Khaleghi, B.1    Khamis, A.2    Karray, F.O.3    Razavi, S.N.4
  • 9
    • 84946574616 scopus 로고    scopus 로고
    • Deep head pose: Gaze-direction estimation in multimodal video
    • S. S. Mukherjee and N. M. Robertson, "Deep head pose: Gaze-direction estimation in multimodal video, " IEEE Trans. Multimedia, vol. 17, no. 11, pp. 2094-2107, 2015.
    • (2015) IEEE Trans. Multimedia , vol.17 , Issue.11 , pp. 2094-2107
    • Mukherjee, S.S.1    Robertson, N.M.2
  • 10
    • 84946716984 scopus 로고    scopus 로고
    • Robust face recognition via multimodal deep face representation
    • C. Ding and D. Tao, "Robust face recognition via multimodal deep face representation, " IEEE Trans. Multimedia, vol. 17, no. 11, pp. 2049-2058, 2015.
    • (2015) IEEE Trans. Multimedia , vol.17 , Issue.11 , pp. 2049-2058
    • Ding, C.1    Tao, D.2
  • 14
    • 84945972584 scopus 로고    scopus 로고
    • Modeep: A deep learning framework using motion features for human pose estimation
    • A. Jain, J. Tompson, Y. LeCun, and C. Bregler, "Modeep: A deep learning framework using motion features for human pose estimation, " in Proc. Asian Conf. Computer Vision, 2014, pp. 302-315.
    • (2014) Proc. Asian Conf. Computer Vision , pp. 302-315
    • Jain, A.1    Tompson, J.2    LeCun, Y.3    Bregler, C.4
  • 16
    • 85021145223 scopus 로고    scopus 로고
    • Deep learning in medical image analysis
    • D. Shen, G. Wu, and H.-I. Suk, "Deep learning in medical image analysis, " Annu. Review Biomedical Eng., vol. 19, pp. 221-248, 2017.
    • (2017) Annu. Review Biomedical Eng. , vol.19 , pp. 221-248
    • Shen, D.1    Wu, G.2    Suk, H.-I.3
  • 20
    • 84925851214 scopus 로고    scopus 로고
    • Multimodal neuroimaging neature learning for multiclass diagnosis of Alzheimer's disease
    • S. Liu, S. Liu, W. Cai, H. Che, S. Pujol, R. Kikinis, D. Feng, M. J. Fulham, et al., "Multimodal neuroimaging neature learning for multiclass diagnosis of Alzheimer's disease, " IEEE Trans. Biomed. Eng., vol. 62, no. 4, pp. 1132-1140, 2015.
    • (2015) IEEE Trans. Biomed. Eng. , vol.62 , Issue.4 , pp. 1132-1140
    • Liu, S.1    Liu, S.2    Cai, W.3    Che, H.4    Pujol, S.5    Kikinis, R.6    Feng, D.7    Fulham, M.J.8
  • 24
    • 84993965279 scopus 로고    scopus 로고
    • Interpretable deep neural networks for single-trial EEG classification
    • Dec.
    • I. Sturm, S. Lapuschkin, W. Samek, and K.-R. Müller, "Interpretable deep neural networks for single-trial eeg classification, " J. Neuroscience Methods, vol. 274, pp. 141-145, Dec. 2016.
    • (2016) J. Neuroscience Methods , vol.274 , pp. 141-145
    • Sturm, I.1    Lapuschkin, S.2    Samek, W.3    Müller, K.-R.4
  • 25
    • 84979675872 scopus 로고    scopus 로고
    • Opening up the blackbox: An interpretable deep neural network-based classifier for cell-type specific enhancer predictions
    • S. G. Kim, N. Theera-Ampornpunt, C.-H. Fang, M. Harwani, A. Grama, and S. Chaterji, "Opening up the blackbox: An interpretable deep neural network-based classifier for cell-type specific enhancer predictions, " BMC Syst. Biology, vol. 10, no. 2, p. 54, 2016.
    • (2016) BMC Syst. Biology , vol.10 , Issue.2 , pp. 54
    • Kim, S.G.1    Theera-Ampornpunt, N.2    Fang, C.-H.3    Harwani, M.4    Grama, A.5    Chaterji, S.6
  • 26
    • 84973888858 scopus 로고    scopus 로고
    • Deepdriving: Learning affordance for direct perception in autonomous driving
    • C. Chen, A. Seff, A. Kornhauser, and J. Xiao, "Deepdriving: Learning affordance for direct perception in autonomous driving, " in Proc. IEEE Int. Conf. Computer Vision, 2015, pp. 2722-2730.
    • (2015) Proc. IEEE Int. Conf. Computer Vision , pp. 2722-2730
    • Chen, C.1    Seff, A.2    Kornhauser, A.3    Xiao, J.4
  • 28
    • 84928013181 scopus 로고    scopus 로고
    • Deep learning for detecting robotic grasps
    • I. Lenz, H. Lee, and A. Saxena, "Deep learning for detecting robotic grasps, " Int. J. Robotics Res., vol. 34, no. 4-5, pp. 705-724, 2015.
    • (2015) Int. J. Robotics Res. , vol.34 , Issue.4-5 , pp. 705-724
    • Lenz, I.1    Lee, H.2    Saxena, A.3
  • 32
    • 84939178155 scopus 로고    scopus 로고
    • Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach
    • M. Liang, Z. Li, T. Chen, and J. Zeng, "Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach, " IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 12, no. 4, pp. 928-937, 2015.
    • (2015) IEEE/ACM Trans. Comput. Biol. Bioinf. , vol.12 , Issue.4 , pp. 928-937
    • Liang, M.1    Li, Z.2    Chen, T.3    Zeng, J.4
  • 34
    • 84943617823 scopus 로고    scopus 로고
    • Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis
    • S. Poria, E. Cambria, and A. Gelbukh, "Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis, " in Proc. Conf. Empirical Methods on Natural Language Processing, 2015, pp. 2539-2544.
    • (2015) Proc. Conf. Empirical Methods on Natural Language Processing , pp. 2539-2544
    • Poria, S.1    Cambria, E.2    Gelbukh, A.3
  • 42
    • 84955143444 scopus 로고    scopus 로고
    • Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition
    • F. J. Ordónez and D. Roggen, "Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition, " Sensors, vol. 16, no. 1, p. 115, 2016.
    • (2016) Sensors , vol.16 , Issue.1 , pp. 115
    • Ordónez, F.J.1    Roggen, D.2
  • 44
    • 56449110012 scopus 로고    scopus 로고
    • Classification using discriminative restricted Boltzmann machines
    • H. Larochelle and Y. Bengio, "Classification using discriminative restricted Boltzmann machines, " in Proc. 25th Int. Conf. Machine Learning, 2008, pp. 536-543.
    • (2008) Proc. 25th Int. Conf. Machine Learning , pp. 536-543
    • Larochelle, H.1    Bengio, Y.2
  • 45
    • 84946734767 scopus 로고    scopus 로고
    • Unconstrained multimodal multi-label learning
    • Y. Huang, W. Wang, and L. Wang, "Unconstrained multimodal multi-label learning, " IEEE Trans. Multimedia, vol. 17, no. 11, pp. 1923-1935, 2015.
    • (2015) IEEE Trans. Multimedia , vol.17 , Issue.11 , pp. 1923-1935
    • Huang, Y.1    Wang, W.2    Wang, L.3
  • 51
    • 84956802323 scopus 로고    scopus 로고
    • A tutorial survey of architectures, algorithms, and applications for deep learning
    • L. Deng, "A tutorial survey of architectures, algorithms, and applications for deep learning, " APSIPA Trans. Signal and Inform. Processing, vol. 3, pp. 1-29, 2014.
    • (2014) APSIPA Trans. Signal and Inform. Processing , vol.3 , pp. 1-29
    • Deng, L.1
  • 55
    • 84923050897 scopus 로고    scopus 로고
    • Multimodal video classification with stacked contractive autoencoders
    • Mar.
    • Y. Liu, X. Feng, and Z. Zhou, "Multimodal video classification with stacked contractive autoencoders, " Signal Processing, vol. 120, pp. 761-766, Mar. 2016.
    • (2016) Signal Processing , vol.120 , pp. 761-766
    • Liu, Y.1    Feng, X.2    Zhou, Z.3
  • 58
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks, " Science, vol. 313, no. 5786, pp. 504-507, 2006.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.R.2
  • 59
    • 84939232131 scopus 로고    scopus 로고
    • Learning compact hash codes for multimodal representations using orthogonal deep structure
    • D. Wang, P. Cui, M. Ou, and W. Zhu, "Learning compact hash codes for multimodal representations using orthogonal deep structure, " IEEE Trans. Multimedia, vol. 17, no. 9, pp. 1404-1416, 2015.
    • (2015) IEEE Trans. Multimedia , vol.17 , Issue.9 , pp. 1404-1416
    • Wang, D.1    Cui, P.2    Ou, M.3    Zhu, W.4
  • 65
    • 84906567304 scopus 로고    scopus 로고
    • Structured regularizers for high-dimensional problems: Statistical and computational issues
    • Apr.
    • M. J. Wainwright, "Structured regularizers for high-dimensional problems: Statistical and computational issues, " Annu. Rev. Statistics Application, vol. 1, pp. 233-253, Apr. 2014.
    • (2014) Annu. Rev. Statistics Application , vol.1 , pp. 233-253
    • Wainwright, M.J.1
  • 68
    • 84930703517 scopus 로고    scopus 로고
    • Multimodal deep network learning-based image annotation
    • S. Zhu, X. Li, and S. Shen, "Multimodal deep network learning-based image annotation, " IET Electron. Lett., vol. 51, no. 12, pp. 905-906, 2015.
    • (2015) IET Electron. Lett. , vol.51 , Issue.12 , pp. 905-906
    • Zhu, S.1    Li, X.2    Shen, S.3
  • 69
    • 84876033996 scopus 로고    scopus 로고
    • Structured feature selection and task relationship inference for multi-task learning
    • H. Fei and J. Huan, "Structured feature selection and task relationship inference for multi-task learning, " Knowledge and Inform. Syst., vol. 35, no. 2, pp. 345-364, 2013.
    • (2013) Knowledge and Inform. Syst. , vol.35 , Issue.2 , pp. 345-364
    • Fei, H.1    Huan, J.2
  • 70
    • 84913586072 scopus 로고    scopus 로고
    • Exploring inter-feature and interclass relationships with deep neural networks for video classification
    • Z. Wu, Y.-G. Jiang, J. Wang, J. Pu, and X. Xue, "Exploring inter-feature and interclass relationships with deep neural networks for video classification, " in Proc. ACM Int. Conf. Multimedia, 2014, pp. 167-176.
    • (2014) Proc. ACM Int. Conf. Multimedia , pp. 167-176
    • Wu, Z.1    Jiang, Y.-G.2    Wang, J.3    Pu, J.4    Xue, X.5
  • 71
    • 84946215753 scopus 로고    scopus 로고
    • Large-margin multi-modal deep learning for RGB-D object recognition
    • Nov.
    • A. Wang, J. Lu, J. Cai, T. J. Cham, and G. Wang, "Large-margin multi-modal deep learning for RGB-D object recognition, " IEEE Trans. Multimedia, vol. 17, no. 11, pp. 1887-1898, Nov. 2015.
    • (2015) IEEE Trans. Multimedia , vol.17 , Issue.11 , pp. 1887-1898
    • Wang, A.1    Lu, J.2    Cai, J.3    Cham, T.J.4    Wang, G.5
  • 72
    • 84973866198 scopus 로고    scopus 로고
    • MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition
    • A. Wang, J. Cai, J. Lu, and T.-J. Cham, "MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition, " in Proc. IEEE Int. Conf. Computer Vision, 2015, pp. 1125-1133.
    • (2015) Proc. IEEE Int. Conf. Computer Vision , pp. 1125-1133
    • Wang, A.1    Cai, J.2    Lu, J.3    Cham, T.-J.4
  • 73
    • 0027636611 scopus 로고
    • Learning and development in neural networks: The importance of starting small
    • J. Elman, "Learning and development in neural networks: The importance of starting small, " Cognition, vol. 48, no. 1, pp. 71-99, 1993.
    • (1993) Cognition , vol.48 , Issue.1 , pp. 71-99
    • Elman, J.1
  • 76
    • 0027662338 scopus 로고
    • Pruning algorithms-A survey
    • R. Reed, "Pruning algorithms-A survey, " IEEE Trans. Neural Netw., vol. 4, no. 5, pp. 740-747, 1993.
    • (1993) IEEE Trans. Neural Netw. , vol.4 , Issue.5 , pp. 740-747
    • Reed, R.1
  • 77
    • 84973901098 scopus 로고    scopus 로고
    • Learning the structure of deep convolutional networks
    • J. Feng and T. Darrel, "Learning the structure of deep convolutional networks, " in Proc. Int. Conf. Computer Vision, 2015, pp. 2749-2757.
    • (2015) Proc. Int. Conf. Computer Vision , pp. 2749-2757
    • Feng, J.1    Darrel, T.2
  • 79
    • 0025477595 scopus 로고
    • Genetic algorithms and neural networks: Optimizing connections and connectivity
    • D. Whitley, T. Starkweather, and C. Bogart, "Genetic algorithms and neural networks: Optimizing connections and connectivity, " Parallel Comput., vol. 14, no. 3, pp. 347-361, 1990.
    • (1990) Parallel Comput. , vol.14 , Issue.3 , pp. 347-361
    • Whitley, D.1    Starkweather, T.2    Bogart, C.3
  • 82
    • 84949985138 scopus 로고    scopus 로고
    • Taking the human out of the loop: A review of Bayesian optimization
    • B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas, "Taking the human out of the loop: A review of Bayesian optimization, " Proc. IEEE, vol. 104, no. 1, pp. 148-175, 2016.
    • (2016) Proc. IEEE , vol.104 , Issue.1 , pp. 148-175
    • Shahriari, B.1    Swersky, K.2    Wang, Z.3    Adams, R.P.4    De Freitas, N.5
  • 91
    • 84884231503 scopus 로고    scopus 로고
    • Vision meets robotics: The Kitti data set
    • A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The Kitti data set, " Int. J. Robotics Res., vol. 32, no. 11, pp. 1231-1237, 2013.
    • (2013) Int. J. Robotics Res. , vol.32 , Issue.11 , pp. 1231-1237
    • Geiger, A.1    Lenz, P.2    Stiller, C.3    Urtasun, R.4
  • 92
  • 93
    • 84956626439 scopus 로고    scopus 로고
    • UTD-MHAD: A multimodal data set for human action recognition utilizing a depth camera and a wearable inertial sensor
    • C. Chen, R. Jafari, and N. Kehtarnavaz, "UTD-MHAD: A multimodal data set for human action recognition utilizing a depth camera and a wearable inertial sensor, " in Proc. 2015 IEEE Int. Conf Image Processing (ICIP), 2015, pp. 168-172.
    • (2015) Proc. 2015 IEEE Int. Conf Image Processing (ICIP) , pp. 168-172
    • Chen, C.1    Jafari, R.2    Kehtarnavaz, N.3
  • 98
    • 84952649086 scopus 로고    scopus 로고
    • Design, implementation and validation of a novel open framework for agile development of mobile health applications
    • O. Banos, C. Villalonga, R. Garcia, A. Saez, M. Damas, J. A. Holgado-Terriza, S. Lee, H. Pomares, and I. Rojas, "Design, implementation and validation of a novel open framework for agile development of mobile health applications, " Biomedical Eng. Online, vol. 14, no. 2, p. S6, 2015. [Online]. Available: https://doi. org/10. 1186/1475-925X-14-S2-S6
    • (2015) Biomedical Eng , vol.14 , Issue.2 , pp. S6
    • Banos, O.1    Villalonga, C.2    Garcia, R.3    Saez, A.4    Damas, M.5    Holgado-Terriza, J.A.6    Lee, S.7    Pomares, H.8    Rojas, I.9
  • 101
    • 85040639683 scopus 로고    scopus 로고
    • Exploiting feature and class relationships in video categorization with regularized deep neural networks
    • Y.-G. Jiang, Z. Wu, J. Wang, X. Xue, and S.-F. Chang, "Exploiting feature and class relationships in video categorization with regularized deep neural networks, " IEEE Trans. Pattern Anal. Mach. Intell., 2017. doi: https://doi. org/10. 1109/ TPAMI. 2017. 2670560
    • (2017) IEEE Trans. Pattern Anal. Mach. Intell.
    • Jiang, Y.-G.1    Wu, Z.2    Wang, J.3    Xue, X.4    Chang, S.-F.5
  • 102
    • 84908121477 scopus 로고    scopus 로고
    • KinectFaceDB: A kinect database for face recognition
    • Nov.
    • R. Min, N. Kose, and J.-L. Dugelay, "KinectFaceDB: A kinect database for face recognition, " IEEE Trans. Syst., Man, Cybern., Syst., vol. 44, no. 11, pp. 1534-1548, Nov. 2014.
    • (2014) IEEE Trans. Syst., Man, Cybern., Syst. , vol.44 , Issue.11 , pp. 1534-1548
    • Min, R.1    Kose, N.2    Dugelay, J.-L.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.