SCOPUS 정보 검색 플랫폼

Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017

Volumn 2017-January, Issue , 2017, Pages 5729-5738

Self-supervised video representation learning with odd-one-out networks

(4) Fernando, Basura a Bilen, Hakan b Gavves, Efstratios c Gould, Stephen a

a AUSTRALIAN NATIONAL UNIVERSITY (Australia)

b UNIVERSITY OF OXFORD (United Kingdom)

c UNIVERSITY OF AMSTERDAM (Netherlands)

Author keywords

[No Author keywords available]

Indexed keywords

CLASSIFICATION (OF INFORMATION); LEARNING SYSTEMS; NEURAL NETWORKS; VIDEO RECORDING;

ACTION CLASSIFICATIONS; ACTION RECOGNITION; CONVOLUTIONAL NEURAL NETWORK; MANUAL ANNOTATION; STATE-OF-THE-ART METHODS; SUPERVISED LEARNING METHODS; TEMPORAL REPRESENTATIONS; VIDEO REPRESENTATIONS;

COMPUTER VISION;

EID: 85041924012 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CVPR.2017.607 Document Type: Conference Paper

Times cited : (400)

References (46)

1
- 85007158864
- arXiv preprint 1
- S. Abu-El-Haija, N. Kothari, J. Lee, P. Natsev, G. Toderici, B. Varadarajan, and S. Vijayanarasimhan. YouTube-8M: A large-scale video classification benchmark. arXiv preprint arXiv:1609.08675, 2016. 1
- (2016) YouTube-8M: A Large-Scale Video Classification Benchmark
- Abu-El-Haija, S.¹ Kothari, N.² Lee, J.³ Natsev, P.⁴ Toderici, G.⁵ Varadarajan, B.⁶ Vijayanarasimhan, S.⁷

2
- 84973926501
- Learning to see by moving
- In 1, 2
- P. Agrawal, J. Carreira, and J. Malik. Learning to see by moving. In ICCV, 2015. 1, 2
- (2015) ICCV
- Agrawal, P.¹ Carreira, J.² Malik, J.³

3
- 85162003508
- Slow, decorrelated features for pretraining complex cell-like networks
- In, 1, 2
- Y. Bengio and J. S. Bergstra. Slow, decorrelated features for pretraining complex cell-like networks. In NIPS, 2009. 1, 2
- (2009) NIPS , vol.1 , pp. 2
- Bengio, Y.¹ Bergstra, J.S.²

4
- 85044505731
- Action recognition with dynamic image networks
- abs/1612.00738, 5
- H. Bilen, B. Fernando, E. Gavves, and A. Vedaldi. Action recognition with dynamic image networks. CoRR, abs/1612.00738, 2016. 5
- (2016) CoRR
- Bilen, H.¹ Fernando, B.² Gavves, E.³ Vedaldi, A.⁴

5
- 84986334053
- Dynamic image networks for action recognition
- In 1, 2, 3, 5, 6, 8
- H. Bilen, B. Fernando, E. Gavves, A. Vedaldi, and S. Gould. Dynamic image networks for action recognition. In CVPR, 2016. 1, 2, 3, 5, 6, 8
- (2016) CVPR
- Bilen, H.¹ Fernando, B.² Gavves, E.³ Vedaldi, A.⁴ Gould, S.⁵

6
- 0024220237
- Auto-association by multilayer perceptrons and singular value decomposition
- 1, 2
- H. Bourlard and Y. Kamp. Auto-association by multilayer perceptrons and singular value decomposition. Biological cybernetics, 59(4-5):291-294, 1988. 1, 2
- (1988) Biological Cybernetics , vol.59 , Issue.4-5 , pp. 291-294
- Bourlard, H.¹ Kamp, Y.²

7
- 85044540872
- Learning transformational invariants from natural movies
- In 2
- C. Cadieu and B. A. Olshausen. Learning transformational invariants from natural movies. In NIPS, 2008. 2
- (2008) NIPS
- Cadieu, C.¹ Olshausen, B.A.²

8
- 85044534882
- Visual permutation learning
- In 1, 2
- R. S. Cruz, B. Fernando, A. Cherian, and S. Gould. Visual permutation learning. In CVPR, 2017. 1, 2
- (2017) CVPR
- Cruz, R.S.¹ Fernando, B.² Cherian, A.³ Gould, S.⁴

9
- 85198028989
- Imagenet: A large-scale hierarchical image database
- In 8
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009. 8
- (2009) CVPR
- Deng, J.¹ Dong, W.² Socher, R.³ Li, L.-J.⁴ Li, K.⁵ Fei-Fei, L.⁶

10
- 84973916088
- Unsupervised visual representation learning by context prediction
- In 1, 2
- C. Doersch, A. Gupta, and A. A. Efros. Unsupervised visual representation learning by context prediction. In ICCV, 2015. 1, 2
- (2015) ICCV
- Doersch, C.¹ Gupta, A.² Efros, A.A.³

11
- 84906504048
- Decaf: A deep convolutional activation feature for generic visual recognition
- abs/1310.1531, 6
- J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. CoRR, abs/1310.1531, 2013. 6
- (2013) CoRR
- Donahue, J.¹ Jia, Y.² Vinyals, O.³ Hoffman, J.⁴ Zhang, N.⁵ Tzeng, E.⁶ Darrell, T.⁷

12
- 84937964776
- Discriminative unsupervised feature learning with convolutional neural networks
- In, 1, 2
- A. Dosovitskiy, J. T. Springenberg, M. Riedmiller, and T. Brox. Discriminative unsupervised feature learning with convolutional neural networks. In NIPS, 2014. 1, 2
- (2014) NIPS
- Dosovitskiy, A.¹ Springenberg, J.T.² Riedmiller, M.³ Brox, T.⁴

13
- 84986290213
- Discriminative hierarchical rank pooling for activity recognition
- In 2
- B. Fernando, P. Anderson, M. Hutter, and S. Gould. Discriminative hierarchical rank pooling for activity recognition. In CVPR, 2016. 2
- (2016) CVPR
- Fernando, B.¹ Anderson, P.² Hutter, M.³ Gould, S.⁴

14
- 84959223985
- Modeling video evolution for action recognition
- In 2
- B. Fernando, E. Gavves, J. Oramas, A. Ghodrati, and T. Tuytelaars. Modeling video evolution for action recognition. In CVPR, 2015. 2
- (2015) CVPR
- Fernando, B.¹ Gavves, E.² Oramas, J.³ Ghodrati, A.⁴ Tuytelaars, T.⁵

15
- 84986281831
- Rank pooling for action recognition
- 2, 3, 5
- B. Fernando, E. Gavves, J. Oramas, A. Ghodrati, and T. Tuytelaars. Rank pooling for action recognition. TPAMI, PP(99):1-1, 2016. 2, 3, 5
- (2016) TPAMI , Issue.99 , pp. 1
- Fernando, B.¹ Gavves, E.² Oramas, J.³ Ghodrati, A.⁴ Tuytelaars, T.⁵

16
- 84998887168
- Learning end-to-end video classification with rank-pooling
- In 3
- B. Fernando and S. Gould. Learning end-to-end video classification with rank-pooling. In ICML, 2016. 3
- (2016) ICML
- Fernando, B.¹ Gould, S.²

17
- 33845594569
- Dimensionality reduction by learning an invariant mapping
- In 8
- R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. In CVPR, 2006. 8
- (2006) CVPR
- Hadsell, R.¹ Chopra, S.² LeCun, Y.³

18
- 85013912362
- Going deeper into action recognition: A survey
- 3
- S. Herath, M. Harandi, and F. Porikli. Going deeper into action recognition: A survey. Image and Vision Computing, 60:4-21, 2017. 3
- (2017) Image and Vision Computing , vol.60 , pp. 4-21
- Herath, S.¹ Harandi, M.² Porikli, F.³

19
- 0000999440
- Learning and releaming in boltzmann machines
- 2
- G. E. Hinton and T. J. Sejnowski. Learning and releaming in boltzmann machines. Parallel distributed processing: Explorations in the microstructure of cognition, 1:282-317, 1986. 2
- (1986) Parallel Distributed Processing: Explorations in The Microstructure of Cognition , vol.1 , pp. 282-317
- Hinton, G.E.¹ Sejnowski, T.J.²

20
- 0002834189
- Autoencoders, minimum description length, and helmholtz free energy
- 1, 2
- G. E. Hinton and R. S. Zemel. Autoencoders, minimum description length, and helmholtz free energy. NIPS, 1994. 1, 2
- (1994) NIPS
- Hinton, G.E.¹ Zemel, R.S.²

21
- 84911453664
- Action localization by tubelets from motion
- In 3
- M. Jain, J. C. van Gemert, H. Jégou, P. Bouthemy, and C. G. M. Snoek. Action localization by tubelets from motion. In CVPR, 2014. 3
- (2014) CVPR
- Jain, M.¹ Van Gemert, J.C.² Jégou, H.³ Bouthemy, P.⁴ Snoek, C.G.M.⁵

22
- 84870183903
- 3d convolutional neural networks for human action recognition
- 5
- S. Ji, W. Xu, M. Yang, and K. Yu. 3d convolutional neural networks for human action recognition. PAMI, 35(1):221-231, 2013. 5
- (2013) PAMI , vol.35 , Issue.1 , pp. 221-231
- Ji, S.¹ Xu, W.² Yang, M.³ Yu, K.⁴

23
- 84911364368
- Large-scale video classification with convolutional neural networks
- In 1
- A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In CVPR, 2014. 1
- (2014) CVPR
- Karpathy, A.¹ Toderici, G.² Shetty, S.³ Leung, T.⁴ Sukthankar, R.⁵ Fei-Fei, L.⁶

24
- 84876231242
- ImageNet classification with deep convolutional neural networks
- In 3
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012. 3
- (2012) NIPS
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

25
- 84856682691
- Hmdb: A large video database for human motion recognition
- In 6
- H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. Hmdb: a large video database for human motion recognition. In ICCV, 2011. 6
- (2011) ICCV
- Kuehne, H.¹ Jhuang, H.² Garrote, E.³ Poggio, T.⁴ Serre, T.⁵

26
- 85018976356
- Learning local image descriptors with deep siamese and triplet convolutional networks by minimising global loss functions
- 2
- B. G. V. Kumar, G. Carneiro, and I. D. Reid. Learning local image descriptors with deep siamese and triplet convolutional networks by minimising global loss functions. CoRR, 2015. 2
- (2015) CoRR
- Kumar, B.G.V.¹ Carneiro, G.² Reid, I.D.³

27
- 0032203257
- Gradient-based learning applied to document recognition
- 1
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998. 1
- (1998) Proceedings of The IEEE , vol.86 , Issue.11 , pp. 2278-2324
- LeCun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

28
- 84863380535
- Unsupervised feature learning for audio classification using convolutional deep belief networks
- In 2
- H. Lee, P. Pham, Y. Largman, and A. Y. Ng. Unsupervised feature learning for audio classification using convolutional deep belief networks. In NIPS, 2009. 2
- (2009) NIPS
- Lee, H.¹ Pham, P.² Largman, Y.³ Ng, A.Y.⁴

29
- 84986305500
- Vlad3: Encoding dynamics of deep features for action recognition
- In 2
- Y. Li, W. Li, V. Mahadevan, and N. Vasconcelos. Vlad3: Encoding dynamics of deep features for action recognition. In CVPR, 2016. 2
- (2016) CVPR
- Li, Y.¹ Li, W.² Mahadevan, V.³ Vasconcelos, N.⁴

30
- 85006028843
- arXiv preprint 3
- Z. Li, E. Gavves, M. Jain, and C. G. M. Snoek. Videolstm convolves, attends and flows for action recognition. arXiv preprint arXiv:1607.01794, 2016. 3
- (2016) Videolstm Convolves, Attends and Flows for Action Recognition
- Li, Z.¹ Gavves, E.² Jain, M.³ Snoek, C.G.M.⁴

31
- 85015698797
- arXiv preprint 2, 7, 8
- I. Misra, C. L. Zitnick, and M. Hebert. Unsupervised learning using sequential verification for action recognition. arXiv preprint arXiv:1603.08561, 2016. 2, 7, 8
- (2016) Unsupervised Learning Using Sequential Verification for Action Recognition
- Misra, I.¹ Zitnick, C.L.² Hebert, M.³

32
- 71149084945
- Deep learning from temporal coherence in video
- In 1, 2, 8
- H. Mobahi, R. Collobert, and J. Weston. Deep learning from temporal coherence in video. In ICML, 2009. 1, 2, 8
- (2009) ICML
- Mobahi, H.¹ Collobert, R.² Weston, J.³

33
- 84911385613
- Seeing the arrow of time
- In, 1, 2
- L. C. Pickup, Z. Pan, D. Wei, Y. Shih, C. Zhang, A. Zisser-man, B. Schölkopf, and W. T. Freeman. Seeing the arrow of time. In CVPR, 2014. 1, 2
- (2014) CVPR , vol.1 , pp. 2
- Pickup, L.C.¹ Pan, Z.² Wei, D.³ Shih, Y.⁴ Zhang, C.⁵ Zisser-Man, A.⁶ Schölkopf, B.⁷ Freeman, W.T.⁸

34
- 84947041871
- Imagenet large scale visual recognition challenge
- 1
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. IJCV, 115(3):211-252, 2015. 1
- (2015) IJCV , vol.115 , Issue.3 , pp. 211-252
- Russakovsky, O.¹ Deng, J.² Su, H.³ Krause, J.⁴ Satheesh, S.⁵ Ma, S.⁶ Huang, Z.⁷ Karpathy, A.⁸ Khosla, A.⁹ Bernstein, M.¹⁰

35
- 84884955228
- arXiv preprint 5
- K. Soomro, A. R. Zamir, and M. Shah. Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402, 2012. 5
- (2012) Ucf101: A Dataset of 101 Human Actions Classes from Videos in The Wild
- Soomro, K.¹ Zamir, A.R.² Shah, M.³

36
- 84969544782
- Unsupervised learning of video representations using lstms
- In 2
- N. Srivastava, E. Mansimov, and R. Salakhutdinov. Unsupervised learning of video representations using lstms. In ICML, 2015. 2
- (2015) ICML
- Srivastava, N.¹ Mansimov, E.² Salakhutdinov, R.³

37
- 84973863239
- Human action recognition using factorized spatio-temporal convolutional networks
- In 2, 3, 5
- L. Sun, K. Jia, D.-Y. Yeung, and B. E. Shi. Human action recognition using factorized spatio-temporal convolutional networks. In ICCV, 2015. 2, 3, 5
- (2015) ICCV
- Sun, L.¹ Jia, K.² Yeung, D.-Y.³ Shi, B.E.⁴

38
- 84928547704
- Sequence to sequence learning with neural networks
- In 5
- I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, 2014. 5
- (2014) NIPS
- Sutskever, I.¹ Vinyals, O.² Le, Q.V.³

39
- 84965114137
- arXiv preprint 3
- D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. arXiv preprint arXiv:1412.0767, 2014. 3
- (2014) Learning Spatiotemporal Features with 3d Convolutional Networks
- Tran, D.¹ Bourdev, L.² Fergus, R.³ Torresani, L.⁴ Paluri, M.⁵

40
- 84962815548
- Matconvnet - Convolutional neural networks for matlab
- In 6
- A. Vedaldi and K. Lenc. Matconvnet - convolutional neural networks for matlab. In Proceeding of the ACM Int. Conf. on Multimedia, 2015. 6
- (2015) Proceeding of The ACM Int. Conf. on Multimedia
- Vedaldi, A.¹ Lenc, K.²

41
- 84876945537
- Dense trajectories and motion boundary descriptors for action recognition
- 3
- H. Wang, A. Kläser, C. Schmid, and C.-L. Liu. Dense trajectories and motion boundary descriptors for action recognition. IJCV, 103:60-79, 2013. 3
- (2013) IJCV , vol.103 , pp. 60-79
- Wang, H.¹ Kläser, A.² Schmid, C.³ Liu, C.-L.⁴

42
- 84898805910
- Action recognition with improved trajectories
- In 3
- H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013. 3
- (2013) ICCV
- Wang, H.¹ Schmid, C.²

43
- 85019099168
- Temporal segment networks: Towards good practices for deep action recognition
- In 2, 3, 5, 8
- L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin, X. Tang, and L. Van Gool. Temporal segment networks: towards good practices for deep action recognition. In ECCV, 2016. 2, 3, 5, 8
- (2016) ECCV
- Wang, L.¹ Xiong, Y.² Wang, Z.³ Qiao, Y.⁴ Lin, D.⁵ Tang, X.⁶ Van Gool, L.⁷

44
- 84973889989
- Unsupervised learning of visual representations using videos
- In 1, 2, 8
- X. Wang and A. Gupta. Unsupervised learning of visual representations using videos. In ICCV, 2015. 1, 2, 8
- (2015) ICCV
- Wang, X.¹ Gupta, A.²

45
- 0036546660
- Slow feature analysis: Unsupervised learning of invariances
- 1, 2
- L. Wiskott and T. J. Sejnowski. Slow feature analysis: Unsupervised learning of invariances. Neural computation, 14(4):715-770, 2002. 1, 2
- (2002) Neural Computation , vol.14 , Issue.4 , pp. 715-770
- Wiskott, L.¹ Sejnowski, T.J.²

46
- 84973898486
- Exploiting image-trained CNN architectures for unconstrained video classification
- In 3
- S. Zha, F. Luisier, W. Andrews, N. Srivastava, and R. Salakhutdinov. Exploiting image-trained CNN architectures for unconstrained video classification. In BMVC, 2015. 3
- (2015) BMVC
- Zha, S.¹ Luisier, F.² Andrews, W.³ Srivastava, N.⁴ Salakhutdinov, R.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.