메뉴 건너뛰기




Volumn 82, Issue 3, 2011, Pages 399-443

Knows what it knows: A framework for self-aware learning

Author keywords

Active learning; Computational learning theory; Exploration; Knows What It Knows (KWIK); Mistake bound; Probably Approximately Correct (PAC); Reinforcement learning

Indexed keywords

ACTIVE LEARNING; COMPUTATIONAL LEARNING THEORY; KNOWS WHAT IT KNOWS (KWIK); MISTAKE BOUNDS; PROBABLY APPROXIMATELY CORRECT;

EID: 79958797519     PISSN: 08856125     EISSN: 15730565     Source Type: Journal    
DOI: 10.1007/s10994-010-5225-4     Document Type: Article
Times cited : (117)

References (61)
  • 1
    • 31844444663 scopus 로고    scopus 로고
    • Exploration and apprenticeship learning in reinforcement learning
    • DOI 10.1145/1102351.1102352, ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning
    • Abbeel, P., & Ng, A. Y. (2005). Exploration and apprenticeship learning in reinforcement learning. In Proceedings of the twenty-second international conference on machine learning (pp. 1-8). (Pubitemid 43183309)
    • (2005) ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning , pp. 1-8
    • Abbeel, P.1    Ng, A.Y.2
  • 2
    • 0000710299 scopus 로고
    • Queries and concept learning
    • Angluin, D. (1988). Queries and concept learning. Machine Learning, 2, 319-342.
    • (1988) Machine Learning , vol.2 , pp. 319-342
    • Angluin, D.1
  • 4
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade-offs
    • Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3, 397-422.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
    • Auer, P.1
  • 5
    • 1942450194 scopus 로고    scopus 로고
    • Technical Report CMU-RI-TR-01-25). Robotics Institute, Carnegie Mellon University, Pittsburgh, PA
    • Bagnell, J., Ng, A. Y., & Schneider, J. (2001). Solving uncertain Markov decision problems (Technical Report CMU-RI-TR-01-25). Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
    • (2001) Solving Uncertain Markov Decision Problems
    • Bagnell, J.1    Ng, A.Y.2    Schneider, J.3
  • 8
    • 0028517062 scopus 로고
    • Separating distribution-free and mistake-bound learning models over the Boolean domain
    • Blum, A. (1994). Separating distribution-free and mistake-bound learning models over the Boolean domain. SIAM Journal on Computing, 23, 990-1000.
    • (1994) SIAM Journal on Computing , vol.23 , pp. 990-1000
    • Blum, A.1
  • 9
    • 0346942368 scopus 로고    scopus 로고
    • Decision-Theoretic Planning: Structural Assumptions and Computational Leverage
    • Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11, 1-94. (Pubitemid 129628760)
    • (1999) Journal of Artificial Intelligence Research , vol.11 , pp. 1-94
    • Boutilier, C.1    Dean, T.2    Hanks, S.3
  • 10
    • 0041965975 scopus 로고    scopus 로고
    • R-MAX-A general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman, R. I., & Tennenholtz, M. (2002). R-MAX-a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 13
    • 20544462399 scopus 로고    scopus 로고
    • Minimizing regret with label efficient prediction
    • DOI 10.1109/TIT.2005.847729
    • Cesa-Bianchi, N., Lugosi, G., & Stoltz, G. (2005). Minimizing regret with label efficient prediction. IEEE Transactions on Information Theory, 51, 2152-2162. (Pubitemid 40843632)
    • (2005) IEEE Transactions on Information Theory , vol.51 , Issue.6 , pp. 2152-2162
    • Cesa-Bianchi, N.1    Lugosi, G.2    Stoltz, G.3
  • 14
    • 33745738567 scopus 로고    scopus 로고
    • Worst-case analysis of selective sampling for linear classification
    • Cesa-Bianchi, N., Gentile, C., & Zaniboni, L. (2006). Worst-case analysis of selective sampling for linear classification. Journal of Machine Learning Research, 7, 1205-1230. (Pubitemid 44015299)
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 1205-1230
    • Cesa-Bianchi, N.1    Gentile, C.2    Zaniboni, L.3
  • 16
  • 17
    • 0028424239 scopus 로고
    • Improving generalization with active learning
    • Cohn, D. A., Atlas, L., & Ladner, R. E. (1994). Improving generalization with active learning. Machine Learning, 15, 201-221.
    • (1994) Machine Learning , vol.15 , pp. 201-221
    • Cohn, D.A.1    Atlas, L.2    Ladner, R.E.3
  • 18
    • 84990553353 scopus 로고
    • A model for reasoning about persistence and causation
    • Dean, T., & Kanazawa, K. (1989). A model for reasoning about persistence and causation. Computational Intelligence, 5, 142-150.
    • (1989) Computational Intelligence , vol.5 , pp. 142-150
    • Dean, T.1    Kanazawa, K.2
  • 21
    • 2542446495 scopus 로고
    • Master's thesis, Department of Computer Science, University of Waterloo, Ontario, Canada
    • Fong, P.W. L. (1995b). A quantitative study of hypothesis selection.Master's thesis, Department of Computer Science, University of Waterloo, Ontario, Canada.
    • (1995) A Quantitative Study of Hypothesis Selection
    • Fong, P.W.L.1
  • 23
    • 0031209604 scopus 로고    scopus 로고
    • Selective Sampling Using the Query by Committee Algorithm
    • Freund, Y., Seung, H. S., Shamir, E., & Tishby, N. (1997b). Selective sampling using the query by committee algorith M. Machine Learning, 28, 133-168. (Pubitemid 127506338)
    • (1997) Machine Learning , vol.28 , Issue.2-3 , pp. 133-168
    • Freund, Y.1    Seung, H.S.2    Shamir, E.3    Tishby, N.4
  • 24
    • 24344500472 scopus 로고    scopus 로고
    • Generalization bounds for averaged classifiers
    • DOI 10.1214/009053604000000058
    • Freund, Y.,Mansour, Y., & Schapire, R. E. (2004). Generalization bounds for averaged classifiers. The Annals of Statistics, 32, 1698-1722. (Pubitemid 41250282)
    • (2004) Annals of Statistics , vol.32 , Issue.4 , pp. 1698-1722
    • Freund, Y.1    Mansour, Y.2    Schapire, R.E.3
  • 25
    • 0004236492 scopus 로고
    • (2nd ed.). Baltimore: The Johns Hopkins University Press
    • Golub, G. H., & Van Loan, C. F. (1989). Matrix computations (2nd ed.). Baltimore: The Johns Hopkins University Press.
    • (1989) Matrix Computations
    • Golub, G.H.1    Van Loan, C.F.2
  • 27
    • 84947403595 scopus 로고
    • Probability inequalities for sums of bounded random variables
    • Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58, 13-30.
    • (1963) Journal of the American Statistical Association , vol.58 , pp. 13-30
    • Hoeffding, W.1
  • 28
    • 23244466805 scopus 로고    scopus 로고
    • Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College London
    • Kakade, S. M. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College London.
    • (2003) On the Sample Complexity of Reinforcement Learning
    • Kakade, S.M.1
  • 31
    • 0028460231 scopus 로고
    • Efficient distribution-free learning of probabilistic concepts
    • DOI 10.1016/S0022-0000(05)80062-5
    • Kearns, M. J., & Schapire, R. E. (1994). Efficient distribution-free learning of probabilistic concepts. Journal of Computer and System Sciences, 48, 464-497. (Pubitemid 124013300)
    • (1994) Journal of Computer and System Sciences , vol.48 , Issue.3 , pp. 464-497
    • Kearns, M.J.1    Schapire, R.E.2
  • 32
    • 0036832954 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • Kearns, M. J., & Singh, S. P. (2002). Near-optimal reinforcement learning in polynomial time. Machine Learning, 49, 209-232.
    • (2002) Machine Learning , vol.49 , pp. 209-232
    • Kearns, M.J.1    Singh, S.P.2
  • 34
    • 0036832951 scopus 로고    scopus 로고
    • A sparse sampling algorithm for near-optimal planning in large Markov decision processes
    • DOI 10.1023/A:1017932429737
    • Kearns, M. J., Mansour, Y., & Ng, A. Y. (2002). A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Machine Learning, 49, 193-208. (Pubitemid 34325686)
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 193-208
    • Kearns, M.1    Mansour, Y.2    Ng, A.Y.3
  • 37
    • 0037400054 scopus 로고    scopus 로고
    • An empirical study of two approaches to sequence learning for anomaly detection
    • Lane, T., & Brodley, C. E. (2003). An empirical study of two approaches to sequence learning for anomaly detection. Machine Learning, 51, 73-107.
    • (2003) Machine Learning , vol.51 , pp. 73-107
    • Lane, T.1    Brodley, C.E.2
  • 40
    • 78649496546 scopus 로고    scopus 로고
    • Reducing reinforcement learning to KWIK online regression
    • doi:10.1007/s10472-010-9201-2
    • Li, L., & Littman, M. L. (2010). Reducing reinforcement learning to KWIK online regression. Annals of Mathematics and Artificial Intelligence. doi:10.1007/s10472-010-9201-2.
    • (2010) Annals of Mathematics and Artificial Intelligence
    • Li, L.1    Littman, M.L.2
  • 43
    • 34250091945 scopus 로고
    • Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm
    • Littlestone, N. (1987). Learning quickly when irrelevant attributes abound: A new linear-threshold algorith M. Machine Learning, 2, 285-318.
    • (1987) Machine Learning , vol.2 , pp. 285-318
    • Littlestone, N.1
  • 45
    • 0027684215 scopus 로고
    • Prioritized sweeping: Reinforcement learning with less data and less real time
    • Moore, A.W., & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13, 103-130.
    • (1993) Machine Learning , vol.13 , pp. 103-130
    • Moore, A.W.1    Atkeson, C.G.2
  • 49
    • 0028497385 scopus 로고
    • An upper bound on the loss from approximate optimal-value functions
    • Singh, S. P., & Yee, R. C. (1994). An upper bound on the loss from approximate optimal-value functions. Machine Learning, 16, 227.
    • (1994) Machine Learning , vol.16 , pp. 227
    • Singh, S.P.1    Yee, R.C.2
  • 59
    • 0021518106 scopus 로고
    • A theory of the learnable
    • Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27, 1134-1142.
    • (1984) Communications of the ACM , vol.27 , pp. 1134-1142
    • Valiant, L.G.1
  • 60
    • 79958846996 scopus 로고    scopus 로고
    • Exploring compact reinforcement-learning representations with linear regression
    • A refined version is available as Technical Report DCS-tr-660, Department of Computer Science, Rutgers University, December, 2009
    • Walsh, T. J., Szita, I., Diuk, C., & Littman, M. L. (2009). Exploring compact reinforcement-learning representations with linear regression. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence (UAI-09) (pp. 591-598). A refined version is available as Technical Report DCS-tr-660, Department of Computer Science, Rutgers University, December, 2009.
    • (2009) Proceedings of the Twenty-fifth Conference on Uncertainty in Artificial Intelligence (UAI-09) , pp. 591-598
    • Walsh, T.J.1    Szita, I.2    Diuk, C.3    Littman, M.L.4
  • 61
    • 49549125826 scopus 로고    scopus 로고
    • Maximizing classifier utility when training data is costly
    • Weiss, G. M., & Tian, Y. (2006). Maximizing classifier utility when training data is costly. SIGKDD Explorations, 8, 31-38.
    • (2006) SIGKDD Explorations , vol.8 , pp. 31-38
    • Weiss, G.M.1    Tian, Y.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.