-
1
-
-
0000501656
-
Information theory and an extension of the maximum likelihood principle
-
Petrov, B.N. and Csáki, F. (Eds.) Akadémia Kiadó, Budapest
-
Akaike, H. (1973) ‘Information theory and an extension of the maximum likelihood principle’, in Petrov, B.N. and Csáki, F. (Eds.): 2nd International Symposium on Information Theory, Akadémia Kiadó, Budapest, pp.267–281.
-
(1973)
2nd International Symposium on Information Theory
, pp. 267-281
-
-
Akaike, H.1
-
3
-
-
0033220735
-
Measure-based classifier performance evaluation
-
Andersson, A., Davidsson, P. and Lindén, J. (1999) ‘Measure-based classifier performance evaluation’, Pattern Recognition Letters, Vol. 20, No. 11–13, pp.1165–1173.
-
(1999)
Pattern Recognition Letters
, vol.20
, Issue.11-13
, pp. 1165-1173
-
-
Andersson, A.1
Davidsson, P.2
Lindén, J.3
-
4
-
-
0001838307
-
Estimating the accuracy of learned concepts
-
Chambéry, France
-
Bailey, T.L. and Elkan, C. (1993) ‘Estimating the accuracy of learned concepts’, 13th International Joint Conference on Artificial Intelligence, Chambéry, France, pp.895–901.
-
(1993)
13th International Joint Conference on Artificial Intelligence
, pp. 895-901
-
-
Bailey, T.L.1
Elkan, C.2
-
5
-
-
27144489164
-
A tutorial on support vector machines for pattern recognition
-
Burges, C.J.C. (1998) ‘A tutorial on support vector machines for pattern recognition’, Data Mining and Knowledge Discovery, Vol. 2, No. 2, pp.121–167.
-
(1998)
Data Mining and Knowledge Discovery
, vol.2
, Issue.2
, pp. 121-167
-
-
Burges, C.J.C.1
-
6
-
-
12244279570
-
Data mining in metric space: an empirical analysis of supervised learning performance criteria
-
Seattle, Washington, USA
-
Caruana, R. and Niculescu-Mizil, A. (2004) ‘Data mining in metric space: an empirical analysis of supervised learning performance criteria’, 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, pp.69–78.
-
(2004)
10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, pp. 69-78
-
-
Caruana, R.1
Niculescu-Mizil, A.2
-
7
-
-
0000259511
-
Approximate statistical test for comparing supervised classification learning algorithms
-
Dietterich, T.G. (1998) ‘Approximate statistical test for comparing supervised classification learning algorithms’, Neural Computation, Vol. 10, No. 7, pp.1895–1923.
-
(1998)
Neural Computation
, vol.10
, Issue.7
, pp. 1895-1923
-
-
Dietterich, T.G.1
-
8
-
-
22844456607
-
The role of Occam’s razor in knowledge discovery
-
Domingos, P. (1999) ‘The role of Occam’s razor in knowledge discovery’, Data Mining and Knowledge Discovery, Vol. 3, No. 4, pp.409–425.
-
(1999)
Data Mining and Knowledge Discovery
, vol.3
, Issue.4
, pp. 409-425
-
-
Domingos, P.1
-
9
-
-
0003922190
-
-
2nd ed., John Wiley & Sons, USA
-
Duda, R.O., Hart, P.E. and Stork, D.G. (2000) Pattern Classification, 2nd ed., John Wiley & Sons, USA.
-
(2000)
Pattern Classification
-
-
Duda, R.O.1
Hart, P.E.2
Stork, D.G.3
-
10
-
-
84950461478
-
Estimating the error rate of a prediction rule: improvement on cross-validation
-
Efron, B. (1983) ‘Estimating the error rate of a prediction rule: improvement on cross-validation’, American Statistical Association, Vol. 78, No. 382, pp.316–330.
-
(1983)
American Statistical Association
, vol.78
, Issue.382
, pp. 316-330
-
-
Efron, B.1
-
11
-
-
0001902549
-
An introduction to the bootstrap
-
Chapman & Hall, New York City, New York, USA
-
Efron, B. and Tibshirani, R.J. (1993) ‘An introduction to the bootstrap’, Monographs on Statistics and Applied Probability, Chapman & Hall, New York City, New York, USA.
-
(1993)
Monographs on Statistics and Applied Probability
-
-
Efron, B.1
Tibshirani, R.J.2
-
12
-
-
34948820133
-
Signal detection theory and ROC analysis
-
Academic Press, London
-
Egan, J.P. (1975) ‘Signal detection theory and ROC analysis’, Cognition and Perception, Academic Press, London.
-
(1975)
Cognition and Perception
-
-
Egan, J.P.1
-
14
-
-
1642379397
-
Introduction to the special issue on meta-learning
-
Giraud-Carrier, C., Vilalta, R. and Brazdil, P. (2004) ‘Introduction to the special issue on meta-learning’, Machine Learning, Vol. 54, No. 3, pp.187–193.
-
(2004)
Machine Learning
, vol.54
, Issue.3
, pp. 187-193
-
-
Giraud-Carrier, C.1
Vilalta, R.2
Brazdil, P.3
-
15
-
-
12444258560
-
Evaluation and selection of biases in machine learning
-
Gordon, D.F. and Desjardins, M. (1995) ‘Evaluation and selection of biases in machine learning’, Machine Learning, Vol. 20, pp.5–22.
-
(1995)
Machine Learning
, vol.20
, pp. 5-22
-
-
Gordon, D.F.1
Desjardins, M.2
-
16
-
-
33645034499
-
Minimum description length tutorial
-
Grünwald, P.I., Myung, J. and Pitt, M. (Eds.) MIT Press, Cambridge, Massachusetts, USA
-
Grünwald, P. (2005) ‘Minimum description length tutorial’, in Grünwald, P.I., Myung, J. and Pitt, M. (Eds.): Advances in Minimum Description Length – Theory and Applications, MIT Press, Cambridge, Massachusetts, USA, pp.23–81.
-
(2005)
Advances in Minimum Description Length – Theory and Applications
, pp. 23-81
-
-
Grünwald, P.1
-
17
-
-
0003562954
-
A simple generalisation of the area under the ROC curve for multiple class classification problems
-
Hand, D.J. and Till, R.J. (2001) ‘A simple generalisation of the area under the ROC curve for multiple class classification problems’, Machine Learning, Vol. 45, pp.171–186.
-
(2001)
Machine Learning
, vol.45
, pp. 171-186
-
-
Hand, D.J.1
Till, R.J.2
-
18
-
-
84939754891
-
Bootstrap techniques for error estimation
-
Jain, A.K., Dubes, R.C. and Chen, C-C. (1987) ‘Bootstrap techniques for error estimation’, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 9, No. 5, pp.628–623.
-
(1987)
IEEE Transactions on Pattern Analysis and Machine Intelligence
, vol.9
, Issue.5
-
-
Jain, A.K.1
Dubes, R.C.2
Chen, C.-C.3
-
19
-
-
0029306995
-
STATLOG: comparison of classification algorithms on large real-world problems
-
King, R.D., Feng, C. and Sutherland, A. (1995) ‘STATLOG: comparison of classification algorithms on large real-world problems’, Applied Artificial Intelligence, Vol. 9, No. 3, pp.259–287.
-
(1995)
Applied Artificial Intelligence
, vol.9
, Issue.3
, pp. 259-287
-
-
King, R.D.1
Feng, C.2
Sutherland, A.3
-
20
-
-
85164392958
-
A study of cross-validation and bootstrap for accuracy estimation and model selection
-
Kaufmann, M. (Ed.): Montréal, Québec, Canada
-
Kohavi, R. (1995) ‘A study of cross-validation and bootstrap for accuracy estimation and model selection’, in Kaufmann, M. (Ed.): 14th International Joint Conference on Artificial Intelligence, Montréal, Québec, Canada, pp.1137–1145.
-
(1995)
14th International Joint Conference on Artificial Intelligence
, pp. 1137-1145
-
-
Kohavi, R.1
-
21
-
-
0001902056
-
Three approaches to the quantitative definition of information
-
Kolmogorov, A.N. (1965) ‘Three approaches to the quantitative definition of information’, Problems in Information Transmission, Vol. 1, No. 1, pp.1–7.
-
(1965)
Problems in Information Transmission
, vol.1
, Issue.1
, pp. 1-7
-
-
Kolmogorov, A.N.1
-
22
-
-
33750727688
-
Quantifying the impact of learning algorithm parameter tuning
-
AAAI Press, Boston, Massachusetts, USA
-
Lavesson, N. and Davidsson, P. (2006) ‘Quantifying the impact of learning algorithm parameter tuning’, National Conference on Artificial Intelligence, AAAI Press, Boston, Massachusetts, USA, pp.395–400.
-
(2006)
National Conference on Artificial Intelligence
, pp. 395-400
-
-
Lavesson, N.1
Davidsson, P.2
-
23
-
-
0003682772
-
-
Technical Report CBM-TR-117, Rutgers University, New Brunswick, New Jersey, USA
-
Mitchell, T.M. (1980) The Need for Biases in Learning Generalizations, Technical Report CBM-TR-117, Rutgers University, New Brunswick, New Jersey, USA.
-
(1980)
The Need for Biases in Learning Generalizations
-
-
Mitchell, T.M.1
-
24
-
-
0004255908
-
Machine learning
-
McGraw-Hill, Singapore, international edition
-
Mitchell, T.M. (1997) ‘Machine learning’, Computer Science Series, McGraw-Hill, Singapore, international edition.
-
(1997)
Computer Science Series
-
-
Mitchell, T.M.1
-
25
-
-
0028544395
-
Network information criterion – determining the number of hidden units for an artificial neural network model
-
Murata, N., Yoshizawa, S. and Amari, S-I. (1994) ‘Network information criterion – determining the number of hidden units for an artificial neural network model’, IEEE Transactions on Neural Networks, Vol. 5, No. 6, pp.865–872.
-
(1994)
IEEE Transactions on Neural Networks
, vol.5
, Issue.6
, pp. 865-872
-
-
Murata, N.1
Yoshizawa, S.2
Amari, S.-I.3
-
27
-
-
0343838566
-
Development of multi-criteria metrics for evaluation of data mining algorithms
-
Newport Beach, AAAI Press, Menlo Park, California, USA
-
Nakhaeizadeh, G. and Schnabl, A. (1997) ‘Development of multi-criteria metrics for evaluation of data mining algorithms’, 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, AAAI Press, Menlo Park, California, USA, pp.37–42.
-
(1997)
3rd International Conference on Knowledge Discovery and Data Mining
, pp. 37-42
-
-
Nakhaeizadeh, G.1
Schnabl, A.2
-
28
-
-
85101511266
-
Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions
-
Huntington Beach, AAAI Press, Menlo Park, California, USA
-
Provost, F. and Fawcett, T. (1997) ‘Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions’, 3rd International Conference on Knowledge Discovery and Data Mining, Huntington Beach, AAAI Press, Menlo Park, California, USA.
-
(1997)
3rd International Conference on Knowledge Discovery and Data Mining
-
-
Provost, F.1
Fawcett, T.2
-
29
-
-
0002900357
-
The case against accuracy estimation for comparing induction algorithms
-
Kaufmann, M. (Ed.): Madison, WI, USA
-
Provost, F., Fawcett, T. and Kohavi, R. (1998) ‘The case against accuracy estimation for comparing induction algorithms’, in Kaufmann, M. (Ed.): 15th International Conference on Machine Learning, Madison, WI, USA, pp.445–453.
-
(1998)
15th International Conference on Machine Learning
, pp. 445-453
-
-
Provost, F.1
Fawcett, T.2
Kohavi, R.3
-
30
-
-
0001610304
-
Notes on bias in estimation
-
Quenouille, M.H. (1956) ‘Notes on bias in estimation’, Biometrika, Vol. 61, No.353–360.
-
(1956)
Biometrika
, vol.61
, Issue.353–360
-
-
Quenouille, M.H.1
-
31
-
-
84936421967
-
Choosing models for cross-classifications
-
Raftery, A.E. (1986) ‘Choosing models for cross-classifications’, American Sociological Review, Vol. 51, pp.145–146.
-
(1986)
American Sociological Review
, vol.51
, pp. 145-146
-
-
Raftery, A.E.1
-
32
-
-
0030383583
-
Towards robust model selection using estimation and approximation error bounds
-
ACM Press, New York City, New York, USA
-
Ratsaby, J., Meir, R. and Maiorov, V. (1996) ‘Towards robust model selection using estimation and approximation error bounds’, 9th Annual Conference on Computational Learning Theory, ACM Press, New York City, New York, USA, pp.57–67.
-
(1996)
9th Annual Conference on Computational Learning Theory
, pp. 57-67
-
-
Ratsaby, J.1
Meir, R.2
Maiorov, V.3
-
33
-
-
0018015137
-
Modeling by shortest data description
-
Rissanen, J. (1978) ‘Modeling by shortest data description’, Automatica, Vol. 14, pp.465–471.
-
(1978)
Automatica
, vol.14
, pp. 465-471
-
-
Rissanen, J.1
-
34
-
-
0003584577
-
Artificial intelligence: a modern approach
-
2nd ed., Prentice-Hall, Upper Saddle River, New Jersey, USA
-
Russell, S. and Norvig, P. (2003) ‘Artificial intelligence: a modern approach’, Prentice Hall Series in Artificial Intelligence, 2nd ed., Prentice-Hall, Upper Saddle River, New Jersey, USA.
-
(2003)
Prentice Hall Series in Artificial Intelligence
-
-
Russell, S.1
Norvig, P.2
-
35
-
-
0002534234
-
On comparing classifiers: a critique of current research and methods
-
Salzberg, S.L. (1999) ‘On comparing classifiers: a critique of current research and methods’, Data Mining and Knowledge Discovery, Vol. 1, pp.1–12.
-
(1999)
Data Mining and Knowledge Discovery
, vol.1
, pp. 1-12
-
-
Salzberg, S.L.1
-
36
-
-
33745789043
-
Building support vector machines with reduced classifier complexity
-
Sathiya Keerthi, S., Chapelle, O. and DeCoste, D. (2006) ‘Building support vector machines with reduced classifier complexity’, Machine Learning Research, Vol. 7, pp.1493–1515.
-
(2006)
Machine Learning Research
, vol.7
, pp. 1493-1515
-
-
Sathiya Keerthi, S.1
Chapelle, O.2
DeCoste, D.3
-
37
-
-
0001259758
-
Overfitting avoidance as bias
-
Schaffer, C. (1993) ‘Overfitting avoidance as bias’, Machine Learning, Vol. 10, No. 2, pp.153–178.
-
(1993)
Machine Learning
, vol.10
, Issue.2
, pp. 153-178
-
-
Schaffer, C.1
-
38
-
-
84983750945
-
A conservation law for generalization performance
-
Morgan Kaufmann, New Brunswick, New Jersey, USA
-
Schaffer, C. (1994) ‘A conservation law for generalization performance’, 11th International Conference on Machine Learning, Morgan Kaufmann, New Brunswick, New Jersey, USA, pp.259–265.
-
(1994)
11th International Conference on Machine Learning
, pp. 259-265
-
-
Schaffer, C.1
-
39
-
-
0031370398
-
A new metric-based approach to model selection
-
AAAI Press, Providence, Rhode Island, USA
-
Schuurmans, D. (1997) ‘A new metric-based approach to model selection’, 14th National Conference on Artificial Intelligence, AAAI Press, Providence, Rhode Island, USA, pp.552–558.
-
(1997)
14th National Conference on Artificial Intelligence
, pp. 552-558
-
-
Schuurmans, D.1
-
40
-
-
0000120766
-
Estimating the dimension of a model
-
Schwartz, G. (1978) ‘Estimating the dimension of a model’, Annals of Statistics, Vol. 6, No. 2, pp.461–464.
-
(1978)
Annals of Statistics
, vol.6
, Issue.2
, pp. 461-464
-
-
Schwartz, G.1
-
41
-
-
0000629975
-
Cross-validatory choice and assessment of statistical predictions
-
Stone, M. (1974) ‘Cross-validatory choice and assessment of statistical predictions’, Royal Statistical Society, Vol. 36, pp.111–147.
-
(1974)
Royal Statistical Society
, vol.36
, pp. 111-147
-
-
Stone, M.1
-
42
-
-
0035434818
-
Subspace information criterion for model selection
-
Sugiyama, M. and Ogawa, H. (2001) ‘Subspace information criterion for model selection’, Neural Computation, Vol. 13, No. 8, pp.1863–1890.
-
(2001)
Neural Computation
, vol.13
, Issue.8
, pp. 1863-1890
-
-
Sugiyama, M.1
Ogawa, H.2
-
43
-
-
0003840144
-
-
Open Court Publishing Company, La Salle, Illinois, USA
-
Tornay, S.C. (1938) Ockham: Studies and Selections, Open Court Publishing Company, La Salle, Illinois, USA.
-
(1938)
Ockham: Studies and Selections
-
-
Tornay, S.C.1
-
45
-
-
0001024505
-
On the uniform convergence of relative frequencies of events to their probabilities
-
Vapnik, V. and Chervonenkis, A. (1971) ‘On the uniform convergence of relative frequencies of events to their probabilities’, Theory of Probability and its Applications, Vol. 16, pp.264–280.
-
(1971)
Theory of Probability and its Applications
, vol.16
, pp. 264-280
-
-
Vapnik, V.1
Chervonenkis, A.2
-
46
-
-
0036791948
-
Perspective view and survey of meta learning
-
Vilalta, R. and Drissi, Y. (2002) ‘Perspective view and survey of meta learning’, Artificial Intelligence Review, Vol. 18, No. 2, pp.77–95.
-
(2002)
Artificial Intelligence Review
, vol.18
, Issue.2
, pp. 77-95
-
-
Vilalta, R.1
Drissi, Y.2
-
47
-
-
0003790115
-
-
Technical Report ML-TR-44, Department of Computer Science, Rutgers University, New Brunswick, New Jersey, USA
-
Weiss, G.M. and Provost, F. (2001) The Effect of Class Distribution on Classifier Learning: An Empirical Study, Technical Report ML-TR-44, Department of Computer Science, Rutgers University, New Brunswick, New Jersey, USA.
-
(2001)
The Effect of Class Distribution on Classifier Learning: An Empirical Study
-
-
Weiss, G.M.1
Provost, F.2
-
48
-
-
0003957032
-
-
2nd ed., Morgan Kaufmann, San Francisco, California, USA
-
Witten, I.H. and Frank, E. (2005) Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed., Morgan Kaufmann, San Francisco, California, USA.
-
(2005)
Data Mining: Practical Machine Learning Tools and Techniques
-
-
Witten, I.H.1
Frank, E.2
-
49
-
-
33847181085
-
-
Technical Report, NASA Ames Research Center, Moffett Field, California, USA
-
Wolpert, D.H. (2001) The Supervised Learning no free Lunch Theorems, Technical Report, NASA Ames Research Center, Moffett Field, California, USA.
-
(2001)
The Supervised Learning no free Lunch Theorems
-
-
Wolpert, D.H.1
-
50
-
-
84952963416
-
-
We assume that each V is a finite set but the framework as such is not limited in theory to this assumption
-
We assume that each V is a finite set but the framework as such is not limited in theory to this assumption.
-
-
-
-
51
-
-
84952969177
-
-
This process is sometimes called model selection. The term, originating from statistics, is frequently but also ambiguously used in the machine learning community. Consequently, it is often followed by some definition, e.g., “the objective is to select a good classifier from a set of classifiers” (Kohavi, 1995), “a mechanism for […] selecting a hypothesis among a set of candidate hypotheses based on some pre-specified quality measure” (Ratsaby et al., 1996)
-
This process is sometimes called model selection. The term, originating from statistics, is frequently but also ambiguously used in the machine learning community. Consequently, it is often followed by some definition, e.g., “the objective is to select a good classifier from a set of classifiers” (Kohavi, 1995), “a mechanism for […] selecting a hypothesis among a set of candidate hypotheses based on some pre-specified quality measure” (Ratsaby et al., 1996).
-
-
-
-
52
-
-
84952956089
-
-
In terms of our framework, an overfitted classifier scores a very high accuracy on Ts but it has an unacceptably low accuracy when classifying other instances. The goal is to find a classifier with an acceptable accuracy for typical instances of the problem domain
-
In terms of our framework, an overfitted classifier scores a very high accuracy on Ts but it has an unacceptably low accuracy when classifying other instances. The goal is to find a classifier with an acceptable accuracy for typical instances of the problem domain.
-
-
-
|