EFFECTS OF SAMPLE SIZE RATIO ON THE PERFORMANCE OF THE QUADRATIC DISCRIMINANT FUNCTION
DOI:
https://doi.org/10.51406/jnset.v11i2.1968Keywords:
Heteroscedastic, Unbalanced data, Discriminant function, prior probabilities, MisclassificationAbstract
This study investigated the performance of the heteroscedastic discriminant function under the non-optimal condition of unbalanced group representation in the populations. The asymptotic performance of the classification function with respect to increased Mahalanobis’ distance (under this condition) was considered. Results obtained have shown that the misclassification of observations from the smaller group escalates when the sample size ratio 1:2 is exceeded (for small sample sizes). Results also show more sensitivity to sample size than the distance function when the data set is balanced, while the performance of the function in the classification of the underrepresented group improved by increasing the distance function. More robustness with unbalanced data was also observed with the Quadratic Function than the Linear Discriminant Function.
References
Adebanji, A.O., Adeyemi, S., Iyaniwura, O. Effects of sample size ratio on the Linear Discriminant Function, International Journal of Modern Mathematics, 3(1): 97-108. http://ijmm.dixiewpublishing.com/
Fisher, R.A. 1936. The Use of Multiple Measurements in Taxonomic Problems, Annals of Eungenics, 7: 179-188.
Fisher, R.A. 1938. Statistical Utilization of Multiple Measurements, Annals of Eugenecs 8, 376-386.
Joossens, K. 2006. Robust Discriminant Analysis, Ph.D. Thesis of the Katholieke University, Leuven, Belgium, 31-46.
Lachenbruch, P.A., Mickey, M.R. 1968. Estimation of error rates in discriminant analysis, Technometrics, 10: 1-11.
McFarlan, R.H. Richards, D. 2002. Exact Misclassification Problems for Plug-in Normal Discriminant Functions: The Heterogeneous Case, Journal of Multivariate Analysis, 82: 229-330.
McLachlan, G.A. 1992. Textbook of Discriminant Analysis and Statistical Pattern Recognition, Wiley Series in Probability and Mathematical Statistics.
Murray, G.D. 1977. A Cautionary Note on Selection of Variables in Discriminant Analysis, Applied Statistics, 26(3): 246-250.