EFFECTS OF DIFFERENT TYPE OF COVARIATES AND SAMPLE SIZE ON PARAMETER ESTIMATION FOR MULTINOMIAL LOGISTIC REGRESSION MODEL

Authors

  • Hamzah Abdul Hamid Instute of Engineering Mathematics, Universiti Malaysia Perlis, Kampus Pauh Putra, 02600 Arau, Perlis, Malaysia
  • Yap Bee Wah Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia
  • Xian-Jin Xie Department of Clinical Sciences & Simmons Cancer Center, The University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd. Dallas, Texas, USA

DOI:

https://doi.org/10.11113/jt.v78.10036

Keywords:

Parameter estimation, simulation, multinomial logistic regression, skewed covariate

Abstract

The sample size and distributions of covariate may affect many statistical modeling techniques. This paper investigates the effects of sample size and data distribution on parameter estimates for multinomial logistic regression. A simulation study was conducted for different distributions (symmetric normal, positively skewed, negatively skewed) for the continuous covariates. In addition, we simulate categorical covariates to investigate their effects on parameter estimation for the multinomial logistic regression model. The simulation results show that the effect of skewed and categorical covariate reduces as sample size increases. The parameter estimates for normal distribution covariate apparently are less affected by sample size. For multinomial logistic regression model with a single covariate study, a sample size of at least 300 is required to obtain unbiased estimates when the covariate is positively skewed or is a categorical covariate. A much larger sample size is required when covariates are negatively skewed.

References

Hedeker, D. 2003. A Mixedâ€Effects Multinomial Logistic Regression Model. Stat. Med. 22(9):1433-1446.

Kutner, M. H., Nachtsheim, C. J., and Neter, J. 2004. Applied Linear Regression Models. 4th Edition. McGraw-Hill/Irwin.

Hosmer, D. Jr., and Lemeshow, S. 2004. Applied Logistic Regression. John Wiley & Sons; 2nd edition.

Prabhakar, M., Prasad, Y. G., Desai S., Thirupathi, M., Gopika, K., Rao, G. R., and Venkateswarlu, B. 2013. Hyperspectral Remote Sensing Of Yellow Mosaic Severity And Associated Pigment Losses In Vignamungo Using Multinomial Logistic Regression Models. Crop Prot., 45(2013): 132-140.

Venkataraman, K. and Uddameri, V. 2012. Modeling Simultaneous Exceedance Of Drinking-Water Standards Of Arsenic And Nitrate In The Southern Ogallala Aquifer Using Multinomial Logistic Regression. J. Hydrol. 458(2012): 16-27.

Varga, C., Middleton, D., Walton, R., Savage, R., Tighe, M.-K., Allen, V., Ahmed, R. and Rosella, L. 2012. Evaluating Risk Factors For Endemic Human Salmonella Enteritidis Infections With Different Phage Types In Ontario, Canada Using Multinomial Logistic Regression And A Case-Case Study Approach. BMC Public Health. 12(1): 866.

Erceg-Hurn, D. M. and Mirosevich, V. M. 2008. Modern Robust Statistical Methods: An Easy Way To Maximize The Accuracy And Power Of Your Research. Am. Psychol. 63(7): 591-601.

Jahan S. and Khan, A. 2012. Power Of T-Test For Simple Linear Regression Model With Non-Normal Error Distribution: A Quantile Function Distribution Approach. J. Sci. Res. 4(3): 609-622.

Khan A. and Rayner G. 2003. Robustness To Non-Normality Of Common Tests For The Many-Sample Location Problem. Journal of Applied Mathematics and Decision Sciences. 7(4): 187-206.

Curran, P. J., West, S. G. and Finch, J. F. 1996. The Robustness Of Test Statistics To Nonnormality And Specification Error In Confirmatory Factor Analysis. Psychological Methods. American Psychological Association, Inc. 1(1): 16-29.

Hamid, H. A., Wah, Y. B., Xie, X.-J. and Rahman, H. A. A. 2015. Assessing The Effects Of Different Types Of Covariates For Binary Logistic Regression. The 2nd ISM International Statistical Conference 2014 (ISM-II): Empowering the Applications of Statistical and Mathematical Sciences. AIP Publishing. 425(2015): 425-430.

Fagerland, M., Hosmer, D. and Bofin, A. 2008. Multinomial Goodnessâ€Ofâ€Fit Tests For Logistic Regression Models. Statist. Med. 27(21): 4238-4253.

Stokes, M. E., Davis, C. S. and Koch, G. G. 2009. Categorical Data Analysis Using the SAS System. 2nd edition. SAS Institute.

Hosmer, D. W. and Lemesbow, S. 1980. Goodness Of Fit Tests For The Multiple Logistic Regression Model. Commun. Stat.-Theory Methods. 9(10): 1043-1069.

Xie, X.-J., Pendergast, J. and Clarke, W. 2008. Increasing The Power: A Practical Approach To Goodness-Of-Fit Test For Logistic Regression Models With Continuous Predictors. Comput. Stat. Data Anal. 52(5): 2703-2713.

Motrenko, A., Strijov, V. and Weber, G.-W. 2014. Sample Size Determination For Logistic Regression. J. Comput. Appl. Math. 255(2014): 743-752.

Fishman, G. 1971. Estimating Sample Size In Computing Simulation Experiments. Manage. Sci. 18(1): 21-38.

Ancel, P. Y. 1999. Value Of Multinomial Model In Epidemiology: Application To The Comparison Of Risk Factors For Severely And Moderately Preterm Births. Rev. Epidemiol. Sante Publique. 47(6): 563-9.

Cramer, J. S. 2002. The Origins of Logistic Regression. Tinbergen Inst. Work. Pap. 2002-119/4.

Downloads

Published

2016-12-15

How to Cite

EFFECTS OF DIFFERENT TYPE OF COVARIATES AND SAMPLE SIZE ON PARAMETER ESTIMATION FOR MULTINOMIAL LOGISTIC REGRESSION MODEL. (2016). Jurnal Teknologi, 78(12-3). https://doi.org/10.11113/jt.v78.10036