Development and Standardization of Biology Achievement Test

Document Type : Original Article

Authors

1 PhD in Measurement and Assessment; teacher of Baqer Al-Uloom High School, Education District 4 Shiraz -, IRAN

2 Professor in Educational Psychology; Faculty member of Psychology and Educational Sciences Department, Shiraz University, IRAN

Abstract

The general purpose of this study was to develop and standardize an Achievement test to measure student learning in the biology program at the secondary school. Both classical and IRT models were used to address the research objectives of the study. The preliminary instrument consisted of 150 multiple-choice items that on a sample size of 300 male and female students were performed. The final instrument was two parallel forms of 50 items that were performed on a normative sample of 938 male and female students in Shiraz. The Estimated reliability coefficient for internal consistency with test forms was 0.89, and 0.88, respectively. Based on factor analysis; both forms of the test were an overall factor saturated. Results showed there is no significant difference between the mean scores of boys and girls. So standardized and percentile Norms for all subjects were calculated. Findings from the IRT analysis showed that more than %92 of the items are significantly fitted to Three-Parameter Logistic Model. Test information function was a bell-shaped curve and over a wide ability range from -0.5 to +2.5 provides more information. Also, the maximum information was provided at +1.5 from the ability continuum. Based on this, it can be concluded that the Biology Achievement Test provides a more accurate estimate of the True score of subjects whose ability level is higher than the average ability continuum.
 

Keywords


امبرتسون، سوزان ای. و استیون، پی. رایس (2000). نظریه های جدید روانسنجی برای روان شناسان. ترجمه حسن پاشا شریفی، ولی الله فرزاد و همکاران (1388). تهران: رشد.
ثرندایک، رابرت، ال. (1982). روان‏سنجی کاربردی. ترجمه حیدر علی هومن (1369). تهران: موسسه انتشارات و چاپ دانشگاه تهران.
سازمان پژوهش و برنامه‌ریزی آموزشی، دفتر برنامه‌ریزی و تألیف کتب درسی. (1389). راهنمای برنامه درس زیست‌شناسی، تهران: مؤلف
سیف، علی‌اکبر. (1384). سنجش فرایند و فراورده یادگیری. تهران: دوران.
دفتر همکاری‌های علمی بین‌المللی وزارت آموزش‌وپرورش. (1379). مجموعه گفتارهای ارزشیابی در آموزش. تهران: مؤلف.
کیامنش، علیرضا. (1376). گزارش سنجش عملکرد در سومین مطالعه بین المللی ریاضی و علوم سال چهارم ابتدایی و سوم راهنمایی. تهران: وزارت آموزش و پرورش.
گلاور، جان ای و برونینگ، راجر اچ. (1375). روان‌شناسی تربیتی، اصول و کاربرد آن. ترجمه علینقی خرازی (1375). تهران: مرکز نشر دانشگاهی
گیج، نیت، ل؛ و برلانیر، دیوید، سی. (1374). روان‌شناسی تربیتی. ترجمه غلامرضا خوی نژاد. مشهد: حکیم فردوسی.
هومن، حیدر علی. (1372). اندازه‌گیری‌های روانی و تربیتی. تهران دیبا.
Abraham, M. R., Williamson, V. M. & Westbrook, S. L. (1994). A cross-age study of the understanding five concepts, Journal of Research in Science Teaching, 31 (2): 147-165.
Alele-Williams, G. (2002). Measurement and evaluation in mathematics: The way forward. Journal of basic sciences, 1(1):1-7
American Educational Research Association, American Psychological Association, & National Council of Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: Authors
Baker, J.O. (2003). Testing in modern classroom. London: George. Allen and Unwin
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s bility. In F. M. Lord & M. R. Novick, Statistical theories of mental test scores (pp. 397-472). Reading, MA: Addison-Wesley.
Black, Paul; Harrison, Christine; Lee, Clara; Marshall, Bethan and William, Dylan (2003). Assessment for Learning- putting it into practice. Maidenhead, U.K.: Open university Press.
Cizek,  Gregory  J. (1993). Testing for Learning: A Remonstrance. Educational Measurement: Issues and Practice, Volume 12(4): 40-42.
Crooks, T.J. (1988) ‘The impact of classroom evaluation practices on students’, Review of Educational Research, 58: 4.
de Ayala, R. J. (2009). The Theory and Practice of Item Response Theory, New York: Guilford Publications, Inc.
DeMars, Christine (2010). Item Response Theory. New York: Oxford University Press, Inc.
Faulkner‑Bond Molly and Wells Craig S. (2016). A Brief History of and Introduction to Item Response Theory.  In Ronald K. Hambleton and Stephen G. Sireci (Ed.). Educational measurement. New York: The Guilford Press.
Frederiksen, N. (1984) ‘The real test bias: Influences of testing on teaching and learning’, American Psychologist, 39(3): 193–202.
Gipps, C.V. (2003). Beyond Testing: Towards a Theory of Educational Assessment, London: Washington, D.C. Taylor & Francis e-Library.
Gronlund, N. E. (1988). How to Construct Achievement Tests (4th ed.). Englewood Cliffs, NJ: Prentice-Hall, Inc.
Gulliksen, H. (1950). Theory of mental tests. John Wiley & Sons Inc.
Hambleton, R. K., Rogers, H. J., & Swaminathan, H. (1991) Fundamentals of item response theory, Newbury Park, Cliff: Sage Publications
Hambleton, R.K. (1989). Principles and selected applications of item- response theory. In R. Linn (Ed.) Educational measurement, (3th ed.). New York: American Council on Education
Hambleton, R.K.& Cook, L.L. (1977). Latent trait models and their use in the analysis of educational test data, Journal of Educational Measurement, 14: 75-96.
Helmstadter, G. C. (1964). Principles of Psychological Measurement. New York: Appleton Century Crofts.
InternationaL Student Achievement in the TIMSS (2011). Science Content and Cognitive Domains. chapter 3, TIMSS & PIRLS International Study Center. Lynch school of education, Boston College.
Isaacs, T., Zara, C., Herbert, G., Coombs, S. and Smith, C. (2013) Key Concepts in Educational Assessment. SAGE Publications Ltd.
Kaplan, Robert M. and Saccuzzo, Dennis P. (2018). Psychological testing: Principles, applications, and issues, (9th ed.). Boston: Cengage Learning.
Linn, R. (2000). Assessment and accountability.  Educational Researcher, 29(2): 4-16.
Mehrens, W. A. and Lehman, H. J. (1986). Using Standardized Test in Education. (4th ed). New York : Longman.
Pellegrino, J. W., Chudowsky, N.  and Glaser, R. (Eds.). (2001) Knowing what Students Know :The Science and Design of Educational Assessment . Washington, DC. National Academy Press.
Peterson, R. F. & Treagust, D. F. (1989). Grade-12 students’ misconceptions of covalent bonding and structure, Journal of Chemical Education, 66 (6): 459-460.
Phelps, Richard P. (2012). The Effect of Testing on Student Achievement, 1910–2010. International Journal of Testing, 12: 21–43.
Popham, J. (1987) ‘The merits of measurement-driven instruction’, Phi Delta Kappa, 5: 679–82.
Salmon-Cox, L. (1981). Teachers and standardized achievement tests: what's really happening? Phi Delta Kappan, 62(10): 730-736.
Simon, M., Ercikan, K., Rousseau, M., (2013). Improving large-scale assessment in education :theory, issues and practice . New York & London: Routledge Flamer.
Urry, V. W. (1977) Tailored testing: a successful application of latent trait theory. Journal of Educational Measurement, 14(2): 181-196
Vale, C. D. and Gialluca K.  (1985) ASCAL: A Microcomputer Program for Estimating Logistic IRT Item Parameters. Computer Science