Publication Date
2023
Document Type
Dissertation/Thesis
First Advisor
Eads, Michael
Degree Name
Ph.D. (Doctor of Philosophy)
Legacy Department
Department of Physics
Abstract
Predicting students’ performance to identify which students are at risk of receiving aD/Fail/Withdraw (DFW) grade and ensuring their timely graduation is not just desirable but also necessary in most educational entities. In the US, not only is the Science, Technology, Engineering, and Mathematics (STEM) major becoming less popular among students, the graduation rate of STEM students is steadily declining. The lack of STEM graduates in the US is a serious problem that will place this country at a disadvantage as a competitor in international technological advancement. In order to secure its status as a technological leader internationally, the US institutions must be more vigilant in predicting the grades of STEM students to improve student retention in STEM fields. Using early grade prediction allows the school to monitor students’ course progress and increases their chances of graduating. Predicting grades is highly beneficial for at-risk STEM students because it allows for timely pedagogical interventions that can better equip the students for success in their courses and prevent dropouts. Identifying at-risk students is a complicated problem since there are many factors to consider. Traditional approaches to analyzing and using students’ data have had mixed results in identifying factors that help predict students’ performance early on in the semester. Machine learning can uniquely identify patterns that other approaches cannot, making it a promising method for grade prediction that is currently available. This study uses machine learning algorithms to identify the key factors that predict which students are at-risk early in the semester. The results of this study show that demographic variables such as gender, academic level, and age are of little value in predicting student success. The factors with the highest correlation to the course grade were the students’ cumulative college grade point average (GPA) and the grades that the students received on the first four homework assignments of the semester. These variables are provided as input to machine learning algorithms - logistic regression, decision tree, and random forest. Using machine learning, grades can be predicted with 80% to 97% accuracy in the first two to four weeks of the semester, which allows the university to intervene early on.
Recommended Citation
Fatima, Saba, "Using Machine Learning to Predict Student Outcomes" (2023). Graduate Research Theses & Dissertations. 7142.
https://huskiecommons.lib.niu.edu/allgraduate-thesesdissertations/7142
Extent
108 pages
Language
eng
Publisher
Northern Illinois University
Rights Statement
In Copyright
Rights Statement 2
NIU theses are protected by copyright. They may be viewed from Huskie Commons for any purpose, but reproduction or distribution in any format is prohibited without the written permission of the authors.
Media Type
Text