Publication Date

2023

Document Type

Dissertation/Thesis

First Advisor

Eads, Michael

Degree Name

Ph.D. (Doctor of Philosophy)

Legacy Department

Department of Physics

Abstract

Predicting students’ performance to identify which students are at risk of receiving aD/Fail/Withdraw (DFW) grade and ensuring their timely graduation is not just desirable but also necessary in most educational entities. In the US, not only is the Science, Technology, Engineering, and Mathematics (STEM) major becoming less popular among students, the graduation rate of STEM students is steadily declining. The lack of STEM graduates in the US is a serious problem that will place this country at a disadvantage as a competitor in international technological advancement. In order to secure its status as a technological leader internationally, the US institutions must be more vigilant in predicting the grades of STEM students to improve student retention in STEM fields. Using early grade prediction allows the school to monitor students’ course progress and increases their chances of graduating. Predicting grades is highly beneficial for at-risk STEM students because it allows for timely pedagogical interventions that can better equip the students for success in their courses and prevent dropouts. Identifying at-risk students is a complicated problem since there are many factors to consider. Traditional approaches to analyzing and using students’ data have had mixed results in identifying factors that help predict students’ performance early on in the semester. Machine learning can uniquely identify patterns that other approaches cannot, making it a promising method for grade prediction that is currently available. This study uses machine learning algorithms to identify the key factors that predict which students are at-risk early in the semester. The results of this study show that demographic variables such as gender, academic level, and age are of little value in predicting student success. The factors with the highest correlation to the course grade were the students’ cumulative college grade point average (GPA) and the grades that the students received on the first four homework assignments of the semester. These variables are provided as input to machine learning algorithms - logistic regression, decision tree, and random forest. Using machine learning, grades can be predicted with 80% to 97% accuracy in the first two to four weeks of the semester, which allows the university to intervene early on.

Recommended Citation

Fatima, Saba, "Using Machine Learning to Predict Student Outcomes" (2023). Graduate Research Theses & Dissertations. 7142.
https://huskiecommons.lib.niu.edu/allgraduate-thesesdissertations/7142

Extent

108 pages

Language

eng

Publisher

Northern Illinois University

Rights Statement

In Copyright

Rights Statement 2

NIU theses are protected by copyright. They may be viewed from Huskie Commons for any purpose, but reproduction or distribution in any format is prohibited without the written permission of the authors.

Media Type

Text

Download

Included in

Physics Commons

COinS

Graduate Research Theses & Dissertations

Using Machine Learning to Predict Student Outcomes

Publication Date

Document Type

First Advisor

Degree Name

Legacy Department

Abstract

Recommended Citation

Extent

Language

Publisher

Rights Statement

Rights Statement 2

Media Type

Included in

Browse

Search

Author Corner

Graduate Research Theses & Dissertations

Using Machine Learning to Predict Student Outcomes

Author

Publication Date

Document Type

First Advisor

Degree Name

Legacy Department

Abstract

Recommended Citation

Extent

Language

Publisher

Rights Statement

Rights Statement 2

Media Type

Included in

Share

Browse

Search

Author Corner