Abstract—This study uses a machine learning technique, a boosted tree model, to relate the student cognitive achievement in the 2018 data from the Programme of International Student Assessment (PISA) to other features related to the student learning process, capturing the complex and nonlinear relationships in the data. The SHapley Additive exPlanations (SHAP) approach is subsequently used to explain the complexity of the model. It reveals the relative importance of each of the features in predicting cognitive achievement. We find that instruction time comes out as an important predictor, but with a nonlinear relationship between its value and the contribution to the prediction. We find that a large weekly learning time of more than 35 hours is associated with less positive or even negative effect on the predicted outcome. We discuss how this method can possibly be used to signal problems in the student population related to learning time or other features.
Index Terms—Learning factor analysis, machine learning, SHAP values, PISA.
The authors are with Alef Education, UAE (e-mail: firstname.lastname@example.org).