Classification of Educational Skills for University Students in Computer Programming Classes

Abstract—This paper analyzes the relationship of educational skills that students should achieve for each computer programming class using a student self-assessment questionnaire. The questionnaire survey, containing 25 educational skills, was conducted in computer programming classes in my university using a computer-assisted web-interviewing technique. The questionnaire results are analyzed using an agglomerative hierarchical clustering based on Ward’s method and a self-organizing map, which is a machine learning method. This study shows that the students can be classified into four clusters: highly skilled students, students with high learning and thinking skills but low executing skills, students with high leaning and executing skills but low thinking skills, and students with lower skills.


I. INTRODUCTION
The skills and attributes that should be learned in school have been changing. The United States Department of Education announced the 21st century skills and the Definition and Selection of Competencies: Theoretical and Conceptual Foundations (DeSeCo) project of the Organisation for Economic Co-operation and Development (OECD) proposed the key competencies. The National Institute for Educational Policy Research in Japan organized a research on curriculum, which fostered attributes and skills.
Tokai University, to which the author belongs, has set four key abilities as a specific evaluation indicator since 2009: thinking, communication, challenging, and accomplishment. The faculty has set the appropriate abilities and evaluation indicators in the syllabus as a skill to be taught in class. However, the skill setting method is neither a theoretical nor a systematic approach. This study aims to set effective educational skills and educational performance indicators analytically per class.
Meta-analyses in [1]- [3], data envelopment analyses in [4], [5], and mediational analysis in [6] were used in a research for the evaluation of indexes and skills. The data envelopment analyses [4] examined the relative efficiency of Australian universities. Furthermore, the technical and scale efficiency [5] were estimated for the population of the Australian universities. The achievement goals, motivational study strategies, and exam performance were examined using mediational analysis in [6]. The author of this paper has examined the relationship between students' educational Manuscript received December 20, 2020; revised April 13, 2021. T. Taniguchi is with IT Education Center, Tokai University, Hiratsuka, Kanagawa 2591292, Japan (e-mail: taniguchi@tokai-u.jp). skills using a multidimensional scaling in [7] and a self-organizing map (SOM) in [8], [9].
In order to investigate the relationships between educational skills and students' consciousness, the aim of this paper is to conduct a questionnaire survey about educational skills before and after students attend computer programming classes. This study used a self-assessment questionnaire to analyze the relationship among students' educational skills. Students who took the introduction and the advanced computer programming classes were the participants of the survey. The questionnaire survey was conducted to analyze 25 educational skills in [10], [11]. The SOM in [12] and an agglomerative hierarchical clustering based on the Ward's method were used to analyze the questionnaire data. The SOM, which is a machine learning method, is an unsupervised neural network method and an efficient tool for visualizing the relationship of complicated data. This study classified the students into four clusters: highly skilled students, students with high learning and thinking skills but low executing skills, students with high leaning and executing skills but low thinking skills, and students with lower skills.

A. Participants
The questionnaire survey was conducted before and after computer programming classes in the autumn semester of 2018. The questionnaire for the students in the introduction and advanced courses and 11 faculties was surveyed in Tokai University. The introduction course comprises "introduction to computer programming" and "basic computer programming." "Applied computer programming," "computer algorithm," and "computer graphics" are in the advanced course. Students, regardless of their year level in school, or faculties can take these classes. Table I shows the students' year level in study and gender, and faculties of survey respondents. All the participants have volunteered to participate in the study, with a cumulative total of 443 student participants.

B. Procedure
A computer-assisted web-interviewing technique was used to collect questionnaire data. The participants were asked to complete the online questionnaire, containing educational skills. The questionnaire and the study purpose information were provided for the participants by web pages. All the participants joined voluntarily and have read the informed consent terms on the questionnaire web pages.

D. Data Analysis
The Mann Whitney U test was used to evaluate how the students acquire their educational skills, before and after the classes. Tables III and IV present the results from the introduction and advanced courses, respectively, where the bold numbers indicate that the means increase after classes compared with before classes.
An agglomerative hierarchical clustering based on Ward's method was applied to the questionnaire results to classify the participants into some educational skills groups. The dendrogram illustrates a tree diagram representing the clustering result in Figs. 1, 3, 5, and 7.
The SOM is an efficient tool visualizing the multidimensional data, which are the relationship of the 21 educational skills based on the student's consciousness using the questionnaire results. The remaining four educational skills (i.e., communication, collaboration, relationship, and artistic skills) were not covered in the classes; therefore they were excluded. This study used the SOM-Toolbox of MATLAB in [13] to create and visualize the SOM for the datasets. The questionnaire results for the introduction and advanced classes were used as the datasets. These data were normalized such that each variable had a unit variance. The SOM results in this study were obtained, regardless of the initial values because the SOMs were initialized and trained through principal component analysis.

III. RESULTS
Firstly, the data were analyzed for statistical significance using the Mann Whitney U test. Table III presents that almost all the means of educational skills increased after the classes compared with before, except for inquiring skills. Table IV shows that the mean values for approximately half of the educational skills increased. However, no significant differences (< 0.05) of the U tests in the educational skills were found in both the results of the introduction and advanced courses. Therefore, there is no statistical significant difference before and after class.
IT education center in Tokai University has been conducting an educational skill survey for students for several years. We examined the relationship among educational skills using a multidimensional scaling in [7] and analyzed the educational skills were classified into different skill groups according to the difficulty of the ICT courses using a SOM analysis in [8]. We showed the students could be classified into several groups based on their attributes, respective academic faculties, and academic years in [9]. This paper analyzed the relationships between the questionnaire results before and after classes. The students attending classes were classified into several groups using an agglomerative hierarchical clustering based on Ward's method for the questionnaire results. The dendrograms presented in Figs. 1 and 3 illustrate the clustering results for before and after the introduction classes, respectively. Figs. 5 and 7 illustrate the before and after results of the advanced classes. The right graph in Fig. 1 illustrates that the gradient is steep at a point less than 4. Therefore, the number of clusters before the introduction classes is set to 4. Similarly, the number of clusters after the introduction classes is set to 4. The numbers of clusters before and after the advanced classes are 4 and 3, respectively. There is no direct relationship between the results of the dendrograms and the following analysis of the SOM. This study considered theses cluster results of the dendrograms because deciding on the number of clusters is difficult for a SOM analysis.
The SOM method was applied to the questionnaire results to investigate the relationship among the educational skills. The unified distance matrices (U-matrices) are in the upper left of Figs. 2, 4, 6, and 8; the other matrices are the component planes. In the U-matrices, the difference of reference vectors (yellow dot sequence) represents the cluster boundary. The component planes illustrate the 21 educational skills as SOM variables. In the component planes, yellow portion represents a higher skill rating, and blue represents a lower skill rating.

IV. DISCUSSION
The U-matrix in Fig. 2 shows two cluster boundaries for before the introduction classes. The figure also shows that the top-left boundary represents the student group with lower educational skills and the bottom-right one presents the higher skills group. The U-matrix in Fig. 4 shows two cluster boundaries for the after introduction classes. The top boundary represents the lower educational skills, whereas the bottom one presents the higher skills in the same manner as before. Both the upper and lower areas in Fig. 4 are wider after the classes than before. These U-matrices in Figs. 2 and  4 show that the number of students with both lower and higher skills increased after classes. Hence, the problem that should be addressed is the increasing number of students with lower skills.
In the component planes in Figs. 2 and 4, yellow and blue portions present a higher skill rating and a lower skill rating, respectively. These component planes can be classified into three types according to the color patterns. Table V shows the three types of educational skill classifications according to the yellow shapes, and the square area presents the learning skills. The bottom-right corner and bottom-left corner triangle areas present executing skills and thinking skills, respectively. From the cluster boundaries in the U-matrices and these color patterns, the students are classified into four clusters: the bottom boundary students with higher skills, the middle-left boundary students with high learning and thinking skills but low executing skills, the middle-right boundary students with high learning and executing skills but lower thinking skills, and the top boundary students all with lower skills. Interestingly, the middle-left boundary students have the opposite nature of the middle-right boundary ones. Figs. 6 and 8 show the SOM analysis for the advanced classes. The U-matrices of these figures show one clear cluster and one thin boundary of before and after the advanced classes, respectively. The top-left boundary presents students with lower educational skills, and the bottom one presents the higher skills similar to that in the introduction courses. In addition, these U-matrices show that the number of students with lower skills decreased after classes because the top-left boundaries after the classes were smaller than before. These component planes in Figs. 6 and 8 can be classified into three types according to the color pattern similar to that in the introduction classes. Table VI shows the educational skill classifications according to the yellow shapes. From the cluster boundaries in the U-matrices and these color patterns, the students are classified into four clusters similar to the introduction classes.
Therefore, the computer programming students can be classified into four groups regardless of the class difficulty: 1) Students with higher educational skills. 2) Students with high learning and thinking skills but low executing skills 3) Students with high learning and executing skills but low thinking skills 4) Students with lower educational skills.           From these results, the following four measures for the computer programming classes can be considered. 1) Offering more difficult assignments and advanced subjects 2) Offering more assignments and extending computer programming time 3) Compelling the students to think a programming algorithm deeply 4) Compelling the students to ask questions frequently

V. CONCLUSION
This paper has analyzed the relationship of educational skills that students should achieve for each computer programming class through the student self-assessment questionnaire. The questionnaire survey with 25 educational skills was taken to computer programming classes in Tokai University using a computer-assisted web-interviewing technique. The questionnaire results were analyzed using an agglomerative hierarchical clustering based on Ward's method and a self-organizing map, which is a machine learning method. This study has shown that the students can be classified into four clusters: highly skilled students, students with high learning and thinking skills but low executing skills, students with high leaning and executing skills but low thinking skills, and students with low skills. Using the results, present and future works will focus on improving class syllabus and contents.