Abstract—This paper is to introduce two heuristic methods based on crisp and fuzzy partitions for selecting the subset of instances from the training data set in high dimensional problems. This subset is called the representative training data set (RTR). A proposed genetic algorithm (GA) is used to learn a compact fuzzy rule-based system (FRBS) with the instances of RTR. RTR size is rather smaller than the initial training data set, thus time cost for learning FRBS decreases significantly. Therein the number of fuzzy rules is not only reduced but rule lengths are also shorter. The smaller size of the rule base is closely related to the interpretability of the FRBS. As a result, the final FBRS gets a suitable and acceptable balance between interpretability and accuracy.
Index Terms—Crisp partition, fuzzy partition, fuzzy rule set reduction, data reduction techniques, genetic algorithm, interpretability.
Tri Minh Huynh is with Department of Information Technology Sai Gon University Ho Chi Minh City, Viet Nam (e-mail: email@example.com).
Cite: Tri Minh Huynh, "Two New Heuristic Methods Based on Crisp and Fuzzy Partitions for Training Data Reduction," International Journal of Information and Education Technology vol. 1, no. 4, pp. 273-279, 2011.