Enhanced naïve bayes classification framework with data reduction and transformation techniques

Student thesis: PhD Thesis

Abstract

The Bayesian classification framework has been widely used in many fields, but the covariance matrix is usually difficult to estimate reliably. To alleviate the problem, many naive Bayes (NB) approaches with good performance have been developed. However, the assumption of conditional independence between attributes in NB rarely holds in reality. Various attribute-weighting schemes have been developed to address this problem. Among them, class-specific attribute weighted naive Bayes (CAWNB) has recently achieved good performance by using classification feedback to optimize the attribute weights of each class. However, the derived model may be over-fitted to the training dataset, especially when the dataset is insufficient to train a model with good generalization performance. Moreover, the Bayesian classification framework often relies on the discretization method to handle the various data types. Existing discretization methods often target maximizing the discriminant power of discretized data, while overlooking the fact that the primary target of data discretization in classification is to improve the generalization performance. As a result, the data tend to be over-split into many small bins since the data without discretization retain the maximal discriminant information.
In this thesis, we exploit the data intrinsic by using data reduction and transformation methods. In Chapter 3, we propose a regularization technique to improve the generalization capability of naive Bayes classifier, which could well balance the trade-off between discrimination power and generalization capability. We boost the discriminant power of naive Bayes by developing a semi-supervised discretization framework with an adaptive discriminative selection criterion in Chapter 4. Besides, a well-designed discretization scheme using a Max-Relevancy-Min-Divergence (MRmD) criterion is introduced to better balance the generalization ability and discrimination power of subsequent classifiers discussed in Chapter 5. To reduce the data noise and alleviate the weakness in capturing the feature correlation, a feature augmentation framework employing the stacked autoencoder is proposed in Chapter 6. These contributions are discussed in detail as follows.
Firstly, we propose a regularization technique to improve the generalization capability of naive Bayes classifier, which could well balance the trade-off between discrimination power and generalization capability. More specifically, by introducing the regularization term, the proposed method, namely regularized naive Bayes (RNB), could well capture the data characteristics when the dataset is large, and exhibit good generalization performance when the dataset is small. RNB is compared with the state-of-the-art naive Bayes methods. Experiments on 33 machine-learning benchmark datasets demonstrate that RNB outperforms other NB methods significantly.
Secondly, we design a semi-supervised adaptive discriminative discretization (SADD) scheme to address the significant information loss in previous discretization methods and improve the performance of naive Bayes classifiers. To make full use of labeled and unlabeled data, the pseudo-labeling technique is utilized to compute the pseudo labels for unlabeled data. Then, an adaptive discriminative selection criterion is proposed to further reduce the information loss and the resulting discretization scheme could achieve a better trade-off between generalization ability and discrimination power. Experimental results on 31 machine-learning datasets validate the effectiveness of the proposed SADD.
Thirdly, we propose a Max-Dependency-Min-Divergence (MDmD) criterion that maximizes both the discriminant information and generalization ability of the discretized data, and hence the performance of NB classifier can be improved. More specifically, the Max-Dependency criterion maximizes the statistical dependency between the discretized data and the classification variable while the Min-Divergence criterion explicitly minimizes the JS-divergence between the training data and the validation data for a given discretization scheme. The proposed MRmD is compared with the state-of-the-art discretization algorithms under the naive Bayes classification framework on 45 machine learning benchmark datasets. It significantly outperforms all the compared methods on most of the datasets.
Fourthly, we enhance the discriminant power of NB classifiers by a stack auto-encoder that consists of two auto-encoders for different purposes. The first encoder shrinks the initial features to derive a compact feature representation in order to remove the noise and redundant information. The second encoder boosts the discriminant power of the features by expanding them into a higher-dimensional space so that different classes of samples could be better separated in the higher-dimensional space. By integrating with the state-of-the-art NB method, regularized naive Bayes (FAR-NB), the discrimination power of the model is greatly enhanced. The proposed FAR-NB is compared with the state-of-the-art NB classifiers and achieves a superior classification performance.
The contributions of this thesis are summarized as follows:
• We propose a regularized naive Bayes classifier to automatically balance the generalization ability and discrimination power by optimizing the attribute weights.
• We propose a semi-supervised adaptive discriminative discretization scheme to reduce the significant information loss in state-of-the-art naive Bayes classifiers.
• We propose to boost the performance of NB classifier from a discretization perspective, using a Max-Relevancy-Min-Divergence discretization scheme.
• We propose a feature augmentation method to enhance the discrimination power of NB classifier employing stack autoencoder to explore the data intrinsic residing in the original space.
Date of AwardJul 2024
Original languageEnglish
Awarding Institution
  • University of Nottingham
SupervisorJianfeng Ren (Supervisor), Ruibin Bai (Supervisor), Yuan Yao (Supervisor) & Tieyan Liu (Supervisor)

Keywords

  • Naive Bayes
  • Classification
  • Data reduction
  • Data transformation

Cite this

'