|
|
|
Epileptic seizure detection: a comparative study between deep and traditional machine learning techniques |
Rekha Sahu1, Satya Ranjan Dash2, Lleuvelyn A Cacha3, Roman R Poznanski4, Shantipriya Parida5, *( ) |
1School of Computer Engineering, KIIT University, Bhubaneswar, Odisha, 751024, India 2School of Computer Application, KIIT University, Bhubaneswar, Odisha, 751024, India 3Faculty of Health Science, Universiti Sultan Zainal Abidin, Gong Badak Campus, Darul Iman, Terengganu, 21300, Malaysia 4Faculty of Informatics and Computing, Universiti Sultan Zainal Abidin, Besut Campus, Besut, Terengganu, 22200, Malaysia 5Idiap Research Institute, Centre du Parc, Rue Marconi 19, Martigny, CH-1920, Switzerland |
|
|
Abstract
Electroencephalography is the recording of brain electrical activities that can be used to diagnose brain seizure disorders. By identifying brain activity patterns and their correspondence between symptoms and diseases, it is possible to give an accurate diagnosis and appropriate drug therapy to patients. This work aims to categorize electroencephalography signals on different channels’ recordings for classifying and predicting epileptic seizures. The collection of the electroencephalography recordings contained in the dataset attributes 179 information and 11,500 instances. Instances are of five categories, where one is the symptoms of epilepsy seizure. We have used traditional, ensemble methods and deep machine learning techniques highlighting their performance for the epilepsy seizure detection task. One dimensional convolutional neural network, ensemble machine learning techniques like bagging, boosting (AdaBoost, gradient boosting, and XG boosting), and stacking is implemented. Traditional machine learning techniques such as decision tree, random forest, extra tree, ridge classifier, logistic regression, K-Nearest Neighbor, Naive Bayes (gaussian), and Kernel Support Vector Machine (polynomial, gaussian) are used for classifying and predicting epilepsy seizure. Before using ensemble and traditional techniques, we have preprocessed the data set using the Karl Pearson coefficient of correlation to eliminate irrelevant attributes. Further accuracy of classification and prediction of the classifiers are manipulated using k-fold cross-validation methods and represent the Receiver Operating Characteristic Area Under the Curve for each classifier. After sorting and comparing algorithms, we have found the convolutional neural network and extra tree bagging classifiers to have better performance than all other ensemble and traditional classifiers.
|
Submitted: 03 February 2020
Accepted: 04 March 2020
Published: 30 March 2020
|
*Corresponding Author(s):
Shantipriya Parida
E-mail: shantipriya.parida@idiap.ch
|
Figure 1. Overall work on EEG Data set with the implementation of CNN, ensemble, and traditional machine learning algorithms. The EEG dataset is preprocessed (except CNN model) to eliminate irrelevant features and split into train and test datasets. The training and test datasets are used to train the traditional, ensemble, and deep learning models and used to classify epilepsy or non-epilepsy Seizure.
Table 1 Summary of the epileptic EEG data. All Set (A-E) contains five healthy subjects.
Subjects | Set A 100 subjects | Set B 100 subjects | Set C 100 subjects | Set D 100 subjects | Set E 100 subjects | Patient’s state | Epilepsy seizure | Having tumor | Healthy | Eye closed | Eye opened | Number of text files containing recording of EEG signals | 100 with each file includes 4096 samples of one EEG time series. | 100 with each file includes 4096 samples of one EEG time series. | 100 with each file includes 4096 samples of one EEG time series. | 100 with each file includes 4096 samples of one EEG time series. | 100 with each file includes 4096 samples of one EEG time series. | Time duration (s) | 23.6 | 23.6 | 23.6 | 23.6 | 23.6 |
Figure 2. CNN model performance by depicting ROC AUC representation of CNN classifier. The area under the curve is 0.99, which is a valid positive rate.
Table 2 Accuracy and Standard Deviation of different machine learning techniques.
Machine Learning Techniques | Accuracy | Standard Deviation | Decision tree | 0.8886 | +/- 0.0014 | Random Forest classifier | 0.9517 | +/- 0.0009 | Extra tree classifier | 0.9435 | +/- 0.0030 | Kernel Support Vector Machine (polynomial) | 0.9349 | +/- 0.0010 | Kernel Support Vector Machine (Gaussian) | 0.9420 | +/- 0.0037 | Naïve Bays Classifier | 0.9430 | +/- 0.0011 | Logistic regression | 0.8048 | +/- 0.0006 | K-nearest neighbor classifier | 0.9301 | +/- 0.0015 |
Figure 3. Representation of the ROC curve of traditional machine learning techniques. Random Forest: ROC, AUC = 1.000; Extra Tree: ROC, AUC = 1.000; K-NN: ROC, AUC = 0.997; Logistic Regression: ROC, AUC = 0.538; Decision Tree: ROC, AUC = 0.767.
Table 3 Different Bagging Classifiers' accuracy.
Base Estimators for Bagging | Average Manipulation | Voting to estimators | Accuracy | Standard Deviation | Accuracy | Standard Deviation | K-nearest neighbors’ classifier | 0.9393 | +/- 0.0005 | 0.93 | +/- 0.00 | Kernel Support Vector Machine (Gaussian) | 0.9448 | +/- 0.0015 | 0.94 | +/- 0.01 | Ridge Classifier | 0.8000 | +/- 0.0001 | 0.80 | +/- 00 | Logistic regression | 0.8008 | +/- 0.0003 | 0.80 | +/- 00 | Decision tree classifier | 0.9019 | +/- 0.0041 | 0.89 | +/- 00 | Naïve Bays Classifier (Gaussian) | 0.9427 | +/- 0.0017 | 0.94 | +/- 00 | Kernel Support Vector Machine (Polynomial) | 0.9309 | +/- 0.0019 | - | - | Random Forest Classifier | 0.9474 | +/- 0.0014 | 0.95 | +/- 0.00 | Extra tree classifier | 0.966 | +/- 0.0007 | 0.95 | +/- 0.00 |
Figure 4. ROC AUC representation of bagging classifiers. Bagging Random Forest: ROC, AUC = 0.995; Bagging Extra Tree: ROC, AUC = 0.998; Meta-bagging K-NN: ROC, AUC = 0.994; Meta-bagging Logistic Regression: ROC, AUC = 0.570; Meta-bagging Decision Tree: ROC, AUC = 0.935.
Table 4 Accuracy of Boosting algorithms implementation.
Boosting Methods | Accuracy | Standard Deviation | Ada Boost | 0.93 | +/- 0.00 | Gradient boosting algorithm | 0.95 | +/- 0.00 | XG Boost Algorithm | 0.95 | +/- 0.00 |
Figure 5. ROC AUC of Boosting classifiers. Ada Boost: ROC, AUC = 0.965; Grad Boost: ROC, AUC = 0.980; XGB Boost: ROC, AUC = 0.981.
Table 5 Stacking implementation accuracy.
Base Estimators for Stacking | Voting to estimators | Accuracy | Standard Deviation | K-nearest neighbors classifier | 0.9301 | +/- 0.0015 | Logistic regression | 0.8048 | +/- 0.0006 | Decision tree classifier | 0.8886 | +/- 0.0014 | Naïve BaysClassifier (Gaussian) | 0.9430 | +/- 0.0011 | Random forest classifier | 0.9470 | +/- 0.0029 | Extra tree classifier | 0.9435 | +/- 0.0030 | Stack Classifier (second level classifier logistic regression) | 0.9510 | +/- 0.0009 |
Figure 6. Representation ROC, AUC of stacking implementation. Stacking Random Forest: ROC, AUC = 1.000; Stacking Extra Tree: ROC, AUC = 1.000; Stacking K-NN: ROC, AUC = 0.997; Stacking Logistic Regression: ROC, AUC = 0.538; Stacking Decision Tree: ROC, AUC = 0.767; Stacking 2nd level classifier logistic regression: ROC AUC = 1.000.
Table 6 Summary of optimal accuracy of classification for comparative study.
Classifiers | Accuracy | ROC AUC | CNN | 0.96 | 0.99 | Extra Tree Bagging (Average) | 0.96 | 1.00 | Gradient Boosting | 0.95 | 0.98 | XG Boosting | 0.95 | 0.98 | Stacking | 0.95 | 1.00 | Random Forest | 0.95 | 1.00 |
Figure 7. Performance summary by depicting ROC AUC of all the optimal classifiers (traditional, ensemble, and deep learning). CNN and Bagging Extra Tree outperforms as compared to classifiers based on the conventional machine learning approach.
[1] |
Acharya, U. R., Oh, S. L., Hagiwara, Y., Tan, J. H. and Adeli, H. (2018) Deep convolutional neural network for the automated detection and diagnosis of seizures using EEG signals. Computers in Biology and Medicine 100, 270-278.
|
[2] |
Alhussein, M., Muhammad, G. and Hossain, M. S. (2019) EEG pathology detection based on deep learning. IEEE Access 7, 27781-27788.
|
[3] |
Andrzejak, R. G., Lehnertz, K., Mormann, F., Rieke, C., David, P. and Elger, C. E. (2001) Indications of non-linear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E 64, 061907.
|
[4] |
Avcu, M. T., Zhang, Z. and Chan, D. W. S. (2019) ‘Seizure detection using least EEG channels by deep convolutional neural network,’ ICASSP 2019 - 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). Brighton, UK.
|
[5] |
Ay, B., Yildirim, O., Talo, M., Baloglu, U. B., Aydin, G., Puthankattil, S. D., and Acharya, U. R. (2019) Automated depression detection using deep representation and sequence learning with EEG signals. Journal of Medical Systems 43, 205.
|
[6] |
Cacha, L., Parida, S., Dehuri, S., Cho, S. B. and Poznanski, R. R. (2016) A fuzzy integral method based on he ensemble of neural networks to analyze fMRI data for cognitive state classification across multiple subjects. Journal of Integrative Neuroscience 15, 593-606.
|
[7] |
Clarke, S., Karoly, P., Nurse, E., Seneviratne, U., Taylor, J., Knight-Sadler, R., Kerr, R., Moore, B., Hennessy, P., Mendis, D., Lim, C., Miles, J., Cook, M. and Freestone, D. (2019) Computer-assisted EEG diagnostic review for idiopathic generalized epilepsy. Epilepsy & Behavior, 106556.
|
[8] |
Fukumori, K., Nguyen, H. T. T., Yoshida, N. and Tanaka, T. (2019) ‘Fully data-driven convolutional filters with deep learning models for epileptic spike detection,’ ICASSP 2019 - 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). Brighton, UK.
|
[9] |
Ilyas, M., Saad, P., Ahmad, M. and Ghani, A. (2016) ‘Classification of EEG signals for brain-computer interface applications: Performance comparison,’ 2016 International Conference on Robotics, Automation and Sciences (ICORAS). Ayer Keroh.
|
[10] |
Karim, A. M., Güzel, M. S., Tolun, M. R., Kaya, H. and Çelebi, F. V. (2018) A new generalized deep learning framework combining sparse autoencoder and Taguchi method for novel data classification and processing. Mathematical Problems in Engineering 2018, 1-13.
|
[11] |
Karim, A. M., Güzel, M. S., Tolun, M. R., Kaya, H. and Çelebi, F. V. (2019) A new framework using deep auto-encoder and energy spectral density for medical wave-form data classification and processing. Biocybernetics and Biomedical Engineering 39, 148-159.
|
[12] |
Lee, S. B., Kim, H., Lee, S., Kim, H. J., Lee, S. W. and Kim, D. J. (2019) ‘Classification of the motion artifacts in near-infrared spectroscopy based on wavelet statistical feature,’ 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). Bari, Italy, 2019. IEEE.
|
[13] |
Mahato, S. and Paul, S. (2020) Classification of depression patients and normal subjects based on electroencephalogram (EEG) signal using alpha power and theta asymmetry. Journal of Medical Systems 44, 28.
|
[14] |
Nandy, A., Alahe, M. A., Uddin, S. N., Alam, S., Nahid, A. A. and Awal, M. A. (2019) ‘Feature extraction and classification of EEG signals for seizure detection,’ 2019 International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST). Dhaka, Bangladesh.
|
[15] |
Parida, S., Dehuri, S., Cho, S. B., Cacha, L. and Poznanski, R. (2015) A hybrid method for classifying cognitive states from fMRI data. Journal of Integrative Neuroscience 14, 355-368.
|
[16] |
Rahman, M. M., Bhuiyan, M. I. H. and Das, A. B. (2019) Classification of focal and non-focal EEG signals in VMD-DWT domain using ensemble stacking. Biomedical Signal Processing and Control 50, 72-82.
|
[17] |
Resque, P., Barros, A., Rosário, D. and Cerqueira, E. (2019) ‘An investigation of different machine learning approaches for epileptic seizure detection,’ 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC). Tangier, Morocco.
|
[18] |
Seifzadeh, S., Rezaei, M., Faez, K., and Amiri, M. (2017) Fast and efficient four class motor imagery electroencephalography signal analysis using common spatial pattern-ridge regression algorithm for the purpose of brain-computer interface. Journal of Medical Signals and Sensors 7, 80-85.
|
[19] |
Siuly, S., Li, Y. and Zhang, Y. (2016) Injecting principal component analysis with the OA scheme in the epileptic EEG signal classification. In, Siuly, S. et al. (eds.) EEG signal analysis and classification (pp. 141-144). Germany, CA: Springer.
|
[20] |
Struck, A. F., Rodriguez-Ruiz, A. A., Osman, G., Gilmore, E. J., Haider, H. A., Dhakar, M. B., Schrettner, M., Lee, J. W., Gaspard, N., Hirsch, L. J., Westover M., B. and Critical Care EEG Monitoring Research Consortium (CCERMRC). (2019) Comparison of machine learning models for seizure prediction in hospitalized patients. Annals of Clinical and Translational Neurology 6, 1239-1247.
|
[21] |
Tavares, G., San-Martin, R., Ianof, J. N., Anghinah, R. and Fraga, F. J. (2019) ‘Improvement in the automatic classification of Alzheimer's disease using EEG after feature selection,’ 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). Bari, Italy.
|
[22] |
Thodoroff, P., Pineau, J. and Lim, A. (2016) ‘Learning robust features using deep learning for automatic seizure detection,’ Machine Learning for Healthcare Conference (MLHC 2016). Los Angeles, USA.
|
[23] |
Wójcik, G. M., Kawiak, A., Kwasniewicz, L., Schneider, P. and Masiak, J. (2019) Azure machine learning tools efficiency in the electroencephalographic signal P300 standard and target responses classification. Bio-Algorithmsand Med-Systems 15, 1-8.
|
[24] |
Yuvaraj, R., Thomas, J., Kluge, T. and Dauwels, J. (2018) ‘A deep learning scheme for automatic seizure detection from long-term scalp EEG,’ 2018 52nd Asilomar Conference on Signals, Systems, and Computers. Pacific Grove, CA, USA.
|
No Suggested Reading articles found! |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|