Abstract found on PubMed
Objective: The accurate prediction of seizure freedom after epilepsy surgery remains challenging. We investigated if 1) training more complex models, 2) recruiting larger sample sizes, or 3) using data-driven selection of clinical predictors would improve our ability to predict post-operative seizure outcome using clinical features. We also conducted the first substantial external validation of a machine learning model trained to predict post-operative seizure outcome.
Methods: We performed a retrospective cohort study of 797 children who had undergone resective or disconnective epilepsy surgery at a tertiary center. We extracted patient information from medical records and trained three models – a logistic regression, a multilayer perceptron, and an XGBoost model – to predict one-year post-operative seizure outcome on our dataset. We evaluated the performance of a recently published XGBoost model on the same patients. We further investigated the impact of sample size on model performance, using learning curve analysis to estimate performance at samples up to N=2,000. Finally, we examined the impact of predictor selection on model performance.
Results: Our logistic regression achieved an accuracy of 72% (95% CI=68-75%,AUC=0.72), while our multilayer perceptron and XGBoost both achieved accuracies of 71% (95% CIMLP =67-74%,AUCMLP =0.70; 95% CIXGBoost own =68-75%,AUCXGBoost own =0.70). There was no significant difference in performance between our three models (all P>0.4) and they all performed better than the external XGBoost, which achieved an accuracy of 63% (95% CI=59-67%,AUC=0.62; PLR=0.005,PMLP =0.01,PXGBoost own =0.01) on our data. All models showed improved performance with increasing sample size, but limited improvements beyond our current sample. The best model performance was achieved with data-driven feature selection.
Significance: We show that neither the deployment of complex machine learning models nor the assembly of thousands of patients alone is likely to generate significant improvements in our ability to predict post-operative seizure freedom. We instead propose that improved feature selection alongside collaboration, data standardization, and model sharing is required to advance the field.