Development and validation of a meta-learner for combining statistical and machine learning prediction models in individuals with depression.
Options
BORIS DOI
Date of Publication
May 16, 2022
Publication Type
Article
Division/Institute
Author
Liu, Qiang | |
De Crescenzo, Franco | |
Ostinelli, Edoardo Giuseppe | |
Li, Zhenpeng | |
Tomlinson, Anneka | |
Cipriani, Andrea |
Series
BMC psychiatry
ISSN or ISBN (if monograph)
1471-244X
Publisher
BioMed Central
Language
English
Publisher DOI
PubMed ID
35578254
Uncontrolled Keywords
Description
BACKGROUND
The debate of whether machine learning models offer advantages over standard statistical methods when making predictions is ongoing. We discuss the use of a meta-learner model combining both approaches as an alternative.
METHODS
To illustrate the development of a meta-learner, we used a dataset of 187,757 people with depression. Using 31 variables, we aimed to predict two outcomes measured 60 days after initiation of antidepressant treatment: severity of depressive symptoms (continuous) and all-cause dropouts (binary). We fitted a ridge regression and a multi-layer perceptron (MLP) deep neural network as two separate prediction models ("base-learners"). We then developed two "meta-learners", combining predictions from the two base-learners. To compare the performance across the different methods, we calculated mean absolute error (MAE, for continuous outcome) and the area under the receiver operating characteristic curve (AUC, for binary outcome) using bootstrapping.
RESULTS
Compared to the best performing base-learner (MLP base-learner, MAE at 4.63, AUC at 0.59), the best performing meta-learner showed a 2.49% decrease in MAE at 4.52 for the continuous outcome and a 6.47% increase in AUC at 0.60 for the binary outcome.
CONCLUSIONS
A meta-learner approach may effectively combine multiple prediction models. Choosing between statistical and machine learning models may not be necessary in practice.
The debate of whether machine learning models offer advantages over standard statistical methods when making predictions is ongoing. We discuss the use of a meta-learner model combining both approaches as an alternative.
METHODS
To illustrate the development of a meta-learner, we used a dataset of 187,757 people with depression. Using 31 variables, we aimed to predict two outcomes measured 60 days after initiation of antidepressant treatment: severity of depressive symptoms (continuous) and all-cause dropouts (binary). We fitted a ridge regression and a multi-layer perceptron (MLP) deep neural network as two separate prediction models ("base-learners"). We then developed two "meta-learners", combining predictions from the two base-learners. To compare the performance across the different methods, we calculated mean absolute error (MAE, for continuous outcome) and the area under the receiver operating characteristic curve (AUC, for binary outcome) using bootstrapping.
RESULTS
Compared to the best performing base-learner (MLP base-learner, MAE at 4.63, AUC at 0.59), the best performing meta-learner showed a 2.49% decrease in MAE at 4.52 for the continuous outcome and a 6.47% increase in AUC at 0.60 for the binary outcome.
CONCLUSIONS
A meta-learner approach may effectively combine multiple prediction models. Choosing between statistical and machine learning models may not be necessary in practice.
File(s)
File | File Type | Format | Size | License | Publisher/Copright statement | Content | |
---|---|---|---|---|---|---|---|
s12888-022-03986-0.pdf | text | Adobe PDF | 1009.95 KB | published |