This manuscript considers a data-mining approach for the prediction of mild obstructive sleep disordered breathing, defined as an elevated respiratory disturbance index (RDI), in 5,530 participants in a community-based study, the Sleep Heart Health Study.
The prediction algorithm was built using modern ensemble learning algorithms, boosting in specific, which allowed for assessing potential high-dimensional interactions between predictor variables or classifiers. To evaluate the performance of the algorithm, the data were split into training and validation sets for varying thresholds for predicting the probability of a high RDI (≥ 7 events per hour in the given results).
Based on a moderate classification threshold from the boosting algorithm, the estimated post-test odds of a high R...