Environmental and omics-related marker panels for the prediction of autoantibody positivity through integrated machine learning feature selection

This abstract has open access
Abstract Summary

Backgroud: A significant gap in the study of the onset of the autoimmune response for Type 1 Diabetes (T1D) and its progression is the lack of biomarkers that can be used to accurately predict and monitor these processes. Therefore, two goal of The Environmental Determinants of Diabetes in the Young (TEDDY) study are to identify new biomarkers and obtain a mechanistic understanding of the autoimmune response leading to T1D. Methods: As a first step towards achieving these goals, we have performed integrative machine learning of multiple omics datasets (genomics, metabolomics and lipidomics), as well as associated meta-data, on a nested case-control cohort from TEDDY, a prospective study of children higher genetic risk of developing diabetes. We applied a pipeline that first identified the best machine learning classification algorithm per data type. We subsequently performed feature selection in the context of a Bayesian integration across the multiple data sources, cross-validation, and optimization via simulated annealing generating a likelihood that each feature is important to the panel predicting seroconversion. A hold-out set of 25% of the total number of available case-control pairs was used to validate the model. Results: We performed predictive modeling with the goal of identifying the features that separate cases from controls in increments of 3, 6, 9 and 12 months prior to seroconversion. The number of case-control pairs with adequate data across the multiple omics and demographic data ranged from 208 to 336 for the various time points. The matched cross-validated training sets returned an average area under the Receiver Operating Characteristic curve of over 0.71 and the feature sets included dozens of disparate markers that were relatively uniform across the various data sources. This proof-of-principle demonstrates the ability to identify candidate biomarker panels for prediction of time to seroconversion.

Submission ID :
IDS13180
Submission Type
Abstract Topics
Computing & Analytics Division, Pacific Northwest National Laboratory
University of South Florida, USA
Center for Public Health Genomics University of Virginia, United States
Institute of Diabetes Research, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich-Neuherberg, Germany
University of Turku
Pacific Northwest Diabetes Research Institute, USA
Department of Epidemiology, Colorado School of Public Health, University of Colorado Denver, Aurora, Colorado, United States
Barbara Davis Center for Diabetes, University of Colorado School of Medicine
Biological Sciences Division, Pacific Northwest National Laboratory
Biological Sciences Division, Pacific Northwest National Laboratory
Computing & Analytics Division, Pacific Northwest National Laboratory
Computing & Analytics Division, Pacific Northwest National Laboratory
Computing & Analytics Division, Pacific Northwest National Laboratory
Barbara Davis Center

Abstracts With Same Type

Submission ID
Submission Title
Submission Topic
Submission Type
Primary Author
IDS75126
Poster Session A
Poster and oral
Dr Michelle So
8 visits

KEY DATES

Event dates:
Thursday 25 October - Monday 29 October 2018

Abstract submission deadline:
Monday 14 May 2018

Abstract notification:
July 2018

Early registration deadline:
Monday 3 September 2018

Registration deadline:
Monday 15 October 2018

Contact
British Society for Immunology
+44 (0)20 3019 5901
congress@immunology.org