Predicting alcohol dependence from multi-site brain structural measures

Open Access
Authors
  • A. Heinz
  • R. Hester
  • K. Hutchinson
  • F. Kiefer
  • O. Korucuoglu
  • T. Lett
  • C.-S.R. Li
  • E. London
  • V. Lorenzetti
  • L. Maartje
  • R. Momenan
  • C. Orr
  • M. Paulus
  • L. Schmaal
  • R. Sinha
  • Z. Sjoerds
  • D.J. Stein
  • E. Stein
  • R.J. van Holst
  • D. Veltman
  • H. Walter
  • R.W. Wiers ORCID logo
  • M. Yucel
  • P.M. Thompson
  • P. Conrod
  • N. Allgaier
  • H. Garavan
Publication date 01-2022
Journal Human Brain Mapping
Volume | Issue number 43 | 1
Pages (from-to) 555-565
Number of pages 11
Organisations
  • Faculty of Social and Behavioural Sciences (FMG) - Psychology Research Institute (PsyRes)
Abstract

To identify neuroimaging biomarkers of alcohol dependence (AD) from structural magnetic resonance imaging, it may be useful to develop classification models that are explicitly generalizable to unseen sites and populations. This problem was explored in a mega-analysis of previously published datasets from 2,034 AD and comparison participants spanning 27 sites curated by the ENIGMA Addiction Working Group. Data were grouped into a training set used for internal validation including 1,652 participants (692 AD, 24 sites), and a test set used for external validation with 382 participants (146 AD, 3 sites). An exploratory data analysis was first conducted, followed by an evolutionary search based feature selection to site generalizable and high performing subsets of brain measurements. Exploratory data analysis revealed that inclusion of case- and control-only sites led to the inadvertent learning of site-effects. Cross validation methods that do not properly account for site can drastically overestimate results. Evolutionary-based feature selection leveraging leave-one-site-out cross-validation, to combat unintentional learning, identified cortical thickness in the left superior frontal gyrus and right lateral orbitofrontal cortex, cortical surface area in the right transverse temporal gyrus, and left putamen volume as final features. Ridge regression restricted to these features yielded a test-set area under the receiver operating characteristic curve of 0.768. These findings evaluate strategies for handling multi-site data with varied underlying class distributions and identify potential biomarkers for individuals with current AD.

Document type Article
Note In special issue: The ENIGMA Consortium: the first 10 years. - With supplementary files
Language English
Published at https://doi.org/10.1002/hbm.25248
Other links https://github.com/sahahn/Alc_Dep https://www.scopus.com/pages/publications/85092600277
Downloads
Supplementary materials
Permalink to this page
Back