Skip to content

Other ways to search: Events Calendar | UTHSC News

Predictive Analytics

Artificial Intelligence Assisted Prediction of Late Onset Cardiomyopathy among Childhood Cancer Survivors

PI: Fatma Gunturken
Co-Invgestigators (UTHSC CBMI): Robert L. Davis, Oguz Akbilgic
Co-Investigators (St. Jude): Gregory T. Armstrong, John Jefferies, Kirsten K. Ness, Daniel M. Green, John Lucas, Deokumar Srivastava, Melissa M. Hudson, Leslie L Robison, Daniel Mulrooney, Elsayed Z. Soliman , Ibrahim Karabayir

Early identification of childhood cancer survivors at high risk for treatment-related cardiomyopathy may improve outcomes by enabling timely intervention.  We plan to implement deep learning and signal processing methods using the Children’s Oncology Group (COG) guideline-recommended baseline electrocardiography (ECG) to predict future cardiomyopathy. We will apply signal processing and deep learning tools to 12-lead electrocardiogarms (ECG) obtained on 1,217 adult survivors of childhood cancer who are ≥ 18 years of age and ≥ 10 years from diagnosis. Subjects will be limited to those without evidence of cardiomyopathy at baseline and who are prospectively followed in the St. Jude Lifetime Cohort (SJLIFE) Study. Data to be used includes clinical and echocardiographic assessment of cardiac function  performed at baseline and follow-up evaluations and graded per a modified version of the Common Terminology Criteria for Adverese Events (CTCAE). Machine learning approaches will include genetic algorithm and extreme gradient boosting (XGboost).

The Research Enterprise Datawarehouse (rEDW) is a consolidated data from multiple clinical sources, which provides a unified view of a patient. It facilitates researchers and clinicians to retrieve patient and encounter level data. The rEDW contains data on patient demographics, laboratory test results, pharmacy information, pathology reports, hospital admission and discharge dates, ICD-9/10 codes, CPT codes, discharge summaries, and progress notes.

Currently, rEDW contains data (starting from 01/01/2014) from

  • Methodist Le Bonheur Healthcare
  • UT Medical Center Knoxville 

We are in the process of receiving data from

  • University Clinical Health
  • Regional One Health
  • Thomas, Nashville
  • Erlanger Health Systems, Chattanooga 

MLH (incl. Hospitals, clinics):

  • Total Encounters – 11,404,889
    • Pediatric (<=18) – 3,211,129 (32.69%)
    • Adult (>18) – 8,193,760 (67.31%)
  • Total Patients – 1,361,448
    • Gender
      • Male – 616,305 (45.27%)
      • Female – 744,164 (54.66%)
      • Unknown – 979 (0.07%)
    • Patient Type
      • Pediatric (<= 18) – 428,539 (31.48%)
      • Adults (>18) – 932,909 (68.52%)
    • Race
      • Asian – 16,198 (1.19%)
      • Black or African American – 636,903 (46.78%)
      • White – 567,536 (41.69%)
      • Multiple – 11,528 (0.85%)
      • Other/Unknown – 119,694 (8.79%)
      • Hispanic – 9,589 (0.70%) 
  • Procedure – 17,389,039
  • Diagnosis – 59,626,426
  • Lab events – 161,377,648
  • Medications – 77,619,494
  • Vital Signs – 234,849,995


  • ICD-9 diagnosis and procedure information are available for patient encounters from January 1st, 2014 to September 30th, 2015
  • ICD-10 diagnosis and procedure information are available for the patient encounters from the October 1st, 2015
  • Data refresh is done on a daily basis

UTMCK (Hospital visits only):

  • Total Encounters – 673,747
    • Pediatric (<=18) – 46,720 (6.94%)
    • Adult (>18) – 626,956 (93.06%)
    • Unknown – 71(0.01%)
  • Total Patients – 294,896
    • Gender
      • Male – 136,569 (46.31%)
      • Female – 158,254 (53.66%)
      • Unknown – 74 (0.03%)
    • Patient Type
      • Pediatric (<=18) – 37,090 (12.58%)
      • Adults (>18) – 257,781 (87.41%)
      • Unknown – 25 (0.01%)
    • Race
      • Asian – 1,689 (0.57%)
      • Black or African American – 20,974 (7.11%)
      • White – 258,033 (87.48%)
      • Multiple – 5,862 (1.99%)
      • Other/Unknown – 8,412 (2.85%)
  • Procedure – 1,246,984
  • Diagnosis – 5,720,026
  • Lab events – 26,171,258
  • Medications – 15,747,446
  • Vital Signs – 100,352,948


  • Data from 01/01/2014 to 05/31/2020
PIs Akbilgic and Kamaleswaran

Co-Inviestigators: Samuel Goldman, UCSF and Robert Davis, MD, MPH


Parkinson’s disease (PD) is a progressive disabling neurodegenerative disorder that is understood to have three developmental phases, referred to as preclinical, premotor (or prodromal), and motor phases [1]– [3]. This project will discover novel noninvasive electrocardiogram (EKG) features of PD onset that can be processed within an artificial intelligence framework to identify patients at the prodromal PD phase. Such knowledge can facilitate the recruitment of subjects for disease-modifying clinical trials. PD is a systemic disorder with widespread anatomic involvement and consequent nonmotor symptoms including cardiac sympathetic denervation [4],[5]. In his existing MJFF-funded work, co-PI Goldman and his team have shown statistically significant differences in heart rate variability (HRV) metrics, determined from standard 10-second EKGs, in prevalent PD subjects compared to controls [6]. The same HRV metrics did not robustly distinguish prodromal PD from controls, however. As suggested by our preliminary data, artificial intelligence and machine learning can be utilized to uncover hidden, novel EKG markers that enable accurate classification of subjects with prodromal PD. Building on Goldman’s MJFF work, our hypothesis is that artificial intelligence can help distinguish between subjects who will develop PD and those who won’t. We will test our hypothesis using two cohorts, one for model building/internal validation and one for external validation. In the first cohort, we will use data from standard EKG recordings from participants in the Honolulu Asia Aging Study (HAAS) to develop a classification model to distinguish prodromal PD from controls. We will externally validate the HAAS-derived model using EKGs from members of Methodist Le Bonheur Healthcare (MLH), Memphis, TN.

PI: Fatma Gunturken

Co-Investigators: Kenneth Ataga, Oguz Akbilgic, Robert L Davis

Predicting Rapid Kidney Function Decline in Sickle Cell Disease

The object of this study is to predict and identify risk factors for rapid decline in kidney function in patients with severe sickle cell disease (SCD) genotypes (HbSS/HbSβ0 thalassemia) using machine learning. The design is a retrospective cohort study of patients receiving care at a single academic medical center followed from 2004 to 2013. We will define rapid decline in kidney function using estimated glomerular filtration rate (eGFR) loss thresholds of >3.0 or >5.0 ml/min/1.73 m2 per year. Logistic regression and machine learning algorithms including classification and regression trees (CART), random forest, adaboosting, gradient boosting and extreme gradient boosting (XGboosting), will be used to predict rapid decline in kidney function six and twelve months in advance.

May 26, 2022