Kim, Park, Choi, Lim, Ok, Noh, Song, Kang, Lee, and Kim: Application of Machine Learning to Predict Weight Loss in Overweight, and Obese Patients on Korean Medicine Weight Management Program
Original Article
The Journal of Korean Medicine 2020; 41(2): 58-79.
The purpose of this study is to predict the weight loss by applying machine learning using real-world clinical data from overweight and obese adults on weight loss program in 4 Korean Medicine obesity clinics.
Methods
From January, 2017 to May, 2019, we collected data from overweight and obese adults (BMI≥23 kg/m2) who registered for a 3-month Gamitaeeumjowi-tang prescription program. Predictive analysis was conducted at the time of three prescriptions, and the expected reduced rate and reduced weight at the next order of prescription were predicted as binary classification (classification benchmark: highest quartile, median, lowest quartile). For the median, further analysis was conducted after using the variable selection method. The data set for each analysis was 25,988 in the first, 6,304 in the second, and 833 in the third. 5-fold cross validation was used to prevent overfitting.
Results
Prediction accuracy was increased from 1st to 2nd and 3rd analysis. After selecting the variables based on the median, artificial neural network showed the highest accuracy in 1st (54.69%), 2nd (73.52%), and 3rd (81.88%) prediction analysis based on reduced rate. The prediction performance was additionally confirmed through AUC, Random Forest showed the highest in 1st (0.640), 2nd (0.816), and 3rd (0.939) prediction analysis based on reduced weight.
Conclusions
The prediction of weight loss by applying machine learning showed that the accuracy was improved by using the initial weight loss information. There is a possibility that it can be used to screen patients who need intensive intervention when expected weight loss is low.
A schematic diagram of prediction analyses of weight loss.
The analysis for predicting weight loss was divided into three parts, and the weight loss at each time point refers to the change from the initial point of treatment to the point of weight report.
Fig. 4
Receiver operating characteristics (ROC) curves
Table 1A
Independent Variables Used in the First Analysis (n=25,988)
Reduced rate of 1st Bench mark = (initial weight – weight at 2nd prescription)/ initial weight *100
Reduced rate of 2nd Bench mark = (initial weight – weight at 3rd prescription)/ initial weight *100
Reduced rate of 3rd Bench mark = (initial weight – weight at last weight report)/ initial weight *100
Table 3
Features of First and Fourth Quartile based on First Prediction Analysis
Reduced Rate
Reduced Weight
More Than Upper 25% (n=6,554)
Less Than Lower 25% (n=6,527)
More Than Upper 25% (n=6,499)
Less Than Lower 25% (n=6,507)
Age (years)
34.92 ± 9.58
37.52 ± 10.59
34.59 ± 9.37
37.84 ± 10.67
Gender (n, %)
Female 5,930 (90)
Female 5,918 (91)
Female 5,400 (83)
Female 6,118 (94)
Male 624 (10)
Male 609 (9)
Male 1,099 (17)
Male 389 (6)
Height (cm)
162.28 ± 6.67
161.95 ± 6.71
164.22 ± 7.32
160.94 ± 6.26
Weight (kg)
72.76 ± 10.78
72.74 ± 10.5
77.68 ± 12.01
70.06 ± 9.26
BMI (kg/m2)
27.56 ± 3.01
27.68 ± 3.07
28.71 ± 3.23
27.01 ± 2.81
Diet 1 (n, %)
1,908 (29)
1,426 (22)
1,902 (29)
1,393 (21)
Diet 2 (n, %)
2,689 (41)
2,436 (37)
2,593 (40)
2,482 (38)
Diet 3 (n, %)
1,957 (30)
2,665 (41)
2,004 (31)
2,632 (40)
RR (%)
5.89 ± 0.82
1.67 ± 0.79
5.76 ± 0.98
1.71 ± 0.84
RW (kg)
4.29 ± 0.87
1.22 ± 0.61
4.42 ± 0.77
1.18 ± 0.56
Data are expressed as n (%) for categorical variables and mean ± SD for continuous variables.
Diet 1: Weight Loss Experience_None; Diet 2: Diet, exercise only or weight loss drug for less than 3 months; Diet 3: Weight Loss Experience_Weight Loss Drug over 3 Months; RR: Reduced Rate; RW: Reduced Weight
Table 4
Model Performance according to Variables Ranking Based on Feature Importance
Reduced Rate in 1st Analysis
Reduced Weight in 1st Analysis
Ranking
Variables
DT (%)
RF (%)
LR (%)
ANN (%)
Variables
DT (%)
RF (%)
LR (%)
ANN (%)
7
Diet 1
52.19
52.19
52.19
52.44
6
MD_S
54.51
54.51
54.51
54.71
5
Diet 3
54.51
54.51
54.51
54.71
4
Gender
55.07
55.07
55.07
55.22
3
Diet3
52.39
52.39
52.39
53.44
Age
55.91
55.61
55.46
56.48
2
Weight
52.53
52.64
52.39
53.44
BMI
58.61
59.20
58.56
58.90
1
Age
54.06
54.05
53.79
54.69
Weight
58.78
60.06
59.05
59.95
Reduced Rate in 2nd Analysis
Reduced Weight in 2nd Analysis
Ranking
Variables
DT (%)
RF (%)
LR (%)
ANN (%)
Variables
DT (%)
RF (%)
LR (%)
ANN (%)
8
Gender
54.65
54.65
54.65
53.44
7
SAS 1–2_G
57.14
57.14
57.14
57.63
6
SAS1–2 G
54.44
54.44
54.44
55.50
MD_S
57.77
57.77
57.77
58.60
5
SWL1–2 B
56.40
56.40
56.40
56.76
SWL1–2_B
59.73
59.73
59.83
60.01
4
Weight
57.24
56.50
56.50
57.12
Age
58.56
58.67
60.15
61.02
3
Age
55.97
57.03
57.40
59.18
SWL1–2_G
61.31
62.21
62.58
63.52
2
SWL1–2_G
60.94
61.52
62.42
62.21
Weight
64.16
64.64
64.11
66.23
1
RR 1–2
72.04
71.83
70.51
73.52
RR 1–2
73.41
73.15
72.20
75.33
Reduced Rate in 3rd Analysis
Reduced Weight in 3rd Analysis
Ranking
Variables
DT (%)
RF (%)
LR (%)
ANN (%)
Variables
DT (%)
RF (%)
LR (%)
ANN (%)
4
Age
54.80
52.80
53.60
58.95
Age
59.20
59.20
58.00
61.95
3
Weight3
52.80
56.80
50.40
57.63
Weight
59.20
64.80
60.80
67.71
2
RR 1–2
70.80
73.60
69.20
71.55
RW 2–3
71.60
76.80
72.00
76.11
1
RR 1–3
80.00
81.20
81.60
81.88
RW 1–3
86.00
86.00
82.00
83.67
DT: Decision Tree; RF: Random Forest; LR: Logistic Regression; ANN: Artificial Neural Network; Diet 3: Weight Loss Experience_Weight Loss Drug over 3 Months; MD_S: Patients with Medication Dose Change_Stable; Diet 1: Weight Loss Experience_None; SAS 1–2_G: Satiety and Appetite Suppression 1–2_Good; SWL1–2_B: Satisfaction with Weight Loss 1–2_Bad; SWL1–2_G: Satisfaction with Weight Loss 1–2_Good; RR: Reduced Rate; RW: Reduced Weight
Table 5
Prediction Model Performance Results based on AUC
Algorithm
Sensitivity
Specificity
AUC
1st Analysis
Reduced Rate
DT
0.557
0.524
0.551
RF
0.591
0.489
0.557
LR
0.576
0.499
0.546
ANN
0.574
0.497
0.550
Reduced weight
DT
0.709
0.465
0.630
RF
0.609
0.591
0.640
LR
0.582
0.602
0.631
ANN
0.426
0.748
0.620
2nd Analysis
Reduced Rate
DT
0.722
0.719
0.791
RF
0.748
0.690
0.789
LR
0.734
0.677
0.785
ANN
0.805
0.606
0.785
Reduced weight
DT
0.784
0.687
0.802
RF
0.764
0.700
0.816
LR
0.761
0.685
0.801
ANN
0.880
0.557
0.798
3rd Analysis
Reduced Rate
DT
0.762
0.836
0.890
RF
0.787
0.836
0.890
LR
0.852
0.781
0.897
ANN
0.828
0.789
0.880
Reduced weight
DT
0.873
0.848
0.937
RF
0.873
0.848
0.939
LR
0.847
0.795
0.905
ANN
0.788
0.856
0.920
DT: Decision Tree; RF: Random Forest; LR: Logistic Regression; ANN: Artificial Neural Network; AUC: Area Under the Curve (0.90 – 1.00: excellent, 0.80 – 0.90: good, 0.70 – 0.80: fair, 0.60 – 0.70: poor, 0.50 – 0.60: fail)
참고문헌
1. Hill, JO, Wyatt, HR, & Peters, JC. Energy balance and obesity. Circulation, (2012). 126(1), 126-32.
3. Wall, KC, Politzer, CS, Chahla, J, & Garrigues, GE. Obesity is associated with an increased prevalence of glenohumeral osteoarthritis and arthroplasty: A cohort study. Orthop Clin N Am, (2020). 51(2), 259-264.
4. Kolb, R, Sutterwala, FS, & Zhang, W. Obesity and cancer: inflammation bridges the two. Curr Opin Pharmacol, (2016). 29, 77-89.
5. Handjieva-Darlenska, T, Handjiev, S, Larsen, TM, Baak, MA, Jebb, S, & Papadaki, A, et al. Initial weight loss on an 800-kcal diet as a predictor of weight loss success after 8 weeks: the Diogenes study. Eur J Clin Nutr, (2010). 64(9), 994-9.
6. Hollis, JF, Gullion, CM, Stevens, VJ, Brantley, PJ, Appel, LJ, & Ard, JD, et al. Weight loss during the intensive intervention phase of the weight-loss maintenance trial. Am J Prev Med, (2008). 35(2), 118-26.
7. Reed, JR, Yates, BC, Houfek, J, Briner, W, Schmid, KK, & Pullen, CH. Motivational Factors Predict Weight Loss in Rural Adults. Public Health Nurs, (2016). 33(3), 232-241.
8. Annesi, JJ, & Whitaker, AC. Psychological factors discriminating between successful and unsuccessful weight loss in a behavioral exercise and nutrition education treatment. Int J Behav Med, (2010). 17(3), 168-75.
9. Fabricatore, AN, Wadden, TA, Moore, RH, Butryn, ML, Heymsfield, SB, & Nguyen, AM. Predictors of attrition and weight loss success: Results from a randomized controlled trial. Behav Res Ther, (2009). 47(8), 685-91.
10. Hadziabdic, MO, Mucalo, I, Hrabac, P, Matic, T, Rahelic, D, & Bozikov, V. Factors predictive of drop-out and weight loss success in weight management of obese patients. J Hum Nutr Diet, (2015). 28(2), 24-32.
11. Batterham, M, Tapsell, LC, & Charlton, KE. Predicting dropout in dietary weight loss trials using demographic and early weight change characteristics: Implications for trial design. Obes Res Clin Pract, (2016). 10(2), 189-96.
12. Kang, EY, Park, YB, Kim, MY, & Park, YJ. A Study on Factors Associated with Weight Loss by ‘Gamitaeeumjowee-Tang’
. J Korean Med Obes Res, (2017). 17(2), 68-72.
13. Batterham, M, Tapsell, L, Charlton, K, O’Shea, J, & Thorne, R. Using data mining to predict success in a weight loss trial. J Hum Nutr Diet, (2017). 30(4), 471-478.
14. Rajkomar, A, Dean, J, & Kohane, I. Machine Learning in Medicine. N Engl J Med, (2019). 380, 1347-1358.
15. Kim, H, Yang, SB, Kang, Y, Park, YB, & Kim, JH. Machine learning approach to blood stasis pattern identification based on self-reported symptoms. Korean J Acupunct, (2016). 33(3), 102-113.
16. Sharma, K, Kaur, A, & Gujral, S. Brain tumor detection based on machine learning algorithms. Int J Comput Appl, (2014). 103(1), 7-11.
17. Kourou, K, Exarchos, TP, Exarchos, KP, Karamouzis, MV, & Fotiadis, DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J, (2015). 13, 8-17.
18. Wu, CC, Hsu, WD, Islam, MM, Poly, TN, Yang, HC, & Nguyen, PA, et al. An artificial intelligence approach to early predict non-ST-elevation myocardial infarction patients with chest pain. Comput Methods Programs Biomed, (2019). 173, 109-117.
19. Wang, S, & Summers, RM. Machine learning and radiology. Med Image Anal, (2012). 16(5), 933-51.
20. Dugan, TM, Mukhopadhyay, S, Carroll, A, & Downs, S. Machine Learning Techniques for Prediction of Early Childhood Obesity. Appl Clin Inform, (2015). 6(3), 506-520.
21. Hammond R, Athanasiadou R, Curado S, Aphinyanaphongs Y, Abrams C, Messito MJ, et al. Predicting childhood obesity using electronic health records and publicly available data. PLoS ONE. 14:4. e0215571
https://doi.org/10.1371/journal.pone.0215571
22. Aswani, A, Kaminsky, P, Mintz, Y, Flowers, E, & Fukuoka, Y. Behavioral Modeling in Weight Loss Interventions. Eur J Oper Res, (2019). 272(3), 1058-1072.
23. Kim, YM, Cho, DG, & Kang, SH. Analysis of Factors associated with Geographic Variations in the Prevalence of Adult Obesity using Decision Tree. Health Soc Sci, (2014). 36(1), 157-181.
24. Nam, SH, Kim, SY, Lim, YW, & Park, YB. Review on predictors of weight loss in obesity treatment. J Korean Med Obes Res, (2018). 18(2), 115-127.
25. Yoon, NR, Yoo, YJ, Kim, MJ, Kim, SY, Lim, YW, & Lim, HH, et al. Analysis of adverse events in weight loss program in combination with ‘Gamitaeeumjowee-Tang’ and low-calorie diet. J Korean Med Obes Res, (2018). 18(1), 1-9.
26. Kurs, MB, & Rudnicki, WR. Feature Selection with the Boruta Package. J Stat Softw, (2010). 36(11), 1-13.
27. Jung, D, Kim, G, Park, J, Lee, H, Kim, H, & Choi, H, et al. Prediction of rehospitalization of patients and finding causes of it with data mashup and bigdata analysis. Entrue J Inf Technol, (2015). 14(3), 133-149.
28. Díaz-Uriarte, R, & Andrés, SA. Gene selection and classification of microarray data using random forest. BMC Bioinformatics, (2006). 7, 3.
https://doi.org/10.1186/1471-2105-7-3
29. Ortega Hinojosa, AM, MacLeod, KE, Balmes, J, & Jerrett, M. Influence of school environments on childhood obesity in California. Environ Res, (2018). 166, 100-107.
30. Munger, E, Choi, H, Dey, AK, Elnabawi, YA, Groenendyk, JW, & Rodante, J, et al. Application of machine learning to determine top predictors of noncalcified coronary burden in psoriasis: An observational cohort study”. J Am Acad Dermatol, (2019). Article in presshttps://doi.org/10.1016/j.jaad.2019.10.060
31. Scheinker, D, Valencia, A, & Rodriguez, F. Identification of factors associated with variation in US county-level obesity prevalence rates using epidemiologic vs machine learning models. JAMA Netw Open, (2019). 2(4), e192884.
32. Forman, EM, Kerrigan, SG, Butryn, ML, Juarascio, AS, Manasse, SM, & Ontañón, S, et al. Can the artificial intelligence technique of reinforcement learning use continuously-monitored digital data to optimize treatment for weight loss? J Behav Med, (2019). 42(2), 276-290.
33. Hong, N, Wen, A, Stone, DJ, Tsuji, S, Kingsbury, PR, & Rasmussen, LV, et al. Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries. J Biomed Inform, (2019). 99, 103310https://doi.org/10.1016/j.jbi.2019.103310
34. Han, JY, & Park, YJ. Analysis of factors influencing obesity treatment according to initial condition and compliance with medication. J Korean Med Obes Res, (2019). 19(1), 31-41.
35. Magkos, F, Fraterrigo, G, Yoshino, J, Luecking, C, Kirbach, K, & Kelly, SC. Effects of Moderate and Subsequent Progressive Weight Loss on Metabolic Function and Adipose Tissue Biology in Humans with Obesity. Cell Metab, (2016). 23(4), 591-601.
36. Jo, GW, Ok, JM, Kim, SY, & Lim, YW. Review on the Efficacy and Safety of Mahuang and Ephedrine in the Treatment of Obesity -Focused on RCT-. J Korean Med, (2017). 38(3), 170-184.
37. Disse, E, Ledoux, S, Bétry, C, Caussy, C, Maitrepierre, C, & Coupaye, M, et al. An artificial neural network to predict resting energy expenditure in obesity. Clin Nutr, (2018). 37(5), 1661-1669.