초록접수 현황

14F-250 구연 발표

A Hybrid Model To Predict The Long-term Survival In Lung Cancer Patients Using C5.0 Decision Tree Algorithm
김대준1, 유우식1, 정희석1, 이창영1, 백효채1, 정경영1, 손창식2, 김윤년2
1 연세대학교 의과대학 세브란스병원 흉부외과학교실, 2 계명대학교 의과대학 의료정보학교실

목적 : Data mining is a powerful new technology with great potential which has been applied in various fields of science. The aim of this study is to develop a prediction model of long-term survival in lung cancer patients using decision tree algorithm.

방법 : Conventional statistical approaches were used as a feature selection process using chi-square test, Fisher’s exact test, Mann-Whitney U-test, and Wald forward logistic regression. The final model was constructed using the C5.0 decision tree algorithm of Clementine 12.0 after pre-processing.

결과 : From 2001.1 to 2010.2, 1538 curative resections were performed and 1422 patients had stage I-IIIA disease. Of these, excluding 104 patients with neoadjuvant treatment and 29 patients with operative mortality, 1289 patients were enrolled. In Univariate analysis, 26 variables were risk factors of long-term survival. Univariate Cox proportional hazard regression analysis based on Classification & Regression Tree (CART) model revealed 8 rulesets, of which accuracy, sensitivity, specificity, and AUC were 73.5%, 59.9%, 81.7%, and 70.5%. The rulesets revealed that (1) open surgery, stage I, functional class 1-2, FEV1 <88.8% (HR 1.499, 95% CI 1.085-2.071, p=0.014), (2) open surgery, stage II-IIIA, functional class 0, multistation nodal involvement (HR 2.194, 95% CI 1.772-2.715, p<0.001), (3) open surgery, stage II-IIIA, functional class 1-3, FEV1 <116.5% (HR 2.385, 95% CI 1.935-2.940, p<0.001) were the risk factors of long-term survival.

결론 : The decision tree algorithm can be applied to predict long-term survival in lung cancer patients. It suggests that data mining tools have a potential to give new insights in the management of cancer patients.


책임저자: 김대준
연세대학교 의과대학 세브란스병원 흉부외과학교실
연락처 : 김대준, Tel: 02-2228-2140 , E-mail : kdjcool@yuhs.ac

목 록