Open Access

Preliminary Evaluation of a Novel Artificial Intelligence-based Prediction Model for Surgical Site Infection in Colon Cancer

Cancer Diagnosis & Prognosis Nov-Dec; 2(6): 691-696 DOI: 10.21873/cdp.10161
Received 23 June 2022 | Revised 04 December 2023 | Accepted 22 August 2022
Corresponding author
Junichi Mazaki, 6-7-1 Nishishinjuku, Shinjukuku, Tokyo 160-0023, Japan. Tel: +81 333426111, Fax: +08 333451437


Background/Aim: There are few studies on artificial intelligence-based prediction models for colon cancer built using clinicopathological factors. Here, we aimed to perform a preliminary evaluation of a novel artificial intelligence-based prediction model for surgical site infection (SSI) in patients with stage II-III colon cancer. Patients and Methods: The medical records of 730 patients who underwent radical surgery for stage II-III colon cancer between 2000 and 2018 at our institute were retrospectively analyzed. Kaplan–Meier curves were used to examine the association between SSI and oncological outcomes (recurrence-free survival time). Next, we used the machine learning software Prediction One to predict SSI. Receiver-operating characteristic curve analysis was used to evaluate the accuracy of the artificial intelligence model. Results: The prognosis in terms of recurrence-free survival time was poor in patients with SSI (p=0.005, 95% confidence interval=4892.061-5525.251). The area under the curve of the artificial intelligence model in predicting SSI was 0.731. Conclusion: As SSI is an important prognostic factor associated with oncological outcomes, the prediction of SSI occurrence is important. Based on our preliminary evaluation, the artificial intelligence model for predicting SSI in patients with stage II-III colon cancer was as accurate as the previously reported model derived through conventional statistical analysis.
Keywords: Artificial intelligence (AI), colon cancer, surgical site infection (SSI)

An increasing amount of evidence suggests that postoperative infection is associated with poorer long-term outcomes in various malignancies (1,2). Surgical site infection (SSI) is a common postoperative complication that occurs in 5-40% of patients undergoing colorectal surgery (3,4). Postoperative SSI, an important marker of surgical quality, increases treatment costs, delays the initiation of adjuvant therapy, affects quality of life, and may be associated with premature mortality (2).

Preoperative weakening of immunity and undernutrition are listed as risk factors for SSI in the US Centers for Disease Control and Prevention (CDC) guidelines, and various systemic inflammatory and nutritional scores have been reported as useful SSI predictors (5,6). The roles of the systemic inflammatory response and nutritional status in SSI have been increasingly recognized, as reflected in the numerous immunological and nutritional markers shown to affect SSI incidence in different types of cancer, including colon cancer. These markers include the modified Glasgow Prognostic score (mGPS), C-reactive protein to albumin ratio (CAR, which is an index of cancer cachexia), Controlling Nutritional Status score (CONUT score, which detects potential malnutrition), prognostic nutritional index (PNI, which is calculated using the serum albumin levels and peripheral lymphocyte count), and lymphocyte to monocyte ratio (LMR, which reflects the balance between the tumor-promoting environment and antitumor immunity) (7-9).

Many studies have attempted to predict SSI following colon surgery by using prediction models, which include various clinicopathological factors. Such models rely on conventional statistical analyses, such as multivariate analyses and nomograms. The predictive ability of these models was high, as assessed by a receiver-operating characteristic (ROC) curve analysis or the concordance index (C-index), and their area under the curve (AUC) values were all approximately 0.75 (10-14). This led us to consider the applicability of artificial intelligence (AI)-based prediction models, which have developed rapidly in recent years.

The third AI boom is arriving, and AI is evolving rapidly. AI, particularly machine learning and deep learning, has been applied in clinical cancer research, and the cancer prediction performance has reached new heights. However, no studies have been published on building AI-based prediction models for colon cancer using clinicopathological factors.

Therefore, in this study, we aimed to build a novel AI-based prediction model (AI model) for SSI in stage II-III colon cancer using immunological and nutritional markers and to perform a preliminary evaluation of its performance.

Patients and Methods

Patients. The medical records of 730 patients who underwent radical surgery for stage II-III colon cancer between January 2000 and August 2018 at Tokyo Medical University Hospital were retrospectively analyzed. Information pertaining to the following 25 variables was obtained: age, sex, body mass index (BMI), number of hospitalization days, smoking status, insulin use, normal/emergency surgery, laparoscopic/open surgery, histological cancer type, maximum tumor diameter (tumor size), sidedness (tumor location), blood loss, operative time, pathological T-stage (T-stage), pathological N-stage (N-stage), pathological stage, venous invasion (v), lymphatic invasion (ly), carcinoembryonic antigen (CEA), carbohydrate antigen 19-9 (CA-19-9), LMR, CAR, mGPS, PNI, and CONUT. Patients were generally hospitalized 2 days before surgery, and we evaluated the laboratory data on the day of admission. This study was approved by the Institutional Review Board of Tokyo Medical University Hospital.

SSI. The definition of SSI included superficial and deep incision infections classified based on CDC criteria, as well as wound infections with the following characteristics: presence of purulent fluid or pus in the wound incision and local redness and warming of the surgical site. This study included patients with Clavien-Dindo grade 1-3b complications. ROC curve analysis was used to investigate the incidence of SSI and predictive accuracy of each factor (5,6).

Conventional statistical analysis. A univariable analysis of the association between SSI and oncological outcomes [recurrence-free survival (RFS)] was performed using the Kaplan–Meier method and log-rank test. All statistical analyses were performed using the SPSS software (IBM® SPSS® Statistics for Windows, Version 25.0; IBM, Chicago, IL, USA). p<0.05 was considered statistically significant.

AI. We used the machine learning software Prediction One (Sony Network Communications Inc.; Higashishinagawa, Shinagawa-ku, Tokyo), to predict SSI incidence in stage II-III colon cancer. The software generates feature vectors from the dataset using standard preprocessing methods, including one-hot encoding for categorical variables and normalization for numerical variables. The gradient boosting tree and neural network are used as supervised machine learning models in the software. Each model was trained with hyperparameter tuning; subsequently, an ensemble model based on both the trained models was constructed for prediction. To evaluate the accuracy of the AI model, we calculated the AUC using a ROC curve analysis with internal validation. Prediction One also evaluates the “importance of variables” (IOV) using a method based on permutation feature importance. This method calculates the difference in model output when a single variable is removed. The value of the difference in model output indicates the extent to which the model depends on the variable. The value of the difference was computed for each covariate and was averaged over those in the dataset. An IOV value of ≥0.020 was considered significant.


Patient and tumor characteristics. The baseline patient and tumor characteristics are summarized in Table I. SSI occurred in 92 patients (12.6%). Among these patients, 73 had superficial infections, and 19 had deep infections. The Clavien-Dindo classification grades were as follows: grade 1, 62 patients; grade 2, 11 patients; grade 3a, 13 patients; and grade 3b, 6 patients.

Kaplan–Meier analysis of RFS in patients with and without SSI. Eighty-four patients had local or systemic recurrence, and 89 patients died. In total, 134 patients experienced recurrence or died. The median (range) time to recurrence was 1.53 (0.03-7.9) years, and the time to recurrence or death was 1.84 (0.01-17.4) years. The median (range) RFS was 4.46 (0.01-19.1) years. The risk of recurrence or death significantly increased after the patient developed SSI (p=0.005) (Figure 1).

AI analysis. The AI model was used to predict the occurrence of SSI based on 25 covariates. Data from 730 and 100 patients were used in the learning and validation models, respectively. The ROC curve for the AI model is shown in Figure 2. The AUC value was 0.731, which was comparable to that of the previously reported model derived through conventional statistical analysis. Prediction One was used to calculate the IOV of each factor in predicting SSI occurrence. The IOV values are presented in Table II. The following are the factors with IOV ≥0.020 presented in decreasing order of the IOV values: hospitalization (0.0816), blood loss (0.0453), LMR (0.0408), insulin use (0.0407), differentiation (0.0390), laparoscopy (0.0351), emergency surgery (0.0336), tumor size (0.0334), lymphatic invasion (0.0296), CAR (0.0267), operative time (0.0254), CA19-9 (0.0249), and smoking status (0.0245).


This study involved a preliminary evaluation of an AI-based SSI prediction model that included various clinicopathological factors observed in patients with colon cancer. In previous models based on conventional statistical analysis, all the reported values for the AUC and C-index were approximately 0.75 (10,11,13,14). The accuracy of our AI model, with an AUC value of 0.731, was as good as that of the previously reported models. Moreover, the AI model uses easily obtainable data on simple clinicopathological factors, and the cost of constructing the model is low.

Postoperative infections contribute to poor survival (1,2). SSI is one of the most important complications of colorectal surgery (3,4). Therefore, the identification of SSI predictors, which can be used in clinical practice, is a pressing issue. A cancerous state often activates a systemic inflammatory reaction, and inflammation in turn reduces immunity (7). In addition, impaired nutrient intake is occasionally seen during the perioperative period in patients with colorectal cancer, which leads to an immunocompromised state (8). Therefore, nutritional management is indispensable for the success of colorectal surgery. Although various systemic inflammatory and nutritional scores have been used to predict postoperative complications (9), no useful SSI predictors have been reported for colon cancer. In this study, LMR and CAR were ranked high in terms of IOV using Prediction One. This finding suggests that inflammatory and nutritional scores are significantly associated with SSI.

The use of machine learning has become widespread in clinical research. The types of learning used by computers are subclassified into categories, such as supervised and unsupervised learning. Supervised learning begins with the prediction of a known output or target. Therefore, it is often used for the estimation of risk in medical research (15,16). In deep learning, unsupervised learning is initially used to identify robust features, and subsequently, these features are refined and can ultimately be used as predictors in the final supervised model. Deep neural networks (DNNs), also known as deep learning networks, are used in many AI applications (17). Multiple or multivariate logistic regression fits multiple parameters in prediction models by assuming that predictors are linearly and additively related to an outcome. However, nonlinear problems commonly occur when these models are applied in the field of human physiology because of complex interactions. Therefore, linear models might not be capable of adequately predicting outcomes, which may explain the differences observed in predictive accuracy between the multivariate and AI models.

In general, conventional statistical analysis involving a conventional linear model focuses on explaining data and is said to be inferior to AI in terms of its predictive ability. Although AI is useful for prediction, there are important concerns owing to the opaque, black-box nature of most AI algorithms. Building prediction models with explainable AI mechanisms is a powerful and transparent alternative to black-box AI models (18). In the present study, Prediction One calculated not only the accuracy of the model but also the contribution of each factor to the outcome, thereby enabling a better understanding of the model and its constituents. The IOV values and independent risk factors identified through multivariate analysis do not exactly correspond, but IOV is an informative variable for describing the model.

This study has several limitations worth noting. First, this was a retrospective, single-center study. Second, we did not determine whether the patients had hematological or autoimmune disorders, which may have influenced the preoperative laboratory data. Third, our data did not distinguish between superficial and deep-incision infections. Given that this is the first study of its kind that used Prediction One, SSI was defined broadly in order to improve prediction accuracy. Fourth, external validation was not performed. Prospective studies and external validation are needed to improve the performance of the AI model.

In conclusion, based on our preliminary analysis, the AI model was found to be useful for predicting SSI in patients with stage II-III colon cancer. In this study, SSI was found to be associated with worse oncological outcomes; therefore, the prediction of SSI occurrence is important in colon cancer.

Conflicts of Interest

The Authors have no conflicts of interest to disclose in relation to this study.

Authors’ Contributions

Yuki Ohno and Junichi Mazaki conceived the idea of the study. Yuki Ohno developed the statistical analysis plan and conducted statistical analyses. Ryutaro Udo, Tomoya Tago, Kenta Kasahara, Masanobu Enomoto and Tetsuo Ishizaki contributed to the interpretation of the results. Yuki Ohno drafted the original manuscript. Yuichi Nagakawa supervised the conduct of this study. All Authors reviewed the manuscript draft and revised it critically on intellectual content. All Authors approved the final version of the manuscript to be published.


1 Matsumoto Y Tsujimoto H Ono S Shinomiya N Miyazaki H Hiraki S Takahata R Yoshida K Saitoh D Yamori T Yamamoto J & Hase K Abdominal infection suppresses the number and activity of intrahepatic natural killer cells and promotes tumor growth in a murine liver metastasis model. Ann Surg Oncol. 23(Suppl 2) S257 - S265 2016. PMID: 25752891. DOI: 10.1245/s10434-015-4466-7
2 Tsujimoto H Ueno H Hashiguchi Y Ono S Ichikura T & Hase K Postoperative infections are associated with adverse outcome after resection with curative intent for colorectal cancer. Oncol Lett. 1(1) 119 - 125 2010. PMID: 22966268. DOI: 10.3892/ol_00000022
3 Huh JW Lee WY Park YA Cho YB Kim HC Yun SH & Chun HK Oncological outcome of surgical site infection after colorectal cancer surgery. Int J Colorectal Dis. 34(2) 277 - 283 2019. PMID: 30426197. DOI: 10.1007/s00384-018-3194-4
4 Troillet N Aghayev E Eisenring MC Widmer AF & Swissnoso First results of the Swiss National surgical site infection surveillance program: who seeks shall find. Infect Control Hosp Epidemiol. 38(6) 697 - 704 2017. PMID: 28558862. DOI: 10.1017/ice.2017.55
5 Berríos-Torres SI Umscheid CA Bratzler DW Leas B Stone EC Kelz RR Reinke CE Morgan S Solomkin JS Mazuski JE Dellinger EP Itani KMF Berbari EF Segreti J Parvizi J Blanchard J Allen G Kluytmans JAJW Donlan R Schecter WP & Healthcare Infection Control Practices Advisory Committee Centers for disease control and prevention guideline for the prevention of surgical site infection, 2017. JAMA Surg. 152(8) 784 - 791 2017. PMID: 28467526. DOI: 10.1001/jamasurg.2017.0904
6 Mangram AJ Horan TC Pearson ML Silver LC & Jarvis WR Guideline for prevention of surgical site infection, 1999. Hospital Infection Control Practices Advisory Committee. Infect Control Hosp Epidemiol. 20(4) 250 - 78 1999. PMID: 10219875. DOI: 10.1086/501620
7 Sagawa M Yoshimatsu K Yokomizo H Yano Y Okayama S Usui T Yamaguchi K Shiozawa S Shimakawa T Katsube T Kato H & Naritaka Y Worse preoperative status based on inflammation and host immunity is a risk factor for surgical site infections in colorectal cancer surgery. J Nippon Med Sch. 84(5) 224 - 230 2017. PMID: 29142183. DOI: 10.1272/jnms.84.224
8 Mohri Y Inoue Y Tanaka K Hiro J Uchida K & Kusunoki M Prognostic nutritional index predicts postoperative outcome in colorectal cancer. World J Surg. 37(11) 2688 - 2692 2013. PMID: 23884382. DOI: 10.1007/s00268-013-2156-9
9 Tokunaga R Sakamoto Y Nakagawa S Izumi D Kosumi K Taki K Higashi T Miyata T Miyamoto Y Yoshida N & Baba H Comparison of systemic inflammatory and nutritional scores in colorectal cancer patients who underwent potentially curative resection. Int J Clin Oncol. 22(4) 740 - 748 2017. PMID: 28213742. DOI: 10.1007/s10147-017-1102-5
10 Gervaz P Bandiera-Clerc C Buchs NC Eisenring MC Troillet N Perneger T & Harbarth S Scoring system to predict the risk of surgical-site infection after colorectal resection. Br J Surg. 99(4) 589 - 595 2012. PMID: 22231649. DOI: 10.1002/bjs.8656
11 de Campos-Lobato LF Wells B Wick E Pronty K Kiran R Remzi F & Vogel JD Predicting organ space surgical site infection with a nomogram. J Gastrointest Surg. 13(11) 1986 - 1992 2009. PMID: 19760301. DOI: 10.1007/s11605-009-0968-6
12 Goulart A Ferreira C Estrada A Nogueira F Martins S Mesquita-Rodrigues A Sousa N & Leão P Early inflammatory biomarkers as predictive factors for freedom from infection after colorectal cancer surgery: a prospective cohort study. Surg Infect (Larchmt). 19(4) 446 - 450 2018. PMID: 29624484. DOI: 10.1089/sur.2017.294
13 Bergquist JR Thiels CA Etzioni DA Habermann EB & Cima RR Failure of colorectal surgical site infection predictive models applied to an independent dataset: do they add value or just confusion. J Am Coll Surg. 222(4) 431 - 438 2016. PMID: 26847588. DOI: 10.1016/j.jamcollsurg.2015.12.034
14 Grant R Aupee M Buchs NC Cooper K Eisenring MC Lamagni T Ris F Tanguy J Troillet N Harbarth S & Abbas M Performance of surgical site infection risk prediction models in colorectal surgery: external validity assessment from three European national surveillance networks. Infect Control Hosp Epidemiol. 40(9) 983 - 990 2019. PMID: 31218977. DOI: 10.1017/ice.2019.163
15 Breiman L Random forests. Mach Learn. 45 5 - 32 2001. DOI: 10.1023/A:1010933404324
16 Deo RC Machine learning in medicine. Circulation. 132(20) 1920 - 1930 2015. PMID: 26572668. DOI: 10.1161/CIRCULATIONAHA.115.001593
17 Sze V Chen Y Yang T & Emer J Efficient processing of deep neural networks: a tutorial and survey. Proceedings of the IEEE. 105(12) 2295 - 2329 2022. DOI: 10.1109/JPROC.2017.2761740
18 Castelvecchi D Can we open the black box of AI. Nature. 538(7623) 20 - 23 2016. PMID: 27708329. DOI: 10.1038/538020a