Computer Science and Information Systems 2023 Volume 20, Issue 1, Pages: 381-404
https://doi.org/10.2298/CSIS220227061S
Full text (
257 KB)
Cited by
The application of machine learning techniques in prediction of quality of life features for cancer patients
Savić Miloš
(Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), svc@dmi.uns.ac.rs
Kurbalija Vladimir
(Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), kurba@dmi.uns.ac.rs
Ilić Mihailo (Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), milic@dmi.uns.ac.rs
Ivanović Mirjana
(Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), mira@dmi.uns.ac.rs
Jakovetić Dušan (Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), dusan.jakovetic@dmi.uns.ac.rs
Valachis Antonios (Department of Oncology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden), antonios.valachis@oru.se
Autexier Serge (German Research Center for Artificial Intelligence GmbH, Cyber-Physical Systems Bremen, Germany), serge.autexier@dfki.de
Rust Johannes (German Research Center for Artificial Intelligence GmbH, Cyber-Physical Systems Bremen, Germany), johannes.rust@dfki.de
Kosmidis Thanos (Care Across Ltd London, England), thanos.kosmidis@careacross.com
Quality of life (QoL) is one of the major issues for cancer patients.With the advent of medical databases containing large amounts of relevant QoL information it becomes possible to train predictive QoL models by machine learning (ML) techniques. However, the training of predictive QoL models poses several challenges mostly due to data privacy concerns and missing values in patient data. In this paper, we analyze several classification and regression ML models predicting QoL indicators for breast and prostate cancer patients. Three different approaches are employed for imputing missing values, and several settings for data privacy preserving are tested. The examined ML models are trained on datasets formed from two databases containing a large number of anonymized medical records of cancer patients from Sweden. Two learning scenarios are considered: centralized and federated learning. In the centralized learning scenario all patient data coming from different data sources is collected at a central location prior to model training. On the other hand, federated learning enables collective training of machine learning models without data sharing. The results of our experimental evaluation show that the predictive power of federated models is comparable to that of centrally trained models for short-term QoL predictions, whereas for long-term periods centralized models provide more accurate QoL predictions. Furthermore, we provide insights into the quality of data preprocessing tasks (missing value imputation and differential privacy).
Keywords: Quality of Life, Cancer Patients, Predictive Models, Federated Learning, Breast Cancer, Prostate Cancer
Show references
Savić, M., Kurbalija, V., Ilić, M., Ivanović, M., Jakovetić, D., Valachis, A., Autexier, S., Rust, J., Kosmidis, T.: Analysis of Machine Learning Models Predicting Quality of Life for Cancer Patients, p. 35-42. Association for Computing Machinery, New York, NY, USA (2021), https://doi.org/10.1145/3444757.3485103
Sidey-Gibbons, J., Sidey-Gibbons, C.: Machine learning in medicine: a practical introduction. BMC Medical Research Methodology 19 (03 2019)
Saadat, S., Aziz, A., Ahmad, H., Imtiaz, H., Sohail, Z., Kazmi, A., Aslam, S., Naqvi, N., Saadat, S.: Predicting quality of life changes in hemodialysis patients using machine learning: Generation of an early warning system. Cureus 9 (09 2017)
Sim, J., Kim, Y., Kim, J., Lee, J., Kim, M.S., Shim, Y., Zo, J., Yun, Y.H.: The major effects of health-related quality of life on 5-year survival prediction among lung cancer survivors: applications of machine learning. Scientific Reports 10, 10693 (07 2020)
Velikova, G., Booth, L., Smith, A., Brown, P., Lynch, P., Brown, J., Selby, P.: Measuring quality of life in routine oncology practice improves communication and patient well-being: A randomized controlled trial. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 22, 714-24 (03 2004)
Singh, A., Pannu, H.S., Malhi, A.: Explainable information retrieval using deep learning for medical images. Computer Science and Information Systems 19(1), 277-307 (01 2022)
Šušteršič, T., Peulić, M., Peulić, A.: FPGA Implementation of Fuzzy Medical Decision Support System for Disc Hernia Diagnosis. Computer Science and Information Systems 18(3), 619-640 (06 2021)
Bratić, B., Kurbalija, V., Ivanović, M., Oder, I., Bosnić, Z.: Machine learning for predicting cognitive diseases: Methods, data sources and risk factors. J. Med. Syst. 42(12) (oct 2018), https://doi.org/10.1007/s10916-018-1071-x
Sinha, R., Heuvel, W.: A systematic literature review of quality of life in lower limb amputees. Disability and rehabilitation 33, 883-99 (06 2011)
Spiga, O., Cicaloni, V., Fiorini, C., Trezza, A., Visibelli, A., Millucci, L., Bernardini, G., Bernini, A., Marzocchi, B., Braconi, D., Prischi, F., Santucci, A.: Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease. Orphanet Journal of Rare Diseases 15 (12 2020)
Kaur, M., Dhalaria, M., Sharma, P., Park, J.: Supervised machine-learning predictive analytics for national quality of life scoring. Applied Sciences 9, 1613 (04 2019)
Gonçalves, J., Faria, B.M., Reis, L.P., Carvalho, V., Rocha, A.: Data mining and electronic devices applied to quality of life related to health data. In: 2015 10th Iberian Conference on Information Systems and Technologies (CISTI). pp. 1-4 (2015)
Kumar, S., Rana, M., Verma, K., Singh, N., Sharma, A., Maria, A., Singh, G., Khaira, H., Saini, S.: Prediqt-cx: Post treatment health related quality of life prediction model for cervical cancer patients. PloS one 9, e89851 (02 2014)
Yang, Z., Olszewski, D., He, C., Pintea, G., Lian, J., Chou, T., Chen, R.C., Shtylla, B.: Machine learning and statistical prediction of patient quality-of-life after prostate radiation therapy. Computers in Biology and Medicine 129, 104127 (2021), https://www.sciencedirect.com/science/article/pii/S0010482520304583
Melin, R., Fugl-Meyer, K., Fugl-Meyer, A.: Life satisfaction in 18-to 64-year-old swedes: In relation to education, employment situation, health and physical activity. Journal of rehabilitation medicine : official journal of the UEMS European Board of Physical and Rehabilitation Medicine 35, 84-90 (04 2003)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. 10(2) (Jan 2019), https://doi.org/10.1145/3298981
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825-2830 (2011)
Kairouz, P., McMahan, H.B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A.N., Bonawitz, K., Charles, Z., Cormode, G., Cummings, R., D’Oliveira, R.G.L., Eichner, H., Rouayheb, S.E., Evans, D., Gardner, J., Garrett, Z., Gascón, A., Ghazi, B., Gibbons, P.B., Gruteser, M., Harchaoui, Z., He, C., He, L., Huo, Z., Hutchinson, B., Hsu, J., Jaggi, M., Javidi, T., Joshi, G., Khodak, M., Konečný, J., Korolova, A., Koushanfar, F., Koyejo, S., Lepoint, T., Liu, Y., Mittal, P., Mohri, M., Nock, R., Özgür, A., Pagh, R., Raykova, M., Qi, H., Ramage, D., Raskar, R., Song, D., Song, W., Stich, S.U., Sun, Z., Suresh, A.T., Tramèr, F., Vepakomma, P., Wang, J., Xiong, L., Xu, Z., Yang, Q., Yu, F.X., Yu, H., Zhao, S.: Advances and open problems in federated learning (2021)
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D.G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zheng, X.: Tensorflow: A system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. p. 265-283. OSDI’16, USENIX Association, USA (2016)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015)
Buuren, S., Groothuis-Oudshoorn, C.: MICE: Multivariate imputation by chained equations in R. Journal of Statistical Software 45 (12 2011)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) Theory of Cryptography. pp. 265-284. Springer Berlin Heidelberg, Berlin, Heidelberg (2006)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. p. 1322-1333. CCS ’15, Association for Computing Machinery, New York, NY, USA (2015), https://doi.org/10.1145/2810103.2813677
Li, Q., Wen, Z., He, B.: Practical federated gradient boosting decision trees (2019)
Yang, S., Ren, B., Zhou, X., Liu, L.: Parallel distributed logistic regression for vertical federated learning without third-party coordinator (2019)