Computer Science and Information Systems 2023 Volume 20, Issue 1, Pages: 381-404
https://doi.org/10.2298/CSIS220227061S
Full text ( 257 KB)
Cited by


The application of machine learning techniques in prediction of quality of life features for cancer patients

Savić Miloš ORCID iD icon (Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), svc@dmi.uns.ac.rs
Kurbalija Vladimir ORCID iD icon (Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), kurba@dmi.uns.ac.rs
Ilić Mihailo (Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), milic@dmi.uns.ac.rs
Ivanović Mirjana ORCID iD icon (Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), mira@dmi.uns.ac.rs
Jakovetić Dušan (Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia), dusan.jakovetic@dmi.uns.ac.rs
Valachis Antonios (Department of Oncology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden), antonios.valachis@oru.se
Autexier Serge (German Research Center for Artificial Intelligence GmbH, Cyber-Physical Systems Bremen, Germany), serge.autexier@dfki.de
Rust Johannes (German Research Center for Artificial Intelligence GmbH, Cyber-Physical Systems Bremen, Germany), johannes.rust@dfki.de
Kosmidis Thanos (Care Across Ltd London, England), thanos.kosmidis@careacross.com

Quality of life (QoL) is one of the major issues for cancer patients.With the advent of medical databases containing large amounts of relevant QoL information it becomes possible to train predictive QoL models by machine learning (ML) techniques. However, the training of predictive QoL models poses several challenges mostly due to data privacy concerns and missing values in patient data. In this paper, we analyze several classification and regression ML models predicting QoL indicators for breast and prostate cancer patients. Three different approaches are employed for imputing missing values, and several settings for data privacy preserving are tested. The examined ML models are trained on datasets formed from two databases containing a large number of anonymized medical records of cancer patients from Sweden. Two learning scenarios are considered: centralized and federated learning. In the centralized learning scenario all patient data coming from different data sources is collected at a central location prior to model training. On the other hand, federated learning enables collective training of machine learning models without data sharing. The results of our experimental evaluation show that the predictive power of federated models is comparable to that of centrally trained models for short-term QoL predictions, whereas for long-term periods centralized models provide more accurate QoL predictions. Furthermore, we provide insights into the quality of data preprocessing tasks (missing value imputation and differential privacy).

Keywords: Quality of Life, Cancer Patients, Predictive Models, Federated Learning, Breast Cancer, Prostate Cancer


Show references