Risk stratification of cancer patients, that is the prediction of the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years the use of gene expression profiling in combination with the clinical and histological criteria traditionally used in such a prediction has been successfully introduced. Sets of genes whose expression values in a tumor can be used to predict the outcome of the pathology (gene expression signatures) were introduced and tested by many research groups. A well-known such signature is the 70-gene signature, on which we recently tested several machine learning techniques in order to maximize its predictive power. Genetic Programming (GP) was shown to perform significantly better than other techniques including Support Vector Machines, Multilayered Perceptrons and Random Forests in classifying patients. Genetic Programming has the further advantage, with respect to other methods, of performing an automatic feature selection. Importantly, by using a weighted average between false positives and false negatives in the definition of the fitness, we showed that GP can outperform all the other methods in minimizing false negatives (one of the main goals in clinical applications) without compromising the overall minimization of incorrectly classified instances. The solutions returned by GP are appealing also from a clinical point of view, being simple, easy to understand, and built out of a rather limited subset of the available features.
Giacobini, M., Provero, P., Vanneschi, L., Mauri, G. (2014). Towards the Use of Genetic Programming for the Prediction of Survival in Cancer. In S. Cagnoni, M. Mirolli, M. Villani (a cura di), Evolution, Complexity and Artificial Life (pp. 177-192). Springer Berlin Heidelberg [10.1007/978-3-642-37577-4_12].
Towards the Use of Genetic Programming for the Prediction of Survival in Cancer
VANNESCHI, LEONARDO;MAURI, GIANCARLO
2014
Abstract
Risk stratification of cancer patients, that is the prediction of the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years the use of gene expression profiling in combination with the clinical and histological criteria traditionally used in such a prediction has been successfully introduced. Sets of genes whose expression values in a tumor can be used to predict the outcome of the pathology (gene expression signatures) were introduced and tested by many research groups. A well-known such signature is the 70-gene signature, on which we recently tested several machine learning techniques in order to maximize its predictive power. Genetic Programming (GP) was shown to perform significantly better than other techniques including Support Vector Machines, Multilayered Perceptrons and Random Forests in classifying patients. Genetic Programming has the further advantage, with respect to other methods, of performing an automatic feature selection. Importantly, by using a weighted average between false positives and false negatives in the definition of the fitness, we showed that GP can outperform all the other methods in minimizing false negatives (one of the main goals in clinical applications) without compromising the overall minimization of incorrectly classified instances. The solutions returned by GP are appealing also from a clinical point of view, being simple, easy to understand, and built out of a rather limited subset of the available features.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.