Quantitative Structure – Activity Relationship (QSAR) models play a central role in medicinal chemistry, toxicology and computer-assisted molecular design, as well as a support for regulatory decisions and animal testing reduction. Thus, assessing their predictive ability becomes an essential step for any prospective application. Many metrics have been proposed to estimate the model predictive ability of QSARs, which have created confusion on how models should be evaluated and properly compared. Recently, we showed that the metric Q2F3 is particularly well-suited for comparing the external predictivity of different models developed on the same training dataset. However, when comparing models developed on different training data, this function becomes inadequate and only dispersion measures like the root-mean-square error (RMSE) should be used. The intent of this work is to provide clarity on the correct and incorrect uses of Q2F3, discussing its behavior towards the training data distribution and illustrating some cases in which Q2F3 estimates may be misleading. Hereby, we encourage the usage of measures of dispersions when models trained on different datasets have to be compared and evaluated
Consonni, V., Todeschini, R., Ballabio, D., Grisoni, F. (2019). On the Misleading Use of QF32 for QSAR Model Comparison. MOLECULAR INFORMATICS, 38(1) [10.1002/minf.201800029].
On the Misleading Use of QF32 for QSAR Model Comparison
Consonni, V
;Todeschini, R;Ballabio, D;Grisoni, F
2019
Abstract
Quantitative Structure – Activity Relationship (QSAR) models play a central role in medicinal chemistry, toxicology and computer-assisted molecular design, as well as a support for regulatory decisions and animal testing reduction. Thus, assessing their predictive ability becomes an essential step for any prospective application. Many metrics have been proposed to estimate the model predictive ability of QSARs, which have created confusion on how models should be evaluated and properly compared. Recently, we showed that the metric Q2F3 is particularly well-suited for comparing the external predictivity of different models developed on the same training dataset. However, when comparing models developed on different training data, this function becomes inadequate and only dispersion measures like the root-mean-square error (RMSE) should be used. The intent of this work is to provide clarity on the correct and incorrect uses of Q2F3, discussing its behavior towards the training data distribution and illustrating some cases in which Q2F3 estimates may be misleading. Hereby, we encourage the usage of measures of dispersions when models trained on different datasets have to be compared and evaluatedFile | Dimensione | Formato | |
---|---|---|---|
Consonni_et_al-2019-Molecular_Informatics.pdf
Solo gestori archivio
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Dimensione
438.99 kB
Formato
Adobe PDF
|
438.99 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.