Partial and total order ranking strategies, which from a mathematical point of view are based on elementary methods of Discrete Mathematics, appear as an attractive and simple tool to perform data analysis. Moreover order ranking strategies seem to be a very useful tool not only to perform data exploration but also to develop order-ranking models, being a possible alternative to conventional QSAR methods. In fact, when data material is characterised by uncertainties, order methods can be used as alternative to statistical methods such as multiple linear regression (MLR), since they do not require specific functional relationship between the independent variables and the dependent variables (responses). A ranking model is a relationship between a set of dependent attributes, experimentally investigated, and a set of independent attributes, i.e. model variables. As in regression and classification models the variable selection is one of the main step to find predictive models. In the present work, the Genetic Algorithm (GA-VSS) approach is proposed as the variable selection method to search for the best ranking models within a wide set of predictor variables. The ranking based on the selected subsets of variables is compared with the experimental ranking and evaluated both in partial and total ranking by a set of similarity indices and the Spearman's rank index, respectively. A case study application is presented on a partial order ranking model developed for 12 congeneric phenylureas selected as similarly acting mixture components and analysed according to their toxicity on Scenedesmus vacuolatus. © 2006 Springer-Verlag Berlin Heidelberg.
Pavan, M., Consonni, V., Gramatica, P., Todeschini, R. (2006). New QSAR modelling approach based on ranking models by genetic algorithms - Variable subset selection (GA-VSS). In R. Bruggeman, L. Carlsen (a cura di), Partial Order in Environmental Sciences and Chemistry (pp. 185-224). Berlin : Springer Verlag.
New QSAR modelling approach based on ranking models by genetic algorithms - Variable subset selection (GA-VSS)
CONSONNI, VIVIANA;TODESCHINI, ROBERTO
2006
Abstract
Partial and total order ranking strategies, which from a mathematical point of view are based on elementary methods of Discrete Mathematics, appear as an attractive and simple tool to perform data analysis. Moreover order ranking strategies seem to be a very useful tool not only to perform data exploration but also to develop order-ranking models, being a possible alternative to conventional QSAR methods. In fact, when data material is characterised by uncertainties, order methods can be used as alternative to statistical methods such as multiple linear regression (MLR), since they do not require specific functional relationship between the independent variables and the dependent variables (responses). A ranking model is a relationship between a set of dependent attributes, experimentally investigated, and a set of independent attributes, i.e. model variables. As in regression and classification models the variable selection is one of the main step to find predictive models. In the present work, the Genetic Algorithm (GA-VSS) approach is proposed as the variable selection method to search for the best ranking models within a wide set of predictor variables. The ranking based on the selected subsets of variables is compared with the experimental ranking and evaluated both in partial and total ranking by a set of similarity indices and the Spearman's rank index, respectively. A case study application is presented on a partial order ranking model developed for 12 congeneric phenylureas selected as similarly acting mixture components and analysed according to their toxicity on Scenedesmus vacuolatus. © 2006 Springer-Verlag Berlin Heidelberg.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.