The deep relation between dynamics, in terms of both global and local flexibility, and function of proteins is now widely acknowledged. Flexibility is involved in protein binding to small molecules as well as in mechanisms involving domain motions and is also at the basis of signal transduction processes and allosteric interactions. Computational methods, and in particular Molecular-Dynamics (MD) simulations, are widely applied in the investigation of a wide range of dynamic properties and processes that occur in the ps to μs timescale. By means of MD simulations a large ensemble of molecular structures can be generated to sample the accessible conformational space of a protein and identification of functionally relevant conformations is generally done by comparing and grouping the obtained conformations. Data-mining techniques, like clustering, provide one means to group and analyse the information in the MD trajectory, but the results are often influenced by the type of algorithm and the choice of optimal parameters is often case dependent. On the other hands, nowadays computer technology has simplified the complexity in analysing and visualizing scientific data. Within these techniques, the Self-Organizing Maps (SOMs) are an invaluable data mining tool. In this thesis a novel strategy to analyse and compare conformational ensembles of protein domains was developed by using a two-level approach that combines SOMs and complete linkage clustering. First, the representation of the conformations extracted from the MD simulations were encoded as a proper input data for the SOM analysis. Second, the effects of the typical SOM parameters in the analysis of these data were studied by using an experimental design approach. Third the use of a rule to define the optimal number of clusters that best summarizes the information in the map was proposed. Finally this new protocol was applied for the conformational and functional analysis of two study cases: a) a group of single-site mutants of the α-spectrin SH3 (Spc-SH3) domain and b) the bound and unbound states of a group of protein-protein complexes involving proteins of the RAS superfamily. The results demonstrated the potential of this approach in the analysis of large ensembles of molecular conformations, i.e. the possibility of producing a topological mapping of the conformational space in a simple 2D visualisation, as well as of effectively highlighting differences in the conformational dynamics directly related to biological functions.
(2011). Comparison of protein dynamics: a new methodology based on self-organizing maps. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2011).
Comparison of protein dynamics: a new methodology based on self-organizing maps
FRACCALVIERI, DOMENICO
2011
Abstract
The deep relation between dynamics, in terms of both global and local flexibility, and function of proteins is now widely acknowledged. Flexibility is involved in protein binding to small molecules as well as in mechanisms involving domain motions and is also at the basis of signal transduction processes and allosteric interactions. Computational methods, and in particular Molecular-Dynamics (MD) simulations, are widely applied in the investigation of a wide range of dynamic properties and processes that occur in the ps to μs timescale. By means of MD simulations a large ensemble of molecular structures can be generated to sample the accessible conformational space of a protein and identification of functionally relevant conformations is generally done by comparing and grouping the obtained conformations. Data-mining techniques, like clustering, provide one means to group and analyse the information in the MD trajectory, but the results are often influenced by the type of algorithm and the choice of optimal parameters is often case dependent. On the other hands, nowadays computer technology has simplified the complexity in analysing and visualizing scientific data. Within these techniques, the Self-Organizing Maps (SOMs) are an invaluable data mining tool. In this thesis a novel strategy to analyse and compare conformational ensembles of protein domains was developed by using a two-level approach that combines SOMs and complete linkage clustering. First, the representation of the conformations extracted from the MD simulations were encoded as a proper input data for the SOM analysis. Second, the effects of the typical SOM parameters in the analysis of these data were studied by using an experimental design approach. Third the use of a rule to define the optimal number of clusters that best summarizes the information in the map was proposed. Finally this new protocol was applied for the conformational and functional analysis of two study cases: a) a group of single-site mutants of the α-spectrin SH3 (Spc-SH3) domain and b) the bound and unbound states of a group of protein-protein complexes involving proteins of the RAS superfamily. The results demonstrated the potential of this approach in the analysis of large ensembles of molecular conformations, i.e. the possibility of producing a topological mapping of the conformational space in a simple 2D visualisation, as well as of effectively highlighting differences in the conformational dynamics directly related to biological functions.File | Dimensione | Formato | |
---|---|---|---|
Phd_unimib_041566.pdf
accesso aperto
Tipologia di allegato:
Doctoral thesis
Dimensione
6.03 MB
Formato
Adobe PDF
|
6.03 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.