Analysis and Visual Summarization of Molecular Dynamics Simulation

Lade...
Vorschaubild
Dateien
Devadoss_0-257299.pdf
Devadoss_0-257299.pdfGröße: 5.42 MBDownloads: 1076
Datum
2014
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
ArXiv-ID
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Open Access Green
Sammlungen
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Dissertation
Publikationsstatus
Published
Erschienen in
Zusammenfassung

The three-dimensional structure of a protein defines not only its size and shape, but also its function. The biological functions of the proteins are generally controlled by cooperative motions or correlated fluctuations of the three-dimensional structures. Experimental techniques such as X-ray crystallography and NMR spectroscopy are extremely valuable and, at the moment, irreplaceable tools in determining the structures of the proteins in atomic detail [1, 2]. Mechanistic details can be deduced from these techniques by either finding a sequence of stable end-states of conformational transitions or trapping long-lived intermediates using molecular modifications. The exact conformational transitions between these states are, however, very difficult to characterize experimentally. Theoretical methods like molecular dynamics simulations starting from the experimental structure can fill this gap. One such example cooperative structural changes is the closing mechanism of DNA polymerase I, which catalyzes all DNA synthesis in nature often with astounding speed and accuracy. The hand-like arrangement, including a thumb, a palm and a fingers domain, of this enzyme plays an important role by inducing structural rearrangements in the form of a movement of the fingers domain towards the thumb domain, i.e., the transition from open to closed form, during nucleotide insertion [47-49].


To study this mechanism in more detail and identify reasons for the increased fidelity of DNA polymerase I mutants identified in the group of Andreas Marx [50], a specific kind of molecular dynamics simulations, targeted molecular dynamics (TMD) [106], are anticipated. To understand the behavior of such simulations regarding the chosen parameters but also to have an unbiased approach of analysis at hand, the aim of my thesis was to provide the prerequisites for starting simulations of the closing mechanism of DNA polymerase I (Klentaq1).


The first challenge here was to design a new (unbiased) criterion for characterizing the global changes with the possibility to identify the local changing parts also. Molecular dynamics simulation, a complementary method to the experimental techniques in elucidating key aspects of biological processes, computes the complete ensemble of conformations as a function of time. The results of MD simulations are stored in the form of trajectories, which are huge in size and take long time for the analysis. Measures like distances, contacts, hydrogen bonds, angles, torsion angles, radius of gyration and methods like secondary structure analysis and principle component analysis are used to analyze the MD simulation results. Among these measures, the most commonly used procedure is to calculate the root mean square deviation (RMSD), the root mean square distance between the corresponding atomic positions in two structures after the rotation and translation of one structure to align it optimally onto the other, in order to quantify similarity. This is a global measure and does not provide information of the changing parts of the structures. Apart from the traditional measures, TimeScapes [23], an automated method using a coarse-grained representation of amino acid side-chain (representative atoms) and calculates the distance between all pairs of the representative atoms, is very useful in the detection of potentially important structure-changing events in long MD trajectories. The above-mentioned measures are either global measures with no distinctive information about the local changes or local measures without the information on the overall changes.


To overcome the problems associated with the use of RMSD and other measures, in this thesis, I propose to use Cα torsion angles [45] – torsion angles derived from four consecutive Cα atoms – as an unbiased measure to analyze the MD simulation results. It is a highly valuable similarity measure on the global as well as substructure scale and can help to find major events, i.e., the molecular parts of the protein involved in the structural changes (spatial domain) and the times (temporal domain) at which the changes occurred.


An in-house program was used to calculate Cα torsion angles, and a ‘m x n’ matrix, called Cα torsion angle progression matrix, was formed, where m represents the number of structures in the MD trajectory and n represents the number of Cα torsion angles. A method, namely Cα torsion angles total score (CATATS) method, based on a total similarity score, which uses the differences in the Cα torsion angles between the conformations to characterize the undergoing conformational transitions, was developed. This method was used to describe the global similarities derived from the Cα torsion angles in a number of biological test systems of different size. Three artificially high-temperature unfolding simulations of polypeptides with different amino acids length, α-Conotoxin (16 amino acids), Crambin (46 amino acids) and Ubiquitin (76 amino acids), were carried out. The total scores of each simulation were arranged as CATATS matrix and similarly a RMSD matrix was also formed. These matrices were represented as heat maps. This approach is not only providing an easy and quick way to compare a single structure with other structures, but is also very useful in visualizing and grouping the similar conformational structures into clusters. The main disadvantages of CATATS method are the inconclusiveness in distinguishing the influence of highly flexible parts and changes from different parts of the structures. Significant events (large changes of single central torsion angles) are masked by many thermally fluctuating torsion angles.
To identify the major structurally changing parts, filter out the important amino acids, and the times at which the structural changes are happening, another methods based on single Cα torsion angle was developed. This was tested on an unconstrained (20 ns) simulation of an open-form ternary complex of the large fragment of Thermus aquaticus DNA polymerase I (Klentaq1). The graphical representation of the Cα torsion angle progression matrix was given in the form of heat maps. By visual inspection of the heat map, two significantly changing regions, first region belongs to the thumb domain (torsion angle numbers 181 – 230, corresponding to residues 475 -527) and the second region belongs to the fingers domain (torsion angle numbers 341 – 400, corresponding to residues 635 – 697), were identified after removing the rigid and flexible parts. The heat maps of these regions showed that the DNA polymerase I traversed a couple of biologically relevant conformational changes during the course of the MD simulation. The transitions leading to the (meta) stable intermediate were identified by clustering the Cα torsion angles with the cutoff criterion that specifies the minimum dissimilarity at which two structures are considered to belong to two different clusters. The (meta) stable structure was identified around half time of the 20 ns simulation based on Cα torsion angle clustering, which was confirmed by the calculation of the RMSD values compared to the open and closed form showed that this metastable structure is an intermediate of the closing process and named as half-closed form. The torsion angle numbers 354 to 387 (largest flexible group), 395 to 407 and 438 to 453 (two additional groups) from the fingers domain and the torsion angle numbers 182 to 202 from the thumb domain are the torsion angles responsible for the partial closing mechanism. Finally, a detailed analysis was carried out by concentrating on these spatial regions highlighted by Cα torsion angle changes. TimeScapes approach was used and identified a salt bridge, which brought the fingers and the thumb domain to approach each other and initiated the closing mechanism. By combining the Cα torsion angles with the RMSD and the TimeScapes approach, as they give complementary information, one can extract secrets out of the MD simulations.


The second challenge was to identify optimal parameters for the TMD simulations on the closing mechanism of DNA polymerase I. These events have to be accelerated to be observed in a computationally feasible MD simulation. TMD simulation, a well-suited method to calculate the transition pathways by continuously diminishing the RMSD value between initial and target structures by means of steering forces, was carried out to see the transitions from the open to the closed enzyme form of DNA polymerase I. Six TMD simulations with different constrains (cases) were carried out. For more details on constrains, the reader can refer Table 4.2 of chapter 4 of the thesis. Among the six different cases, case 2 and case 4 showed better results with the incoming nucleotide paired with its pairing base from the template, with the RMSD value of 0.478 Å and 0.533 Å to the target structure, respectively.


The influences of different starting structures on the progression of the TMD simulation were analyzed with the starting structures taken from a normal MD simulation after 1ns (case 4), 5 ns (case 5), and 10 ns (case 6). In case 5 and case 6, the final orientation of the pairing base was totally different, because of the target force applied during TMD simulation forced the pairing base and its surroundings differently, which led to dead-lock situations. This behavior is, on the one hand, caused by the non-optimal starting structure since especially the nucleobases change their orientation in the simulation used for the starting-structure generation. On the other hand, it is also caused by suboptimal constraints used in the TMD simulation. In case 2, the flipping of Tyr 671 away from the template, the movement of O-helix and a sequence of structural changes in the binding site allowed the pairing base to form a Watson-Crick base-pair with the incoming nucleotide and completed the transition from the open to the closed enzyme form in an appropriate manner even when starting from the 10ns starting structure. Thus, I proposed the setup in case 2 as a preferred protocol to investigate the closing mechanism of DNA polymerase I.


With the knowledge obtained from these six different cases, the optimal choices for the following parameters and important simulation details have been identified: (1) the atoms, which had to be added or removed in the initial and the target structures to fulfil the prerequisite of equal number of atoms; (2) the target-fit-masks, which are used to best-fit the target structure to the simulation structure; (3) the target-rms-masks, which are used to calculate the RMSD; (4) the positional constraints, which are used to avoid the rotational and translational motion of the system during the TMD simulation; and (5) constraints up to Cγ atoms, which are able to enforce the relevant changes but with reduced deadlocks in single side chains; with no constraints on equivalent atoms in lipophilic side chains like VAL, ILE, and LEU.


By identifying the optimal parameters including the set of atoms to be constrained and the development of an unbiased method to analyze the structural changes in the system and the time series of these changes in the large number of anticipated simulations, all perquisites for studying the closing mechanism of DNA polymerase I are now fulfilled. To understand the mechanism and the influences of the involved species, longer TMD simulations with appropriate positional constrains as well as target-fit-masks and target-rms-masks according to case 2 will be carried out in the near future. For statistical significance, this study will be continued with multiple parallel simulations to characterize the influences of different incoming nucleotides, different mismatched base-pairs and different mutants of DNA polymerase I during the closing mechanism enforced by TMD simulation.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
540 Chemie
Schlagwörter
Molecular Dynamics Simulations, Analysis, C-alpha Torsion Angles, TimeScapes, Time Series, Torsion Angle Heatmaps, Visualization, DNA polymerase I, Klentaq
Konferenz
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690DEVADOSS, Fredrick Robin, 2014. Analysis and Visual Summarization of Molecular Dynamics Simulation [Dissertation]. Konstanz: University of Konstanz
BibTex
@phdthesis{Devadoss2014Analy-29131,
  year={2014},
  title={Analysis and Visual Summarization of Molecular Dynamics Simulation},
  author={Devadoss, Fredrick Robin},
  address={Konstanz},
  school={Universität Konstanz}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29131">
    <dc:language>eng</dc:language>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/29131/3/Devadoss_0-257299.pdf"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-10-16T11:25:13Z</dcterms:available>
    <dcterms:title>Analysis and Visual Summarization of Molecular Dynamics Simulation</dcterms:title>
    <dcterms:issued>2014</dcterms:issued>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29"/>
    <dcterms:abstract xml:lang="eng">The three-dimensional structure of a protein defines not only its size and shape, but also its function. The biological functions of the proteins are generally controlled by cooperative motions or correlated fluctuations of the three-dimensional structures. Experimental techniques such as X-ray crystallography and NMR spectroscopy are extremely valuable and, at the moment, irreplaceable tools in determining the structures of the proteins in atomic detail [1, 2]. Mechanistic details can be deduced from these techniques by either finding a sequence of stable end-states of conformational transitions or trapping long-lived intermediates using molecular modifications. The exact conformational transitions between these states are, however, very difficult to characterize experimentally. Theoretical methods like molecular dynamics simulations starting from the experimental structure can fill this gap. One such example cooperative structural changes is the closing mechanism of DNA polymerase I, which catalyzes all DNA synthesis in nature often with astounding speed and accuracy. The hand-like arrangement, including a thumb, a palm and a fingers domain, of this enzyme plays an important role by inducing structural rearrangements in the form of a movement of the fingers domain towards the thumb domain, i.e., the transition from open to closed form, during nucleotide insertion [47-49].&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;To study this mechanism in more detail and identify reasons for the increased fidelity of DNA polymerase I mutants identified in the group of Andreas Marx [50], a specific kind of molecular dynamics simulations, targeted molecular dynamics (TMD) [106], are anticipated. To understand the behavior of such simulations regarding the chosen parameters but also to have an unbiased approach of analysis at hand, the aim of my thesis was to provide the prerequisites for starting simulations of the closing mechanism of DNA polymerase I (Klentaq1).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The first challenge here was to design a new (unbiased) criterion for characterizing the global changes with the possibility to identify the local changing parts also. Molecular dynamics simulation, a complementary method to the experimental techniques in elucidating key aspects of biological processes, computes the complete ensemble of conformations as a function of time. The results of MD simulations are stored in the form of trajectories, which are huge in size and take long time for the analysis. Measures like distances, contacts, hydrogen bonds, angles, torsion angles, radius of gyration and methods like secondary structure analysis and principle component analysis are used to analyze the MD simulation results. Among these measures, the most commonly used procedure is to calculate the root mean square deviation (RMSD), the root mean square distance between the corresponding atomic positions in two structures after the rotation and translation of one structure to align it optimally onto the other, in order to quantify similarity. This is a global measure and does not provide information of the changing parts of the structures. Apart from the traditional measures, TimeScapes [23], an automated method using a coarse-grained representation of amino acid side-chain (representative atoms) and calculates the distance between all pairs of the representative atoms, is very useful in the detection of potentially important structure-changing events in long MD trajectories. The above-mentioned measures are either global measures with no distinctive information about the local changes or local measures without the information on the overall changes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;To overcome the problems associated with the use of RMSD and other measures, in this thesis, I propose to use Cα torsion angles [45] – torsion angles derived from four consecutive Cα atoms – as an unbiased measure to analyze the MD simulation results. It is a highly valuable similarity measure on the global as well as substructure scale and can help to find major events, i.e., the molecular parts of the protein involved in the structural changes (spatial domain) and the times (temporal domain) at which the changes occurred.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;An in-house program was used to calculate Cα torsion angles, and a ‘m x n’ matrix, called Cα torsion angle progression matrix, was formed, where m represents the number of structures in the MD trajectory and n represents the number of Cα torsion angles. A method, namely Cα torsion angles total score (CATATS) method, based on a total similarity score, which uses the differences in the Cα torsion angles between the conformations to characterize the undergoing conformational transitions, was developed. This method was used to describe the global similarities derived from the Cα torsion angles in a number of biological test systems of different size. Three artificially high-temperature unfolding simulations of polypeptides with different amino acids length, α-Conotoxin (16 amino acids), Crambin (46 amino acids) and Ubiquitin (76 amino acids), were carried out. The total scores of each simulation were arranged as CATATS matrix and similarly a RMSD matrix was also formed. These matrices were represented as heat maps. This approach is not only providing an easy and quick way to compare a single structure with other structures, but is also very useful in visualizing and grouping the similar conformational structures into clusters. The main disadvantages of CATATS method are the inconclusiveness in distinguishing the influence of highly flexible parts and changes from different parts of the structures. Significant events (large changes of single central torsion angles) are masked by many thermally fluctuating torsion angles.&lt;br /&gt;To identify the major structurally changing parts, filter out the important amino acids, and the times at which the structural changes are happening, another methods based on single Cα torsion angle was developed. This was tested on an unconstrained (20 ns) simulation of an open-form ternary complex of the large fragment of Thermus aquaticus DNA polymerase I (Klentaq1). The graphical representation of the Cα torsion angle progression matrix was given in the form of heat maps. By visual inspection of the heat map, two significantly changing regions, first region belongs to the thumb domain (torsion angle numbers 181 – 230, corresponding to residues 475 -527) and the second region belongs to the fingers domain (torsion angle numbers 341 – 400, corresponding to residues 635 – 697), were identified after removing the rigid and flexible parts. The heat maps of these regions showed that the DNA polymerase I traversed a couple of biologically relevant conformational changes during the course of the MD simulation. The transitions leading to the (meta) stable intermediate were identified by clustering the Cα torsion angles with the cutoff criterion that specifies the minimum dissimilarity at which two structures are considered to belong to two different clusters. The (meta) stable structure was identified around half time of the 20 ns simulation based on Cα torsion angle clustering, which was confirmed by the calculation of the RMSD values compared to the open and closed form showed that this metastable structure is an intermediate of the closing process and named as half-closed form. The torsion angle numbers 354 to 387 (largest flexible group), 395 to 407 and 438 to 453 (two additional groups) from the fingers domain and the torsion angle numbers 182 to 202 from the thumb domain are the torsion angles responsible for the partial closing mechanism. Finally, a detailed analysis was carried out by concentrating on these spatial regions highlighted by Cα torsion angle changes. TimeScapes approach was used and identified a salt bridge, which brought the fingers and the thumb domain to approach each other and initiated the closing mechanism. By combining the Cα torsion angles with the RMSD and the TimeScapes approach, as they give complementary information, one can extract secrets out of the MD simulations.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The second challenge was to identify optimal parameters for the TMD simulations on the closing mechanism of DNA polymerase I. These events have to be accelerated to be observed in a computationally feasible MD simulation. TMD simulation, a well-suited method to calculate the transition pathways by continuously diminishing the RMSD value between initial and target structures by means of steering forces, was carried out to see the transitions from the open to the closed enzyme form of DNA polymerase I. Six TMD simulations with different constrains (cases) were carried out. For more details on constrains, the reader can refer Table 4.2 of chapter 4 of the thesis. Among the six different cases, case 2 and case 4 showed better results with the incoming nucleotide paired with its pairing base from the template, with the RMSD value of 0.478 Å and 0.533 Å to the target structure, respectively.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The influences of different starting structures on the progression of the TMD simulation were analyzed with the starting structures taken from a normal MD simulation after 1ns (case 4), 5 ns (case 5), and 10 ns (case 6). In case 5 and case 6, the final orientation of the pairing base was totally different, because of the target force applied during TMD simulation forced the pairing base and its surroundings differently, which led to dead-lock situations. This behavior is, on the one hand, caused by the non-optimal starting structure since especially the nucleobases change their orientation in the simulation used for the starting-structure generation. On the other hand, it is also caused by suboptimal constraints used in the TMD simulation. In case 2, the flipping of Tyr 671 away from the template, the movement of O-helix and a sequence of structural changes in the binding site allowed the pairing base to form a Watson-Crick base-pair with the incoming nucleotide and completed the transition from the open to the closed enzyme form in an appropriate manner even when starting from the 10ns starting structure. Thus, I proposed the setup in case 2 as a preferred protocol to investigate the closing mechanism of DNA polymerase I.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;With the knowledge obtained from these six different cases, the optimal choices for the following parameters and important simulation details have been identified: (1) the atoms, which had to be added or removed in the initial and the target structures to fulfil the prerequisite of equal number of atoms; (2) the target-fit-masks, which are used to best-fit the target structure to the simulation structure; (3) the target-rms-masks, which are used to calculate the RMSD; (4) the positional constraints, which are used to avoid the rotational and translational motion of the system during the TMD simulation; and (5) constraints up to Cγ atoms, which are able to enforce the relevant changes but with reduced deadlocks in single side chains; with no constraints on equivalent atoms in lipophilic side chains like VAL, ILE, and LEU.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;By identifying the optimal parameters including the set of atoms to be constrained and the development of an unbiased method to analyze the structural changes in the system and the time series of these changes in the large number of anticipated simulations, all perquisites for studying the closing mechanism of DNA polymerase I are now fulfilled. To understand the mechanism and the influences of the involved species, longer TMD simulations with appropriate positional constrains as well as target-fit-masks and target-rms-masks according to case 2 will be carried out in the near future. For statistical significance, this study will be continued with multiple parallel simulations to characterize the influences of different incoming nucleotides, different mismatched base-pairs and different mutants of DNA polymerase I during the closing mechanism enforced by TMD simulation.</dcterms:abstract>
    <dc:contributor>Devadoss, Fredrick Robin</dc:contributor>
    <dc:rights>terms-of-use</dc:rights>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:creator>Devadoss, Fredrick Robin</dc:creator>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/29131/3/Devadoss_0-257299.pdf"/>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/29131"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-10-16T11:25:13Z</dc:date>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
September 19, 2014
Hochschulschriftenvermerk
Konstanz, Univ., Diss., 2014
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Begutachtet
Diese Publikation teilen