We are searching data for your request:
Upon completion, a link will appear to access the found materials.
When I was investigating the differences between protein structures obtained by X-ray crystallography and NMR spectroscopy, I found the paper  compairing structures of several proteins resolved both with X-ray and NMR. The average root-mean-square deviation (RMSD) between NMR and Xray structures is 1.4Å (max. 3.6Å), and the average RMSD between different NMR structures for same protein is 0.4Å (max. 1.3Å). I've checked some proteins studied in this paper, and they are mostly within 100-200 residues long.
However, there are plenty of papers (e.g. , , dealing with similarly sized proteins) that base their statements about conformational transitions on structures with RMSD difference 1-2Å.
I wonder what is considered reliable RMSD between two structures to draw any solid conlusions about conformational transitions (e.g. upon ligand binding) as opposed to simple thermal fluctuations or perturbance caused by the method to obtain structure?
Of course, the best way (at least in my opinion) to distinguish actual conformational transition from thermal fluctuations is to measure its lifetime, but it is often not an option.
Andrec M, Snyder DA, Zhou Z, Young J, Montelione GT, Levy RM. 2007. A large data set comparison of protein structures determined by crystallography and NMR: statistical test for structural differences and the effect of crystal packing. Proteins 69: 449-65.
Grant BJ, Gorfe AA, McCammon JA. 2009. Ras conformational switching: simulating nucleotide-dependent conformational transitions with accelerated molecular dynamics. PLoS Computational Biology 5(3): e1000325.
Kumaraswami M, Newberry KJ, Brennan RG. 2010. Conformational plasticity of the coiled-coil domain of BmrR is required for bmr operator binding: the structure of unliganded BmrR. Journal of Molecular Biology 398: 264-75.
The best way to distinguish "actual conformational transition from thermal fluctuations" in a protein is to (determine and) compare atomic-resolution structures of the ligand-free and ligand-bound protein.
Calculating the root mean square deviation of atomic structures¶
We calculate the RMSD of domains in adenylate kinase as it transitions from an open to closed structure, and look at calculating weighted RMSDs.
Last updated: June 26, 2020 with MDAnalysis 1.0.0
Minimum version of MDAnalysis: 1.0.0
MDAnalysis implements RMSD calculation using the fast QCP algorithm ([The05]). Please cite ([The05]) when using the MDAnalysis.analysis.align module in published work.
Protein motions are commonly quantified measuring structural differences between conformers. The extension of these differences are called conformational diversity. These motions are essential to understand protein biology. We have found that the distribution of conformational diversity in a large dataset of proteins could be explained in terms of three sets sharing structure-based features emerging from the conformer population for each protein. The first set, which we called rigid, involve proteins showing almost no backbone movements but with important changes in tunnels. In order of increasing conformational diversity, the other sets are called partially disordered and malleable, showing disordered regions and important cavities but with different behaviour to each other. Shared features in each set could represent conformational mechanisms related with biological functions.
Citation: Monzon AM, Zea DJ, Fornasari MS, Saldaño TE, Fernandez-Alberti S, Tosatto SCE, et al. (2017) Conformational diversity analysis reveals three functional mechanisms in proteins. PLoS Comput Biol 13(2): e1005398. https://doi.org/10.1371/journal.pcbi.1005398
Editor: Christine A. Orengo, University College London, UNITED KINGDOM
Received: September 10, 2016 Accepted: February 2, 2017 Published: February 13, 2017
Copyright: © 2017 Monzon et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: Funding for this research was provided by COST Action (BM1405) Non-Globular Proteins-net (SCET) and Universidad Nacional de Quilmes (PUNQ 1004/11) (GP). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Molecular machines convert chemical energy to mechanical work in the process of carrying out their specific tasks. Often these proteins are fueled by ATP binding and hydrolysis, enabling switching between different conformations. The ATP-dependent chaperone GroEL is a molecular machine that opens and closes its barrel-like structure in order to provide a folding cage for unfolded proteins. The quest to fully understand and control GroEL and other molecular machines is enhanced by complementing experimental work with computational approaches. Here, we provide a description of the molecular basis for the conformational changes in the GroEL subunit by performing extensive molecular dynamics simulations. The simulations sample the conformational population for the different nucleotide-free and bound states in the isolated subunit. The results reveal that the conformations of the subunit when isolated resemble those of the subunit integrated in the GroEL complex. Moreover, the molecular dynamics simulations allow following detailed changes in individual interatomic interactions brought about by ATP-binding.
Citation: Skjaerven L, Grant B, Muga A, Teigen K, McCammon JA, Reuter N, et al. (2011) Conformational Sampling and Nucleotide-Dependent Transitions of the GroEL Subunit Probed by Unbiased Molecular Dynamics Simulations. PLoS Comput Biol 7(3): e1002004. https://doi.org/10.1371/journal.pcbi.1002004
Editor: Jianpeng Ma, Baylor College of Medicine, United States of America
Received: October 6, 2010 Accepted: December 9, 2010 Published: March 10, 2011
Copyright: © 2011 Skjaerven et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The Norwegian Research Council is acknowledged for CPU resources granted through the NOTUR supercomputing program (http://www.notur.no/) and Bergen Center for Computational Science for providing powerful computer facilities (http://www.bccs.uni.no/). Work at CSIC/UPV/EHU was financed by MICINN (Grant BUF2007-64452). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
The three-dimensional structure of proteins is determined by many interactions, such as covalent bonds, hydrogen bonds, and the hydrophobic effect. Although during most conformational transitions the local structure remains intact, it was found that conformational changes of proteins are often associated with opening of one or more hydrogen bonds. Because detection of these unstable hydrogen bonds is essential for protein flexibility prediction, we have developed a method to identify such unstable hydrogen bonds based on their local environment. 4 The method rests on estimating the likelihood for the hydrogen bond to be attacked by water molecules. Irrespective of the geometry in the input structure, a hydrogen bond that can be attacked by water molecules is more likely to open than a hydrogen bond that is shielded by hydrophobic residues. 9-12 In the constraint definition process in tCONCOORD, we calculate a solvation score for each hydrogen bond. If this score exceeds a predefined threshold, the hydrogen bond is labeled as unstable and not included as a geometrical constraint. Thus, in the subsequently generated ensemble these hydrogen bonds are not necessearily conserved.
The tCONCOORD-GUI allows to load the results of the constraint definition process and to visualize the defined constraints (see Fig. 2). The structure is displayed in a PyMOL window controlled by the tCONCOORD-GUI. A protein chain is thereby represented as a simplified Cα model for clarity. Different kinds of interactions are represented as colored arrows. Moreover, each interaction is listed in a table and detailed information, e.g., hydrogen bond geometries or energies, is displayed in a text field. Hydrogen bonds that are not taken into account because of high solvation probabilities are also displayed. The threshold can be changed interactively to enable the user to thoroughly study all hydrogen bonds and to set up simulations with different thresholds.
Depending on the particular question, it might be useful to switch off interactions, e.g., to study the influence of a certain hydrogen bond on the conformational flexibility. The tCONCOORD-GUI therefore allows to interactively influence the constraint definition process by selectively enabling and disabling interactions.
We have performed parallel calculations based on three different coarse-grained potential functions.
A simple harmonic potential function, which models all interactions between residues within a cutoff radius (13 Å) as springs to calculate a subset of the slow normal modes (29), serves as a control. It is given by A uniform spring constant k of value 1 kcal/molÅ 2 is adopted for all residues that are closer than the cutoff distance. Rij represents the displacement vector, and Rij 0 represents the equilibrium distances between positions of Cα atoms i and j. N is the number of residues.
An alternative approach is taken to introduce additional restraints to the neighboring Cα atoms to model pseudo bonds between each residue. The modified potential becomes In this bond-restrained model, kbond is the spring constant of the interaction between neighbor Cα atoms, and knonbond is the spring constant of the interaction between nonbonded neighbor Cα atoms. A value of 70 kcal/molÅ 2 is adopted for kbond based on the statistical analysis and Boltzmann inversion of the crystallographic data.
The fully elaborated VAMM force field is where each of the terms is defined in a companion paper (19). In brief, Vbonded is the same bond-restraint term as for VBond-restrained in Eq. 2, including separate definitions for trans- and cis-peptide bonds and for disulfide bonds Vangle is for restraints on virtual bond angles τ, which depend on type of secondary structure Vdihedral is for restraints on virtual dihedral angles θ, which also depend on secondary structure Vnonbonded is for restraints on nonbonded interactions between atom pairs more than five residues apart, which are specific to amino acid identities and Vlocal is for restraints that maintain local geometries.
Iterative Normal-Mode-Driven Transitions.
We computed the transition pathway between two alternative conformations, A (e.g., open-state ADK) and B (e.g., closed-state ADK), by moving each structure toward the other in iterations of moves directed at each step along the normal mode of greatest engagement with its target structure. Only protein residues are modeled bound ligands are implicit. For each structure, from the starting points at A0 and B0 through intermediates Ai and Bi at each step i, the Hessian matrix is constructed from the particular potential function used in that analysis (Eq. 1, 2, or 3) and diagonalized to compute normal modes (30, 31).
In general, 20 slow modes are considered sufficient to span the slow fluctuations of macromolecules that sample the functional motions (17). Thus, we screened the 20 lowest-frequency normal modes for each of conformations Ai and Bi to find that normal mode from each vibrating state having the greatest Marques–Sanejouand overlap factor (32) with the alternate, target state. This factor, also called the involvement coefficient by Ma and Karplus (33), measures how much a given normal mode contributes to the molecular displacement between two conformers. The involvement coefficient, Iik, is the projection of the normal-mode vector onto the linear displacement: where at step i, Lkj is the component from eigenvector k acting on atom j, and ΔRj is the displacement vector between the intermediate states generated from alternative starting structures. Both for the mode from Ai with highest involvement coefficient into the direction of Bi and for the mode of Bi with highest involvement coefficient into the direction Ai, each direction of the mode must be tested, which we do by calculating the pairwise rmsd values between four possible conformers generated from states Ai and Bi after applying shifts dictated by the respective modes in each of the alternative directions.
The shifts for each move for residue j are found by where C is a constant chosen as 0.005, λk is the eigenvalue of mode k, Ljk is the eigenvector for mode k acting at atom j, and ± refers to the alternative directions for this mode. Tests showed that the nature of conformational transitions is relatively insensitive to the value of C within a certain interval (Table S2), and 0.005 was chosen so that the calculations are performed within a feasible CPU time and without unrealistic distortions due to large step sizes.
The two new conformers from shift sets giving minimal rmsd values then become the input structures for the next iteration step. The process continues until intermediate conformers Ai and Bi converge to within the convergence criteria, which is usually set as an rmsd value of 1 Å. This cutoff value is appropriate because the normal modes calculated from two structures that are closer than 1 Å rmsd are not significantly different (34), and two such structures are generally very similar, except at flexible loops. A more lenient criterion might be set in cases where complicated motions, such as in flexible loops or local folding events, are also involved in contrast to the transition observed in ADK.
Computations of conformational transitions with the simple harmonic potential (Eq. 1) or the bond-restrained harmonic potential (Eq. 2) use the algorithm shown in Fig. S3. Normal modes and iterative moves are made as described above, sampling the 20 lowest-frequency modes and moving in the direction of normal modes having greatest involvement in the directions toward the respective target structures.
Computations of conformational transitional pathways with the VAMM potential function (Eq. 3) use the algorithm shown in Fig. S4. The VAMM algorithm is similar to the control algorithm, but additional requirements are imposed. In the initial step, secondary structures needed for VAMM are assigned from the crystal structure coordinates of conformers A0 and B0 by using the DSSP algorithm (35). This step is followed by a truncated Newton minimization of the Cα system in the VAMM force field to a gradient of 0.1 kcal/molÅ (36). Hessian matrices are constructed from each of these minimized structures by using VAMM and are then diagonalized for normal-mode evaluations. The procedure is iterated as described above to generate successive intermediates Ai and Bi. As the calculation proceeds, secondary structures of the computed intermediate states are updated at intervals of 0.1 rmsd (rmsdUPDATE) between the calculated intermediate states. The rmsdUPDATE values between 0.1 and 0.5 Å generated reliable and consistent results on calculation of the ADK transition. In addition, energy minimizations of the intermediate states are performed whenever the energy gradient rises above 1 kcal/molÅ. This approach keeps the system near-local energy minima and also helps to relax artificially high strains. The convergence criterion is again 1 Å rmsd.
Distributions of Virtually Bonded Conformation Parameters.
Probability contour plots for relevant conformational parameters were evaluated from the same Top500 database (21) as used to produce Ramachandran plots by the MolProbity software (37). Details are in SI Text.
Strain Energy of Intermediate States.
Strains that accumulate during conformational transitions are calculated with a double-well potential similar to that for the plastic network model (18). Details are in SI Text.
RMSD during conformational transition in proteins - Biology
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited.
Feature Papers represent the most advanced research with significant potential for high impact in the field. Feature Papers are submitted upon individual invitation or recommendation by the scientific editors and undergo peer review prior to publication.
The Feature Paper can be either an original research article, a substantial novel research study that often involves several techniques or approaches, or a comprehensive review paper with concise and precise updates on the latest progress in the field that systematically reviews the most exciting advances in scientific literature. This type of paper provides an outlook on future directions of research or possible applications.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to authors, or important in this field. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.
The information (AIRs) used to drive the docking together with i-RMSD, l-RMSD and fnat figures for the protein-protein, protein-DNA systems in post-sampling are available as supplementary material (Data S1). All models generated for the various complexes together with their statistics are available for download from the SBGrid data repository 36 (https://data.sbgrid.org/labs/32/, https://doi.org/10.15785/SBGRID/707).
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
Generating conformational transition paths with low potential-energy barriers for proteins
The knowledge of conformational transition paths in proteins can be useful for understanding protein mechanisms. Recently, we have introduced the As-Rigid-As-Possible (ARAP) interpolation method, for generating interpolation paths between two protein conformations. The method was shown to preserve well the rigidity of the initial conformation along the path. However, because the method is totally geometry-based, the generated paths may be inconsistent because the atom interactions are ignored. Therefore, in this article, we would like to introduce a new method to generate conformational transition paths with low potential-energy barriers for proteins. The method is composed of three processing stages. First, ARAP interpolation is used for generating an initial path. Then, the path conformations are enhanced by a clash remover. Finally, Nudged Elastic Band, a path-optimization method, is used to produce a low-energy path. Large energy reductions are found in the paths obtained from the method than in those obtained from the ARAP interpolation method alone. The results also show that ARAP interpolation is a good candidate for generating an initial path because it leads to lower potential-energy paths than two other common methods for path interpolation.
This is a preview of subscription content, access via your institution.
The information about solvent concentration and experimental procedures applied to protein crystallization is not always available from the PDB files (i. e. incomplete or absent information). To solve this problem, we built a consensus list of organic solvents and non-aqueous crystallization media which are commonly used in the crystallization process to do this, we referred to crystallographic manuals and research articles. Then, we used this list to search crystal structures (without mutations and resolution < 4 Å) from the database of Conformational Diversity in the Native State of proteins (CoDNaS) (a conformational diversity database, based on a collection of redundant structures for the same protein, linked with physicochemical and biological information) . The presence of these organic molecules in the crystal, indicated in the HETATOM field of the PDB files, was used for distinguishing the aqueous from the non-aqueous environment structures, and for building the “large” dataset. The large dataset then contains 1737 proteins with 3474 conformers. We also considered another dataset resulting from the web scrapping method and hand curation for the collection of structures related to soaking and co-crystallization methods in organic solvents, which contained 33 proteins and 2755 structures. In this case, the structures were collected using the web scrapping method, in which bibliographic databases were explored to gather research articles related with soaking and co-crystallization methods in organic solvents and/or non-aqueous media. Using the text mining method, all the articles found were analyzed and related to a PDB structure. The structures obtained were linked with their respectively CoDNaS entries in order to get the conformers for each protein. This last dataset was considered as a “control” one and all its tendencies were contrasted with those in “large” ones. Pairs of conformers were explored for the presence of bound ligands, in order to obtain bound-bound and unbound-unbound pairs of conformers to avoid bias in the analysis of conformational diversity. Presence of bound ligands was evaluated using BioLiP database .
Both datasets were presented and analyzed as having three subgroups of pairs of conformers: those in which both conformers contained any of the common organic solvents and/or non-aqueous media used in protein structure estimations in our list (see Additional file 1: Table S1) or were structures obtained from research articles related with co-crystallization and soaking in organic solvents (OO) those in which only one of them had the organic molecules in its crystal (AO) and those in which no organic solvent was found (AA). In each set, we only considered the highest C-alpha Root Mean Square Deviation (RMSD) between the corresponding conformers for a given protein. Therefore, we obtained three subgroups for the large dataset (AA, AO and OO with 9680, 1737, 2062 pairs of conformers, respectively) and three subgroups for the control dataset (AA, AO and OO with 33, 31, 25 pairs of conformers, respectively).
To estimate the structural dissimilarity between conformers, we used the C-alpha RMSD, which was calculated using MAMMOTH . The accessible surface area (ASA) is the surface area of a biomolecule that is accessible to a solvent. ASA calculations for each conformer were obtained using NACCESS (S. Hubbard and J. Thornton. 1993. NACCESS, Computer Program. Department of Biochemistry Molecular Biology, University College London). Global ASA corresponds to the sum of absolute ASA values of each residue and relative ASA is calculated for each amino acid in the protein by expressing the various residue accessible surfaces summed as a percentage of that observed in a ALA-X-ALA tripeptide.
To obtain a measurement of the amino acid movements, we have calculated the amount of amino acids buried (ASAs lower than 25% were considered buried, and ASAs over 25% were considered exposed) for the three populations. All the data was processed using our own scripts coded in Python.
To explore the transitions between the different secondary structures, we defined the secondary structure for each conformer using DSSP . The C-alpha and residue atoms RMSD per position was calculated using ProFit (Martin, A. C. R. and Porter, C. T. http://www.bioinf.org.uk/software/profit/). Disorder was assumed as represented by missing electron density coordinates in a structure determined by X-Ray diffraction . To define intrinsically disordered regions (IDRs) we only considered those segments with five or more consecutive missing residues which were not in the amino or carboxyl terminal ends of the protein sequence (the first and last 20 residues of the chain were excluded). Fold class and superfamily were studied using CATH database . As control and large dataset showed the same trend in terms of backbone RMSD, these structural analyses were performed only in the large dataset.
All data obtained were retrieved and processed using home-made scripts coded in Python.
Radii of gyration and H-bonds
Radii of gyration for all PDB structures were estimated using the MMTSB tools (http://blue11.bch.msu.edu/mmtsb/Main_Page). For the calculation of the number of hydrogens bonds we used HBPLUS . Comparisons between conformers were made using our own Python scripts.
Tunnels and cavities calculation
The number of cavities and tunnels, as well as their properties, were estimated for all conformers using Fpocket  and MOLE . All data obtained were retrieved and processed using our own scripts coded in Python.
Dataset distributions were assumed to be continuous and not parametric, which was confirmed by D’Agostino and Pearson’s normal test. Comparisons within groups were made by Kolmogorov-Smirnov test, as appropriate. One-way ANOVA was used for multigroup comparisons. A P-value < 0.05 was taken to indicate statistical significance.