We are searching data for your request:
Upon completion, a link will appear to access the found materials.
Every surface of your body harbors a flourishing microbial ecosystem. This is particularly true of the gastrointestinal system, which runs from your mouth and esophagus (with a detour to the nose), through the stomach, into the small and large intestine and the colon291. These environments differ in terms of a number properties, including differences in pH and O2 levels. Near the mouth and esophagus O2 levels are high and microbes can use aerobic (O2 dependent) respiration to extract energy from food. Moving through the system O2 levels decrease until anaerobic (without O2) mechanisms are necessary. At different position along the length of the gastrointestinal trackmicrobes with different ecological preferences and adaptabilities are found.
One challenge associated with characterizing the exact complexity of the microbiome present at various locations is that often the organisms present are dependent upon one another for growth; when isolated from one another they do not grow. The standard way to count bacteria is to grow them in the lab using plates of growth media. Samples are diluted so that single bacteria land (in isolation from one another) on the plate. When they grow and divide, they form macroscopic colonies and it is possible to count the number of “colony forming units” (CFUs) per original sample volume. This provides a measure of the number of individual bacteria present. If a organism cannot not form a colony under the assay conditions, it will appear to be absent from the population. But as we have just mentioned some bacteria are totally dependent on others and therefore do not grow in isolation. To avoid this issue, newer molecular methods use DNA sequence analyses to identify which organisms are present without having to grow them292. The result of this type of analysis reveals the true complexity of the microbial ecosystems living on and within us293.
For our purposes, we will focus on one well known, but relatively minor member of this microbial community, Escherichia coli294. E. coli is a member of the Enterobacteriaceae family of bacteria and is found in the colon of birds and mammals295. coli is what is known as a facultative aerobe, it can survive in both an anaerobic and an aerobic environment. This flexibility, as well as E. coli’s generally non-fastidious nutrient requirements make it easy to grow in the laboratory. Moreover, the commonly used laboratory strain of E. coli, known as K12, does not cause disease in humans. That said, there are other strains of E. coli, such as E. coli O157:H7 that is pathogenic (disease-causing). coli O157:H7 contains 1,387 genes not found in the E. coli K12. It is estimated that the two E. coli strains diverged from a common ancestor ~4 million years ago. The details of what makes E. coli O157:H7 pathogenic is a fascinating topic, but beyond our scope here296.
Adaptive behavior and gene networks (the lac response): Lactose is a disaccharide (a sugar) composed of D-galactose and D-glucose. It is synthesized, biologically, exclusively by female mammals. Mammals use lactose in milk as a source of calories (energy) for infants. One reason (it is thought) is that lactose is not easily digested by most microbes. The lactose synthesis system is derived from an evolutionary modification of an ancestral gene that encodes the enzyme lysozyme. Through duplication and mutation, a gene encoding the protein α-lactoalbumin was generated. α-lactoalbumin is expressed only in mammary glands, where it forms a complex with a ubiquitously expressed protein, galactosyltransferase, to form the protein lactose synthase297.
E. coli is capable of metabolizing lactose, but only when there are no better (easier) sugars to eat. If glucose or other compounds are present in the environment the genes required to metabolize lactose are turned off. Two genes are required for E. coli to metabolize lactose. The first encodes lactose permease. Lactose, being large and highly hydrophilic cannot pass through the E. coli cell membrane. Lactose permease is a membrane protein that allows lactose to enter the cell, moving down its concentration gradient. The second gene involved in lactose utilization encodes the enzyme β-galactosidase, which splits lactose into D-galactose and D-glucose, both of which can be metabolized by proteins expressed constitutively (that is, all of the time) within the cell. So how exactly does this system work? How are the lactose utilization genes turned off in the absence of lactose and how are they turned on when lactose is present and energy is needed. The answers illustrate general principles of the interaction networks controlling gene expression.
In E. coli, like many bacteria, multiple genes are organized into what are known as operons. In an operon, a single regulatory region controls the expression of multiple genes. It is also common in bacteria that multiple genes involved in a single metabolic pathway are located in the same operon (the same region of the DNA). A powerful approach to the study of genes is to look for relevant mutant phenotypes. As we said, wild type (that is, normal) E. coli can grow on lactose as their sole energy sources. So to understand lactose utilization, we can look mutant E. coli that cannot grow on lactose298. To make the screen for such mutations more relevant, we first check to make sure that the mutant can grow on glucose. Why? Because we are not really interested (in this case) in mutations in genes that disrupt standard metabolism, for example the ability to use glucose; we seek to understand the genes involved in the specific process of lactose metabolism. Such an analysis revealed a number of distinct classes of mutations:. some led to an inability to respond to the presence of lactose in the medium, others led to the de-repression, that is the constant expression of two genes involved in the ability to metabolize lactose, lactose permease and β-galactosidase. In these mutant strains both genes were expressed where or not lactose was present. By mapping (using the Hfr system, see above) where these mutations are in the genome of E. coli, and a number of other experiments, the following model was generated.
The genes encoding lactose permease (lacY) and β-galactosidase (lacZ) are part of an operon, known as the lac operon. This operon is regulated by two distinct factors. The first is the product of a constitutively active gene, lacI, which encodes a polypeptide that assembles into a tetrameric protein that acts as a transcriptional repressor. In a typical cell there are ~10 lac repressor proteins present. The lac repressor protein binds to sites in the promoter of the lac operon. When bound to these sites the repressor protein blocks transcription (expression) of the lac operon. The repressor’s binding sites within the lac operon promoter appear to be its only functionally significant binding sites in the entire E. coli genome. The second regulatory element in the system is known as the activator site. It can bind the catabolyte activator protein (or CAP), which is encoded by a gene located outside of the lac operon. The DNA binding activity of CAP is regulated by the binding of a co-factor, cyclic adenosine monophosphate (cAMP). cAMP accumulates in the cell when nutrients, specifically free energy delivering nutrients (like glucose) are low. Its presence acts as a signal that the cell needs energy. In the absence of cAMP, CAP does not bind to or activate expression of the lac operon, but in its presence (that is, when energy is needed), the CAP-cAMP protein is active, binds to a site in the lac operon promoter, recruits and activates RNA polymerase, leading to the synthesis of lactose permease and β-galactosidase RNAs and proteins. However, even if energy levels are low (and cAMP levels are high), the lac operon is inactive in the absence of lactose because of the binding of the lac repressor protein to sites (labelled 01, 02, and 03) in lac the regulatory region of the operon.
So what happens when lactose appears in the cell’s environment? Well, obviously nothing, since the cells are expressing the lac repressor, so no lactose permease is present, and lactose cannot enter the cell without it. But that prediction assumes that, at the molecular level, the system works perfectly and deterministically. This is not the case, however, the system is stochastic, that is subject to the effects of random processes - it is noisy and probabilistic.Given the small number of lac repressor molecules per cell (~10), there is a small but significant chance that, at random, the lac operon of a particular cell will be free of bound repressor. If this occurs under conditions in which CAP is active then, if lactose is present, we see the effect of positive feedback loop299. When lactose is added, those cells that have, by chance, expressed both the lactose permease and of β-galactosidase (a small percentage of the total cell population) will respond: lactose will enter these cells (since the permease is present)and, since β-galactosidase is also present, it will be converted to allolactone (a reaction catalyzed by β-galactosidase. Allolactone binds to, and inhibits the activity of the lac repressor protein. In the presence of allolactone the repressor no longer inhibits lac operon expression and there is further increase (~1000 fold)in the rate of expression of lactose permease and β-galactosidase. β-galactosidase also catalyzes the hydrolysis of lactose into D-galactosidase and D-glucose, which are then used to drive cellular metabolism. Through this process, the cell goes from essentially no expression of the lac operon to full expression, and with full expression becomes able to metabolize lactose. At the same time, those cells that did not (by chance) express lactose permease and β-galactosidase will not be able to metabolizing lactose at all. So even though all of the E. coli cells present in a culture may be genetically identical, they can express different phenotypes due to the stochastic nature of gene expression. An example of such behavior is presented in the PhET gene expression basics applet300. In the case of the lac system, over time the noisy nature of gene expression lead to more and more cells activating their copy of the lac operon. Once “on”, as long as lactose is present in the system, its entry into the cell and its conversion into allolactone will keep the lac repressor protein in an inactive state and allow continued expression of the lac operon.
What happens if lactose disappears from the environment, what determines how long it takes for the cells to return to the state in which they no longer express the lac operon? The answer is determined by the effects of cell division and regulatory processes. In the absence of lactose the allolactone concentration falls and the lac repressor protein returns to its active state and inhibits expression of the lac operon. No new lactose permease and β-galactosidase will be synthesized and their concentrations will fall due to degradation. At the same time, and again because their synthesis has stopped, with each cell division the concentration of the lactose permease and β-galactosidase will decreases by ~50%. With time the proteins will be diluted (and degraded) and so the cells return to the initial state, that is, with the lac operon off and no copies of either lactose permease or β-galactosidase present.
Decrease of energy spilling in Escherichia coli continuous cultures with rising specific growth rate and carbon wasting
Growth substrates, aerobic/anaerobic conditions, specific growth rate (μ) etc. strongly influence Escherichia coli cell physiology in terms of cell size, biomass composition, gene and protein expression. To understand the regulation behind these different phenotype properties, it is useful to know carbon flux patterns in the metabolic network which are generally calculated by metabolic flux analysis (MFA). However, rarely is biomass composition determined and carbon balance carefully measured in the same experiments which could possibly lead to distorted MFA results and questionable conclusions. Therefore, we carried out both detailed carbon balance and biomass composition analysis in the same experiments for more accurate quantitative analysis of metabolism and MFA.
We applied advanced continuous cultivation methods (A-stat and D-stat) to continuously monitor E. coli K-12 MG1655 flux and energy metabolism dynamic responses to change of μ and glucose-acetate co-utilisation. Surprisingly, a 36% reduction of ATP spilling was detected with increasing μ and carbon wasting to non-CO2 by-products under constant biomass yield. The apparent discrepancy between constant biomass yield and decline of ATP spilling could be explained by the rise of carbon wasting from 3 to 11% in the carbon balance which was revealed by the discovered novel excretion profile of E. coli pyrimidine pathway intermediates carbamoyl-phosphate, dihydroorotate and orotate. We found that carbon wasting patterns are dependent not only on μ, but also on glucose-acetate co-utilisation capability. Accumulation of these compounds was coupled to the two-phase acetate accumulation profile. Acetate overflow was observed in parallel with the reduction of TCA cycle and glycolysis fluxes, and induction of pentose phosphate pathway.
It can be concluded that acetate metabolism is one of the major regulating factors of central carbon metabolism. More importantly, our model calculations with actual biomass composition and detailed carbon balance analysis in steady state conditions with -omics data comparison demonstrate the importance of a comprehensive systems biology approach for more advanced understanding of metabolism and carbon re-routing mechanisms potentially leading to more successful metabolic engineering.
Organization and sequence of the genome
Sequence analysis. To obtain the contiguous genome sequence, a combined approach was used that involved the systematic sequence analysis of selected large-insert clones (cosmids and BACs) as well as random small-insert clones from a whole-genome shotgun library. This culminated in a composite sequence of 4,411,529 base pairs (bp) (Figs 1 , 2 (PDF File: 890K)), with a G + C content of 65.6%. This represents the second-largest bacterial genome sequence currently available (after that of Escherichia coli) 9 . The initiation codon for the dnaA gene, a hallmark for the origin of replication, oriC, was chosen as the start point for numbering. The genome is rich in repetitive DNA, particularly insertion sequences, and in new multigene families and duplicated housekeeping genes. The G + C content is relatively constant throughout the genome (Fig. 1) indicating that horizontally transferred pathogenicity islands of atypical base composition are probably absent. Several regions showing higher than average G + C content (Fig. 1) were detected these correspond to sequences belonging to a large gene family that includes the polymorphic G + C-rich sequences (PGRSs).
The outer circle shows the scale in Mb, with 0 representing the origin of replication. The first ring from the exterior denotes the positions of stable RNA genes (tRNAs are blue, others are pink) and the direct repeat region (pink cube) the second ring inwards shows the coding sequence by strand (clockwise, dark green anticlockwise, light green) the third ring depicts repetitive DNA (insertion sequences, orange 13E12 REP family, dark pink prophage, blue) the fourth ring shows the positions of the PPE family members (green) the fifth ring shows the PE family members (purple, excluding PGRS) and the sixth ring shows the positions of the PGRSsequences (dark red). The histogram (centre) represents G + C content, with <65% G + C in yellow, and >65% G + C in red. The figure was generated with software from DNASTAR.
Genes for stable RNA. Fifty genes coding for functional RNA molecules were found. These molecules were the three species produced by the unique ribosomal RNA operon, the 10Sa RNA involved in degradation of proteins encoded by abnormal messenger RNA, the RNA component of RNase P, and 45 transfer RNAs. No4.5S RNA could be detected. The rrn operon is situated unusually as it occurs about 1,500 kilobases (kb) from the putative oriC most eubacteria have one or more rrn operons near to oriC to exploit the gene-dosage effect obtained during replication 10 . This arrangement may be related to the slow growth of M. tuberculosis. The genes encoding tRNAs that recognize 43 of the 61 possible sense codons were distributed throughout the genome and, with one exception, none of these uses A in the first position of the anticodon, indicating that extensive wobble occurs during translation. This is consistent with the high G + C content of the genome and the consequent bias in codon usage. Three genes encoding tRNAs for methionine were found one of these genes (metV) is situated in a region that may correspond to the terminus of replication (Figs 1 , 2 (PDF File: 890K)). As metV is linked to defective genes for integrase and excisionase, perhaps it was once part of a phage or similar mobile genetic element.
Insertion sequences and prophages. Sixteen copies of the promiscuous insertion sequence IS6110 and six copies of the more stable element IS1081 reside within the genome of H37Rv 8 . One copy of IS1081 is truncated. Scrutiny of the genomic sequence led to the identification of a further 32 different insertion sequence elements, most of which have not been described previously, and of the 13E12 family of repetitive sequences which exhibit some of the characteristics of mobile genetic elements (Fig. 1). The newly discovered insertion sequences belong mainly to the IS3 and IS256 families, although six of them define a new group. There is extensive similarity between IS1561 and IS1552 with insertion sequence elements found in Nocardia and Rhodococcus spp., suggesting that they may be widely disseminated among the actinomycetes.
Most of the insertion sequences in M. tuberculosis H37Rv appear to have inserted in intergenic or non-coding regions, often near tRNA genes (Fig. 1). Many are clustered, suggesting the existence of insertional hot-spots that prevent genes from being inactivated, as has been described for Rhizobium 11 . The chromosomal distribution of the insertion sequences is informative as there appears to have been a selection against insertions in the quadrant encompassing oriC and an overrepresentation in the direct repeat region that contains the prototype IS6110. This bias was also observed experimentally in a transposon mutagenesis study 12 .
At least two prophages have been detected in the genome sequence and their presence may explain why M. tuberculosis shows persistent low-level lysis in culture. Prophages phiRv1 and phiRv2 are both ∼ 10 kb in length and are similarly organized, and some of their gene products show marked similarity to those encoded by certain bacteriophages from Streptomyces and saprophytic mycobacteria. The site of insertion of phiRv1 is intriguing as it corresponds to part of a repetitive sequence of the 13E12 family that itself appears to have integrated into the biotin operon. Some strains of M. tuberculosis have been described as requiring biotin as a growth supplement, indicating either that phiRv1 has a polar effect on expression of the distal bio genes or that aberrant excision, leading to mutation, may occur. During the serial attenuation of M. bovis that led to the vaccine strain M. bovis BCG, the phiRv1 prophage was lost 13 . In a systematic study of the genomic diversity of prophages and insertion sequences (S.V.G. et al., manuscript in preparation), only IS1532 exhibited significant variability, indicating that most of the prophages and insertion sequences are currently stable. However, from these combined observations, one can conclude that horizontal transfer of genetic material into the free-living ancestor of the M. tuberculosis complex probably occurred in nature before the tubercle bacillus adopted its specialized intracellular niche.
Genes encoding proteins. 3,924 open reading frames were identified in the genome (see Methods), accounting for ∼ 91% of the potential coding capacity (Figs 1 , 2 (PDF File: 890K)). A few of these genes appear to have in-frame stop codons or frameshift mutations (irrespective of the source of the DNA sequenced) and may either use frameshifting during translation or correspond to pseudogenes. Consistent with the high G + C content of the genome, GTG initiation codons (35%) are used more frequently than in Bacillus subtilis (9%) and E. coli (14%), although ATG (61%) is the most common translational start. There are a few examples of atypical initiation codons, the most notable being the ATC used by infC, which begins with ATT in both B. subtilis and E. coli 9 , 14 . There is a slight bias in the orientation of the genes (Fig. 1) with respect to the direction of replication as ∼ 59% are transcribed with the same polarity as replication, compared with 75% in B. subtilis. In other bacteria, genes transcribed in the same direction as the replication forks are believed to be expressed more efficiently 9 , 14 . Again, the more even distribution in gene polarity seen in M. tuberculosis may reflect the slow growth and infrequent replication cycles. Three genes (dnaB, recA and Rv1461) have been invaded by sequences encoding inteins (protein introns) and in all three cases their counterparts in M. leprae also contain inteins, but at different sites 15 (S.T.C. et al., unpublished observations).
Protein function, composition and duplication. By using various database comparisons, we attributed precise functions to ∼ 40% of the predicted proteins and found some information or similarity for another 44%. The remaining 16% resembled no known proteins and may account for specific mycobacterial functions. Examination of the amino-acid composition of the M. tuberculosis proteome by correspondence analysis 16 , and comparison with that of other microorganisms whose genome sequences are available, revealed a statistically significant preference for the amino acids Ala, Gly, Pro, Arg and Trp, which are all encoded by G + C-rich codons, and a comparative reduction in the use of amino acids encoded by A + T-rich codons such as Asn, Ile, Lys, Phe and Tyr (Fig. 3). This approach also identified two groups of proteins rich in Asn or Gly that belong to new families, PE and PPE (see below). The fraction of the proteome that has arisen through gene duplication is similar to that seen in E. coli or B. subtilis ( ∼ 51% refs 9, 14 ), except that the level of sequence conservation is considerably higher, indicating that there may be extensive redundancy or differential production of the corresponding polypeptides. The apparent lack of divergence following gene duplication is consistent with the hypothesis that M.tuberculosis is of recent descent 6 .
Note the extreme position of M. tuberculosis and the shift in amino-acid preference reflecting increasing G + C content from left to right. Abbreviations used: Ae, Aquifex aeolicus Af, Archaeoglobus fulgidis Bb, Borrelia burgdorfei Bs, B. subtilis Ce, Caenorhabditis elegans Ec, E. coli Hi, Haemophilus influenzae Hp, Helicobacter pylori Mg, Mycoplasma genitalium Mj, Methanococcus jannaschi Mp, Mycoplasma pneumoniae Mt, M. tuberculosis Mth, Methanobacterium thermoautotrophicum Sc, Saccharomyces cerevisiae Ss, Synechocystis sp. strain PCC6803. F1 and F2, first and second factorial axes 16 .
DISCUSSIONFIG 7 Summary of how an E. coli nitrogen starvation response impacts cross-feeding. The arrow thickness indicates relative flux. (A) Low NH4 + excretion levels by R. palustris (Rp) limit the ability of E. coli (Ec) to obtain NH4 + by diffusion across the membrane as NH3. Low NH4 + availability is sensed by E. coli through the sensor kinase NtrB, which phosphorylates the response regulator NtrC (see reference 19 for details on how nitrogen availability is sensed and transmitted). NtrC upregulates the expression of many genes involved in scavenging nitrogen, including the gene for the high-affinity NH4 + transporter AmtB. Higher AmtB levels allow E. coli to acquire the small amounts of NH4 + excreted by R. palustris , supporting E. coli growth and the mutualistic excretion of organic acids, which R. palustris uses as a carbon source. (B) Without NtrC, E coli AmtB levels remain low, and R. palustris has a competitive advantage in reacquiring excreted NH4 + . Starved for nitrogen, E. coli growth and organic acid cross-feeding slow, thereby threatening the stability of the mutualism. PTS, phosphotransferase system. (Adapted from reference 12.)
Noroviruses (NoVs) are one of the leading causes of acute gastroenteritis, including both outbreaks and endemic infections. The development of preventive strategies, including vaccines, for the most susceptible groups (children <5 years of age, the elderly and individuals suffering crowding, such as military personnel and travelers) is desirable. However, NoV vaccine development has faced many difficulties, including genetic/antigenic diversity, limited knowledge on NoV immunology and viral cycle, lack of a permissive cell line for cultivation and lack of a widely available and successful animal model. Vaccine candidates rely on inoculation of virus-like particles (VLPs) formed by the main capsid protein VP1, subviral particles made from the protruding domain of VP1 (P-particles) or viral vectors with a NoV capsid gene insert produced by bioengineering technologies. Polivalent vaccines including multiple NoV genotypes and/or other viruses acquired by the enteric route have been developed. A VLP vaccine candidate has reached phase II clinical trials and several others are in pre-clinical stages of development. In this article we discuss the main challenges facing the development of a NoV vaccine and the current status of prevailing candidates.
Dickson Lab @ MSU
Dickson A and Brooks III CL*. PLOS Comp. Biol. (2013)
For cells to function, the concentrations of all proteins in the cell must be maintained at the proper levels (proteostasis). This task - complicated by cellular stresses, protein misfolding, aggregation, and degradation - is performed by a collection of chaperones that alter the configurational landscape of a given client protein through the formation of protein-chaperone complexes. The set of all such complexes and the transitions between them form the proteostasis network. Recently, a computational model was introduced (FoldEco) that synthesizes experimental data into a system-wide description of the proteostasis network of E. coli. This model describes the concentrations over time of all the species in the system, which include different conformations of the client protein, as well as protein-chaperone complexes. We apply to this model a recently developed analysis tool to calculate mediation probabilities in complex networks. This allows us to determine the probability that a given chaperone system is used to mediate transitions between client protein conformations, such as folding, or the correction of misfolded conformations. We determine how these probabilities change both across different proteins, as well as with system parameters, such as the synthesis rate, and in each case reveal in detail which factors control the usage of one chaperone system over another. We find that the different chaperone systems do not operate orthogonally and can compensate for each other when one system is disabled or overworked, and that this can complicate the analysis of knockout experiments, where the concentration of native protein is compared both with and without the presence of a given chaperone system. This study also gives a general recipe for conducting a transition-path-based analysis on a network of coupled chemical reactions, which can be useful in other types of networks as well.
The nik operon of Escherichia coli encodes a periplasmic binding-protein-dependent transport system for nickel
Laboratoire de Gériétique Moléculaire des Microorganismes, CNRS-URA 1486, INSA, Bâtiment 406, 20 Avenue Albert Einstein, 69621 Villeurbanne Cedex, France.
Laboratoire de Gériétique Moléculaire des Microorganismes, CNRS-URA 1486, INSA, Bâtiment 406, 20 Avenue Albert Einstein, 69621 Villeurbanne Cedex, France.
Laboratoire de Gériétique Moléculaire des Microorganismes, CNRS-URA 1486, INSA, Bâtiment 406, 20 Avenue Albert Einstein, 69621 Villeurbanne Cedex, France.
The complete nucleotide sequence of the Escherichia coli nik locus, which has been suggested to encode the specific transport system for nickel, has been determined. It was found to contain five overlapping open reading frames that form a single transcription unit. Deduced amino acid sequence of the nik operon shows that its five gene products, NikA to NikE, are highly homologous to components of oligopeptide-and dipeptide-binding protein-dependent transport systems from several Gram-negative and Gram-positive species. NikA represents the periplasmic binding protein, NikB and NikC are similar to integral membrane components of periplasmic permeases, and NikD and NikE possess typical ATP-binding domains that suggest their energy coupling role to the transport process. Insertion mutations in nik genes totally abolished the nickel-containing hydrogenase activity under nickel limitation and markedly altered the rate of nickel transport. Taken together, these data support the notion that the nik operon encodes a typical periplasmic binding-protein-dependent transport system for nickel.
A simplification of Cobelli’s glucose–insulin model for type 1 diabetes mellitus and its FPGA implementation
Cobelli’s glucose–insulin model is the only computer simulator of glucose–insulin interactions accepted by Food Drug Administration as a substitute to animal trials. However, it consists of multiple differential equations that make it hard to be implemented on a hardware platform. In this investigation, the Cobelli’s model is simplified by Padé approximant method and implemented on a field-programmable gate array-based platform as a hardware model for predicting glucose changes in subjects with type 1 diabetes mellitus. Compared with the original Cobelli’s model, the implemented hardware model provides a nearly perfect approximation in predicting glucose changes with rather small root-mean-square errors and maximum errors. The RMSE results for 30 subjects show that the method for simplifying and implementing Cobelli’s model has good robustness and applicability. The successful hardware implementation of Cobelli’s model will promote a wider adoption of this model that can substitute animal trials, provide fast and reliable glucose and insulin estimation, and ultimately assist the further development of an artificial pancreas system.
This is a preview of subscription content, access via your institution.
GhoT increases persistence
Initially, we examined the 14 E. coli transcripts that lack GCU sites to determine if these transcripts were related to the ability of MqsR to increase persistence 6 . The effect of producing MqsR from pCA24N-mqsR in each of the 14 isogenic single gene knockouts (ghoT, hisL, kilR, pheL, ralR, tnaC, trpL, yahH, ybfQ, ybhT, yciG, ygaQ, yheV, and ymdF) (see Supplementary Results, Supplementary Table 1) was investigated 2 h after the addition of ampicillin (100 µg/mL) to determine if the ability of MqsR to increase persistence was altered and found that deleting ghoT had one of the largest effects on MqsR-mediated persistence (Supplementary Table 2). In strains where the kanamycin gene replacement might create a polar effect, the resistance gene was removed and the strains retested. In no case could the effects be ascribed to polarity (Supplementary Table 2). A time course study further confirmed that deleting ghoT significantly reduced MqsR-mediated persistence (27 ± 2-fold reduction for ΔghoT/pCA24N-mqsR vs. BW25113/pCA24N-mqsR) and made the cell behave similarly to the wild-type strain without MqsR production ( Fig. 1a ). Corroborating the dependence of MqsR on GhoT to increase persistence, producing GhoT in a ΔghoT strain also increased persistence 48 ± 3 fold to levels seen while producing MqsR ( Fig. 1b ). In fresh LB medium, some of these persister cells revived with a lag time of roughly 4 h which is comparable to reported values 24 ( Fig. 1c ).
Cell survival (%) after ampicillin (100 µg/mL) treatment for 2, 4, and 6 h with MqsR production with and without ghoT (a), or with GhoT production (b). wt indicates the wild-type host (E. coli BW25113). (c) Revival of GhoT-induced persister cells was tested by producing GhoT in BW25113 ΔghoT/pCA24N-ghoT while treating cells with ampicillin (100 µg/ml) for 2 hours. Growth in fresh LB medium was compared to control cells that lacked ampicillin treatment. At least three independent cultures of each strain were evaluated for each experiment, and error bars indicate standard error of mean.
GhoT affects the membrane and produces ghost cells
The organization of the ghoST operon and the impact of GhoT on persistence suggested that it might be a TA pair (Supplementary Fig. 1a). Hence, we tested whether or not GhoT is a toxin. GhoT is predicted to be a small (57 aa), highly hydrophobic protein with two transmembrane domains (residues 7 to 27 and 37 to 57) 25 . When GhoT was produced in the wild-type strain, which contains a chromosomal gene for the putative antitoxin GhoS, the turbidity of the culture decreased ( Fig. 2a ), and this decrease was due to cell lysis since cell cultures became clear ( Fig. 2b ). Corroborating these results, production of GhoT caused 60% of the cells to adopt “ghost” morphologies as observed using phase contrast microscopy ( Fig. 2c ) ghost cells are dead or dying cells in which the damaged membrane causes the cell poles to appear dense and the center to appear transparent 26 . Therefore, GhoT is a toxin that when overproduced, lyses cells by disrupting the cell membrane to form ghost cells.
(a) Cell growth in LB medium for cells producing GhoT and GhoS. Note the chromosomal copy of ghoS in the wild-type strain allows for some growth with toxin GhoT production. GhoSX is truncated GhoS with a stop codon introduced at Tyr16. Three independent cultures of each strain were evaluated, and error bars indicate standard error of mean (n = 3). (b) Cell culture at the end of growth in (a) at 20 h to show the clearance and lysis due to production of GhoT. Scale bar represents 1 cm. (c) Cell morphology after incubating for 8 h at 37ଌ. Scale bar represents 5 µm. For (a), (b) and (c), Empty: BW25113/pCA24N/pBS(Kan), GhoT: BW25113/pCA24N-ghoT/pBS(Kan), GhoS: BW25113/pCA24N/pBS(Kan)-ghoS, GhoT + GhoS: BW25113/pCA24N-ghoT/pBS(Kan)-ghoS, and GhoT + GhoSX: BW25113/pCA24N-ghoT/pBS(Kan)-ghoSX. Plasmids were retained with kanamycin (50 µg/mL) and chloramphenicol (30 µg/mL) 0.5 mM IPTG was used at time 0 to produce the plasmid-based proteins. Three independent cultures of each strain were evaluated. (d) Growth on LB plates with kanamycin (50 µg/mL), chloramphenicol (30 µg/mL), and IPTG (1 mM, to induce ghoT via pCA24N-ghoT). In the absence of a chromosomal copy of ghoS, there is no growth with toxin GhoT production. ΔghoS is BW25113 ΔghoS and ΔghoT is BW25113 ΔghoT. p refers to pCA24N and p-ghoT refers to pCA24N-ghoT, respectively. Three independent cultures of each strain were evaluated. Scale bar represents 1 cm.
GhoS is an antitoxin and GhoT/GhoS form a TA pair
Like toxin GhoT, GhoS is also a small protein (98 aa). There are 27 bp between the two genes, which include a putative RBS for ghoT therefore, ghoST are predicted to comprise a single operon. Unlike GhoT, production of GhoS was not toxic, and it completely counteracted the toxicity of GhoT ( Fig. 2a ). Furthermore, production of GhoS with GhoT reduced the formation of ghost cells by 18-fold based on microscopic observation of
500 cells ( Fig. 2c ), whereas producing GhoS alone did not cause ghost cells to form.
Replacement of antitoxin ghoS with a kanamycin cassette 27 is not lethal, likely due to the polar effect on downstream ghoT. However, when GhoT was produced via pCA24N-ghoT, deletion of ghoS was lethal as growth was completely inhibited in the ΔghoS mutant but not in the ΔghoT mutant, which has an intact chromosomal copy of ghoS ( Fig. 2d ). This is a typical feature of TA systems 20 , although polar mutations can mask this effect if the antitoxin gene precedes the toxin gene.
As shown by reverse transcription polymerase chain reaction (RT-PCR), ghoST form a single operon since they are co-transcribed. Specifically, a single band of
400 bp was detected using a forward primer in the first gene (ghoS-f) and a reverse primer in the second gene (ghoT-r) (Supplementary Table 3) using cDNA synthesized from total RNA as the template (Supplementary Fig. 1b). As controls, the same band was detected using genomic DNA as template but not for total RNA. Collectively, these results demonstrate that GhoS is an antitoxin, that ghoT and ghoS are co-transcribed, and that they form a TA system.
GhoS is a proteic monomeric antitoxin
To demonstrate that GhoS functions as a proteic antitoxin, we introduced a stop codon by a single nucleotide change into ghoS DNA at corresponding amino acid position 16 (Tyr16) and tested its impact on cell growth. We found that the early termination mutation abolished the ability of GhoS to block the toxicity of GhoT for both cell growth ( Fig. 2a ) and ghost cell formation. We also found that antitoxin GhoS is not degraded (Supplementary Fig. 2a) in response to stress 28 , whereas most antitoxins are degraded (Supplementary Fig. 2b), and found that GhoS does not bind its own promoter (Supplementary Fig. 3). In addition, size exclusion chromatography, dynamic light scattering (Supplementary Fig. 4) and biomolecular NMR experiments (see below) demonstrated that GhoS is a monomer in solution. Collectively, these results show that GhoS is a non-canonical antitoxin because it does not regulate its own transcription, it is stable, and it is a monomer in solution.
GhoS adopts a ferredoxin-like fold similar to CAS2
Analysis of the GhoS protein sequence using BLAST revealed that while it is conserved among multiple species of E. coli, it is not similar to any protein whose structure or function is known. Because function is more highly conserved than sequence, we used biomolecular NMR spectroscopy to determine the structure of GhoS and, in turn, gain insights into its biological function. In the sequence-specific backbone assignment, 95 of the expected 96 backbone amide NH pairs (3 prolines) are assigned with the missing residue corresponding to the N-terminal cloning artifact His(𢄡) (Supplementary Fig. 5). A total of 2479 Nuclear Overhauser Enhancement (NOE)-derived distance constraints were used for the structure calculation (
25 NOE constraints/residue) using a simulated annealing protocol within the program CYANA 29 and refined in explicit solvent using CNS 32 . The GhoS model has excellent stereochemistry (see Supplementary Methods) and the root-mean-square-deviation value about the mean coordinate positions of the backbone atoms for residues 5 to 95 is 0.36 ± 0.08 Å (20 models in the ensemble Supplementary Fig. 6a). NMR and refinement statistics are reported in Supplementary Table 4. The three-dimensional GhoS structure consists of three α-helices and five β-strands ( Fig. 3a ) and is stabilized by two hydrophobic clusters. The central hydrophobic core consists of residues Tyr10, Val12, Phe14, Tyr16, Phe24, Leu27, Met31, Met34, Phe36, Phe55, Ile57, Ile66, Ile70, Leu77, Ile80, Phe82, Leu84, and Ile86 (Supplementary Fig. 6b). The structure is also stabilized by a second hydrophobic cluster comprised of Val11, Val40, Leu50, Ala56, Met87, Val89, Tyr92, and Phe93 (Supplementary Fig. 6c).
(a) Ribbon model of the lowest-energy conformer of GhoS, with the secondary structural elements and termini labeled putative catalytically important residues shown as sticks and labeled. Figure prepared with PyMOL (http://www.pymol.org/). (b) Two-micrograms of in vitro synthesized wild-type ghoT transcript (207 nt, lane 1) were incubated without (−) or with 30 µg of purified GhoS and its variants at 37ଌ for 3 h. Two mutants, F14A and F55A, eluted from the size exclusion column as monomers (M) and dimers (D), so both forms were tested (F14A, 40% dimer F55A, 32% dimer). The reduced activity of GhoS with point mutations is shown by the presence of un-cleaved transcript as indicated by an arrow. M indicates low range ssRNA ladder. (c) Circular dichroism (CD) spectra demonstrating that native GhoS (dark blue) and all the GhoS mutants are folded (sample concentrations
20 µM). (d) Co-expression of GhoT with wild-type (WT) GhoS and the GhoS variants via BL21(DE3)/pCA24N-ghoT harboring the pRP1B(Kan)-ghoS constructs (0.1 mM IPTG was used). Scale bar represents 1.1 cm.
A search for structural homologs of GhoS using the structure-based alignment program DALI 33 identified five proteins with Z-scores of 5.8 to 6.3, all of which adopt a ferredoxin-like fold, characterized by a split α-β sandwich (β-α-β-β-α-β Supplementary Table 5). This superfold is highly populated with functionally diverse proteins, such as ribosomal proteins, DNA binding proteins, and ribonucleases 34 . Of the five structures with the best Z-scores, only two were of similar size to GhoS: SSO1404 (PDBID 2I8E, 88 residues Z-score = 6.1) 35 and SSO8090 (PDBID 3EXC, 78 residues Z-score = 5.8). These proteins (and three other family members: TT1823, Z-score = 5.6, PF1117, Z-score=5.1, DvuCAS2, Z-score = 5.0 36 ) belong to the CRISPR-associated (CAS2) family. SSO1404 and SSO8090 are sequence-specific endoribonucleases that preferentially cleave single-stranded RNA 37 . The structures of GhoS and the CAS2 protein SSO1404 monomer (CAS2 proteins are dimers in vitro) overlap well (Supplementary Fig. 6d,e, Supplementary Fig. 7). The primary difference between them is the position of β-strand 㬢. In GhoS, 㬢 and 㬢’ form a short two-stranded β-sheet that interacts with the C-terminal α-helix, 㬓. In contrast, in the CAS2 proteins, 㬢 projects upwards to form the fourth β-strand of the β-sheet in the ferredoxin fold. Thus, GhoS adopts an atypical ferredoxin fold in which the central β-sheet is made up of three and not four β-strands.
GhoS is an endoribonuclease that cleaves ghoT mRNA
The sequence identity between GhoS and the CAS2 proteins is low, between 10%. However, when the structures of SSO1404 and GhoS are superimposed, five of the six SSO1404 catalytic residues are structurally conserved in GhoS ( Fig. 3a , Supplementary Fig. 6d,e GhoS/SSO1404: Phe14/Tyr9, Asp15/Asp10, Arg26/Arg19, Arg28/Arg31 and Phe55/Phe37). Systematically converting these five GhoS residues to alanines revealed that, in vitro, the Arg28Ala, Phe55Ala and, to a lesser extent, Arg26Ala substitutions reduced the ability of antitoxin GhoS to cleave ghoT mRNA ( Fig. 3b , Supplementary Fig. 8 circular dichroism shows all mutants are folded, Fig. 3c ) this effect was also corroborated for the Arg28Ala variant in vivo ( Fig. 3d ). Thus Arg28 appears to be important for GhoS activity.
Because GhoS production is not toxic but instead increases growth ( Fig. 2a ), these results suggest that GhoS is a sequence-specific endoribonuclease. Thus, we investigated whether GhoS cleaves ghoT mRNA. Using quantitative real-time, reverse-transcription PCR (qRT-PCR), we found that the ghoT portion of the ghoST transcript in the wild-type strain was 21 ± 2-fold less stable than the ghoS portion of the transcript in the stationary phase (see Supplementary Table 6 for all of the qRT-PCR data). Corroborating this result, production of GhoS via pCA24N-ghoS reduced the ghoT portion of the transcript 5 ± 1-fold relative to the empty plasmid. Cleavage by GhoS appears specific since ompA (𢄡.1 ± 0.2), ompF (1.2 ± 0.1), ralR (𢄡.7 ± 0.4), or purA (𢄡.9 ± 0.5) transcript levels were not affected by GhoS production.
In vitro, GhoS cleaved the ghoT portion of the transcript (207 nt, Supplementary Table 7) at multiple sites and generated, after full-digestion, fragments of approximately 52, 65, 87, 91 and 116 nt ( Fig. 4a & Supplementary Fig. 8), whereas there was less degradation of the ghoS portion of the transcript under the same conditions. As expected, heat denaturation of GhoS abolished the ability to cleave the transcripts. Very little degradation of the ATP synthase subunit gene atpE and ompA in vitro was observed. Finally, no degradation was observed of: (i) total RNAs in vitro, (ii) 23S and 16S rRNAs in vivo with GhoS, and (iii) tRNAs in vitro (Supplementary Fig. 9).
(a) GhoS cleavage reaction with native transcripts of ghoT (207 nt), ghoS (311 nt), atpE (189 nt) and ompA (211 nt). HI indicates heat inactivated GhoS. The blue arrows indicate the main fragments generated after cleavage. M indicates the low range ssRNA ladder. (b) Predicted secondary structure of in vitro synthesized ghoT mRNA. Capital red letters indicate the changed nt for mutations m1, m2, m3, and m4. S1, S2, S3, S4, and S5 indicate the cleavage sites based on RNA sequencing. The four main sections in the structure are indicated with numbers i, ii, iii and iv. The RNA secondary structure was obtained using Mfold software. (c) GhoS cleavage reaction with transcripts of ghoT with mutations m1, m2, m3, and m4 (207 nt). The red arrows indicate the fragments generated or increased in the mutant transcripts after cleavage. (d) Predicted secondary structure of in vitro synthesized ghoTm1m2 mRNA. The mutated ghoTm1m2 cleavage site is indicated by two solid red lines. (e) GhoS cleavage reaction with transcripts of ghoT with mutations m1, m2, and m1m2 (207 nt). The green arrows indicate the reduced fragments after the introduction of the second mutation. For the reactions shown in (a), (c), and (e), 2 µg of in vitro synthesized transcripts were incubated in with (−) or without 30 µg (+) of purified GhoS at 37ଌ for 3 h and analyzed by gel electrophoresis.
Using RNA sequencing, we found that GhoS cleaves specifically ghoT mRNA at nt positions 30/31, 51/52, 66/68, 115/116, and 154/155 (positions S1 to S5, Fig. 4b ). Analysis of the cleaved products identified a putative cleavage site corresponding to 5’-UNNU(A/C)N(A/G)(A/U)A(A/U)-3’. To corroborate GhoS cleavage at the 51/52 nt site, we altered the ghoT mRNA fragment via mutation m1 (AUAUU to CGCGC at nt position 52, Fig. 4b ) and found a reduction in overall cleavage and increase in larger fragments (e.g., 87 and 124 nt) as would be expected for loss of this site ( Fig. 4c ). Additional mutations (m2 at nt 125 and m3 at nt 59) reduced cleavage as expected given their proximity to cleavage sites 66/68 and 115/116, and the m4 change (nt 132) had little effect since the transcript was completely degraded as with the wild-type ghoT mRNA ( Fig. 4c ).
To investigate the importance of RNA secondary structure on GhoS cleavage, the stems disrupted by mutations m1 or m2 were recovered by the introduction of both mutations into the plasmid carrying single mutant ghoT alleles. The recovery of the cleavage pattern to that of the wild-type ghoT transcript in the double mutant transcript would indicate the importance of RNA secondary structure over sequence recognition in GhoS cleavage, while a reduction in cleavage would indicate that sequence recognition is important. We found that the introduction of both mutations m1m2 to restore the stem of the predicted secondary structure ( Fig. 4d ) generated a unique cleavage pattern distinct from that of the native ghoT mRNA ( Fig. 4e ). A reduction of the fragments accumulated due to the m1 mutation along with an increase in large partially-cleaved or un-cleaved fragments compared to the native transcript suggests the importance of sequence recognition during GhoS cleavage. Therefore, GhoS is a specific RNase that limits translation of toxin GhoT by cleaving ghoT transcripts.
To provide more evidence of the specificity of the RNase activity of GhoS, we analyzed changes in mRNA levels during production of GhoS compared to the strain with an empty plasmid with a DNA microarray so that we could investigate in vivo which of the cell’s transcripts may be cleaved by GhoS. Under these conditions, only 20 transcripts had altered mRNA levels and all were found to be reduced (Supplementary Table 9) there were no induced genes. These genes down-regulated due to GhoS production were all involved in the biosynthesis/transport of purines and pyrimidines. These results suggest that GhoS selectively cleaves only a few cellular targets.
To further corroborate the findings in the DNA microarray, qRT-PCR was performed with seven genes (purM, purH, purE, pyrI, pyrB, carA and carB) with total RNA isolated under the same culture conditions as the DNA microarray experiment. In each case, the qRT-PCR results showed a decreased RNA abundance upon GhoS production, which matched the microarray results (Supplementary Table 9). Although ghoT expression was unaltered in the microarray analysis, qRT-PCR performed with duplicate samples on three independent occasions, showed that ghoT expression was decreased at least 3-fold upon production of GhoS.
GhoT increases early biofilm formation
Since TA systems affect biofilm formation 4, 5 and since we identified mqsR as one of the highly regulated genes in E. coli biofilm cells, when compared to planktonic cells 4 , we investigated the impact of GhoT/GhoS on biofilm formation. Deletion of ghoT decreased biofilm formation at 8 h in LB medium at 30ଌ (4.6-fold) and 37ଌ (4.9-fold), while deletion of ghoS increased biofilm formation significantly (up to 6.1-fold) at 30ଌ and 37ଌ at 8 h (Supplementary Fig. 10a). Swimming motility was also slightly reduced when ghoT was deleted, while the deletion of ghoS increased cell motility
2-fold (Supplementary Fig. 10b). These results show that GhoS and GhoT impact early biofilm formation and swimming motility.
The conventional BN modelling style fails to integrate with other higher-level cellular descriptive models such as the ODE. Our study is an attempt to bridge the gap between these two paradigms. The multi-bit Boolean encoding of the ODE model developed in this work is similar to a conventional BN in its construction. However, the Boolean vector representing a particular node is designed to provide finer resolution to mimic high-level functional behaviour. We show that our multi-bit Boolean model can simulate the high-level ODE description of the chemotaxis module ofE. coli. One limitation of the current model is that the accuracy is dependent upon the number of bits used in the BN modelling. However, the proposed coarse discretisation is very useful for the aforementioned bridging purpose. The chemotaxis response and the drift velocity estimated from our multi-bit BN simulation trajectories are consistent with experimental results published in the literature [ 19 , 21 ]. Taken together, by simulating E. coli chemotaxis, we conclude that multi-bit Boolean methodology is ideal for simulating dynamic high-level biological systems. Furthermore, we propose that the multi-bit Boolean model developed in this study can be used for designing of bio-inspired machines such as nanobots. Nanobots are tiny machines that can operate without human interaction, such as nanobotic swimmers [ 41 , 42 ]. These futuristic machines are designed for multiple medical applications including diagnosis and testing of tissues and recording of vital parameters in the bloodstream (e.g. temperature, pressure). One of such nanobots (termed as logibot) based on the Boolean model has been published recently in [ 43 ]. The logibots designed in [ 43 ] propel according to the pH levels in their environment. Similarly, the modelling approach designed in the current study will enable the development of nanobots capable of sensing chemical concentration variations and regulate its movement accordingly. We propose that the multi-bit BN strategy can provide a platform to design nanobots that mimic Brownian-like motion and drift. Such features will enable nanobots to perform sophisticated functions, including distributed sensing and targeted drug delivery.
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.