Information

Isolating a specific gene (specifically TRAV* series of genes) for sequencing

Isolating a specific gene (specifically TRAV* series of genes) for sequencing



We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I'm trying to figure out (for pedagogical purposes) the right way to isolate and PCR the regions coding the T-Cell receptor.

My understanding is that I would need to use restriction enzymes that target regions near the start and end of the gene. I was looking at a bunch of different restriction enzymes on this site: https://www.addgene.org/mol-bio-reference/restriction-enzymes/

What I'm not sure about is: how do you pick the combination that will specifically cut out that gene and not any of the other ones? It seems like the restriction enzymes only consist of a few base pairs, and I would guess they can slice at many many sites.

Furthermore once it is cut, how do I isolate just the DNA for the gene I want to sequence as opposed to the rest of the junk DNA?


Restriction enzymes are usually only used when you do molecular cloning using plasmids or other forms of shorter DNA moleclues, since - like you said - using them on genomic DNA would cut it into much more than thousands of pieces.

Instead what you can do is perform your PCR directly on the genomic DNA (or potentially cDNA, if you want to get rid of introns). Doing PCR on genomic DNA is often a bit tricky (i.e. you'll likely have to try different conditions to get a product), but generally possible. If you want to analyse the whole coding region it might be better to extract mRNA instead of DNA and reverse transcribe it, so that can amplify the coding region as one block, that doesn't contain Introns - PCR on cDNA is also easier than on genomic DNA.

Another thing you need to look out for is that T-Cell-Receptors (similar to antibodies) are different in every T-Cell. This means that you can't use Sanger sequencing, which will only give you a proper result if you input has the exact same sequence. To get the sequences of a mix of T-cells you would need to use next-gen sequencing.


The fruitless gene affects female receptivity and species isolation

Female mate rejection acts as a major selective force within species, and can serve as a reproductive barrier between species. In spite of its critical role in fitness and reproduction, surprisingly little is known about the genetic or neural basis of variation in female mate choice. Here, we identify fruitless as a gene affecting female receptivity within Drosophila melanogaster, as well as female Drosophila simulans rejection of male D. melanogaster. Of the multiple transcripts this gene produces, by far the most widely studied is the sex-specifically spliced transcript involved in the sex determination pathway. However, we find that female rejection behaviour is affected by a non-sex-specifically spliced fruitless transcript. This is the first implication of fruitless in female behaviour, and the first behavioural role identified for a fruitless non-sex-specifically spliced transcript. We found that this locus does not influence preferences via a single sensory modality, examining courtship song, antennal pheromone perception, or perception of substrate vibrations, and we conclude that fruitless influences mate choice via the integration of multiple signals or through another sensory modality.

1. Introduction

Female mate choice can impact both sexual selection within species and reproductive isolation between species. Females typically have a greater investment than males in reproduction resulting from a single mating [1–3], and thus benefit from fewer but higher-quality matings, which they achieve by displaying mate discrimination [4–6]. Female discrimination can select for higher-quality males within a species, and allow females to avoid maladaptive matings with males from other species. Since behavioural isolation due to mate choice divergence is likely to be the earliest factor to arise in the speciation process [7], these female preferences can contribute to the divergence and maintenance of species separation [8,9]. While a small number of loci have been identified as candidate genes for naturally-occurring variation in female preference behaviour both within [10–14] and between species [15–19], very few loci affecting female mating behaviour have been confirmed.

The Drosophila genus has multiple closely related species that are sexually isolated, making it a commonly used model for studying behavioural isolation [14,20]. Drosophila melanogaster and D. simulans are two species that are behaviourally isolated: while D. melanogaster females mate at high frequencies with D. melanogaster males, D. simulans females do not [21,22]. However, hybrid females from the permissive cross between D. melanogaster females and D. simulans males [23,24] will mate with D. melanogaster males [25]. The receptivity of hybrid females towards D. melanogaster males therefore resembles D. melanogaster female behaviour, which suggests that D. melanogaster loci affecting female receptivity are likely to be dominant, or semi-dominant, over the corresponding D. simulans non-receptive loci. These D. simulans recessive loci are probably responsible for the rejection of D. melanogaster males.

D. melanogaster and D. simulans do not produce fertile hybrid offspring [23,26]. As a consequence, recombination mapping is not possible between these two species. Therefore, in a previous study we used deficiency mapping to identify small regions of the 3rd chromosome that influence species-specific female preference [19]. This uses pre-existing D. melanogaster lines that are missing a small portion of their genome (a deficiency) F1 inter-species hybrids produced from these lines contain one full set of both parents' genomes, with the exception of the deficient region, where only D. simulans genome is present (figure 1a). A key aspect of this assay is that it is not testing the effect of disrupting a gene's function, since the non-disrupted allele is functional. Instead, it is assessing naturally-occurring variation in the allele that is unmasked [28]. If hybrid females bearing a deficiency behave more like D. simulans females by rejecting D. melanogaster males rather than mating with them, the deficient region was further fine-mapped using smaller deficient regions. Here, we used this approach to identify a locus affecting inter-species female rejection behaviour: fruitless.

Figure 1. Genetic mapping identifies the fruitless gene as influencing female rejection of heterospecific males. (a) Schematic of the 3rd chromosomes (blue and red lines) in deficiency complementation mapping. D. melanogaster (mel, top left) females bearing a deficiency or disruption (broken blue line) are crossed to D. simulans (sim, bottom left red lines) produce hybrids inheriting intact chromosomes from both species (bottom right sim/mel Bal ) and hybrids inheriting a disruption in the D. melanogaster chromosome (top right sim/mel Dis ). Gene(s) within this disrupted region are only expressed from the sim homologue's alleles. (b) Genetic constructs used to assay female rejection of heterospecific males. Rectangular bars represent deficiencies blurred ends represent imprecisely known breakpoints scale is approximate. The three deficiencies at top (marked with asterisk) are from [19]. Orange is statistically significant (p < 0.05) grey is not statistically significant. Arrowed box represents location and direction of fru gene compared with the deficiencies note that image is shown in orientation relative to fru. At bottom, fru is expanded and shown 5′–3′ to represent the fru P1–P5 first exons, common exons C1–C5, and 3′ exons A–D boxes are exons, black boxes are coding. The relative locations of transposable element insertions are represented by numbered inverted triangles: 1 = fru GAL4 , 2 = fru MI05459 , 3 = fru NP0021 , 4 = fru MI01850 , 5 = fru KG00116 dashed lines represent targeted fru ΔP1 and fru ΔP2 deletion locations. Image not to scale. (c) Representation of fru transcripts. Adapted from [27]. (d) Proportion mated when paired with D. melanogaster males for control pure species females (1 h assay) without a disruption (mel/mel Bal ), with a disruption (mel/mel Dis ), and hybrid females (24 h assay) without a disruption (sim/mel Bal ), when compared with hybrid females with a disruption (sim/mel Dis ). Comparisons where sim/mel Dis females have a significant reduction in mating have p-values shown in bold. (e) Transcript presence in flies homozygous for the Minos insertion fru MI05459 compared with flies that have had the Minos element excised. The housekeeping gene RpL32 is used as a control. Each sample was done with a biological replicate. (f) Female receptivity is rescued when the Minos element within fru MI05459 is removed (n = 32). (Online version in colour.)

The frutless (fru) gene is best known for its role in sex determination. It has a highly conserved genetic sequence across the orders Diptera [29], Hymenoptera [30] and Blattaria [31], suggesting an ancient origin and a common function among different insects [32]. In D. melanogaster, the fru gene encodes a set of transcription factors generated via alternative exons at both the 5′ and 3′ ends (figure 1c) [33–36]. Transcripts begin with one of five first exons (P1–P5), followed by a central common region shared by all transcripts that contains a BTB (protein–protein interaction) domain, and ending with one of four final exons (A–D), three of which (A–C) contain a zinc-finger DNA-binding domain [37,38]. The function of fru has been primarily studied in relation to the P1 transcript group's role in generating male courtship behaviour [35,39–41]. P1 transcripts are sex-specifically spliced at the P1-S exon by the products of the transformer and transformer-2 genes to produce male- (fru M ) and female- (fru F ) specific transcripts (figure 1c) [37]. fru F transcripts are not translated into functional proteins, but are detectable in the central nervous system of wild-type females [38]. Fru M proteins translated from fru M are expressed in approximately 3% of the neurons in the central nervous system [34,42] as well as the peripheral nervous system [37,43]. fru M behavioural phenotypes are specified by either a single, or combination, of isoforms [44,45], and Fru M proteins overwhelmingly target genes involved in nervous system development [45].

The well-studied P1 transcript does not produce a functional protein in females and appears to not have a role in female mating behaviour [35]. While far less studied than the P1 transcripts, a subset of the non-sex-specifically spliced transcripts (P2–P5) are necessary for adult viability and external morphology [30,46,47]. The P2 transcript is most strongly expressed in the eye, while the P3 and P4 transcripts are expressed widely throughout development [36,48]. None of these other transcripts have been assayed for their effect on female behaviour. We show here that one of these non-sex-specifically spliced transcripts affects both intra- and inter-species female rejection behaviour.

2. Material and methods

(a) Drosophila strains and crosses

Flies were maintained on standard food medium (Bloomington Drosophila Stock Center) under a 14 : 10 light : dark cycle at 24°C and approximately 80% relative humidity. All Drosophila melanogaster disruption stocks are listed in electronic supplementary material, table S1. Deficiencies and gene disruptions were maintained over 3 rd chromosome balancers (Bal) TM3 or TM6C. Unless otherwise noted, all stocks were obtained from the Bloomington Drosophila Stock Center (Bloomington, Indiana), as were line #3703 (w 1118 /Dp(1Y)y + CyO/nub 1 b 1 sna Sco lt 1 stw 3 MKRS/TM6B, Tb 1 ), fru sat15 (Df(3R)fru sat15 /TM6B, Tb 1 ), and the Minos transposase stock (w 1118 sna Sco /SM6a, P2.4). The fru GAL4 and fru 4-40 lines were provided by Dr Barbara Taylor. The D. melanogaster stock with GFP-tagged sperm (w P19B(3)) was provided by Dr John Belote. The D. melanogaster fru ΔP1 and fru ΔP2 were made and provided by Dr Megan C. Neville at the Centre for Neural Circuits and Behaviour, University of Oxford. Wild-type D. melanogaster BJS is an isofemale line collected by Dr Brent Sinclair. Wild-type D. simulans Florida City (FC) was provided by Dr Jerry Coyne (collected from Florida City, Florida Coyne 1989) wild-type D. simulans 199 (stock # 14021-0251.199 Nanyuki, Kenya) and wild-type D. simulans 216 (stock # 14021-0251.216 Winters, California) were obtained from the Drosophila Species Stock Center (San Diego, California).

(b) Mating assays

Five to six Virgin D. melanogaster females bearing a disruption over a balancer (mel Dis /mel Bal ) aged 5–7 days were crossed with 25–30 D. simulans FC or 3–4 D. melanogaster BJS non-virgin 5–10-day-old males to produce F1 hybrid sim/mel Dis and sim/mel Bal or mel/mel Dis and mel/mel Bal offspring, respectively. Assays were conducted as two replicates, whereby the flies used in assays were generated via two independent sets of crosses at slightly separate time points.

Virgin 5–7-day-old F1 females of the four genotypes produced above were paired with a 5–7-day-old virgin D. melanogaster GFP-sperm male in a no-choice mating assay. Assays were carried out at 24°C, approximately 50% relative humidity, and 1–2 h of ‘lights on’ in 30 ml plastic vials containing approximately 5 ml of standard food medium. All four genotypes (sim/mel Dis , sim/mel Bal , mel/mel Dis and mel/mel Bal ) were assayed in equal numbers within any given assay to account for uncontrolled environmental effects that could influence mating activity (e.g. [49]). Flies were observed for 1 hour (‘1 h mating assay’) and scored for latency to courtship and latency to copulation. From these measures, the proportion that copulated out of those that were courted was calculated for pure species females. Assays involving F1 hybrid sim/mel Dis and sim/mel Bal females were left within the vials for an additional 24 h (‘24 h mating assay’), at which time the female was decapitated, her reproductive tract dissected and imaged with fluorescent microscopy, and scored for the presence or absence of fluorescently-labelled sperm. Hybrid females were scored as if all females were courted in the 24 h period. We compared the amount of copulation out of those that were courted in a 1 h assay for pure species females and the total proportion that copulated in a 24 assay for hybrid females. This comparison at different time scales was necessary due to the far lower level of receptivity of hybrid females towards these males. We found no difference in the occurrence of courtship for disruption versus intact hybrid females for all assays (electronic supplementary material, table S1 p > 0.05). The proportion that copulated was compared using logistic regression (α < 0.05) with the independent variables of species (mel/mel or sim/mel) and genotype (Dis or Bal) and the dependent variable of whether copulation occurred after courtship (yes or no). We considered a result biologically significant only if the sim/mel Dis females had reduced copulation compared with sim/mel Bal . Data were checked for overdispersion by comparing the null and residual deviances no overdispersion was present. Analyses were done using R version 3.6.1 [46].

To test whether the reduction in receptivity due to fru occurs across D. simulans or is specific to strain FC, we repeated the above assays for fru MI01850 mel Dis /mel Bal crossed to D. simulans stock 199 and D. simulans stock 216, and analysed as above.

To test whether these females experience a general disruption in female receptivity, we paired fru 4-40 , fru GAL4 and fru KG00116 sim/mel Dis and sim/mel Bal females in no-choice mating assays as above, but using D. simulans FC males instead of D. melanogaster males, scoring for the presence or absence of sperm using a compound microscope. The proportion of females containing sperm for sim/mel Dis was compared with those for sim/mel Bal using a two-proportion Z-test (α < 0.05).

To test whether homozygous deletions induce changes in female receptivity in within-species assays, we paired homozygous fru ΔP1 and fru ΔP2 D. melanogaster females with D. melanogaster males in 1 h no choice mating assays and scored for courtship and copulation. Canton-S melanogaster females were used as a control.

To test whether females bearing a targeted deletion of the first exon of fru P2 (fru ΔP2 , see below) display active courtship rejection behaviours versus are passively non-receptive, we placed virgin F1 hybrid fru ΔP2 females aged 5–7 days in single pairs with 5-day-old virgin D. melanogaster GFP-sperm males in a 10 min no-choice mating assay. We scored males for courtship latency, time spent actively courting, and number of copulation attempts, and scored females for number of ovipositor extrusions, times kicking, times displaying wing scissoring, and total time spent walking away from the male after he approached.

(c) RT-PCR

RNA was extracted from two biological replicates, each with 20 pure species or homozygous disruption 5–7-day-old females, using the TRIzol plus Purelink RNA purification kit (Thermofisher Cat# A33254). The fru MI05459 line is homozygous lethal, so was crossed with stock #3703 to create the genotype y 1 w* fru MI05459 /TM6B, Tb 1 . Adult females were dissected out of late stage non-Tb 1 pupal cases [33], which are homozygous for the disruption. RNA was quantified using a Nanophotometer P300 (Implen, Inc.) and 2 µg of total RNA was used for cDNA synthesis using Maxima First Strand cDNA Synthesis Kit with DsDNAase (Thermofisher, Cat# K1671) using oligo dTs. Two technical replicate were done for each biological replicate. RT-PCR was performed using a forward primer within each of the respective exons and a reverse primer within the common region of fru (electronic supplementary material, table S2). RpL32 was used as a control (electronic supplementary material, figure S1).

(d) Excision of MiMIC insertion

The Minos mediated Integration Cassette (MiMIC) in fru MI05459 was excised by crossing to a Minos transposase stock using standard protocols [47,50]. Resulting males were screened for the excision of the MiMIC insertion by PCR genotyping using primers flanking the MiMIC insertion site (electronic supplementary material, table S2). Offspring produced from individuals with excisions were crossed together to create a stable stock with genotype y 1 w* [excised: Mifru MI05459 ]. The excisions were confirmed by sequencing the region flanking the MiMIC insertion site (electronic supplementary material table S2 and figure S2).

Five to six virgin females from the clean excision stock and virgin females from the original fru MI05459 Minos disruption stock were separately crossed with 25–30 non-virgin 5–10-day-old D. simulans FC males to create F1 hybrid sim/mel Dis+ , sim/mel Dis , and sim/mel Bal females, where Dis+ indicates the excised Minos disruption. F1 hybrid females were aged 5–7 days and then paired separately with 5–7-day-old virgin D. melanogaster GFP-sperm males in 24 h no choice mating assays, as above.

(e) fruitless mutant lethality analysis

To test whether the MiMIC insertion in fru MI05459 disrupts P3 and P4 transcripts, we made ten separate crosses of 3 fru sat15 males with 5 fru MI05459 females. We scored the proportion of offspring that were fru sat15 / fru MI05459 that were produced across the first 10 days of offspring eclosion.

(f) Sensory modality assays

In all sensory modality assays, 5–7-day-old virgin females from the aforementioned interspecific crosses were paired with 5–7-day-old virgin D. melanogaster GFP-sperm males in 24 h no-choice mating assays, as described above. To test the effect of courtship song, males were anaesthetized and de-winged 48 h before the mating assays by clipping their wings (wing−) using dissection tweezers immediately distal to the hinge [51]. Control males were anaesthetized at the same time but were not de-winged (wing+). Males were given 48 h to recover then paired in no-choice mating assays. Assays were first conducted with fru MI05459 sim/mel Dis females and sim/mel Bal controls. The assays were conducted again with fru MI05459 and two additional deletions: fru 4-40 and fru GAL4 . A four-way comparison of copulation occurrence of males with or without wings when paired with sim/mel Dis or sim/mel Bal hybrid females were compared using logistic regression, as above. Copulation occurrence for wing+ versus wing− males in the additional disruption assays were compared using a two-proportion Z-test (α < 0.05).

To test whether fruitless is acting on female perception of olfactory or auditory signals, 48 h before the assay fru MI05459 sim/mel Dis females and sim/mel Bal control females were placed under CO2 anaesthesia and either left unaltered (ant+) or dissection tweezers were used to remove the last two segments of the females' antennae (ant−), which includes the aristae. Females were given 48 h to recover, then paired with males in no-choice mating assays with these females. The assays were conducted again with fru MI05459 and fru GAL4 . A four-way comparison of copulation occurrence of sim/mel Dis or sim/mel Bal hybrid females with or without antennae were compared using logistic regression, as above. Copulation occurrence of ant+ versus ant− females in the additional disruption assays were compared using a two-proportion Z-test (α < 0.05).

To test whether substrate vibrations had an effect in female preference, 5–7-day-old virgin hybrid fru GAL4 sim/mel Dis and corresponding sim/mel Bal females were paired with 5–7-day-old virgin D. melanogaster GFP sperm males in a Plexiglas mating arena [52] placed on a granite base. The base, if left bare, does not allow transmission of the substrate vibrations produced by small insects [53]. The mating arena was coated with Insect-a-Slip Insect Barrier from BioQuip Products (Cat# 2871C), to prevent flies from climbing the walls or ceiling of the arena. For half of the chambers in the arena, the granite base was coated with 1% agarose gel approximately 2 mm thick to add a substrate for the control assays. The mating assays were performed for 24 h, as described above. A four-way comparison of copulation occurrence of sim/mel Dis and sim/mel Bal hybrid females with or without substrate were compared using logistic regression, as above.

3. Results and discussion

(a) Identification and confirmation of fruitless

We used overlapping, small deficiencies to fine-map one of the regions identified by Laturney & Moehring [19] to identify a small genomic region containing a single gene: fruitless (fru figure 1b electronic supplementary material, table S1). The small deletion fru 4-40 has one breakpoint within the fru gene and is known to eliminate the sex-specific transcripts of fru [33] the unrefined upstream breakpoint is encompassed by a non-significant deletion we tested (Df(3R)07280).

To confirm whether fru had an effect on female preference, we used the same conceptual approach as above, but using two alleles known to disrupt sex-specific fru expression (fru GAL4 and fru NP0021 [27,33,42,54,55]) and three transposable element insertions in fru that have yet to be fully characterized but may disrupt fru function (fru MI01850 , fru KG00116 and fru MI05459 [56–58]). When there is an insertion in the D. melanogaster allele of fru in an inter-species hybrid, so that only the D. simulans fru allele is probably functional (sim/mel Dis ), female receptivity towards D. melanogaster males was significantly reduced compared with controls in five of the six lines we tested (figure 1b,d electronic supplementary material, table S1). The non-significant effect of fru GAL4 may be because its position in the 5′ region of the gene does not disrupt the causal transcripts. Combining the six independent tests of fru disruption (fru 4-40 , fru GAL4 , fru NP0021 , fru MI01850 , fru KG00116 and fru MI05459 ) into a meta-analysis gives a very low meta-p value (Fisher's method: p < 10 −8 ), supporting the conclusion that this locus affects female receptivity.

We then established that the effect on female receptivity is not due to aspects of the underlying genetic background. First, we observe that it is not due to hemizygosity, as fru hemizygous pure-species D. melanogaster females did not display a reduction in receptivity (figure 1d). In other words, the disruption itself does not generally affect female receptivity, and the effect depends on which allele is disrupted or ‘unmasked’.

The behaviour we observe is also not simply a general reduction in female receptivity towards all males due to removing the melanogaster allele and unmasking the simulans allele, rather than being due to the signals provided by D. melanogaster males in particular. We matched the fru allele being expressed to the species of the male in the assay by pairing sim/mel Dis females with D. simulans males. If the unmasking of the simulans allele does not affect general female receptivity, then when these females are tested with simulans males they should accept males at the same rate as control females. Females bearing any of the three melanogaster fru disruptions that we assayed mated similarly to the controls lacking this disruption (sim/mel Bal ) when paired with D. simulans males (electronic supplementary material, table S3), indicating that females with only the simulans allele are not suffering a general reduction in receptivity. Further, our results are not due to defective fru processing causing masculinization of mel Dis females as this would be expected to reduce receptivity [59,60], which we did not observe in our control crosses it also would induce aggression [61,62], which was not directly apparent in any of our assays.

To ensure that the effect of fru is species specific, we determined that this effect is not only present in a single strain of D. simulans by testing the fru MI01850 gene disruption in hybrids that were generated using two additional strains of D. simulans. As with the original strain, these strains of D. simulans produced the same significant effect of fru (p = 0.0001, p < 0.0001 electronic supplementary material, table S4). While this confirms the broad effect of this gene in more than one strain of simulans, we cannot rule out that some strains from the species' worldwide distribution may not display this effect this warrants additional testing.

If fru is indeed involved in species-specific female preference behaviour, D. melanogaster-like female preference behaviour should be rescued after excising the transposable element insertion from one of the fru disruption lines. We chose to excise the fru MI05459 Minos element insertion as it showed a strong effect on female preference (figure 1d electronic supplementary material, table S1). The fru MI05459 insertion disrupts at least some fru transcripts, as it qualitatively reduces transcription of P2 and eliminates transcription of P5 (figure 1e electronic supplementary material, figure S1) and fru MI05459 /fru 4-40 males are behaviourally sterile (electronic supplementary material, table S5), as would be expected if the P1 transcript is functionally disrupted [33,34]. Homozygous deletion of P3 and P4 cause lethality when we pair disruption fru MI05459 with a known fru null (fru sat15 ), fru MI05459 /fru sat15 offspring are viable (25% expected, 24% seen: 86/357), indicating that this disruption does not appear to affect the functionality of transcripts P3 and P4. After excision, sequencing and RT-PCR to confirm a clean removal of the Minos element (figure 1e electronic supplementary material, figure S2), we assayed female receptivity (figure 1f). Hybrid females with the excised Minos element mated significantly more (19/32) than hybrids bearing the disruption (3/32 p < 0.0001, Z = 4.21) and had only a slight non-significant reduction in mating frequency compared with control hybrids in which fru is intact (26/32 p = 0.0549, Z = 1.92). Thus, by removing the disruption in fru, we regained female receptivity towards D. melanogaster males, confirming the effect of fru on species-specific female preference.

(b) Identification of the fru transcript affecting female preference

To identify which fru transcript may be affecting female rejection behaviour, we first assessed the presence or absence of P1–P5 transcripts in females for five of the disruption lines we tested (electronic supplementary material, figure S1) fru 4-40 was previously shown to disrupt P1 and P2 transcripts [33]. While the presence of fru transcript does not indicate functional transcript [34], most of the lines we tested had P1, P3 and P4 transcripts present, while several had reduced or absent P2 or P5. Based on these preliminary observations, and the potential to rule out P1, we then tested precise deletions of either the D. melanogaster fru P1 first exon or fru P2 first exon (fru ΔP1 and fru ΔP2 , respectively) to determine if the transcripts from one of these exons are affecting inter-species female rejection. As we cannot be certain that a P2 deletion does not also subtly affect P5 expression, and to determine if a P2 transcript could plausibly affect behaviour via the female brain, we used RT-PCR to confirm that the P2 transcript, but not the P5 transcript, is expressed in female brains (figure 2d). Hybrid females bearing a melanogaster P1 or P2 disruption appear to be as attractive to D. melanogaster males as non-disruption females. Males initiate courtship (n = 32, p= 0.867, F = 3.094) and spend a total time courting (n = 32, p= 0.742, F = 0.299 electronic supplementary material, table S6) hybrid disruption females similarly to control females.

Figure 2. Proportion of females mating when they contain fru ΔP1 (independent deletions #1 and #2) or fru ΔP2 targeted deletions of their melanogaster allele. Note that, for all hybrid females, the simulans fru allele is intact. (a) fru ΔP1 (#1 n = 39) or fru ΔP2 (n = 50) hybrid females, compared with hybrid Canton-S/simulans controls in a 24 h assay. (b) Hybrid females with a disruption (sim/mel Dis light purple) compared with hybrid females without a disruption (sim/mel Bal dark purple), pure species females with (mel/mel Dis light blue) and without (mel/mel Bal dark blue) a heterozygous disruption. Pure-species females assayed for 1 h hybrid females assayed for 24 h. p-Values are for the interaction term between species (sim/mel versus mel/mel) and genotype (Dis versus Bal). (c) Pure species D. melanogaster females paired with D. melanogaster males for 1 h (n = 30) females are wild-type (Canton-S) or have a homozygous deletion (fru ΔP1 or fru ΔP2 ). (d) RT-PCR product from tissues taken from a female body (thorax and abdomen), head without the brain, brain, and whole fly for fru P2, fru P5, and the control gene RpL32. (Online version in colour.)

The receptivity of females was tested in two ways. First, for a comparison with the genetic background in which the deletions were made (Canton-S), hybrid females with the disruption were compared with hybrids made from wild-type Canton-S melanogaster (figure 2a). Second, to ensure that the disruption itself is not affecting behaviour, we tested the disruptions in the same manner as our original assays (figure 2b). Hybrid females bearing fru ΔP1 , and thus only expressing D. simulans P1 alleles, did not show reduced receptivity towards D. melanogaster males. In contrast, fru ΔP2 hybrid females that only express D. simulans P2 alleles show reduced mating with D. melanogaster males (figure 2a,b electronic supplementary material, tables S1 and S7). Note that this finding does not rule out a potential role of other fru transcripts in female receptivity. Males actively court fru ΔP2 hybrid females: during a 10 min assay males courted an average of 70% of the time and made 17 copulation attempts (n = 10 electronic supplementary material, table S8). The fru ΔP2 hybrid females are also not simply lacking the display of copulation acceptance, but actively reject courting males an average of 26 times in a 10 min assay, via displays of ovipositor extrusion, kicking, and/or wing scissoring, and spent an average of 3 min in the 10 min assay actively moving away from the courting male (n = 10 electronic supplementary material, table S8). The effect of fru ΔP2 on female receptivity is the first indication that one of the non-sex-specific fru transcripts may play an important role in behaviour. This finding challenges theoretical models that consider female preference genes to be separate from genes underlying the production of male traits, primarily due to the different morphological features involved in these two traits and the potential cost of intralocus sexual antagonism (e.g. [63,64] but see [65,66]). The separate fru transcripts encoding female preference and male courtship traits may have facilitated this gene's preference-trait pleiotropy.

(c) Effect of homozygous P2 deletions in pure-species D. melanogaster females

We also tested whether the loss of P1 or P2 affects within-species female preference by assaying receptivity in pure species pairings. Male D. melanogaster court fru ΔP1 and fru ΔP2 homozygous deletion females for a similar amount of time as Canton-S control females (n = 30, p= 0.207, F = 1.606) but there is a difference in the time it takes for males to initiate courtship (n = 30, p= 0.0002, F = 3.103 electronic supplementary material, table S9). In pairwise comparisons, the latency to courtship is shorter for fru ΔP1 (n = 30, p= 0.00028, t= 1.672) and fru ΔP2 (n = 30, p= 0.0028, t= 1.672) females compared with the control, indicating that males are faster to initiate courtship with deletion females. However, although males court these females more rapidly, D. melanogaster females homozygous for the fru ΔP2 deletion exhibit reduced mating with D. melanogaster males compared with fru ΔP1 and Canton-S females (n = 30, p= 0.01, Z = 2.334 figure 2c electronic supplementary material, table S9). When paired with D. simulans males in a 5-day mating assay period, there was no difference in copulation frequency among the homozygous deletions (P1: 1/15 P2: 0/15) compared with control (0/15 p= 0.154, Z = 1.017), but all three groups of females show very low mating in this inter-species assay, reducing our power to detect differences. These results suggest that the loss of D. melanogaster fru P2 significantly reduces female receptivity within species, but the loss of either fru P1 or fru P2 does not appear to increase female receptivity towards heterospecific males.

(d) Female evaluation of sensory signals via fru

Lastly, we wanted to assess whether the fru gene is involved in a female's assessment of an individual male courtship component, rather than in the integration of male courtship signals. Drosophila male courtship involves multimodal signals exchanged through a series of stereotypic courtship steps. During this exchange, the female assesses the male based on these signals and chooses to either copulate with or reject the male. The primary modalities that have historically been examined are the male auditory signal of courtship song produced by wing vibrations, which differs between these two species [67–69], and chemical pheromones, which also differ between these two species [70–72]. Removal or alteration of male song, or removal of female perception of song does not cause females to become fully receptive to heterospecific males [63,64], presumably because all of the other courtship components are still heterospecific, and a single negative courtship signal can be sufficient to induce female rejection. However, it is possible that the rejection induced by a single gene may result from the processing of a single component of courtship, and removal of that component might induce D. melanogaster-like receptivity in hybrid females bearing a fru disruption. Female hybrids bearing a disruption in another gene had their receptivity rescued upon removal of male courtship song [73], indicating that it is possible for single genes to affect receptivity via the presence of a single ‘negative’ signal.

To test this, we attempted to remove or alter two primary individual components of either male courtship or female perception in mating assays involving female hybrids bearing a fru disruption that unmasks the D. simulans allele of fru. If fru is acting via only one of these sensory modalities, then removing that component will increase the receptivity of these hybrids towards D. melanogaster males. Removal of the wings of D. melanogaster males caused either no effect or a significant reduction in mating when these males were paired with hybrid females bearing a fru disruption (figure 3a electronic supplementary material, tables S10 and S11), similar to what is seen when wings are removed in pure species matings of D. melanogaster [64,67]. Therefore, removal of this auditory signal does not rescue female receptivity in relation to the D. simulans fru allele. We then tested whether perception of auditory or olfactory information affects receptivity in the context of fru by removing the last two antennal segments and aristae of the female these are the primary organs for sensing odorants and auditory signals, respectively [74]. We observed no significant difference in copulation upon removal of these organs (figure 3b electronic supplementary material, tables S12 and S13). Lastly, we tested whether fru is acting via a recently discovered sensory modality: substrate-borne abdominal vibrations during courtship [75,76]. We used a granite surface to eliminate the transfer of these signals [53], but female receptivity did not increase upon removal of substrate vibrations in the context of fru (figure 3c electronic supplementary material, table S14). Thus, fru does not appear to be acting through these single sensory modalities, and is probably either acting through the integration of multiple modalities, or via a signal or sensory organ other than the ones we assessed here.

Figure 3. The removal of sensory modalities does not increase female receptivity towards D. melanogaster males. (a) Proportion of hybrid disruption fru MI05459 and control sim/mel Bal females that mate when paired with males with wings removed (wing−) versus those with wings intact (wing+). The assay was repeated for this disruption plus two additional disruptions (fru 4-40 , fru GAL4 ). (b) Proportion of hybrid disruption fru MI05459 and control sim/mel Bal females that mate when their last two antennal segments have been removed (ant−), which removes the primary sensory organs for both olfactory and auditory signals, compared with females with intact antennae (ant+). The assay was repeated for this disruption and fru GAL4 . (c) Proportion of hybrid disruption fru MI05459 or fru GAL4 and control sim/mel Bal females that mated when placed on a vibrationless (vib−) versus control (vib+) substrate.

Identifying a genetic basis for behavioural isolation is critical for understanding the speciation process and the evolutionary history of such a reproductive barrier [77,78]. Here, we identify fru as a behavioural isolation gene that influences D. simulans female rejection of D. melanogaster males. This effect appears to be through the P2 group of transcripts of fru, the first behavioural role of a non-sex-specific transcript of this gene. Since the fru gene plays a role in sex determination and has highly conserved genetic sequence across multiple orders, it is possible that splice variants of this gene, and other genes involved in sex-differentiation pathways, may affect behavioural isolation in other species pairs.

Data accessibility

All data are provided in the manuscript and electronic supplementary material tables.


The early beginning of DNA sequencing

The progress from the first isolation of DNA by Friedrich Mietscher in 1869 to next generation sequencing (high-throughput sequencing) was the result of continuous efforts of the science community. After Watson, Crick and Franklin discovered the structure of the DNA in 1953, many attempts were made to sequence DNA. Eventually, in 1965, Robert Holley sequenced the first tRNA, for which he was awarded the Nobel Prize in 1986. In his Nobel Prize speech, he said, “without minimizing the pleasure of receiving awards and prizes, I think it is true that the greatest satisfaction for a scientist comes from carrying a major piece of research to a successful conclusion” (Holley, 1968). In 1972, Walter Fiers was first to sequence the DNA of a complete gene (the gene encoding the coat protein of the bacteriophage MS2) by utilising RNAses to digest the virus RNA and isolate oligonucleotides, and then separating them via electrophoresis/chromatography (Declercq et al. 2019 Min Jou et al., 1972).


Abstract

Background:

Genome-wide transcriptome analysis has greatly advanced our understanding of the regulatory networks underlying basic cardiac biology and mechanisms driving disease. However, so far, the resolution of studying gene expression patterns in the adult heart has been limited to the level of extracts from whole tissues. The use of tissue homogenates inherently causes the loss of any information on cellular origin or cell type-specific changes in gene expression. Recent developments in RNA amplification strategies provide a unique opportunity to use small amounts of input RNA for genome-wide sequencing of single cells.

Methods:

Here, we present a method to obtain high-quality RNA from digested cardiac tissue from adult mice for automated single-cell sequencing of both the healthy and diseased heart.

Results:

After optimization, we were able to perform single-cell sequencing on adult cardiac tissue under both homeostatic conditions and after ischemic injury. Clustering analysis based on differential gene expression unveiled known and novel markers of all main cardiac cell types. Based on differential gene expression, we could identify multiple subpopulations within a certain cell type. Furthermore, applying single-cell sequencing on both the healthy and injured heart indicated the presence of disease-specific cell subpopulations. As such, we identified cytoskeleton-associated protein 4 as a novel marker for activated fibroblasts that positively correlates with known myofibroblast markers in both mouse and human cardiac tissue. Cytoskeleton-associated protein 4 inhibition in activated fibroblasts treated with transforming growth factor β triggered a greater increase in the expression of genes related to activated fibroblasts compared with control, suggesting a role of cytoskeleton-associated protein 4 in modulating fibroblast activation in the injured heart.

Conclusions:

Single-cell sequencing on both the healthy and diseased adult heart allows us to study transcriptomic differences between cardiac cells, as well as cell type-specific changes in gene expression during cardiac disease. This new approach provides a wealth of novel insights into molecular changes that underlie the cellular processes relevant for cardiac biology and pathophysiology. Applying this technology could lead to the discovery of new therapeutic targets relevant for heart disease.

Introduction

Clinical Perspective

What Is New?

Single-cell sequencing can be used to identify subpopulation-specific and novel cell type-specific markers in the healthy and diseased heart.

Identification of new disease-driven cell populations provides insights into gene expression changes that are triggered by ischemic injury.

Myozenin2-enriched cardiomyocytes form a distinct subpopulation of cardiomyocytes in the healthy heart.

Cytoskeleton-associated protein 4 is a novel marker for activated fibroblasts that positively correlates with known myofibroblast markers in both murine and human diseased hearts.

In vitro experiments suggest a modulating function for cytoskeleton-associated protein 4 in myofibroblast activation.

What Are the Clinical Implications?

Single-cell sequencing of the adult heart allows us to examine molecular mechanisms that drive the cellular processes underlying heart disease.

New biology discovered by single-cell sequencing can ultimately lead to the development of novel therapeutic strategies.

We identified cytoskeleton-associated protein 4 as a new marker for activated cardiac fibroblasts during ischemic injury that appears to attenuate myofibroblast activation.

The heart consists of a collection of different cell types that coordinately regulate cardiac function. 1 Changes in cellular composition and function are mechanistically responsible for cardiac remodeling and repair during disease. Identifying the underlying differences in gene expression between cell types or transcriptome heterogeneity across cells of the same type will greatly help to improve our understanding of cellular changes under both healthy and diseased conditions. 2 RNA amplification strategies have provided the opportunity to use small amounts of input RNA for genome-wide gene expression analysis at single-cell resolution. The analysis of individual cells randomly drawn from a sample allows for an unbiased view of mRNAs present in different cell types of an organ, which will provide a more accurate classification of cellular diversity through differences in gene expression. 2–5 Recently, transcriptional profiling of cardiac cell populations during heart development by single-cell sequencing has led to the identification of lineage-specific gene programs that underlie early cardiac development. 4,6 However, so far, no studies have focused on single-cell transcriptomics of the adult heart.

Here we present a method for obtaining high-quality RNA from digested adult cardiac tissue for automated single-cell sequencing of both the healthy and diseased heart. Using this approach, we were able to collect reliable gene expression data for all main cardiac cell types. Clustering analysis uncovered both known and novel markers of certain cell populations and led to the identification of multiple subpopulations within a certain cell type. Single-cell sequencing analysis of both healthy hearts and hearts suffering from ischemic injury indicated that cardiac damage gives rise to new subpopulations of known cell types. Using our single-cell sequencing data, we were able to identify cytoskeleton-associated protein 4 (CKAP4) as a novel marker for activated fibroblasts, which we were able to validate in cardiac samples from patients suffering from ischemic heart disease. Additionally, in a set of in vitro experiments, we were able to show that CKAP4 is involved in myofibroblast activation.

Altogether, our data for the first time show the feasibility of using single-cell sequencing on the adult heart to study transcriptomic differences between cardiac cell types and the heterogeneity in gene expression within 1 cell population. This method will greatly advance the molecular insights into cellular mechanisms that are relevant for cardiac remodeling and function.

Methods

An expanded methods section, a step-by-step protocol for the single-cell sequencing approach, and any associated references are available in the online-only Data Supplement. The data, analytic methods, and study materials will be made available by the authors to other researchers for purposes of reproducing the results or replicating the procedure.

Experimental Animals

All animal studies were performed in accordance with institutional guidelines and regulations of the Animal Welfare Committee of the Royal Netherlands Academy of Arts and Sciences. C57BL/6J mice were subjected to sham (control) or ischemia reperfusion (IR) surgery as previously reported. 7 Hearts were collected and analyzed 3 days after surgery.

Digestion of the Heart

After collecting the infarcted areas (infarct and border zone region) or the corresponding region of control hearts, the tissue was digested for 15 minutes and used for subsequent RNA isolation or single-cell sorting and sequencing.

Flow Cytometry

The freshly collected cardiac cell lysates were resuspended in DMEM, and living single cells were sorted into 384-well plates based on multiple scatter properties and DAPI exclusion. After cell sorting, the plates were immediately centrifuged and stored at −80°C.

Library Preparation and Sequencing of Single Cells

The SORT-seq procedure was applied as described previously. 5 Illumina sequencing libraries were prepared using the TruSeq small RNA primers (Illumina) and sequenced paired-end at 75-bp read length with Illumina NextSeq.

Data Analysis of Single-Cell RNA Sequencing

Paired-end reads from Illumina sequencing were mapped with BWA-ALN 8 to the reference genome GRCm38/mm10. For quantification of transcript abundance, the number of transcripts containing unique molecular identifiers per cell-specific barcode was counted for each gene. Next, the RaceID2 algorithm was used to cluster cells based on K-medoids clustering, visualize cell clusters using t-distributed Stochastic Neighbor Embedding (t-SNE), and compute genes up- or downregulated in all cells within the cluster compared with cells not in the cluster. 2,3

Bulk Sequencing and Data Analysis

For bulk sequencing, RNA from cardiac tissue was isolated using Trizol. Subsequently, libraries were prepared and sequenced using a similar protocol as described for single-cell RNA sequencing.

Pathway Analysis and Gene Ontology

To investigate whether differentially expressed genes in subgroups of cells share a similar biological function, enrichment analyses on these genes were performed using the gene ontology enrichment tool and the Kyoto Encyclopedia of Genes and Genomes pathway analysis using DAVID. 9 Significant enrichment of genes in gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathway analyses are shown as P values corrected for multiple testing using the Benjamini–Hochberg method.

Human Heart Samples

Approval for studies on human tissue samples was obtained from the Medical Ethics Committee of the University Medical Center Utrecht, The Netherlands (12#387). In this study, we included tissue from the left ventricular free wall of patients with ischemic heart disease (infarct, border zone, and remote) and left ventricular free wall of nonfailing donor hearts.

Gene expression values obtained by quantitative polymerase chain reaction (qPCR) were plotted for correlation analysis.

Statistical Analysis (qPCR)

The number of samples (n) used in each experiment is indicated in the legend or shown in the figures. The results are presented as mean±standard error of the mean. For qPCR analysis, statistical analyses were performed using PRISM (GraphPad Software Inc). If 2 groups were compared, a Student’s t test was used.

Results

Isolation of High-Quality RNA From Digested Hearts From Adult Mice

To create a reliable gene expression atlas of all cardiac cell types, we first aimed to determine the optimal method for tissue digestion and RNA extraction to obtain high-quality RNA from single-cell suspensions of the adult heart. To do so, mice were euthanized, after which the hearts were perfused with cold perfusion buffer. After collecting the anterior wall of the left ventricle, the tissue was washed in ice-cold perfusion buffer, kept on ice, minced into small pieces, and transferred into a glass vial with cold digestion buffer. After digestion of the tissue, the cell suspension was gently pipetted up and down, after which the lysate was passed through a 100-μm cell strainer. The strainer was then rinsed with DMEM, and the cells were collected for subsequent RNA extraction or flow cytometry (Figure 1A and 1B).

Figure 1. Sorting single cells from the adult heart. A, Schematic representation of the heart and all main cardiac cell types. B, Work flow of the protocol. C, Gating strategy to sort single cells based on different scatter properties. D, Schematic image of the heart highlighting the area selected for enzymatic digestion images of cells before D’ and after sorting D”. E, Representative bioanalyzer results for RNA quality for the indicated number of sorted cells from control heart. This quality step was performed on each heart used for digestion and downstream single-cell analysis. FSC indicates forward scatter N/A, not applicable RIN, RNA integrity number and SSC, sideward scatter.

To optimize our digestion protocol, we started out by testing 4 different digestion solutions that are commonly used to digest muscle tissue containing liberase, collagenase II, pancreatin, or trypsin. 10–13 Based on cellular imaging and assessment of RNA quality by RNA integrity number (RIN), the digestion solution containing Liberase appeared most optimal for dispersing adult cardiac tissue while maintaining intact RNA (Figure IA and IB in the online-only Data Supplement).

To compare both the influence of the solution containing the single-cell suspension and time the samples were kept on ice before further processing, we tested both DPBS with 5% FBS 14 and DMEM and collected the cells for RNA analysis either immediately or after having been on ice for 30 or 60 minutes. Although the time on ice did not seem to influence the RIN, using DMEM appeared to provide higher quality RNA when lysing cardiac tissue (Figure IC in the online-only Data Supplement).

Next, we examined whether digestion in a 37°C incubator would be better than a 37°C shaking water bath using 2 different concentrations of the digestion enzyme Liberase. Based on RIN, 0.5 mg liberase yielded good-quality RNA, and visually it was evident that using the shaking water bath was better at digesting the pieces of cardiac tissue into suspension (Figure ID in the online-only Data Supplement). RNA isolated from mechanically homogenized cardiac tissue was taken along as a positive control (control heart).

To test whether the method of RNA isolation would influence the RNA quality, we used both the mirVana RNA isolation kit (Thermo Fisher Scientific) and Trizol (Invitrogen) to isolate RNA from the digested cardiac cells. RNA integrity was comparable for samples isolated with Trizol or mirVana (Figure IE in the online-only Data Supplement).

Together these data indicated that, based on cell morphology and RNA quality, using liberase to digest adult cardiac tissue for 15 minutes in a 37°C shaking water bath before Trizol RNA isolation, was optimal for obtaining a single-cell suspension of the heart (Figure IE in the online-only Data Supplement, indicated by an arrow).

Single-Cell Sorting Strategy

After enzymatically dispersing cardiac tissue, flow cytometry was used to separate the cells. Empirically, we found that using a large nozzle size (130 µm) allowed for sorting all range of cardiac cells without damaging larger cardiomyocytes. We based our gating strategy on multiple scatter properties, including DAPI to sort for living cells and green autofluorescence to sort for more complex cells that contain cytoskeletal filaments (Figure 1C). 15,16 Our results indicated that on average 89.4% of the cells from control hearts were viable after sorting, of which 92% showed an autofluorescent signal (Figure 1C). Additionally, to enrich for bigger cells, we selected for cells with a higher forward scatter width. Imaging the cells after sorting indicated that the cells remained intact and suitable for further sequencing applications (Figure 1D and 1E). Because cardiomyocytes are notoriously difficult to isolate by sorting strategies, we wanted to confirm the quality of the isolated cardiomyocytes after sorting. To do so, we crossed αmyosin-heavy chain-Cre transgenic mice with R26 lox-Stop-lox tdTomato mice to mark the cardiomyocyte population (Figure IIA in the online-only Data Supplement). After sorting we obtained 88.8% living cells (Figure IIB in the online-only Data Supplement), in line with data obtained in Figure IC in the online-only Data Supplement. To confirm that these were living cells, we resorted the single-cell lysate, which indicated 99.5% of these sorted cells were alive. (Figure IIC in the online-only Data Supplement). Imaging of the Tomato signals showed the cardiomyocytes to be intact after the sorting procedure (Figure IID in the online-only Data Supplement). To ensure that the quality of RNA remained intact even after sorting, we collected different quantities of cardiac cells and isolated RNA. Bioanalyzer results indicate that after cell separation, the RNA isolated from the dispersed and sorted cells remained of good quality as indicated by RIN (Figure IE in the online-only Data Supplement).

Single-Cell Sequencing to Identify Gene Expression Signatures in All Main Cardiac Cell Types

Using the SORT-Seq protocol, 5 on average we detected 16 874 raw unique reads per cell. The distribution of the readcounts across cells indicated that the reads come from single cells because we did not observe the bi- or multimodal distribution of reads one would expect when detecting a transcript from doublets or multiplets (Figure IIE in the online-only Data Supplement). After applying filtering procedures for quality and input (see online-only Data Supplement), 426 cells from 3 different control hearts were used for downstream in silico analysis.

For the identification and analysis of all main cardiac cell types, we used the RaceID2 algorithm. 3,17 K-medoids clustering of 1-Pearson correlation coefficients revealed 14 distinct cell clusters in the adult heart (Figure 2A). The separation between the different cell clusters was further validated by a t-distributed Stochastic Neighbor Embedding (t-SNE) map showing lower intracluster cell-to-cell distance compared with the intercluster distances (Figure 2B). Next, we assessed which genes were differentially expressed within each cluster compared to all other clusters (Database I in the online-only Data Supplement). We used the abundance of known marker genes to determine the cell identity of the different cardiac cell clusters (Figure 2C and Database I in the online-only Data Supplement). We were able to identify clusters belonging to all major cell populations in the heart (Figure IIIA in the online-only Data Supplement). The t-SNE maps showing the expression of some of these marker genes indicated the presence of cardiomyocytes (Figure 2D), fibroblasts (Figure 2E), endothelial cells (Figure 2F), and macrophages (Figure 2G).

Figure 2. Clustering of cardiac cells based on gene expression differences. A, Heatmap showing distances in cell-to-cell transcriptomes of 426 cells obtained from 3 hearts. Distances are measured by 1-Pearson’s correlation coefficient. K-medoids clustering identified 14 different cell clusters depicted on the x and y axes of the heatmap. B, t-SNE map indicating transcriptome similarities among all individual cells. Different numbers and colors highlight the different clusters identified by K-medoids clustering shown in A. C, Tables showing a list of known marker genes of main cardiac cell types used to identify the subpopulations of cells identified in A. D through G, t-SNE maps indicating the expression of selected, well-established cellular markers in cell populations identified as cardiomyocytes (D), fibroblasts (E), endothelial cells (F), and macrophages (G). Data are shown as normalized transcript counts on a color-coded logarithmic scale. t-SNE indicates t-distributed Stochastic Neighbor Embedding.

Taken together, these findings demonstrate that by using our method, we are able to generate a single-cell gene expression profiling of the adult heart, which can identify all major cell types.

Contribution of Mitochondrial Transcripts Differ for Individual Cell Types

While investigating gene expression signatures in all main cardiac cell types (Figure IIIA and Database I in the online-only Data Supplement), we observed that the contribution of mitochondrial and genomic transcripts varied among different cells. All cardiomyocytes have higher percentages of mitochondrial transcripts (58% to 86% of total transcripts) when compared with other cardiac cell types (Figure IIIB through IIIE in the online-only Data Supplement). Because mitochondrial transcripts are so abundant, we excluded them from the clustering analysis and focused only on the differential expression of genomic genes.

Single-Cell Sequencing Identifies Cell Type-Specific Subpopulations in the Healthy Heart

Single-cell sequencing of the adult heart revealed multiple clusters within the same cell type (Figure 2C and Figure III in the online-only Data Supplement). These cells are bioinformatically clustered based on the differential expression of genes, while they also express marker genes that identify them as a specific cell type. According to the gene expression of marker genes, we were able to identify 4 different clusters of cardiomyocytes, 2 clusters of endothelial cells, 2 clusters of fibroblasts, 2 clusters of macrophages, 1 cluster of smooth muscle cells, and 1 cluster of erythrocytes (Figure 2C and Figure III in the online-only Data Supplement).

To explore these clusters in more detail, we focused on the cardiomyocyte clusters. These clusters are defined as cardiomyocytes based on the enriched expression of cardiomyocyte marker genes compared with other cells (Figure 3A, indicated in red, and Figure 3B). The enriched expression of a divergent set of genes classifies them as separate cell clusters based on the RaceID2 parameters (Figure 3A, indicated in black, and Figure 3B). It is interesting to note that the relative expression of the well-known cardiomyocyte markers genes showed a high level of variation between the different cardiomyocyte clusters, and as expected we observed them to be enriched compared with the fibroblasts (Figure 3C).

Figure 3. Gene expression differences across CM subpopulations. A, Table showing the top 25 highest expressed and enriched genes per cluster. Genes highlighted in red are known as cardiomyocyte markers. B, t-SNE maps showing the distribution in the expression of selected cardiomyocyte markers that are enriched in indicated cardiomyocyte clusters. Expression is shown as normalized transcript counts on a color-coded scale. C, Heatmap of average normalized number of reads across the 4 cardiomyocyte clusters and a fibroblast cluster as a control. t-SNE indicates t-distributed stochastic neighbor embedding. CM indicates cardiomyocyte.

On the basis of the differential expression of cardiomyocyte marker genes, cluster 4 appeared to be the most divergent from the other cardiomyocyte clusters (Figure 3C). A t-SNE map confirmed that this cluster does express cardiomyocyte markers, such as cardiac troponin T, Tnnt2 (Figure 4A). However, it is the only subpopulation of cardiomyocytes that is enriched for Myozenin2 (Myoz2) expression (Figures 3 and 4B). To determine whether the clustering of Myoz2 expression is a product of stoichastic gene expression within the cardiomyocytes of the heart, we aimed to validate this cluster in an independent mouse model. To do so, we specifically sorted and sequenced cardiomyocytes from cardiac tissue from mice in which we labeled the cardiomyocytes with tdTomato (α myosin-heavy chain -Cre transgenic mice crossed with R26 lox-Stop-lox tdTomato mice) (Figure IIA in the online-only Data Supplement). Similar to the control hearts (Figure IVA in the online-only Data Supplement), we also detected a Myoz2-enriched cardiomyocyte cluster (Figure IVB in the online-only Data Supplement). In comparing the highly expressed genes in the Myoz2-enriched cardiomyocyte clusters from either C57BL/6J or tdTomato mice, we identified a large overlap in enriched genes from both clusters (Figure IVC in the online-only Data Supplement). The overlap in gene enrichment strongly suggests the clustering data to be reliable and that the Myoz2-enriched cardiomyocyte cluster is indeed different from the other cardiomyocyte clusters.

Figure 4. Myozenin 2 expression is restricted to a subset of cardiomyocytes. A and B, t-SNE maps showing expression of Tnnt2 (A) and Myoz2 (B) across cells from control hearts. Expression is shown as normalized transcript counts on a color-coded scale. C, Representative confocal images of control hearts stained for TNNT2 (green), MYOZ2 (red), and DAPI (blue). Immunohistochemistry was performed on 3 control hearts. D and E, The Kyoto Encyclopedia of Genes and Genomes pathway analysis of top 200 upregulated (D) or downregulated (E) genes in the Myoz2–expressing cardiomyocyte cluster (cluster 4) compared with all other cardiomyocyte clusters (clusters 1, 3, and 9). t-SNE indicates t-distributed stochastic neighbor embedding.

Immunohistochemistry indicated MYOZ2 to be enriched in a layer of cardiomyocytes (costained with TNNT2) located at the epicardial surface of the heart (Figure 4C). Gene ontology analysis of the most differentially regulated genes in the Myoz2-enriched cluster versus other cardiomyocytes indicated the highly detected genes to be involved in degenerative disorders of the central nervous system (Figure 4D), whereas the lowly expressed genes appeared to be involved in cardiac diseases (Figure 4E). This finding is interesting because Myoz2, also known as Calsarcin-1, is an inhibitor of the pathological, prohypertrophic phosphatase Calcineurin. 18,19

In conclusion, our data show that single-cell sequencing analysis on cardiac cells can serve to identify subpopulations of a certain cell type, thus indicating a large heterogeneity among cells from the same cell type. Additionally, this approach allows for the detection of novel cell type-enriched gene expression, providing a basis for discovering new gene functions in certain cell types.

Single-Cell Sequencing of the Injured Heart

Ischemic heart disease is the most common form of cardiovascular disease inducing a remodeling response across the damaged area that involves fibroblast activation, immune cell infiltration, neoangiogenesis, and a change in cardiomyocyte function. 20,21 The identification of new regulators, transcription factors, and molecular pathways that are relevant for these cellular processes could eventually lead to the development of new therapeutic strategies for patients suffering from this disease.

To determine whether our method would allow for studying the influence of disease on inter- and intracellular changes in gene expression, we exposed mice to ischemic injury (IR) and collected samples 3 days after IR. 7 Using the same isolation procedure as described earlier, we isolated 509 individual cells from the infarcted region of heart (Figures V and VI in the online-only Data Supplement). We pooled these cells with the 426 cells obtained previously from the corresponding region of control hearts for in silico analysis with RaceID2. Using our optimized protocol, we were able to obtain good-quality RNA from single-cell suspensions from the infarct region, as assessed by RIN (Figure VIB in the online-only Data Supplement). K-medoids clustering of 1-Pearson correlation revealed 17 different cell clusters in all pooled cells from control and 3 days after IR hearts (Figure VIC in the online-only Data Supplement). Using the abundance of known marker genes to determine the cell identity of the different cardiac cell clusters, we were able to identify clusters belonging to all major cell populations in the heart (Figure VIIA in the online-only Data Supplement), which was validated by t-SNE maps showing enriched marker gene expression in these individual clusters (Figure VIIB through VIIE in the online-only Data Supplement). To see how ischemic injury would affect the expression of mitochondrial and genomic genes within all cell types, we compared the average relative expression of genomic and mitochondrial genes within each cluster obtained from either healthy or diseased hearts (Figure VIIIA through VIIID in the online-only Data Supplement). On average, the expression of mitochondrial genes decreased after ischemic injury within all cell clusters.

Inter- and Intracellular Gene Expression Changes Induced by Ischemic Injury

By generating a t-SNE map to indicate transcriptome similarities between all individual cells, we were able to see that the majority of clusters contained cells from both control and injured hearts (Figure 5A and 5B). However, injury triggered the appearance of disease-enriched cell clusters (Figure 5A through 5D and Figure VIIIE in the online-only Data Supplement).

Figure 5. Single-cell sequencing of the ischemic heart. A and B, t-SNE map indicating transcriptome similarities among 935 individual cells. A, Colors highlight the conditions of the hearts from which the cells where derived (control in green and 3 dpIR in pink). B, Numbers highlight the clusters identified in Figure VIC in the online-only Data Supplement. C and D, Enlargement of the t-SNE map from (A) and (B), focusing in on the fibroblast clusters. The dotted circle highlights clusters containing mainly cells from the diseased hearts. E, Enlargement of the t-SNE map from fibroblast clusters showing higher expression of specific genes in the clusters from the diseased hearts compared with clusters from both control and 3 dpIR hearts. Expression is depicted as normalized transcript count on a color-coded logarithmic (log2) scale. F, Validation of the increased expression of genes found upregulated in fibroblasts from diseased compared with control hearts by qPCR on whole hearts (n=5 2-sample t test *P<0.05 control versus 3 dpIR. G, Pie graph showing the number of significantly (P<0.05) up- and downregulated genes in diseased fibroblast clusters compared with the control fibroblast cluster. Genes were selected with a log2-fold change of ≥1.5 or −1.5, respectively. H, Kyoto Encyclopedia of Genes and Genomes pathway analysis and gene ontology term enrichment on genes that were significantly upregulated in (G). I, Expression of the upregulated 11 genes in the diseased fibroblast clusters compared with the control fibroblast cluster as identified in (G). J, Heatmap showing the differential expression of the 11 upregulated genes in the diseased fibroblast clusters in bulk sequencing of the infarct region of control and 3 dpIR hearts. K, t-SNE map showing the expression of Ckap4 across all cells sequenced from control and diseased hearts. Expression is depicted as a normalized transcript count on a color-coded scale. L, Enlargement of the t-SNE map from fibroblast clusters showing a higher expression of Ckap4 in the diseased fibroblast clusters (clusters 13 and 15) compared with the control fibroblast cluster (cluster 7).3 dpIR indicates 3 days after IR Ckap4, cytoskeleton-associated protein 4 IR, ischemia reperfusion qPCR, quantitative polymerase chain reaction and t-SNE, t-distributed stochastic neighbor embedding.

Focusing on the fibroblasts, we were able to identify 3 different fibroblast clusters (7, 13, and 15) (Figure 5B through 5D). By examining the library origin, we observed that the cells in clusters 13 and 15 predominantly stemmed from the injured hearts, whereas cluster 7 contained cells from both conditions (Figure 5C and 5D). Cells from clusters 13 and 15 were characterized by the relatively high expression of Postn, Wisp1, and Tnc, previously associated with fibroblast activation, 22–24 and t-SNE maps again confirmed the enriched expression of these genes in the disease-enriched fibroblast clusters (Figure 5E and Figure IX in the online-only Data Supplement). The disease-specific induction of these genes was validated by qPCR on cardiac tissue from either control or injured hearts (Figure 5F).

Ckap4 Expression Is Specifically Increased in Activated Fibroblasts

Having identified a population of fibroblasts stemming from both healthy and diseased hearts (cluster 7) and a population stemming predominantly from diseased hearts (cluster 13 and 15), we next determined the differentially expressed genes between these populations. Using a log2-fold change of 1.5 or −1.5 or more with a P value of 0.05, we found 11 genes upregulated and 20 genes downregulated in the disease-enriched fibroblasts population (Figure 5G). In using the same parameters for defining differentially expressed genes in larger cell clusters, the number of differentially expressed genes between cells from the healthy and ischemic hearts became much larger. For example, we found 205 genes differentially expressed between healthy and ischemic cardiomyocytes coming from clusters 1, 3, 4, and 8, whereas 191 genes were differentially expressed in macrophages coming from either the healthy or ischemic hearts forming cluster 5, 9, 14, and 16 (data not shown). The upregulated genes in these diseased-enriched clusters were related to various processes associated with extracellular matrix deposition and collagen deposition (Figure 5H), a hallmark of the fibrotic response after ischemic injury of the heart. 25 In line with this observation, among the enriched genes, we detected various collagen genes and markers known to be expressed in activated fibroblasts, such as Postn(Figure 5I). 22 The identification of clusters that are enriched for disease-related genes from ischemic hearts further underscores the validity of our clustering approach.

In addition to known markers for activated fibroblasts, we identified Ckap4 to be upregulated in this population of activated fibroblasts. Ckap4 is a transmembrane protein and can act as a receptor for various ligands in different cells. 26,27 However, its function in cardiac fibroblasts remains unknown. We could confirm the upregulation of Ckap4 after io on ischemic injury in the whole heart by bulk RNA sequencing (Figure 5J). However, our single-cell sequencing data allowed us to detect the upregulation of Ckap4 specifically in activated fibroblasts and not in any other cell types, which was confirmed by t-SNE maps (Figure 5K and 5L). By immunohistochemistry, we validated the induction of CKAP4 in hearts after ischemic injury, which overlapped with cells expressing the fibroblasts marker vimentin (Figure 6A). To investigate the functional relevance of Ckap4 expression in activated fibroblasts, we inhibited Ckap4 expression in fibroblasts, after which we exposed them to TGFβ, a well-known inducer of myofibroblasts formation (Figure 6B). 28 TGFβ was able to induce Ckap4 expression when compared with control. Small interfering RNA-mediated inhibition of Ckap4 resulted in a dose-dependent reduction of Ckap4 (Figure 6C) on the mRNA and protein level (Figure 6C and 6D). The qPCR analysis showed that TGFβ treatment led to an increase in the expression of markers for activated fibroblasts and that inhibition of Ckap4 further potentiated this effect under such conditions (Figure 6E). Although this finding warrants further investigation, this seems to imply that CKAP4 in activated fibroblast functions to dampen the expression of genes related to activated fibroblast.

Figure 6. Ckap4 is specifically expressed in fibroblasts after IR and in activated fibroblasts in vitro. A, Representative confocal images of control (left) and 3 dpIR (right) mouse heart stained for CKAP4 (red) and known markers for different cardiac cell types (green): endothelial cells (PECAM1, upper), cardiomyocytes (ACTN2, middle), and fibroblasts (VIM, lower). Immunohistochemistry was performed on 3 hearts per condition. B, Schematic overview of Ckap4 knockdown experiments in activated NIH/3T3 fibroblasts. C, qPCR for Ckap4 after TGFβ stimulation in NIH/3T3 fibroblasts after Ckap4 siRNA treatment or scrambled siRNA treatment. Expression levels are relative to vehicle-treated NIH/3T3 cells transfected with scrambled siRNA (n=3 to 10 2-sample t test #P<0.005 versus scrambled siRNA vehicle treated *P<0.005 versus scrambled siRNA vehicle treated $P<0.005 versus scrambled siRNA TGFβ treated). D, Representative confocal images of NIH/3T3 cells on vehicle or TGFβ treatment without or with Ckap4 knockdown. E, qPCR for marker genes of activated fibroblasts in NIH/3T3 cells on vehicle or TGFβ treatment with or without Ckap4 knockdown. Expression levels are relative to vehicle-treated NIH/3T3 cells transfected with scrambled siRNA (n=5 to 9 2-sample t test #P<0.05 versus scrambled siRNA vehicle treated *P<0.05 versus scrambled siRNA vehicle treated $P<0.005 versus scrambled siRNA vehicle treated). 3 dpIR indicates 3 days afer IR Ckap4, cytoskeleton-associated protein 4 IR, ischemia reperfusion NIH, National Institutes of Health qPCR, quantitative polymerase chain reaction siRNA, small interfering RNA TGFβ, transforming growth factor β and VIM, vimentin.

Also in cardiac samples from patients suffering from ischemic heart disease, we were able to confirm a positive correlation in expression between CKAP4 and genes known to be induced in activated cardiac fibroblasts (Figure 7A through 7C). In addition, immunohistochemistry showed a strong overlap between CKAP4 expression and various well-established fibroblast markers in human ischemic hearts (Figure 8D), suggesting that CKAP4 also has a role in activated fibroblasts in humans.

Figure 7. CKAP4 is coexpressed with markers of activated fibroblasts in human ischemic hearts. A through C, Pearson correlation between the expression of CKAP4 and markers for activated fibroblasts POSTN(A), CTHRC1 (B), and FN1 (C) determined by qPCR analysis on human cardiac tissue (n=30, including control hearts and 3 different parts of the ischemic heart: remote, border zone, and infarct, n=35). D, Representative confocal images from human ischemic hearts stained for CKAP4 (red) and markers for different cardiac cell types (green): endothelial cells (PECAM1, upper left), cardiomyocytes (ACTN2, upper right), and fibroblasts (ACTA2, middle left DDR2, middle right VIM, lower left PDGFR, lower right). Immunohistochemistry was performed on 3 hearts per condition. CKAP4, cytoskeleton-associated protein 4 qPCR, quantitative polymerase chain reaction andVIM, vimentin.

Taken together, our single-cell RNA sequencing data on healthy and diseased hearts demonstrate that we are able to identify disease-specific subpopulations of various cell types. Comparing gene expression patterns between healthy and diseased subpopulations within cell types allowed us to detect cell type-specific upregulation of various genes. Using this approach, we identified Ckap4 as a marker specifically upregulated in activated fibroblasts.

Discussion

Here we show that our optimized technique to isolate and sort adult cardiac cells in combination with a high throughput method to sequence single cells with SORT-seq 2,5,29 provides a unique opportunity to reliably obtain single-cell gene expression data of the adult mammalian heart (Figure X in the online-only Data Supplement and step-by-step protocol in the online-only Data Supplement). Among the sequenced cells, we were able to identify all major cardiac cell types, including cardiomyocytes, fibroblasts, endothelial cells, and macrophages. 1,30

A major advantage of single-cell sequencing is the ability to detect heterogeneity within a certain cell type in the heart. Clustering analysis of our single-cell sequencing data shows differentially expressed genes in subsets of cells that are likely to contribute to the functional diversity within different cell types. For example, we found 4 clusters of cardiomyocytes that when compared to fibroblast, as expected, show a significant enrichment for cardiomyocyte marker gene expression. However, our data also show that there is substantial heterogeneity in the expression of well-established cardiomyocyte markers among the different cardiomyocyte subpopulation (Figure 3C). Although we currently do not know the biological relevance of this observation, it could imply that there are functionally different cardiomyocyte populations already in a healthy heart.

In addition, we show that only a subset of cardiomyocytes express Myoz2, a protein belonging to a family of sarcomeric proteins. 18 Myoz2 has been shown to tether α-actinin to calcineurin, a well-known inducer of cardiomyocyte hypertrophy, 31 thereby inhibiting the pathological hypertrophic response of cardiomyocytes. 18 Myoz2 expression limited to only a subset of cardiomyocytes, predominantly located toward the epicardial region of the heart, raises the possibility that subpopulations of cardiomyocytes respond differently to calcineurin-mediated hypertrophy, with some being more resistant than others. A challenge for single-cell sequencing studies is to discriminate between difference based on stoichastic changes in gene expression or biology. 32 To show that the differences in expression in the Myoz2-enriched cardiomyocte cluster are biologically relevant, we additionally clustered sequenced cardiomyocytes from an additional mouse model (Figure IV in the online-only Data Supplement). Also in this study, we were able to identify a distinct cluster of cardiomyocytes to be enriched for Myoz2 expression. When comparing the enriched genes in both Myoz2 clusters, we detected a large overlap, suggesting that these clusters of Myoz2-enriched cardiomyocytes indeed resemble each other and might be functionally different compared with the other cardiomyocytes. Determining the biological function of this subset of cardiomyocytes could yield valuable insights into cellular mechanisms responsible for cardiac function and pathologies.

Comparing single-cell sequencing datasets from both the healthy and injured heart revealed that disease gives rise to new subpopulations of known cell types that appear specific for or enriched with cells coming from the diseased heart. Although it is known that the cellular composition of the heart changes during pathological stress, 33 we are limited in our ability to detect genome-wide changes in expression specifically occurring within each cell type during cardiac stress. Our single-cell sequencing data now allow us to identify transcriptome-wide differences in all mayor cardiac cell types among different conditions. By defining disease-related cell subpopulations and the associated changes in gene expression, we expect to identify novel molecular mechanisms that are relevant for the cellular changes underlying heart disease.

The potential of the described approach is nicely exemplified by the identification Ckap4 as a novel marker for activated fibroblasts. Ckap4 was previously reported to have a function in cell proliferation during tumor progression, 34 but its function in the heart so far has been unstudied. In our dataset, we found Ckap4 to be expressed in the same cell population as Postn, Ctrc1, and Fn1, well-known markers for cardiac myofibroblasts. 22–24 Immunohistochemistry on both control and ischemic hearts confirmed the expression of CKAP4 to be specific for the stressed heart and to overlap with vimentin, a marker for fibroblasts in the ischemic heart. In vitro loss of function of Ckap4 showed an overactivation of myofibroblast-related genes before and after TGFβ stimulation, suggesting a role for Ckap4 during fibroblast activation. Moreover, we observed a positive correlation between CKAP4 and fibroblasts markers in cardiac biopsies from patients suffering from ischemic heart disease.

In an attempt to collect living cells for our single-cell analysis, we biased our flow cytometry gating toward larger cells. As a result, we preferentially isolated cardiomyocytes (71% for sequenced cells in control heart and 59% in diseased hearts) while it is estimated that 30% of cardiac cells are cardiomyocytes. 30 Nonetheless, we were still able to detect also other cell populations. Further optimization of our gating strategy is required to obtain a representative spectrum of cardiac cell types after cell sorting by flow cytometry.

Inherent to the technique, the low-sequencing efficiency prevents us from detecting lower expressed genes in a cell. Because the majority of reads are currently being spent on mitochondrial genes (≤23% to 84% of reads for cardiomyocytes clusters from both sham and IR), efforts to separate out transcripts derived from the mitochondria could greatly enhance the sequencing efficiency.

Although future developments will continue to optimize the single-cell sequencing technology on organs, our study for the first time indicates the feasibility of using this technique on adult cardiac tissue. Our data indicate the potential of this method to identify transcriptional differences between cardiac cell types and to study the heterogeneity in gene expression between the different subpopulations. Together these discoveries create major opportunities to unveil new gene functions for cellular biology and organ function.


Conclusions

The generation of insertional transformant libraries of Chlamydomonas and the subsequent PCR-based screening of those libraries for disruptions in specific genes provides a reliable, moderate-throughput, reverse screening strategy. This strategy can be applied to those organisms for which there are methods for high efficient transformation, but for which homologous recombination occurs at low frequency. Organisms that would benefit from this approach include algae, plants and other eukaryotes. In this study, we identified 82.5% of the screened insertion sites in Library 1 (

100,000 transformants). Naturally, using a large population of transformants significantly increases the chances of identifying a transformant with an insertion in a specific target gene. However, progressive screens for individual target gene disruptions allow for the termination of the screen once an appropriate insertion into the target gene is identified this approach could significantly reduce the final number of transformants that would need to be screened. Assuming the use of 4 target primers for a gene of interest and the presence of the transformant of interest within a population of 100,000 transformants, the number of PCR reactions needed to identify a superpool that contains such a transformant would be between 4 and about 416 (depending on whether it is identified in the first or last superpool tested).

For constructing transformant libraries, we recommend the use of a heterologous, plasmid-free, linear marker gene DNA. This type of marker gene DNA allows for the design of stable, highly specific marker primers positioned at the end of the marker gene coding region, which would facilitate the identification of interrupted target genes. Also, the use of a plasmid-free marker gene makes it easier to characterize the genomic regions flanking the inserted DNA using procedures such as RESDA-PCR [27] and TAIL-PCR [28]. Finally, using a plasmid-free marker gene coupled with electroporation-based methodology for transformation of Chlamydomonas results in the generation of transformants with no or very small deletions, which makes it easier to characterize the insertion site and the exact lesion that is generated, which in turn adds to the value of the PCR-based reverse genetics approach.

Unfortunately, the absence of a convenient method to maintain large number of Chlamydomonas transformants requires that the screen be completed with a one-use (without long-term maintenance of transformants) or a cryopreserved pooled library. With the one-use libraries, once an interrupted target gene is identified within the DNA isolated from the pooled population of transformants, the isolation of that specific transformant can be rapid and feasible. In contrast, with the cryopreserved pooled libraries this process can be time consuming and difficult because some transformants may grow slowly or be under-represented within the population. Hence, PCR-based screening of a moderately large number of target genes using a one-use library represent a reliable method for obtaining specific mutants over a relatively short time interval. Cryopreserved pooled libraries would be most advantageous when performing many small independent screens separated in time it would avoid the generation and arraying of transformants for each screen.


Isolation of neoantigen-specific T cells from tumor and peripheral lymphocytes

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Gartner, J. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Horovitz-Fried, M. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Shamalov, K. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Trebska-McGowan, K. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Bliskovsky, V. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Parkhurst, M. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Prickett, T. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Crystal, J. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Rosenberg, S. in: JCI | PubMed | Google Scholar

1 Laboratory of Tumor Immunology and Immunotherapy, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.

3 Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), NIH, Bethesda, Maryland, USA.

Address correspondence to: Cyrille J. Cohen, Laboratory of Tumor Immunology and Immunotherapy, Gonda Building 204, Room 105, The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900-02, Israel. Phone: 972.3.531.7037 E-mail: [email protected]

Find articles by Robbins, P. in: JCI | PubMed | Google Scholar

Published September 21, 2015 - More info

Adoptively transferred tumor-infiltrating T lymphocytes (TILs) that mediate complete regression of metastatic melanoma have been shown to recognize mutated epitopes expressed by autologous tumors. Here, in an attempt to develop a strategy for facilitating the isolation, expansion, and study of mutated antigen–specific T cells, we performed whole-exome sequencing on matched tumor and normal DNA isolated from 8 patients with metastatic melanoma. Candidate mutated epitopes were identified using a peptide-MHC–binding algorithm, and these epitopes were synthesized and used to generate panels of MHC tetramers that were evaluated for binding to tumor digests and cultured TILs used for the treatment of patients. This strategy resulted in the identification of 9 mutated epitopes from 5 of the 8 patients tested. Cells reactive with 8 of the 9 epitopes could be isolated from autologous peripheral blood, where they were detected at frequencies that were estimated to range between 0.4% and 0.002%. To the best of our knowledge, this represents the first demonstration of the successful isolation of mutation-reactive T cells from patients’ peripheral blood prior to immune therapy, potentially providing the basis for designing personalized immunotherapies to treat patients with advanced cancer.

Cancer results from the accumulation of mutations, some of which lead to uncontrolled cell growth and tumor metastasis and are termed drivers, along with other mutations, termed passengers, that may not contribute to the tumorigenic phenotype ( 1 ). Recent advances in next-generation sequencing enabled the rapid assessment of the mutational landscape of human cancers and revealed that melanomas can harbor between fewer than 100 and more than 2,000 nonsynonymous mutations ( 2 – 5 ). Driver as well as passenger mutations could potentially lead to the expression of mutated proteins recognized by the immune system as foreign ( 6 – 11 ). When encountering peptides presented by MHC molecules on the surface of target cells, T cells are able to recognize single amino acid differences between mutated and nonmutated epitopes, providing an opportunity for the development of novel adoptive immunotherapies ( 12 ).

The administration of autologous tumor-infiltrating T lymphocytes (TILs) can lead to complete, durable tumor regressions in patients with metastatic melanoma ( 13 , 14 ). Understanding the nature and the specificity of these tumor-reactive lymphocytes is crucial for the successful dissemination of adoptive immunotherapy. Studies carried out since the early 1990s, primarily using cDNA library screening methods, demonstrated that tumor-reactive T cells recognized epitopes derived from a variety of molecules ( 15 ). The majority of tumor antigens identified using these approaches could be grouped into several general categories: differentiation antigens, whose expression is limited to a single tissue cancer germline antigens that are not expressed in normal adult tissues with the exception of testis and antigens derived from tumor-specific somatically mutated genes. In this regard, we recently evaluated the ability of TILs to recognize candidate mutated epitopes identified using whole-exome sequencing of autologous tumors in conjunction with HLA class I–binding algorithms. Using this approach, we identified 7 mutated peptide epitopes recognized by 3 therapeutic bulk TIL cultures that mediated objective tumor regressions ( 6 ). We also demonstrated that mutation-reactive T cells present in TILs could be identified by screening autologous antigen-presenting cells that were transfected with tandem minigene libraries expressing cancer mutations identified from patients’ tumors ( 9 ). Importantly, we recently demonstrated that mutation-specific T cells can also be found in solid tumors other than melanoma ( 16 ). Tumor-infiltrating CD4 + T cells derived from a patient with cholangiocarcinoma were found to recognize a mutated epitope derived from ERBB2IP, and administration of a bulk lymphocyte population containing a high percentage of T cells reactive with this epitope mediated an objective clinical response without toxicity ( 16 ).

Strategies that target mutated tumor-specific antigens may have advantages over those that target nonmutated self-antigens. Many tissue-specific differentiation antigens have been identified however, targeting these molecules with potent effector T cells has been associated with autoimmunity, leading in some cases to dose-limiting toxicities ( 17 , 18 ). In addition, T cells recognizing mutated (“foreign”) antigens are not subject to central tolerance and thus may potentially express higher-affinity T cell receptors (TCRs) than do those directed against self-antigens ( 7 , 8 ).

Here, we sought to develop an approach that would facilitate the isolation, expansion, and study of T cells specific for mutated cancer antigens. Nonsynonymous somatic mutations were initially identified by carrying out whole-exome and RNA-sequencing (RNA-seq) analyses of tumors from metastatic melanoma patients. Large panels of MHC tetramers containing candidate mutated peptides that were predicted using MHC-binding algorithms to bind with high affinity to patients’ HLA molecules were then generated using a previously described UV-mediated peptide exchange strategy ( 19 ). Screening of fresh human tumor digests and bulk TIL cultures for their ability to bind mutated tetramers resulted in the identification of mutation-reactive T cells in bulk populations from 5 of the 8 patients evaluated using this approach. Furthermore, this report demonstrates that T cells specific for neoantigens, which represented as little as 0.002% of the peripheral T cell population prior to therapy, can be sorted and expanded in vitro by a factor of 1,000-fold or more. Cells isolated from peripheral blood or in vitro–cultured TILs using this approach can provide a readily accessible source of functional TCRs that can potentially be used to genetically engineer autologous T cells and provide the means to develop personalized adoptive immunotherapies for cancer.

Use of whole-exome sequencing approach to identify and isolate T cells specific for mutated antigens from fresh tumor digests. A schematic representation of the experimental approach followed in this study is presented in Figure 1A. Initially, DNA isolated from tumors that were resected from 8 patients with metastatic melanoma was subjected to whole-exome sequencing to identify nonsynonymous mutations. In addition, RNA was isolated to carry out RNA-seq analysis from the same tumor samples in an attempt to limit candidate peptides to those derived from expressed gene products. Liberal filters were used to generate nonsynonymous somatic exome variant calls to minimize the chance of false-negatives, as described in the Methods section. The number of exome variants initially identified from each of the tumors varied from a low of approximately 300 to a high of over 10,000, as shown in Supplemental Table 1 supplemental material available online with this article doi:10.1172/JCI82416DS1. After filtering expressed transcripts, mutated protein products were used to predict potential epitopes restricted to patients’ HLA class I alleles using a peptide-binding algorithm ( 20 ). Between 16 and 48 of the mutated peptides that were predicted to bind with high affinity to HLAA*01:01 or A*02:01 were then synthesized (Supplemental Table 2) and used to generate panels of fluorescent MHC tetramers using a previously described UV-mediated peptide exchange procedure( 19 ).

Analysis of FTDs. (A) Schematic representation of the experimental strategy. (B) Approximately 3 × 10 4 to 5 × 10 4 cells from FTDs were incubated for 3 to 4 days in lymphocyte medium and then stained with panels of mutated epitope MHC tetramers. The cells were analyzed by flow cytometry (gated on the lymphocyte population). For each patient, the frequency of tetramer-positive cells is indicated (y axis) for each peptide screened (x axis). Patients 3713, 3466, and 3879 were screened for binding to HLA-A*02:01 tetramers and patient 3919 for binding to HLA-A1 tetramers. (C) Tetramer (Tet) staining for selected mutated epitopes identified for each patient. The percentage of positive cells is indicated. (D) Cultured FTD cells were incubated with T2 cells pulsed with the predicted mutated epitopes or incubated with T2 cells pulsed with irrelevant peptide (Ctrl, see Methods) for 16 hours, and IFN-γ secretion was measured by ELISA. Pt., patient.

Analysis of short-term in vitro cultures derived from fresh tumor digests (FTDs) obtained from 4 of the 7 patients with available fresh tumor samples contained T cells that bound to 1 or more of the candidate tetramers containing mutated candidate epitopes (Figure 1B). The FTD cultures from patients 3713 and 3466 each bound to 3 HLA-A2 tetramers, and the FTD cultures from patients 3879 and 3919 each bound to a single HLA-A2 and A1 tetramer, respectively. The percentages of tetramer-positive cells detected in these cell populations ranged from 0.4%, observed with the TEAD1 and ERBB2 epitopes recognized by FTD from patient 3466, and 7.2% for the HLA-A1–restricted TRIP12 epitope recognized by FTD from patient 3919 (Figure 1C and Table 1). Further evaluation demonstrated that the FTD cultures recognized T2 cells pulsed with the cognate mutated epitopes (Figure 1D). In some cases, identification of the optimal T cell epitopes required screening of additional peptides that were not present in the initial screening assays. The mutated TEAD1 peptide with the highest-ranking score, VLENFTIFLV, ranked in the top 0.35% of a random set of natural peptides and had a predicted IC50 of 13 nM. While tetramer staining and reactivity were observed with a TEAD1 peptide, VLENFTIFLV, that was predicted to possess a relatively high affinity for HLA-A2 (Supplemental Figure 1), staining was significantly improved when using a 9-mer peptide that was shifted by 1 amino acid toward the amino terminus of the TEAD1 protein SVLENFTIFL (Supplemental Figure 1). This peptide, however, only ranked in the top 1.8% of peptides and was predicted to possess a binding affinity of 65 nM.

Mutated peptides identified

Approximately 10 3 to 8 × 10 4 tetramer-positive T cells were then electronically sorted and stimulated using anti-CD3 Ab in the presence of irradiated feeder cells, which led to an approximately 1,000-fold expansion 7–10 days after stimulation. Tetramer staining revealed that between 50% and 99% of the T cells that were sorted and expanded using this approach bound to the original tetramer used to isolate the cells (Figure 2A). Each of the expanded T cell cultures recognized the appropriate cognate-mutated epitope, but not the WT version, in cocultures with pulsed T2 cells (Figure 2B), indicating that this screening and isolation strategy represents a robust method for expanding antigen-reactive T cells from FTDs.

Sorting and expansion of mutated antigen–specific T cells. (A) Tetramer-positive T cells were sorted from FTDs and expanded in vitro by stimulation with OKT3. Ten days later, T cells were assessed for tetramer binding. (B) T cells that were sorted and expanded in vitro were cocultured with T2 cells pulsed with the cognate mutated epitope or cocultured with T2 cells pulsed with the corresponding WT peptide. Sixteen hours after the start of the coculture, IFN-γ secretion was measured by ELISA. Results are presented as the mean ± SEM (n = 3). P < 0.05 by Student’s t test for the difference in IFN-γ secretion between mutated (Mut pep) and WT peptides (WT pep).

Isolation of tetramer-reactive T cells from treatment TIL cultures. The proportion of T cells specific for the mutated epitopes in bulk TIL cultures used to treat the patients described above were then evaluated for their ability to recognize the mutated epitopes identified using FTD cultures. All of the specificities identified in the FTDs (Figure 1) were also detected in the expanded TIL cultures by tetramer staining (Figure 3A). With the exception of AHNAK, TILs secreted IFN-γ in response to all of the epitopes identified using FTD cultures (Figure 3B). The relatively low frequency of AHNAK-specific cells in the TIL culture, which was estimated to be approximately 0.1%, might explain the lack of reactivity in this assay nevertheless, we could perform a peptide titration assay using AHNAK-specific T cells that were sorted and expanded from infusion TILs (Supplemental Figure 2). The relative frequencies of these cell populations often differed dramatically between the FTDs and the infusion TILs. For example, the frequency of AHNAK-specific cells decreased from 1.1% to 0.1 % following in vitro expansion, whereas TEAD1-specific T cells increased from 0.4% to 11% after expansion. As above, mutated antigen–specific T cells could be sorted to purity levels ranging from 66% to 95% (Figure 3C) and recognized the appropriate peptides (data not shown).

Treatment TILs recognize mutated antigens. (A) Bulk-treatment TIL cultures were stained with the indicated tetramer and analyzed by flow cytometry. (B) TILs were cocultured with T2 cells pulsed with the indicated mutated epitope at different concentrations (ranging from 10 –6 to 10 –13 M) or cocultured with T2 cells pulsed with irrelevant peptide (Ctrl, at 10 –6 M). Sixteen hours after the start of the coculture, IFN-γ secretion was measured by ELISA. (C) Tetramer-positive cell populations were sorted from infusion TILs and expanded using OKT3. Ten days after the start of the expansion, T cells were assessed for tetramer binding. (D) Bulk-treatment TILs for patient 3703 were analyzed for HLA-A*02:01/mutated epitope tetramer binding. The frequency of tetramer-positive cells (y axis) for each peptide screened (x axis) is indicated (left panel). A positive mutated epitope, NSDHL, was identified (middle panel), and a representative staining plot of 2 tetramer-positive cell populations is shown (right panel). (E) NSDHL hi and NSDHL lo cell populations were sorted and expanded, and 10 days after the start of the expansion, T cells were assessed for NSDHL-tetramer binding (left panels). Additionally, these sorted cell populations were assessed for CD4 and CD8 expression (middle panels). The NSDHL hi cell population, which consisted of 98% CD8 + T cells, and the NSDHL lo cell population, which consisted of 99% CD4 + T cells, were analyzed for peptide reactivity in coculture with T2 cells pulsed with titrated doses of the mutated NSDHL epitope. Sixteen hours after the start of the coculture, IFN-γ secretion was measured in the culture supernatant by ELISA (right panels). Results are representative of 2 independent experiments.

One of the limitations of these studies is a lack of FTDs due to the low viability or relatively small size of some resected tumor samples nevertheless, for some patients such as number 3703, for whom no FTD was available, expanded TILs were available for testing. Screening of a panel of MHC tetramers containing 27 HLA-A*02:01– and 48 HLA-A*11:01–restricted mutated candidate epitopes identified by sequencing DNA from 3703 uncultured fresh tumor revealed that TILs from patient 3703 recognized a single HLA-A2–restricted mutated NSDHL epitope(Figure 3D). Two cell populations that differed in their intensity of binding to the mutated NSDHL tetramer were electronically sorted and expanded in vitro. Analysis of these cells following expansion indicated that T cells that bound low levels of NSDHL (NSDHL lo ) predominantly expressed CD4, whereas the NSDHL cell population that bound high levels of NSDHL tetramer (NHDH hi ) predominantly expressed CD8 (Figure 3E). Their functional avidities were then evaluated by examining their response to T2 cells that were pulsed with between 10 –6 M and 10 –13 M concentrations of the NSDHL peptide. The results indicated that the NSDHL hi and NSDHL lo cell populations possessed similar avidities, recognizing targets pulsed with a minimum of 10 –10 M peptide (Figure 3E).

Mutated antigen–specific T cells can be isolated from peripheral blood lymphocytes. In an attempt to deal with the limited availability of fresh tumor samples from some patients, efforts were made to isolate T cells reactive with the mutated epitopes from the patients’ peripheral blood mononuclear cells (PBMCs). As the frequency of such cells was expected to be extremely low, peripheral lymphocytes obtained prior to TIL treatment were incubated with MHC tetramers that had been labeled with the allophycocyanin (APC) and phycoerythrin (PE) fluorophores, allowing cell separations to be carried out using a 2D gating strategy. The levels of tetramer-specific T cells detected in peripheral blood were generally indistinguishable from background staining, with the exception of the TRIP12 HLA-A2 tetramer, which bound to 0.4% of the PBMCs from patient 3919 (Figure 4A). Following 10 days of in vitro expansion of electronically sorted cells stained with the 2 fluorophores, between 0.4% and 46 % of the cells bound to the appropriate tetramer, with the exception of cells reactive with the mutated AHNAK epitope (Figure 4, B and C). In addition, T cells reactive with the ERBB2-mutated epitopes only constituted 0.4% of the T cells obtained after a single expansion in vitro however, following a second round of enrichment performed with a cell sorter, 82% of the expanded population of T cells recognized the mutated ERBB2 epitope (Figure 4C). These data indicate that T cells specific for mutated tumor antigens can be isolated and expanded from the peripheral blood of most melanoma patients.

Sorting and expansion of mutated antigen–specific T cells from peripheral blood. (A) Cells (1 × 10 7 to 4 × 10 7 ) from pretreatment pheresis samples were thawed and incubated overnight in lymphocyte medium, followed by staining simultaneously with the indicated PE- and APC-labeled tetramers. The frequency of double-positive cells is indicated. (B) The tetramer-positive cell population was sorted and expanded in vitro, and 10 days after expansion, these cells were assessed for tetramer binding. The percentage of positive cells (gated on the CD8 + cell population) is indicated. (C) Similarly, mutated ErbB2–specific cells were sorted from peripheral blood cells. As the relative frequency of these cells was low after the first expansion (left histogram), a second sorting step was performed, and these cells were expanded and then analyzed for ErbB2-tetramer binding (right histogram).

Evaluation of T cell reactivity against mutated epitopes. Additional studies were then carried out to determine whether T cells that had been sorted using MHC tetramers could efficiently recognize both target pulsed with the mutated epitopes and cognate tumor cells. To that end, mutation-reactive T cells were cocultured with either peptide-pulsed T2 or autologous tumor cells, if available. All the T cell populations secreted high levels of IFN-γ in response to the cognate epitope presented by T2 cells (Figure 5, A–D), but failed to respond to the corresponding WT peptides (data not shown). Moreover, all of the T cells tested recognized autologous tumor cells, but not HLA-matched control melanoma cells, with the exception of COL18A1-reactive T cells, which failed to secrete significant levels of IFN-γ when cocultured with autologous fresh tumor cells from patient 3466. This may be due to a defect in processing this particular epitope, as T cells reactive with the mutated TEAD1 and ERBB2 epitopes recognized fresh tumor cells from patient 3466, indicating that these antigens were endogenously processed and presented by the tumor cell line (Figure 5B).

Tumor reactivity of T cells specific for mutated antigens. T cells that were isolated using mutated tetramer complexes were analyzed for peptide and tumor reactivity. T cells from patients 3713 (A), 3466 (B), 3879 (C), and 3919 (D) were cocultured with either peptide-pulsed T2/T2A1 cells (as indicated) or tumor cells. Sixteen hours after the start of the coculture, IFN-γ secretion was measured by ELISA in the culture supernatant. Results for patients 3713, 3879, and 3919 are presented as the mean ± SEM (n = 3). P > 0.05 by Student’s t test for the difference in IFN-γ secretion compared with the negative control.

Persistence of mutated antigen–specific T cells and TCR isolation. Studies were then carried out to evaluate the persistence of T cells reactive with mutated epitopes following adoptive TIL transfer. Specific T cells recognizing all of the mutated epitopes identified using the tetramer-binding approach, with the exception of the AHNAK peptide, could be detected in patients’ peripheral blood 2–12 months after treatment (Figure 6, A and B). Interestingly, the frequencies of the respective T cell populations in the post-treatment samples were similar to those in the original TIL cultures (Figure 3), and T cells reactive with some epitopes, such as the NSDHL peptide, could be detected 1 year after treatment.

Persistence of and TCR isolation from T cells specific for mutated antigens. (A) Cells obtained from pheresis in patients 2–3 months following adoptive TIL transfer were stained with the indicated MHC tetramer and analyzed for binding. The percentage of tetramer-positive cells among the gated lymphocyte population is indicated. (B) Similarly, pheresis samples from patient 3703 obtained 2, 5, 8, and 12 months following treatment were analyzed for NSDHL tetramer binding. The percentage of positive T cells among the gated lymphocytes is indicated. (C) Sorted T cells specific for mutated SRPX (left panel) were assessed for TCR BV13 and BV28 expression (right panel). (D) The α and β chains for the BV13 (CDR3b: CASSFFGSNQPQHF) and BV28 (CDR3b: CASGSSGQGEYGELFF) TCR were cloned from expanded T cells sorted from patient 3713 infusion bag cells stained with SRPX tetramer. OKT3-stimulated PBMCs were transfected with RNAs encoding the appropriate α and β pairs or with GFP. Transfected T cells were cocultured with either mutated SRPX–pulsed T2 cells (left panel) or tumor cells from patient 3713 (right panel), along with the appropriate negative controls (right panel: T2 pulsed with MART126-35 -2L epitope and melanoma cells from patient 3466). Bulk-treatment TILs from patient 3713 were used as a positive control for effector cells. Sixteen hours after the start of the coculture, IFN-γ secretion was measured in the culture supernatant. Results are presented as the mean ± SEM (n = 3) for 2 individual donors. P < 0.05 by Student’s t test for the difference in IFN-γ secretion compared with the negative control.

Another potential immunotherapeutic application of this tetramer-based isolation approach would be to clone the T cell receptor α (TRA) and β (TRB) chains expressed by mutated antigen–specific T cell populations ( 12 ). In an attempt to validate this approach, T cells reactive with the HLA-A2–restricted mutated SRPX epitope were isolated from the TIL population that had been administered to patient 3713. Staining with anti-TRB Abs indicated that approximately 80% of the SRPX-reactive T cells expressed BV13, whereas nearly 20% of the T cells expressed BV28. (Figure 6C). The TRA and TRB chains were then cloned from these 2 populations of tetramer-reactive T cells, and OKT3-stimulated PBMCs were transfected with in vitro–transcribed RNA encoding the appropriate TRA and TRB chains. Cells transfected with TCRs isolated from either population specifically secreted IFN-γ in response to T2 cells pulsed with the SRPX peptide and the autologous 3713 tumor cell line but not when cocultured with irrelevant control targets (Figure 6D). These findings indicate that it may be feasible to develop treatments based on the transfer of TCRs isolated from T cells sorted using tetramers bearing mutated epitopes to autologous peripheral T cells.

Objective clinical response rates of up to 70% have been observed in melanoma patients treated with expanded populations of TILs that were selected on the basis of reactivity with autologous tumor cells ( 13 ), and recent studies have indicated that melanoma TILs frequently recognize somatically mutated gene products ( 6 , 9 ). Ongoing studies are being conducted in additional patients to evaluate the prevalence of neoantigen-reactive T cells in TILs derived from other melanoma patients, which may represent an alternative to the use of autologous tumor cell targets in screening assays for the selection of cultures with tumor reactivity. The adoptive transfer of TILs containing high numbers of CD4 + T cells that recognize a mutated HLA class II–restricted neoantigen led to a partial regression of multiple lung metastases in a patient with cholangiocarcinoma ( 16 ), and current efforts are focused on replicating these results in other patients.

One of the factors potentially limiting the effectiveness of this approach, however, is the frequency of neoantigen-reactive T cells in bulk populations of TILs. Attempts to address this issue have focused on purifying tumor-reactive T cells from bulk TILs and PBMCs using HLA class I tetramers generated with candidate neoepitopes identified by whole-exome sequencing. The present study demonstrates that it is possible to design large panels of MHC tetramers presenting mutated epitopes and use them to identify and sort tumor-reactive T cells from tumor samples and from patients’ peripheral blood, where they are represented at levels estimated to be as low as 0.002% of the total lymphocyte population. While T cells specific for mutated epitopes could be identified and isolated from 5 of 8 patients in our study (Table 1), it may be possible to enhance the efficiency of this process. The strategy presented herein relies on several factors such as the precise determination of tumor genomic data, the fidelity of epitope prediction algorithms, and the availability of T cells. Robust algorithms capable of accurately predicting peptide immunogenicity ( 21 ) and proteasomal processing ( 22 ) are needed to determine the optimal threshold for the inclusion of epitopes in the initial screening phase. Identification of the optimal TEAD1 epitope, which was not present in the initial screening assays carried out with TILs from patient 3466, illustrates one of the problems encountered using current peptide prediction algorithms. Further improvements could take advantage of advanced tetramer screening methods such as combinatorial tetramer staining and mass cytometric analysis to facilitate characterization of the mutated epitopes ( 23 – 25 ). Additional conditional MHC ligands ( 26 ) for less frequent HLA alleles are also needed to expand the use of this approach to a wider patient population. These studies may be facilitated by identification of factors that influence the generation of neoantigen-reactive T cells. Although the expression of tumor antigens may bear some relationship to the induction of T cell responses, many additional factors such as the influence of processing mechanisms and the overall avidity of the TCR for particular peptide-MHC complexes can also influence the recognition of these epitopes.

Analysis of phenotypic and functional characteristics of tetramer-positive tumor-reactive T cells detected in fresh tumor and peripheral blood prior to therapy will also provide important clues to the factors that influence their ability to mediate tumor regression, although these studies are hampered by the relatively low frequency and unknown purity of many of the T cell populations isolated using these methods. Recent advances in single-cell whole-transcriptome analysis, coupled with functional TCR analysis, will hopefully lead to a better understanding of the factors that influence the in vivo activity of tumor-reactive T cells.

Comparisons of the frequencies of mutated epitope–specific T cells in the FTDs and in expanded TIL cultures provide evidence that individual clonotypes may undergo significant expansion or contraction following in vitro culture. For example, T cells reactive with the AHNAK epitope were almost lost following the in vitro expansion protocol and did not appear to persist in the patient after transfer. These findings indicate that some specificities present at early stages of TIL cultures may contract over time and would warrant the use of FTDs or briefly expanded cultures in the epitope/tetramer screening phases. It may also be possible to identify additional T cell specificities by screening peripheral lymphocytes using pools of tetramers containing multiple candidate epitopes. In addition, use of an initial CD3 + T cell enrichment step may enhance the efficiency of a subsequent sort carried out with a peptide-MHC tetramer.

Interestingly, phenotypic analysis demonstrated that approximately 30% of the T cells reactive with an HLA-A2–restricted NSDHL epitope isolated from a patient’s TILs expressed CD4, a phenomenon that was not observed with the other populations of T cells examined in this study (data not shown). Two HLA class I–restricted CD4 + T cells that recognize nonmutated tumor antigens were previously identified ( 27 , 28 ) however, unlike the results presented in this report, corresponding populations of CD8 + T cells were not identified in the same patients studied to identify tumor-reactive CD4 + T cells. Multiple neoantigens have been identified as targets of CD4 + T cells ( 29 ) however, all of the previous targets were recognized in the context of HLA class II gene products. While they generally appear to possess reduced cytolytic activity relative to CD8 + T cells, tumor-reactive CD4 + T cells have been shown to play a significant role in mediating tumor regression in both animal models and patients ( 16 , 30 – 33 ). The physiological role and significance of such nonclassical MHC class I–restricted T cells and their involvement in patients’ antitumor response is unclear, but recognition of class I–expressing tumor cells in vivo could lead to the secretion of cytokines that modulate the function of tumor-promoting cells in the tumor microenvironment and support cytotoxic T lymphocyte (CTL) activity ( 30 , 34 ).

Our preliminary data also demonstrate that multiple clonotypes specific for the same mutated epitope can be isolated from individual patients. A more detailed analysis of the phenotypes of these clonotypes and the biophysical properties of the TCR might shed light on the extent, depth, and requirements for efficient antitumor immune responses. From a practical standpoint, the ability to quickly isolate mutated epitope–specific TCRs, as shown herein, is of importance if one wants to harness the power of T cell redirection using TCR gene transfer ( 12 ). The clinical implementation of this approach will require a more flexible platform for TCR expression such as a transposon-based system ( 35 ), which is not dependent on labor-intensive testing of viral supernatants. The majority of candidate epitopes were restricted to a single patient tumor sample however, peptides encompassing the common driver mutations NRAS Q61R and RAC1 P29S were identified as candidate HLA-A1 epitopes in tumors from patients 3879 and 3919. In addition, NRAS Q61R and RAC1 P29S mutations were both observed in a tumor sample from patient 3702, and an HLA-A2–binding peptide encompassing the RAC1 P29S mutation was identified as a candidate epitope for patients 3702 and 3879. Previous studies demonstrated that the HLA-A1–restricted NRAS peptide evaluated in the current study was recognized by tumor-reactive T cells ( 36 ), and although T cells reactive with either of these epitopes were not identified in these patients, these cells could potentially be present at low levels in tumor or peripheral blood of other patients expressing the appropriate HLA alleles. Additionally, recent studies demonstrate that immune checkpoint inhibitors can generate and/or amplify an immune response against mutated antigens ( 37 – 40 ). As a result of their systemic nature, these treatments can lead to severe autoimmune toxicities ( 41 , 42 ). The occurrence of severe autoimmune toxicities following treatment with an approach that targets neoantigens using MHC tetramer–sorted lymphocytes or cells engineered to express mutated epitope–specific TCRs is unlikely, however, as expression of these antigens is strictly limited to tumor cells. Combining checkpoint inhibitors with the adoptive transfer of enriched populations of neoantigen-reactive T cells may also represent an effective treatment option for patients who progress following treatment with individual therapies.

In conclusion, our findings demonstrate that it is feasible to use MHC tetramers to isolate T cells specific for mutated epitopes in patients with metastatic melanoma and lay the groundwork for designing novel and personalized immunotherapeutic strategies for the treatment of advanced cancers.

Subjects and cell lines. Lymphocytes were cultured in complete lymphocyte medium (50% AIM V, 50% RPMI 1640 Invitrogen, Life Technologies) supplemented with 5% human serum and 300 IU IL-2. Melanoma tumor cells were generated at the Surgery Branch of the NCI and cultured in RPMI 1640 supplemented with 5% FCS. T2 (CRL-1992 ATCC) is a lymphoblastoid cell line deficient in transporter associated with antigen processing (TAP) function, facilitating the binding of cell-surface HLA-A*02:01 molecules with exogenously pulsed peptides. T2A1 is an HLA-A1 transfectant of T2 that can also present HLA-A1–restricted epitopes. All cells were maintained in an incubator at 37°C and 5% CO2. Bulk TIL cultures were expanded from FTDs as previously described ( 43 , 44 ).

Exome-sequencing and RNA-seq analyses. Whole-exome sequencing was carried out at Personal Genome Diagnostics Inc. (Baltimore, Maryland, USA) and the Genome Technology Access Center (Washington University, St. Louis, Missouri, USA), as previously described ( 6 , 9 ). Transcriptome analysis was performed on RNA isolated from fresh tumor cells or tissue culture cell lines. Briefly, total RNA was isolated from fresh tumor fragments using an RNAeasy kit (91355 QIAGEN) and RNA-seq libraries were prepared using SureSelect RNA Library Preparation kits (95051 Agilent Technologies) according to the manufacturers’ instructions. Samples were run on a MiSeq (92122 Illumina), and approximately 4 × 10 7 paired-end reads were obtained per sample. Variants were called if there were more than 2 variant reads, a minimum of 10% variant allele frequency (VAF), and less than 1% VAF in the normal DNA. In addition, common SNPs with a VAF of greater than 1% were eliminated from the analysis. Expression was evaluated by determining the fragment per kilobase per million reads (FPKM) values from the RNA-seq analysis for transcripts encoding individual epitopes, and the candidates that were screened were limited to those encoded by transcripts with a minimum FPKM of 1. A summary of the number of nonsynonymous somatic variant exome calls is shown in Supplemental Table 1. Whole genomic data were deposited in the NCBI’s Sequence Read Archive (SRA) database (accession no. SRP062169).

Epitope prediction and generation of MHC tetramers. For tumor from patient 3713, peptides encoded by variant transcripts with an FPKM value of greater than 3 were evaluated, whereas for tumors from patients 3466, 3703, 3879, 3919, 3903, 3702, and 3926, peptides encoded by variants with a minimum FPKM of 1 were evaluated. Candidate 9 and 10 amino acid peptides containing mutated residues that were predicted to bind with high affinity to patients’ MHC class I molecules were identified using the Immune Epitope Database (IEDB) prediction algorithm (http://tools.immuneepitope.org/mhci/) ( 20 ) and were synthesized by Peptide 2.0 or Atlantic Peptides as crude peptides. The mutated and corresponding WT peptides that were recognized by T cells were then synthesized and HPLC purified to greater than 95% by Peptide 2.0 or Atlantic peptides.

Biotinylated HLA monomers (for HLA-A*02:01, A*01:01, and A*11:01) loaded with UV-cleavable epitopes ( 26 ) were produced by the NIH Tetramer Core Facility at Emory University (Atlanta, Georgia, USA). The sequences of the peptides used in this study are detailed in Supplemental Table 2 and were synthesized by Peptide 2.0. Large panels of MHC tetramers were generated using the protocol detailed by Rodenko et al. ( 19 ). Briefly, HLA loaded with UV-sensitive peptide monomers were subjected to UV light in the presence of 50 μM individual candidate synthetic peptides for 1 hour on ice. Then, the monomers were tetramerized in the presence of fluorescent (PE or APC) streptavidin (Life Technologies) and kept at 4°C for further use.

FACS analysis and Abs. Prior to tetramer analysis, T cells were incubated in PBS containing 5% FCS and dasatinib (50 nM) for 30 minutes at 37°C to enhance tetramer binding ( 45 ). Between 2 × 10 4 and 5 × 10 4 cells were stained with 1 ng/μl MHC tetramer in PBS containing 5% FCS for 45 minutes at room temperature. Following 2 washes in PBS, immunofluorescence was analyzed as the relative log fluorescence of live cells gated on the on populations of cells with relatively low levels of forward and side scatter. Immunofluorescence of the gated lymphocyte population was measured using a FACSCanto (BD) or CyAn-ADP (Beckman Coulter) flow cytometer. Fluorophore-labeled anti-human CD8 (clone SK1), CD4 (clone L200), BV13 (clone AF23), and BV28 (clone CH92) Abs were purchased from BD-Immunotech.

Sorting and rapid expansion protocol. Following a 60-minute incubation with fluorophore-conjugated Abs or tetramers, T cells were isolated using a FACSAria cell sorter (BD) and collected in sterile PBS containing 50% FCS. T cells were expanded to large numbers using a rapid-expansion protocol (REP) with IL-2, anti-CD3 Ab, and irradiated feeder cells ( 46 ).

Analysis of T cell responses. T cells were tested for reactivity in cytokine release assays using commercially available ELISA kits for IFN-γ (Thermo Scientific) at an effector-to-target ratio of 1:1. Briefly, between 2 × 10 4 to 10 5 T cells were incubated for 16 hours at 37°C with a similar number of tumor cells or peptide-pulsed T2 cells. Unless indicated otherwise, T2 cells were pulsed with 1 μM specific peptide or control (pp65, NLVPMVATV) peptide for 2 hours at 37°C, and then washed 2–3 times prior to the start of the coculture.

TCR isolation and expression. RNA was isolated from approximately 10 6 tetramer-sorted cells. A 5′-RACE (rapid amplification of cDNA ends) procedure was performed using the SMARTer RACE kit (Clontech) and primers for the constant regions. The PCR products were cloned into a TOPO cloning vector (Invitrogen, Life Technologies). Twenty to forty clones for each chain were sequenced. TRA and TRB chains identified using the IMGT/V-Quest website (http://www.imgt.org/IMGT_vquest/vquest) ( 47 ) and subsequently cloned in a plasmid for mRNA production ( 48 – 50 ). Briefly, in vitro–transcribed mRNA encoding the identified TRA and TRB chains was generated using the AmpliCap-Max T7 High Yield Message Maker Kit (CELLSCRIPT) and purified using an RNA Clean-Up and Concentration Kit (Norgen Biotek Corp.). PBMCs were separated by centrifugation on a Ficoll-Hypaque cushion and stimulated with 50 ng/ml OKT3 and 300 IU/ml IL-2. Electroporation was performed at 400 V and 500 μs using an ECM 830 Square Wave Electroporation System (BTX Instrument Division, Harvard Apparatus Inc.). Transfections were performed using 1–2 μg in vitro–transcribed mRNA encoding the appropriate TRA and TRB chains per 10 6 PBMCs.

Statistics. Data samples were compared using a 2-tailed Student’s t test where indicated, and a P value of less than 0.05 was considered statistically significant.

Study approval. All of the patients were enrolled in a randomized trial comparing adoptive TIL transfer plus nonmyeloablative chemotherapy with or without 1,200 cGy total body irradiation (11C0123), with the exception of patient 3466, who was enrolled in a trial evaluating the response to unselected TILs (07C0176). Patients were eligible for these trials if they were 18 years or older had measurable metastatic melanoma and an Eastern Cooperative Oncology Group (ECOG) performance status of 0 or 1 a life expectancy of more than 3 months and no evidence of active systemic infections, coagulation disorders, or other active major medical or cardiovascular or immunodeficiency diseases. Patients had progressive metastatic melanoma upon their entrance into the study, and at least 4 weeks had elapsed after any previous treatment. All patients had a metastatic lesion of greater than 2 cm in diameter that could be resected for growth of TILs and had cells that grew in culture to sufficient quantities with in vitro antitumor activity as described in Methods. All materials from patients were obtained during the course of clinical trials approved by the IRB of the NCI, and all patients provided written informed consent prior to their participation in this study.

We thank Arnold Mixon and Shawn Farid for their excellent technical assistance during the cell-sorting procedures the Adelson Medical Research Foundation (AMRF) for their generous support and the NIH Tetramer Core Facility (contract HHSN272201300006C) for the provision of MHC tetramers.

Conflict of interest: The authors have declared that no conflict of interest exists.

Reference information: J Clin Invest. 2015125(10):3981–3991. doi:10.1172/JCI82416.


Material & Methods

Zebrafish maintenance

Zebrafish were maintained in a zebtec semi-closed recirculation housing system (Techniplast, Italy) with a constant water temperature of 28 °C and a conductivity of 500 μS. The fish are exposed to a daily 14 h light and 10 h darkness cycle and fed 2 times/day with dry food (SDS, UK) and once with Artemia (Ocean Nutrition). The Tg(fli1a:EGFP) zebrafish were obtained from the zebrafish international resource center (ZIRC). The Tg(fli1a:EGFP) were crossed and embryos were collected. The clutch of embryos (50–150) were transferred to a petri dish with E3 media and kept in an incubator at a temperature of 28 °C and a daily light and darkness cycle (see higher) until the moment of further processing. For all our experiments, roughly 10 times 100 embryos were used for FACS sorting. All embryos used were younger than 120 h post fertilization (hpf), and thus no ethical approval was needed.

Embryo dissociation for FACS sorting

A clutch of 50–150 embryos of age 3–5 days post fertilization (dpf) was euthanized using an overdose of tricaine and transferred to a 35 mm culture dish. The tricaine was removed as much as possible and 3 ml of preheated (28 °C) dissociation solution (1× PBS, trypsin 0.25%, 1 mM EDTA) was added. During an incubation period of 90 min at 28 °C, the larvae dissociation was advanced by pipetting up and down with a 1 ml pipette every 15 min. When the larvae were completely dissociated, the reaction was stopped by adding 3 μl 1 M CaCl2 (final concentration 1 mM) and 300 μl FBS (10% final concentration). The cells were transferred to a 15 ml tube, pelleted (5 min 800 g), washed with PBS and resuspended in 2 ml resuspension solution (Leibovitz’s L-15 medium + L-Glutamine without Phenol Red, FCS 10%, 0.8 mM CaCl2). The cell solution was passed several times through a 40 μM mesh filter prior to cell sorting. This cell suspension was analysed by flow cytometry on a Bio-Rad S3e cell sorter. Forward and side scatter was used to gate for live, single cells. GFP positive cells, which ranged between 5 and 15% of the parental population, were sorted and collected. No other markers were used for flow cytometry. Representative samples (a negative control vs. GFP positive cells) with the gating strategy are shown in the supplemental data (Additional file 6: Figure S4).

The sorted cells were collected in either the lysis buffer of the RNA isolation kit or into a collection medium (the used resuspension buffer). The cells were kept on ice during the whole procedure.

RNA isolation and quality control

RNA from sorted cells was isolated with the use of the RNeasy plus micro kit (Qiagen, 74,034) or the RNAqueous-Micro Total RNA Isolation Kit (Ambion, AM1931). For the RNeasy plus micro kit, the RLT buffer was supplemented with 2-mercaptoethanol (10 μl per 1 ml RLT) as suggested. The integrity and the concentration of the RNA was analyzed with the Fragment Analyzer (Advanced Analytical Technologies) High Sensitivity RNA Analysis Kit (DNF-472-0500). The PROSize software version 3.0.1.5 determined a RQN RNA integrity score considering the entire electropherogram.

5′ – 3′ delta-Cq RNA integrity assay

This RT-qPCR assay aims to assess the integrity of a low abundant reference gene mRNA, hprt1, as a representative of all mRNAs in the sample. By using the oligo dT primed reverse transcriptase iScript Select kit (Bio-Rad 1,708,896), the cDNA synthesis starts from the 3′ polyA tail, proceeding to the 5′ end of the mRNA transcript. Upon mRNA degradation, the cDNA synthesis is interrupted and not completed to the 5′ end of the transcript. We designed two RT-qPCR assays, 1 targeting the 5′ end (hprt1 5′-fw: CGTTTTGCAGTAGCTTGTCAGA hprt1 5′-rev: ACACCCGCTCTAAGTCAGC), and 1 targeting the 3′ end of the hprt1 cDNA (hprt1 3′-fw: GAGGAGCGTTGGATACAGA hprt1 3′-rev: CTCGTTGTAGTCAAGTGCAT). The assays were developed using the NCBI ‘pick primer’ tool and efficiencies were determined based on a relative 6-point 4-fold dilution series (32, 8, 2, 0.5, 0.125, 0.032 ng cDNA of a pool of embryos of various ages). The higher the 5′-3′ dCq value, the more mRNA degradation. RNA used for this assay was isolated from 1 embryo using the RNAqueous micro or the RNeasy plus micro kit as described in the manual. Starting from 250 ng RNA, cDNA was synthesized using the oligodT primers of the iScript select cDNA synthesis kit.

For RT-qPCR, 2.5 μl 2× SsoAdvanced SYBR Green supermix (Bio-Rad) was mixed with 5 ng input cDNA and 250 nM (finale concentration) of each forward and reverse primer in a 384 well plate (Bio-Rad) and run on a LightCycler 480 (Roche). Thermocycler conditions were set as follows: 95 °C for 2 min, followed by 44 cycles of 95 °C for 5 s, 60 °C for 30 s, 72 °C for 1 s and finally a melting curve analysis was performed at 95 °C for 5 s followed by 60 °C for 1 min, gradual heating to 95 °C at a ramp-rate of 0.11 °C/s followed by cooling to 37 °C for 3 min. Cq values were exported and RT-qPCR data were analysed with qbase+ version 2.6.1 (Biogazelle).

Genomic DNA contamination assessment

For the gDNA contamination assay, RNA was isolated from 2 embryos and diluted to 1 ng/μl to match input amount of sorted cells. Five samples for each kit were isolated without including the gDNA elimination step of the procedure, another five samples for each kit were isolated with the standard RNA isolation method including its gDNA elimination step. Each RNA sample was split into two: half of the RNA was subjected to a Heat & Run (ArticZymes 80,200–50, following the guidelines of the kit) gDNA removal step to remove possible contaminating gDNA. The other halve of the sample was used in further steps without extra gDNA removal. The Heat & Run kit enables gDNA removal without the need of an extra purification step, therefore preventing the loss of sparse RNA. qPCR for the reference gene elfa (elfa-fw: GGAGACTGGTGTCCTCAA elfa-rev: GGTGCATCTCAACAGACTT) and the ERE reference gene [22] loopern4 (loopern4-fw: TGAGCTGAAACTTTACAGACACAT loopern4-rev: AGACTTTGGTGTCTCCAGAATG) was performed on 2 ng RNA from each sample to test for contaminating gDNA. As qPCR is not possible using RNA as template, amplification signals the presence of gDNA in the RNA sample. The RNA was mixed with 2.5 μl 2× SsoAdvanced SYBR Green supermix (Bio-Rad) and 250 nM (finale concentration) of each forward and reverse primer in a 384 well plate (Bio-Rad) and run on a LightCycler 480 (Roche). Thermocycler conditions were as follows: 95 °C for 2 min, followed by 44 cycles of 95 °C for 5 s, 60 °C for 30 s, 72 °C for 1 s and finally a melting curve analysis was performed at 95 °C for 5 s followed by 60 °C for 1 min, gradual heating to 95 °C at a ramp-rate of 0.11 °C/s followed by cooling to 37 °C for 3 min. Cq values were exported and RT-qPCR data were analysed with qbase+ version 2.6.1 (Biogazelle).

Determination of the maximum dilution point of the lysis buffer of the RNeasy plus micro and the RNaqueous micro kit

A clutch of Tg(fli1a:EGFP) embryos was dissociated and processed for FACS sorting as described above. A range of cells (5000 10,000 20,000 30,000 50,000 75,000 100,000 150,000 and 200,000) were sorted either directly into the lysis buffer of the RNA isolation kit or in a collection medium (= the used resuspension buffer: Leibovitz’s L-15 medium + L-Glutamine without Phenol Red, FCS 10%, 0.8 mM CaCl2). When cells were sorted in a collection medium, they were centrifuged (5 min at 1200 rmp), supernatant was removed and the pellet was dissolved in the lysis buffer of the RNA isolation kit. RNA was isolated following the manual of the RNA isolation kit. The integrity and RNA concentration was measured with the Fragment Analyzer as described earlier.

Time series to test RNA preservation capacity of the lysis buffer after FACS sorting

20,000 GFP+ cells from Tg(fli1a:EGFP) embryos were FACS sorted following the procedure described above. The cells were either sorted in lysis buffer of the RNA isolation kit or into a collection medium (Leibovitz’s L-15 medium + L-Glutamine without Phenol Red, FCS 10%, 0.8 mM CaCl2). The sorted cells were kept on ice for a specific amount of time (0 min, 15 min, 30 min, 1 h, 2 h or 24 h) before RNA isolation was carried out. The cells that were sorted into a collection medium were pelleted (5 min, 1200 rpm) and dissolved in lysis buffer right at the start of the RNA isolation process. In addition, to test if longer storage was possible, samples were frozen in liquid nitrogen and stored at − 80 °C for a week before proceeding to RNA isolation. RNA quality was determined with the Fragment Analyzer as described above.

CDNA synthesis and RT-qPCR

Following the Heat & Run guidelines, 1 μl HL-dsDNase and 2 μl 10× Rxn buffer (Heat & Run, Articzymes) were added to 20 μl of each RNA sample. Samples were incubated for 10 min at 37 °C and 5 min at 58 °C for gDNA removal. cDNA synthesis and amplification was prepared using the SMART-Seq v4 Ultra Low Input RNA Kit (Takara Biosystems 634,890). cDNA concentrations were measured using the Qubit dsDNA high sensitivity assay kit (Invitrogen, Q32851). Each sample was diluted to 0.5 ng/μl and 2 μl was mixed with 2.5 μl 2× SsoAdvanced SYBR Green supermix (Bio-Rad) and 250 nM (finale concentration) of each forward and reverse primer in a 384 well plate (Bio-Rad) and run on a LightCycler 480 (Roche). Thermocycler conditions were as follows: 95 °C for 2 min, followed by 44 cycles of 95 °C for 5 s, 60 °C for 30 s, 72 °C for 1 s and finally a melting curve analysis was performed at 95 °C for 5 s followed by 60 °C for 1 min, gradual heating to 95 °C at a ramp-rate of 0.11 °C/s followed by cooling to 37 °C for 3 min. Cq values were exported and RT-qPCR data were analysed with qbase+ version 2.6.1 (Biogazelle). The ERE reference genes trd7, loopern4, hatn10 were used for normalization. For primer sequences, see Table 5.

Library prep, sequencing and data analysis

1 ng of SMART-Seq v4 amplified cDNA (Takara Biosystems 634,890) was used for Nextera XT DNA library prep (Illumina, FC-131-1024). For sample and input specifications see Table 2.

Libraries were sequenced using a NextSeq 500 (Illumina). Quality control on fastq files was performed with FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/).

Reads were aligned to Danio rerio reference genome GRCz10 with STAR v2.4.2a using a two-pass strategy. Genes were quantified using the Danio_rerio.GRCz10.91.gtf transcriptome.

The DESeq2 R-package (version 1.20.0) was used for count normalization and differential gene expression analysis. The normalized read counts were used to generate PCA plots, heatmaps and the correlation matrix. Pearson’s correlation coefficient was calculated on log transformed normalized read counts. Pre-ranked Gene Set Enrichment Analysis (GSEA) was preformed using GenePattern 2.0 (https://cloud.genepattern.org/gp/pages/login.jsf).


Access options

Get full journal access for 1 year

All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.

Get time limited or full article access on ReadCube.

All prices are NET prices.


Acknowledgments

We thank P. Sengupta (Brandeis University), J. Culotti (Mount Sinai Hospital, Toronto) D. Colon-Ramos (Yale University), M. Hammarlund (Yale University) and the C. elegans Genetics Center (University of Minnesota) for strains, D. Colon-Ramos and S. Jang (Yale University) for the confocal image of NSM in Fig. 3, Vanderbilt Flow Cytometry Shared Resource for FACS analysis and Vanderbilt Technologies for Advanced Genomics (VANTAGE) for generating microarray and RNA-Seq data sets.


4 EXPERIMENTAL PROCEDURES

4.1 Strains, fungal transformation, and plasmid construction

M. oryzae wild-type strain O-137 was used as a recipient strain to generate fungal transformants using Agrobacterium tumefaciens-mediated transformation (Khang et al., 2005 ). M. oryzae strains were cultured on oatmeal agar plates at 24 °C under continuous light and stored frozen at −20 °C to maintain full pathogenicity (Valent et al., 1991 ). See Tables S3 and S4 for the list of M. oryzae strains and plasmids used in this study. To monitor PWL2 expression at single-cell resolution, M. oryzae transformant CKF3538 was made by sequential transformation of pCK1292 and pCK1714 into wild-type strain O-137. pCK1292 was generated by cloning a 0.5-kb EcoRI-BamHI fragment of pBV167 (RP27 promoter) (Shipman et al., 2017 ) and 1.7-kb BamHI-HindIII fragment of pAN582 (pBV359, tdTomato:Nos terminator) (Nelson et al., 2007 ) in EcoRI-HindIII sites of pBV141 (pBGt) (Kim et al., 2011 ). pCK1714 was generated by cloning a 1.7-kb EcoRI-BsrGI fragment of pCK1298 (PWL2p:EGFP), a 0.12-kb BsrGI-NotI fragment of pBV118 (pd2EGFP-1) (Li et al., 1998 ), and a 0.5-kb NotI-XhoI fragment of pBV1102 (PWL2 3′ terminator region) in EcoRI-SalI sites of pBV1 (pBHt2) (Mullins et al., 2001 ).

To identify the cis-element in the PWL2 promoter, fungal transformants with fluorescently labelled nuclei were generated. In particular, M. oryzae transformant CKF3276 was generated by sequential transformation of pCK1528 and pCK1586 into wild-type strain O-137. pCK1528 was generated by cloning a 1.0-kb EcoRI-BamHI fragment of pBV126 (RP27 promoter) (Khang et al., 2010 ), a 1.4-kb BamHI-BsrGI fragment of pAN582 (tdTomato:NLS), and a 0.4-kb BsrGI-HindIII fragment of pBV578 (Nos terminator) in EcoRI-HindIII sites of pBV141. pCK1586 was generated by cloning a 872-bp EcoRI-BamHI fragment of pCK1574 (PWL2 promoter) and a 1.3-kb BamHI-XhoI fragment of pCK1576 (sfGFP without ATG:NLS:PWL2 3′ terminator region) in EcoRI-SalI sites of pBV1. sfGFP was first cloned from sfGFP-Lifeact-7 (pCK1349), a gift from Michael Davidson (Addgene plasmid # 54739). A series of constructs with deletions, replacements, and mutations in the PWL2 promoter was generated by restriction enzyme digestion and ligation, or PCR using the primers in Table S5 (Figures S8 and S9). The intact or manipulated PWL2 promoter fragments and sfGFP:NLS:PWL2 3′ terminator region were first cloned into pJET1.2 (Thermo Fisher Scientific) to confirm the introduction of the desired alteration by DNA sequencing, and later the fragments were inserted into the binary vector pBV1.

After two rounds of selections on TB3 (0.3% yeast extract, 0.3% casamino acid, 20% sucrose) and V8 (8% V8 vegetable juice [Campbell's], pH 7) media containing 200 µg/ml of hygromycin (Fisher BioReagents) or 800 µg/ml of G418 (Fisher BioReagents), and 200 µM of cefotaxime (Gold Biotechnology), at least 10 independent transformants for each construct were selected and purified by single spore isolation. All positive transformants showed similar fluorescence patterns and behaved indistinguishably when compared to the wild-type strain under microscope examination. Due to the position effect of transgenic genes (Chen & Zhang, 2016 Soanes et al., 2002 ), two or three independent transformants for each construct were randomly chosen for further analysis.

4.2 Infection assays

Rice sheath inoculations were performed using the susceptible rice YT16 as previously described (Jones & Khang, 2018 ). Briefly, excised leaf sheaths (5–8 cm long) from 19- to 21-day old plants were inoculated with a spore suspension (10 5 spores/ml in distilled water). The inoculated sheaths were hand-trimmed and immediately used for confocal microscopy. Inoculations on onion epidermal peels were performed as previously described (Xu et al., 1997 ). In heat-killed inoculations, pretrimmed sheaths or onion epidermal peels were incubated in 70 °C water for 25 min, cooled down to room temperature, and then inoculated with a spore suspension.

4.3 RNA isolation and RT-qPCR

For the RT-qPCR assay, 15 infected rice sheaths at each time point of 18, 25, 33, and 38 hpi were collected as described (Mosquera et al., 2009 ), frozen immediately in liquid nitrogen, and stored at −80 °C for RNA extraction. The TRIzol method (Invitrogen) was used to extract total RNA. Genomic DNA was eliminated by Turbo DNase (Ambion) according to the manufacturer's instructions. Complementary DNA (cDNA) was synthesized using the ImProm II Reverse Transcriptase system (Promega) from 500 ng of total RNA extracted from infected tissue or mycelia grown for 5 days in complete medium (CM). RT-qPCR was performed with the MX3005P (Stratagene) systems using the VeriQuest SYBR Green qPCR Master Mix (2× Thermo Fisher). Each reaction (final volume 14 µl) contained 7 µl of VeriQuest SYBR Green qPCR Master Mix, 1.5 µl of each the forward and reverse primers (3.3 nM concentrations for each, Table S5), 2 µl of cDNA, and 2 µl of distilled water. Thermocycler conditions were as follows: 2 min at 50 °C, 10 min at 95 °C followed by 40 cycles of 95 °C for 30 s, 60 °C for 30 s, and 72 °C for 30 s. The specificity of each primer pair was checked by incorporation of a final dissociation cycle. The relative expression level was calculated using the 2 −∆Ct method (Livak & Schmittgen, 2001 ) with the M. oryzae actin gene (MGG_03982) as the reference gene (Che Omar et al., 2016 ). Briefly, the average threshold cycle (Ct) was normalized to that of the actin gene for each sample as 2 -ΔCt , where ΔCt = (Ct, target gene − Ct, actin). Two technical replications for each of three biological replications were performed. Mean and standard deviation were calculated from three biological replicates.