Efficient CRISPR‐based genome editing using tandem guide RNAs and editable surrogate reporters

Cleavage efficiency plays a key role in clustered regularly interspaced short palindromic repeat (CRISPR)‐based gene editing, particularly when the given guide RNA exhibits low cleavage activity. Here, we describe the packaging of tandem guide RNAs and single‐strand annealing‐based surrogate reporter cassettes into the CRISPR/CRISPR‐associated protein 9 vector, which increased gene‐editing efficiency by 4.94–6.31‐fold and simultaneously enriched the proportion of genetically modified cells. This strategy may substantially improve genome‐editing efficiency for demanding applications.

The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system evolved in bacteria and archaea as a selfdefense mechanism against invading phage DNA [1-3]. Molecular biologists have retooled the Cas9 nuclease into 'molecule-sized programmable scissors' that, directed by a single guide (sg) RNA sequence, can precisely cleave the target at potentially any position in the genomes of diverse species [4][5][6][7]. This technology offers the power to manipulate genomes and holds great promise for clinical applications, such as for disease modeling and therapy [8,9], as well as for altering the genomes of embryos or gametes [10][11][12][13][14].
Clustered regularly interspaced short palindromic repeat-based genome editing remains in its infancy and requires further optimization. An ideal gene-editing system would allow precise manipulation at any genomic locus with high efficiency and specificity, while facilitating subsequent identification and isolation of the genetically modified cells. While various prediction algorithms can help design sgRNAs that maximize ontarget efficiency and minimize off-target events [15][16][17][18][19], many factors can influence whether the sgRNAs function as predicted [19,20]; these factors include expression levels of Cas9 and sgRNA [21], delivery efficiency [22], and characteristics of target cells [23]. Furthermore, identifying edited cells is typically performed using fluorescence-activated cell sorting (FACS) or antibiotic selection, which can lack sensitivity to detect the small proportion of Cas9-positive cells that can be edited. Limited Cas9 cleavage efficiency and relatively insensitive selection approaches to isolate edited cells hinder the wider application of CRISPR-based genome editing in biological research and clinical applications.
Among various approaches to enhancing cleavage efficiency [24][25][26], one of the more promising is to ensure adequate levels of sgRNA. A given sgRNA can show low or undetectable activity, and its recognition sequence requires a specific protospacer adjacent motif (PAM) at the end to initiate sgRNA-mediated DNA recognition [27,28]. It means the choice of gRNA is often quite limited, especially in introducing a specific change at a specific site. Increasing the level of sgRNA can significantly improve the efficiency of on-target cleavage [21].
One approach to achieving the sensitive selection of edited cells is based on the single-strand annealing (SSA) DNA repair pathway. This pathway involves annealing of repeat sequences that flank a doublestrand break (DSB). The process is initiated when a DSB occurs between two repeated sequences oriented in the same direction; the subsequent bridging of the DSB leads to deletion of one of the repeats [29,30]. Using an SSA-based surrogate reporter provides a robust, unbiased indicator of CRISPR editing performance and enriches for edited cells [31,32].
Here, we combine both of these approaches in an 'all-in-one' strategy in which an SSA-based CRISPR/ Cas9 vector, Cas9 nuclease, sgRNA, and surrogate reporter are copackaged to provide a simplified workflow offering more efficient cleavage and enrichment of edited cells. This strategy was validated by targeting the deleted in azoospermia-like (DAZL) gene, and the results suggest that 'all-in-one' editing can greatly simplify and expedite the CRISPR workflow, as well as maximize gene-editing efficiency even at sites that are difficult to edit.

SSA-based 'all-in-one' vector system
The vector backbone is derived from the pX330 vector, which has numerous, well-positioned restriction enzyme sites and lacks many elements unnecessary for CRISPR-based editing that inflate vector size. In addition, the vector contains three basic components required for genome editing: (a) a custom-designed sgRNA cassette expressed off the U6 promoter, (b) two truncate encoding mCherry fragments designed to detect DSB-induced SSA events at the target site, and (c) an expression cassette encoding a fusion of copGFP and Cas9 bridged by peptide 2A (Fig. 1A).
The two mCherry fragments share a 0.3-kb region of homology. The target sequence is inserted between the two split mCherry genes, and an in-frame stop codon inserted between the mCherry-up sequences prevents possible readthrough of the truncated mCherry gene. If a Cas9-based DSB lies between two repeat sequences, it can lead to SSA-mediated repair, ultimately leading to a deletion of one of the repeats. In this way, both mCherry and copGFP proteins will be produced when SSA occurs, whereas only copGFP will be expressed if SSA does not occur. The fluorescent signal of copGFP can be used to measure transfection efficiency and identify Cas9-positive cells. The fluorescent signal of mCherry can be used to measure the efficiency of on-target mutations and select edited cells using FACS.
The expression cassettes in the all-in-one vector include multiple promoters to allow users to select the most appropriate plasmids for robust expression and efficient cleavage. Experiments with empty vectors and four promoters (EF1a, CMV, CAG, and PGK) showed that the promoters exhibited quite different, dynamic properties. In the case of the Cas9-copGFP expression cassette, the EF1a promoter drove robust expression of copGFP and gave results similar to those with the CMV and CAG promoters. In the case of the SSA-based surrogate reporter expression cassette, signal from mCherry was observed in HEK 293T cells at 48 h after transfection with mock vector containing CAG, CMV, and PGK promoters (Fig. 1B,C). This may be due to low levels of leakage or recombinationdeletion from the mock vector. The ratio of copGFP to mCherry was lowest with the PGK promoter and highest with the CAG and CMV promoters; this may reflect differences in promoter activity. To minimize background noise and increase signal-to-noise ratio, the PGK promoter was used to drive SSA-based expression of mCherry, the EF1a promoter was used to drive expression of the Cas9-peptide 2A-copGFP fusion, and the U6 promoter was used to drive expression of sgRNA.
Detection and on-target efficiency of human editing events with the 'all-in-one' system We examined whether the percentage of copGFP-positive cells that were also mCherry-positive correlated with the efficiency of Cas9-induced indels at endogenous loci. We set up editing reactions with three sgRNAs predicted to show different cleavage activities (dazl.sgRNA.1, dazl.sgRNA.2, dazl.sgRNA.3). These sgRNAs target the sequence between the last exon and the 3 0 UTR of the endogenous human DAZL gene and thereby guide Cas9-mediated cleavage ( Fig. 2A). All-in-one vectors were constructed containing each sgRNA and its target site sequence. A negative-control vector was constructed containing only the target site sequence. Cells were transfected and allowed to undergo genomic editing for 2 days, after which they were sorted based on fluorescence to isolate the copGFP-positive/mCherry-negative population and the copGFP/mCherry dual-positive population. Genomic DNA was harvested and amplified by nested PCR; the amplicons were digested using T7 endonuclease I or Sanger-sequenced.
These results indicate the mCherry expression is a reliable indicator of the efficiency of on-target mutations achieved using our 'all-in-one' system. They also indicate that this system facilitates assessment of sgRNA and Cas9 performance, which may accelerate guide screening and facilitate identification of edited cells.

Incorporation of multiple copies of sgRNA to increase cleavage efficiency
In CRISPR-based mutagenesis, Cas9 nuclease requires a PAM sequence adjacent to the sgRNA. The PAM sequence can be positioned at several locations for any given target site, so Web-based prediction algorithms are usually used to identify locations more likely to be cleaved efficiently and specifically. In some cases, only one PAM site may be available, and it may be predicted to lead to inefficient or undetectable cleavage. This raises the question of how to achieve the desired editing efficiency independent of sequence-based activity [33,34]. One possibility is to optimize sgRNA expression: Higher sgRNA levels can lead to more efficient cleavage. We reasoned that, as vectors can be designed to produce multiple sgRNAs for simultaneous editing at multiple target sites, perhaps we could simply  encode repeats of the same sgRNA in our 'all-in-one' system to boost sgRNA expression and thereby cleavage efficiency. Therefore, we designed one novel all-inone vector containing two, three, or four copies of an sgRNA expression cassette to boost sgRNA expression and transfer sequence-based activity into quantitationbased activity to improving sgRNA performance. At 2-3 days after transfection, all-in-one vector encoding two copies of the sgRNA led to a larger proportion of on-target edited cells than a vector carrying only one copy, two copies with surrogate reporter cassettes can increase gene-editing efficiency by 6.31-fold with dazl.sgRNA.1 and 4.94-fold with dazl.sgRNA.2. No difference was observed after prolonging the incubation period or encoding three or four copies of the sgRNA in the vector (Fig. 3A,B). Indicate that encoding two copies of the sgRNA can maximize the gene-editing efficiency within the shortest time.

Assessment of off-target cleavage
A major challenge in using CRISPR/Cas9 for gene editing is the high incidence of genome cleavage at off-target sites [5,20,35,36]. The possibility that sgRNAs may lead to off-target cleavage at sites showing partial homology is always present, and may even be worse when expressing multiple copies of the sgRNA. Therefore, we measured the probability of off-site cleavage with our 'allin-one' system carrying multiple copies of dazl.sgRNA.1 and dazl.sgRNA.2. We synthesized a contiguous series of potential off-target site sequences predicted for dazl.sgRNAs (Fig. 3C), which we inserted between SSA-reporter cassettes. The resulting vector was transfected into HEK 293T cells, and mCherry-positive populations were compared. The results suggest that the probability of off-site cleavage is similar for vectors carrying one or two sgRNA copies.

Further validation of the 'all-in-one' system with mouse genes
To assess the performance of the 'all-in-one' gene-editing system against additional targets, we programmed the vector with sgRNAs against the following mouse genes involved in spermatogenesis: The PLZF gene encodes promyelocytic leukemia zinc finger (also ZBTB16), a selective marker that favors the renewal of spermatogonial stem cells over their differentiation; and the ACR gene encodes acrosin, the major protease in the acrosome of mature spermatozoa. Several sgRNAs were screened for their ability to induce cleavage at a target site between the last exon and the 3 0 UTR of these genes (Fig. 4A,B)   to 20.48% for sgRNA.1 and 23.10% for sgRNA.2. Editing efficiency increased to 54.32% when two copies of sgRNA.3 were used. Conversely, a vector carrying one copy each of sgRNA.1, sgRNA.2, and sgRNA.3 resulted in editing efficiency of only 20.89% (Fig. 4C-E), consistent with the fact that one sgRNA can bind the sense strand while other sgRNAs bind the antisense strand, decreasing editing efficiency.
As in the PLZF case, using a vector carrying two copies of the sgRNA.3 against ACR led to editing efficiency of 40.49%, which was 1.32-fold more efficient than a vector carrying only one copy (Fig. 4D,F). In contrast to the PLZF case, a vector carrying one copy each of sgRNA.1, sgRNA.2, sgRNA.3, and sgRNA.4 led to editing efficiency of 39.32%, which was similar to that obtained with two copies of sgRNA.1.
These results with PLZF and ACR suggest that using more sgRNAs improves the efficiency of gene editing with the 'all-in-one' system.
Our findings with vectors simultaneously encoding multiple different sgRNAs against the same gene ('cocktail sgRNAs') suggest that these sgRNAs should be carefully designed to avoid unwanted results. If sgRNAs in the cocktail target different strands (some sense, others antisense), Cas9 cleavage may be blocked. If all sgRNAs target the same strand (sense or antisense), Cas9-mediated cleavage can occur efficiently, even when sequences overlap. This is because when one sgRNA molecule binds the target sequence, other sgRNA molecules can no longer bind to it, allowing Cas9 to complex with the sgRNA and genomic DNA.

Conclusion
The power of CRISPR/Cas9 for genomic editing will undoubtedly make it a focus of continued optimization for basic and clinical applications. Here, we demonstrate that an 'all-in-one' CRISPR/Cas9 system that contains two tandem copies of the sgRNA or cocktail sgRNAs targeting the same strand in order to promote cleavage activity, as well as an SSA-based 'editable reporter' to visually indicate editing efficiency, can provide a simplified platform for modifying genes and selecting individual cells in which editing has been successful. This system may help accelerate the development of CRISPR/Cas9 for diverse biomedical applications.

Design and synthesis of sgRNAs
The sgRNAs were designed using the CRISPR tool (http:// crispr.mit.edu), and their sequences as well as the target sequences are listed in Table S1.

Construction of the 'all-in-one' vector
'All-in-one' vectors were constructed using a pX330 backbone. Digestion-ligation or seamless cloning techniques were used to subclone sgRNA/Cas9 cassettes and other components. Briefly, empty vectors were used as template to generate a PCR product, which was cloned into a suitably digested vector using the in-fusion technique. The sgRNAs and target sequences were cloned into vectors using digestion-ligation. Cassettes carrying multiple copies of sgRNAs were constructed using in-fusion cloning.
Cultures were incubated at 37°C with 5% CO 2 . Cells were seeded into 24-well plates 1 day prior to transfection and then transfected using Lipofectamine 3000 (Life Technologies) following the manufacturer's recommended protocol.

Fluorescence-activated cell sorting
At 2 or 3 days post-transfection, cells were resuspended, sorted based on copGFP or mCherry expression into 96well plates using the BD Influx cell sorter (BD Biosciences, San Jose, CA, USA), and expanded. Depending on the experiment, certain cell populations were maintained and expanded to accumulate editing events.

Direct nested PCR of sorted cells and detection of on-target mutations
To examine on-target editing events, target sites within sorted cells were amplified using direct nested PCR and KOX FX neo polymerase (Toyobo, Tokyo, Japan). In this reaction, the genomic region encompassing the sgRNA target sequence was amplified using 'external/internal' primer pairs (Table S2). The PCR product was analyzed in a commercial T7 endonuclease I assay (Vazyme, Nanjing, China) following the manufacturer's recommended protocol, or it was Sanger-sequenced. For the endonuclease assay, PCR product (approximately 200 ng) was mixed with 1 lL 109 buffer and DNA-free water to a final volume of 10 lL, and then subjected to the following re-annealing protocol to enable heteroduplex formation: 95°C for 5 min, ramp from 95°C to 85°C at À2°CÁs À1 , ramp from 85°C to 25°C at À0.1°CÁs À1 , and holding at 4°C for 1 min. Re-annealed products were digested with T7 endonuclease 1 at 37°C for 15 min, analyzed by agarose gel electrophoresis, and quantified based on relative band intensities.

Supporting information
Additional Supporting Information may be found online in the supporting information section at the end of the article: Table S1. List of sgRNAs and target site primers used in the study. Table S2. Primers used in the T7 endonuclease I assay.