Predicting and detecting CRISPR/Cas9-associated off-target effects

The gene superfamily of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) nucleases, evolved in archaea and bacteria as a mechanism of acquired host defence against repeated virus transduction and/or plasmid transfection.  To date, several different CRISPR/Cas systems have been isolated and classified according to their molecular structure, substrate specificity and the conditions that support their optimum activity [1-3].

CRISPR/Cas9, a well-studied member of this superfamily, was repurposed as a gene editing tool in eukaryotic experimental models [4-6].  It comprises a single guide RNA (sgRNA) that recognizes its DNA target by Watson-Crick base-pairing, and directs Cas9 endonuclease to a pre-determined target locus, adjacent to a protospacer adjacent motif (PAM), where Cas9 generates site-specific double-strand breaks (DSBs) in the DNA [7]. In eukaryotes, Cas9-induced DSBs can be repaired in either of two ways, i.e.:

(i) Non-homologous end-joining (NHEJ): an efficient, but error-prone mechanism that can introduce insertion/deletion (indel) mutations at the repair site and/or elsewhere in the genome.  NHEJ is suitable for gene inactivation or deletion mutations that do not rely on preservation of the sequences flanking the affected site.

(ii) Homology-directed repair (HDR): the method of choice in mitotically-active cells, when precise genetic modification is required, and a donor template with homology to the sequences flanking the DSB is provided [Dai 2016].  In post-mitotic, CRISPR/Cas9-expressing cells, HDR can be achieved if the DNA template is delivered by an adeno-associated virus (AAV) vector [8].

The simplicity, versatility, cost-effective design, and ease of use of CRISPR/Cas9, make it a popular choice for modelling animal and plant development and disease.  However, there are many potential complications, some of which are listed below, that hinder its widespread application in vivo.  Among the many possible contributing factors, are Cas9-associated off-target DSBs.  Here, we describe some of the methods and resources that are available to predict and detect them.

Complications of CRISPR/Cas9 gene editing technology in eukaryotes [7, 9-15]

  • Delivery of CRISPR/Cas9 system by an integrating or non-integrating viral vector, results in prolonged expression of the CRISPR/Cas9 system in targeted cells/tissues.
  • CRISPR/Cas9 expression is not cell- / tissue-specific.
  • Ectopic expression of CRISPR/Cas9 presents a risk of malignant transformation, cell death or other, altered cell/tissue phenotype.
  • Delivery of CRISPR-Cas9 by integrating viral vectors is highly efficient, but integration occurs at random sites within the host genome.
  • Delivery of Cas9 mRNA in vivo, is inefficient due to its large size.
  • mRNA/sgRNA or RNP complexes are unstable after delivery.
  • CRISPR/Cas9 exhibits DNA target non-specificity, resulting in off-target cleavage/binding/editing.
  • Insertions / deletions (indels) are induced by faulty NHEJ of DSBs.
  • Mosaicism in mouse embryos attributed to failure of nucleases to introduce DSBs at the 1-cell stage.
  • Asynchronous cell division reduces the efficiency of DSB repair by HDR.
  • Multiple alleles – repair of DSBs by NHEJ produces (i) cohorts of mice with different mutations from the same targeting constructs, requiring genome sequencing to verify the nature and position of the specific mutation; and/or (ii) mice with mosaics of multiple mutations, requiring breeding to segregate and isolate mice that carry single mutations.
  • CRISPR/Cas9 (±vector) is immunogenic in vivo.

 

Even if the most stringent measures have been taken to avoid off-target activity, it is prudent to screen the entire genome for unintended changes, but only after the on-target mutation is confirmed [16].  The reason is, that the monomeric Cas9 nuclease is more prone to off-target effects than the dimeric zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs).  This is attributed to its smaller size, its ability to recognize shorter target sequences, and the ability of the gRNA to tolerate a certain number of mismatches [16].   Among several resources that have been developed to identify off-target changes are:

1.      Off-target predictive modelling.
2.      Protocols that detect off-target DSBs at predicted sites (biased detection).
3.      Protocols that detect off-target DSBs across the entire genome (unbiased detection). 

1. Off-target predictive modelling:  These programs are based on sequence similarity, but differ with respect to the search algorithms used, and the search parameters used (i.e.: max. no. of mismatches, permissible PAMs, completeness of search) [17].  Without validation, they have limited predictive power, and may yield high false positive rates [18].

1.1   Cas-OFFinder [19]

1.2   CRISPOR [20-21]

1.3   CHOP-CHOP v3 [22-23]

1.4   CRISPR-DO (CRISPR Design and Optimization) [6, 24]

1.5   CRISPR ML [17, 25]

1.6   CRISPR-offinder [26]

1.7   CROP-IT (CRISPR Off-target/cleavage Prediction and Identification Tool) [17]

1.8   COSMID (CRISPR Off-target Sites with Mismatches, Insertions, and Deletions) [27]

2.  Protocols that detect off-target DSBs at predicted sites (biased detection): Methods that have been developed to detect on-target mutations can also be used for the analysis of off-target changes, provided that the sites are known [16].  The following protocols are examples:

2.1   AFLP (Amplified Fragment Length Polymorphisms) [16]

               Features

  • Preferentially detects large indels, very large chromosomal deletions (> 1 x 10e6 bp).
  • Able to determine whether insertion or deletion.
  • Moderate throughput.
  • Misses small indels.

 

2.2   CAPS (Cleaved Amplified Polymorphic Sequence) analysis [16]

               Features

  • Preferentially detects all mutations.
  • Highly sensitive, convenient.
  • Moderate throughput.
  • Unable to determine type of mutation.
  • Detection is dependent upon the presence of restriction enzyme recognition site at or < 5bp away from nuclease cut site (i.e.: availability of restriction sites covering the mismatch may be a limiting factor).

 

2.3   Fluorescent PCR-capillary gel electrophoresis [16]

               Features

  • Preferentially detects small indels.
  • Single nucleotide resolution, therefore can detect frameshift mutations.
  • High throughput.
  • Sensitivity: 1.0 %.
  • Does not reveal which nucleotides have been inserted/deleted.
  • Unable to detect SNPs, large indels.
  • Overestimates mutations > 30 bp.

 

2.4   Heteroduplex mobility assay by PAGE [16]

               Features

  • Preferentially detects small indels.
  • Sensitivity: 0.5 %.
  • Unable to determine type of mutation.
  • Moderate throughput.
  • Misses large indels.

 

2.5  HRMA (High-Resolution Melting Analysis) [16]

               Features

  • Preferentially detects small indels.
  • Sensitivity: 2.0 %.
  • Determines whether mutation is insertion or deletion.
  • High throughput.
  • Misses large indels.

 

2.6   Loss of a primer binding site [16]

               Features

  • Preferentially detects indels.
  • Sensitivity: 10.0 %.
  • Able to determine type of mutation.
  • High throughput.
  • Misses substitutions.

 

2.7   Mismatch Cleavage Assay [16]

               Features

  • Preferentially detects small indels.
  • Sensitivity: 0.5% - 3.0%.
  • Unable to determine type of mutation.
  • Moderate throughput.
  • T7 endonuclease 1 (T7E1) can overlook single nucleotide changes.

 

2.8   NGS (Next Generation Sequencing) [16]

               Features

  • Preferentially detects all mutations.
  • Able to determine whether insertion or deletion.
  • High throughput.
  • Sensitivity: 0.01 %.
  • Misses large indels.
  • Expensive.

 

2.9   Sanger Sequencing [16]

               Features

  • Preferentially detects all mutations.
  • Able to determine type of mutation.
  • Easy, fast, widely available.
  • Sensitivity: 1.0 – 2.0 %.
  • Low throughput.
  • Costly.
  • Labour-intensive.

 

3.  Protocols that detect genome-wide (unbiased), off-target DSBs:

3.1   BLESS (Direct In Situ Breaks Labeling, Enrichment on Streptavidin and next-generation Sequencing) [16, 28]

Features

  • Direct in situ labelling of DSBs.
  • Semi-quantitative due to lack of appropriate controls for PCR amplification biases, limiting applications and scalability.
  • Only detects DSBs present at the time of labelling.
  • Reference genome required.

 

3.2   BLISS (Breaks Labeling In Situ and Sequencing) [28]

               Features

  • Detects endogenous and exogenous DSBs in low-input samples of cancer cells, embryonic stem cells and liver tissue.
  • Unbiased profiling of on- and off-target DSBs introduced by Cas9 and Cfp1 nucleases.
  • Direct labeling of DSBs in fixed cells or tissue sections on a solid surface.
  • Low-input requirement by linear amplification of tagged DSBs by in vitro transcription.
  • Quantification of DSBs through unique molecular identifiers.
  • Easy scalability and multiplexing.
  • Versatile, sensitive, quantitative.

 

3.3   Chromatin-immunoprecipitation sequencing (ChIP-Seq) [14, 28]

               Features

  • Genome-wide (unbiased) detection.
  • Does not label DSBs directly.
  • Unable to identify DNA breakpoints with single-nucleotide resolution.
  • Most off-target DNA-binding sites recognized by “dead”Cas9 (dCas9) are not cleaved at all by Cas9 in cells.

 

3.4   CIRCLE-Seq (Circularization for In vitro Reporting of CLeavage Effects by Sequencing)

(https://github.com/tsailabSJ/circleseq)  [18, 29]

               Features

  • Uses next-generation sequencing technology; does not require a reference genome sequence.
  • Eliminates the high background of random reads observed with Digenome-seq.
  • Strips DNA of all bound proteins before nuclease digestion, resulting in efficient cleavage of all potential sites by the Cas9 nuclease.
  • Identifies bona fide, off-target mutations associated with cell-type-specific SNPs.
  • For most Cas9-gRNA complexes tested, CIRCLE-Seq can identify all off-target sites in human genomic DNA found by GUIDE-Seq and High-Throughput Gene Translocation Sequencing (HTGTS).
  • Potential for profiling off-target mutations in organisms lacking full genomic sequence, or outbred populations with considerable sequence heterogeneity, because reference genome sequence is not required.
  • Introduces DSBs in vitro, which can lead to significant under/over-estimates of the number of target sites that are actually modified in cellular models or in vivo.
  • Difficult to predict repair outcomes from in vitro data.

Note - Nuclease concentration within the cell, delivery method (RNP vs plasmid vs integrating virus vs non-integrating virus) and chromatin accessibility, all significantly affect editing outcomes and are generally missed by in vitro off-target assays [18].

 

3.5   Digenome-Seq (www.rgenome.net/digenome/) [12, 14, 28-29]

               Features

  • Sensitivity:  < 0.1 %.
  • Cost-effective.
  • Relies on nuclease cleavage of genomic DNA, sequencing adapter ligation to all free ends (nuclease- and non-nuclease-induced), high-throughput sequencing, and bioinformatic identification of nuclease-cleaved sites exhibiting signature uniform mapping end positions.
  • Requires large number of reads ( ̴̴ 4 x 10e8).
  • High background of random genomic DNA reads makes it challenging to identify low-frequency nuclease-induced cleavage events.
  • Introduces DSBs in vitro, but this approach may not be representative of relevant nuclease concentrations and of cellular properties, e.g. chromatin environment, nuclear architecture, which might influence the frequency of DNA breaking and repair.

 

3.6   DISCOVER-Seq (Discovery of In-Situ Cas Off-targets and VERification by SEQuencing) [16, 18, 30]

               Features

  • Permits direct detection in vivo, in real time (e.g. cell lines, patient-derived iPS cells and live animal models during adenoviral gene editing).
  • Identifies the exact sites in the genome where a DSB has occurred. 
  • Directly provides data on true Cas9 cleavage events.
  • Detects the repair factor, MRE11, one of the first responders to the cleavage site. 
  • No need to purify DNA or to use specific cellular models.
  • Can be applied to multiple CRISPR/Cas systems.
  • Robustly identifies true in vivo off-targets that are missed by CIRCLE-Seq or GUIDE-Seq.
  • Determine bona fide off-targets in a 1-step procedure.
  • In vivo sensitivity not as high as in vitro CIRCLE-Seq.
  • Threshold of detection of indels: > 0.8 %.

 

3.7   DSBCapture [28, 31]

               Features

  • Captures DSBs in situ.
  • Directly maps DSBs at single nucleotide resolution, enabling DSB origin to be determined (i.e. whether drug-induced / radiation-induced / etc.).
  • Requires substantial amounts of input material (millions of cells).
  • Labour-intensive.
  • Semi-quantitative due to lack of appropriate controls for PCR amplification biases, limiting applications and scalability.

 

3.8   End-Seq [28, 32]    

               Features

  • Able to resolve DSBs at single nucleotide level in vivo.
  • Semi-quantitative due to lack of appropriate controls for PCR amplification biases, limiting applications and scalability.

 

3.9   Exome sequencing [16, 18, 28]

               Features

  • Unbiased detection of mutations in coding regions.
  • Less expensive than whole-genome sequencing.
  • Does not detect mutations in non-coding regions.
  • Reference genome required.

 

3.10   GUIDE-Seq (Genome-wide Unbiased Identifications of DSBs Evaluated by Sequencing) [13-14]

https://github.com/aryeelab/guideseq

               Features

  • Detects DSBs introduced by RNA-guided nucleases (RGNs) and potentially other nucleases.
  • Identification of RGN-independent genome breakpoint ‘hotspots’.
  • Greater detection sensitivity than ChIP-Seq and existing predictive algorithms
  • Revealed that truncated gRNAs exhibit substantially reduced RGN-induced off-target DSBs
  • Safety evaluation of CRISPR nucleases before clinical use
  • Sensitivity: 0.3 %
  • Tests nuclease cutting in a cellular context but relies on the integration of exogenous double-stranded oligonucleotides (dsODNs), which is inefficient in primary cells and not applicable in vivo.
  • dsODNs integrate only in  ̴  30-50% of DSBs.
  • Detects and quantifies DSBs repaired by NHEJ, but not by other repair pathways.
  • Reference genome required.
  • False negatives.
  • Limited by chromatin accessibility.

 

3.11   HTGTS (High-Throughput Genome-Wide Translocation Sequencing) [14, 16, 30]

               Features

  • Detects large chromosomal rearrangements.
  • Detects potential off-target sites, and reveals collateral damage caused by recurrent translocations.
  • Detects and quantifies DSBs repaired by NHEJ, but not by other repair pathways.
  • Relies on concurrence of DSBs, even though chromosomal translocations are relatively rare.
  • Reference genome required.
  • False negatives.
  • Limited by chromatin accessibility.

 

3.12   IDLV (Integrase-Defective Lentiviral Vector)-mediated DNA break capture [14, 28]

               Features

  • Programmable.
  • Sensitivity: 1.0 %.
  • Detects and quantifies DSBs repaired by NHEJ, but not by other repair pathways.
  • Tests nuclease cutting in a cellular context but relies on the integration of an exogenous DNA oligo that is inefficient in primary cells and not applicable in vivo.
  • Many bona fide off-target sites cannot be captured.

 

3.13   Whole genome sequencing [16, 30]

               Features

  • Applies NGS of library prepared from whole genome.
  • Unbiased, comprehensive analysis.
  • Detects SNPs, indels and structural variants.
  • Expensive.
  • Only detects higher frequency off-target sites.
  • Not sensitive enough to detect off-target sites in bulk populations.
  • Reference genome required.

 

REFERENCES

1. GASUINAS, G, BARRANGOU, R, HORVATH, P, et al. “Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria”. Proc Natl Acad Sci USA 2012; 109: E2579-E2586.

2. JORE, MM, LUNDGREN, M, van DUIJN, E, et al. “Structural basis for CRISPR RNA-guided DNA recognition by Cascade”. Nat Struct Mol Biol 2011; 18(5): 529-536.

3. RATH, D, AMLINGER, L, RATH, A, et al. “The CRISPR-Cas immune system: biology, mechanisms and applications”. Biochimie 2015; 117: 119-128.

4. JINEK, M, CHYLINSKI, K, FONFARA, I, et al. “A programmable dual-RNA-guided endonuclease in adaptive bacterial immunity”.  Science 2012; 337(6096): 816-821.

5. CONG, L, RAN, FA, COX, D, et al. “Multiplex genome engineering using CRISPR/Cas systems”. Science 2013; 339(6121): 819-823.

6. HSU, PD, SCOTT, DA, WEINSTEIN, JA, et al. “DNA targeting specificity of RNA-guided Cas9 nucleases”. Nat Biotechnol 2013; 31: 827-832.

7. DAI, W-J, ZHU, L-Y, YAN, Z-Y, et al. “CRISPR-Cas9 for in vivo gene therapy: promise and hurdles”. Mol Ther Nucleic Acids 2016; 5: e349.  doi: 10.1038/mtna.2016.58.

8. NISHIYAMA, J, MIKUNI, T, YASUDA, R. “Virus-mediated genome editing via homology-directed repair in mitotic and postmitotic cells in mammalian brain”. Neuron 2017; 96(4): 755-7e68.e5.

9. CHEN, Y, LIU, X, ZHANG, Y, et al. “A self-restricted CRISPR system to reduce off-target effects”. Mol Ther 2016; 24(9): 1508-1510.

10. WU, X, SCOTT, DA, DRIZ, AJ, et al. “Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells”. Nat Biotechnol 2014; 32(7): 670-676.

11. KOO, T, LEE, J & KIM, J-S. “Measuring and reducing off-target activities of programmable nucleases including CRISPR-Cas9”. Mol Cells 2015; 38(6): 475-481.

12. KIM, D, BAE, S, PARK, J, et al. “Digenome-seq: genome-wide profiling of CRISPR-Cas9 off-target effects in human cells”. Nat Methods 2015; 12: 237-243.

13. TSAI, SQ, ZHENG, Z, NGUYEN, NT, et al. “GUIDE-Seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases”. Nat Biotechnol 2015; 33(2): 187-197.

14. ZHANG, X-H, TEE, LY, WANG, X-G, et al. “Off-target effects in CRISPR/Cas9-mediated genome engineering”.  Mol Ther Nucleic Acids 2015; 4(11): e264.  doi: 10.1038/mtna.2015.37.

15. YEADON, J. “Pros and cons of ZNFs, TALENs and CRISPR/Cas”. Jackson Laboratory. (https://www.jax.org/news-and-insights/jax-blog/2014/march/pros-and-cons-of-znfs-talens-and-crispr-cas).

16. ZISCHEWSKI, J, FISCHER, R & BORTESI, L. “Detection of on-target and off-target mutations generated by CRISPR/Cas9 and other sequence-specific nucleases”. Biotechnol Adv 2017; 35(1): 95-104.

17. LISTGARTEN, J, WEINSTEIN, M, KLEINSTIVER, BP, et al. “Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs”. Nature Biomed Eng 2018; 2(1): 38-47.

18. WIENERT, B, WYMAN, SK, RICHARDSON, CD, et al. “Unbiased detection of CRISPR off-targets in vivo using DISCOVER-Seq”.  Science 2019; 364(6437): 286-289.

19. BAE, S, PARK, J & KIM, J-S. “Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases”. Bioinformatics 2014; 30(10): 1473-1475.

20. HAEUSSLER, M, SCHÖNIG, K, ECKERT, H, et al. “Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR”. Genome Biol 2016; 17(1): 148.  doi: 10.1186/s13059-016-1012-2.

21. CONCORDET, J-P & HAEUSSLER, M.  “CRISPOR: intuitive guide selections for CRISPR/Cas9 genome editing experiments and screens”. Nucleic Acids Res 2018; 46(W1): W242-W245.

22. LABUN, K, MONTAGUE, TG, GAGNON, JA, et al. “CHOPCHOP v2: a web tool for the next generation of CRISPR genome engineering”.  Nucleic Acids Res 2016; 44(W1): W272-W276.

23. MONTAGUE, TG, CRUZ, JM, GAGNON, JA, et al. “CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing”. Nucleic Acids Res 2014; 42(W1): W401-W407.

24. MA, J, KÖSTER, J, QIN, Q, et al. “CRISPR-DO for genome-wide CRISPR design and optimization”. Bioinformatics 2016; 32(21): 3336-3338.

25. ZHAO, C, ZHENG, X, QU, W, et al. “CRISPR-offinder: a CRISPR guide RNA design and off-target searching tool for user-defined protospacer adjacent motif”.  Int J Biol Sci 2017; 13(12): 1470-1478.

26. DOENCH, JG, FUSI, N, SULLENDER, M, et al.  “Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9”. Nat Biotechnol 2016; 34(2): 184-191.

27. CRADICK, TJ, QIU, P, LEE, CM, et al.  “COSMID: a web-based tool for identifying and validating CRISPR/Cas off-target sites”.  Mol Ther Nucleic Acids 2014; 3: e214.  doi: 10.1038/mtna.2014.64.

28.  YAN, WX, MIRZAZADEH, R, GARNERONE, S, et al. “BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks”. Nat Commun  2017; 8: 15058. doi: 10.1038/ncomms15058.

29. TSAI, SQ, NGUYEN, NT, MALAGON-LOPEZ, J, et al. “CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 nuclease off-targets”. Nat Methods 2017; 14(6): 607-614.

30. MARTIN, F, SÁNCHEZ-HERNÁNDEZ, S, GUTIÉRREZ-GUERRERO, A, et al. “Biased and unbiased methods for the detection of off-target cleavage by CRISPR/Cas9: an overview”.  Int J Mol Sci 2016; 17: 1507-1514.

31. LENSING, SV, MARSICO, G, HÄNSEL-HERTSCH, R, et al.  “DSBCapture: in situ capture and direct sequencing of dsDNA breaks”. Nat Methods 2016; 13(10): 855-857.

32. CANELA, A, SRIDHARAN, S, SCIASCIA, N, et al. DNA breaks and end resection measured genome-wide by end sequencing”. Mol Cell 2016; 63(5): 898-911.

Please enter these characters in the following text field.

The fields marked with * are required.