Oh Transgene, Where Art Thou?
Mapping Transgene Insertion

Mapping Transgene Insertion Over the last three decades, transgenic mice have become a critical in vivo modeling tool in biomedical research. Transgenic technique1, whereby an exogenous gene is inserted into the mouse genome by direct injection of DNA into the pronuclei of a zygote, has enabled thousands of new transgenic lines to be created2.

However the technique is not without limitations. One of the biggest drawbacks of using pronuclear injection for generating transgenic mice is that the transgene cannot be directed to a specific chromosomal location of the mouse genome. Integration of the transgene is a random event.

Without knowing where integration takes place, it is impossible to completely predict the consequences of a given genetic modification.

Interpreting Phenotypic Data from Transgenic Models

There are several factors worth considering when interpreting the phenotypes of transgenic lines:

  1. The regulatory or coding region of a critical endogenous gene may be disrupted by insertion of the transgene3,4,5, potentially complicating the interpretation of phenotypes. It has been estimated that 5-10% of transgenic mice carry phenotypes unrelated to the function of the transgene6.
  2. The precise location where the integration event occurs may lead to mosaic patterns of transgene expression, a phenomenon known as position effect7.
  3. Multiple copies of the transgene are often inserted as head-to-tail concatemers resulting in variability in copy number between or within founder lines. High transgene copy numbers may lead to epigenetic modification and transgene silencing8.
Because of the potential risks associated with random transgenesis, it is important to know the precise location of the transgene integration site. How can you effectively map transgene insertion?

Taconic Mouse Models Read more about Taconic Biosciences' process for:

Existing Methods for Mapping Transgene Insertion

The characterization of transgenic animals has historically been a challenging process. Mapping transgene insertion has typically been achieved using Fluorescence In-Situ Hybridization (FISH) or PCR-based methods — each of which bring their own limitations to the project.

FISH is a low resolution visualization technique that is labor intensive and limited by its inability to verify sequence integrity or detect tandem insertions at the integration site9. PCR-based approaches, such as inverse PCR or ligation-mediated PCR, offer better resolution than FISH, but require knowledge of the restriction sites within the transgene and specific sequence information10.

As the cost of sequencing has come down significantly, whole genome sequencing11 and sequencing with capture probes12 have recently been used to decipher transgene insertion sites. Although next generation sequencing (NGS)-based methods offer better clarity and quicker turnaround than FISH and PCR-based methods, the standard pair-end read (typically <400 bp) cannot reliably detect all structural variations in and around the transgene insertion site, particularly when the region is rich in repetitive sequences.

Efficiently Mapping Transgene Insertion Sites with TLA

In January 2017, an alternative NGS-based method — targeted locus amplification (TLA) — was successfully used to identify the transgene integration sites in seven commonly used Cre and CreERT2 transgenic lines13. TLA is a novel targeted enrichment strategy combining the principle of proximity ligation with NGS to selectively amplify and sequence the transgene and surrounding genomic region of sizes ranging from tens to hundreds of kilobases.

Using only one primer pair, complementary to a short sequence unique to the transgene, crosslinked and ligated DNA fragments surrounding the transgene insertion site are selectively amplified and sequenced. By analyzing the coverage profile of the sequencing reads and the breakpoint sequences, TLA allows the precise identification of the transgene integration site14.

The main advantage of TLA over conventional approaches is that it generates complete sequence information of a region of interest. TLA thus enables the detection of all Single Nucleotide Variants, structural changes (both in the transgene and integration site) and only requires very little prior knowledge of the transgene sequence.

In all seven transgenic lines, the TLA analyses detected structural changes — either deletions or genomic duplications — at the transgene integration sites. This illustrates the importance of testing for rearrangements around the site of transgene insertion. An example of TLA analyses is provided below.

Advantages of Transgene Mapping Analysis

TLA whole genome coverage and analysis plot of a Tyr-CreERT2 transgenic animal
TLA whole genome coverage and analysis plot of a Tyr-CreERT2 transgenic animal. Upper panel: TLA sequence coverage: mouse chromosomes 1 through X are arranged on the Y-axis. X-axis shows chromosomal position. Lower panel: graphic representation of transgene integration site and structural changes. Grey: flanking genomic sequence. Blue: transgene and corresponding genomic coordinates of the transgene sequence (mouse, human or rat genome). Mouse genome assembly: mm9; human genome assembly: hg19. Rat genome assembly: rn5 (Credit: this figure is kindly provided by Cergentis).
By fully characterizing the transgene insertion site, researchers gain better understanding of how the insertion site location of a transgene contributes to phenotypic outcomes and uncovers potential for instability in transgene expression.

Knowing the location of transgene insertion is also useful for planning intercrosses in the event that a transgene and the desired allele to be selected are located on the same chromosome. This can reduce the cost of maintaining a transgenic colony.

When integration site is unknown, a quantitative based copy number variation (CNV) analysis is utilized to distinguish wild-type from hemi- and homozygous animals. This can vary in resolution and reliability, depending on transgene size and copy number. Once the integration site is known, researchers can replace the quantitative analysis with targeted genotyping assays using standard PCR, which dramatically cuts colony management costs while also being more specific and reliable.

The power and simplicity of TLA technology has opened the door for researchers to efficiently identify transgene integration sites while also providing important data regarding transgene integrity and the integrity of the surrounding genome. This information is invaluable for interpreting all data generated from a transgenic model and for planning future studies and intercrosses.

1. Gordon, J.W.; Sangos, G.A.; Plotkin, D.J.; Barbosa, J.A.; Ruddle, F.H. Genetic transformation of mouse embryos by microinjection of purified DNA. Proc. Natl. Acad. Sci. USA. 1980, 77, 7380-7384.
2. Eppig, J.T.; Blake, J.A.; Bult, C.J.; Kadin, J.A.; Richardson, J.E. The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse. Nucleic Acids Res. 2012, 40, D881-D886.
3. Mukai, H.Y.; Motohashi, H.; Ohneda, O.; Suzuki, N.; Nagano, M.; Yamamoto, M. Transgene insertion in proximity to the C-myb gene disrupts erythroid-megakaryocytic lineage bifurcation. Mol. Cell. Biol. 2006, 26, 7953-7965.
4. Durkin, M.E.; Keck-Waggoner, C.L.; Popescu, N.C.; Thorgeirsson, S.S. Integration of a C-myc transgene results in disruption of the mouse Gtf2ird1 gene, the homologue of the human GTF2IRD1 gene hemizygously deleted in williams-beuren syndrome. Genomics, 2001, 73, 20-27.
5. Meisler, M.H. Insertional mutation of 'classical' and novel genes in transgenic mice. Trends Genet. 1992, 9, 341-344.
6. Yong, C. S. M.; Sharkey, J.; Duscio, B.; Venville, B.; Wei, W.-Z.; Jones, R. F., ... Kershaw, M. H. Embryonic Lethality in Homozygous Human Her-2 Transgenic Mice Due to Disruption of the Pds5b Gene. PLoS ONE, 2015, 10, e0136817.
7. Dobie, K. W.; Lee, M.; Fantes, J. A.; Graham, E.; Clark, A. J.; ... McClenaghan, M. Variegated transgene expression in mouse mammary gland is determined by the transgene integration locus. PNAS, 1996, 93, 6659-6664.
8. Garrick, D.; Fiering, S.; Martin, D.I.; Whitelaw, E. Repeat-induced gene silencing in mammals. Nat. Genet. 1998, 18, 56-59.
9. Kulnane, L.; Lehman, E.; Hock, B.; Tsuchiya K.D.; Lamb, B.T. Rapid and efficient detection of transgene homozygosity by FISH of mouse fibroblasts. Mammalian Genome. 2002, 13, 223-226.
10. Liang, Z.; Breman, A.M.; Grimes, B.R.; Rosen, E.D. Identifying and genotyping transgene integration loci. Transgenic Res. 2008, 17, 979-983.
11. Ji, Y.; Abrams, N.; Zhu, W.; Salinas, E.; Yu, Z.; ... Restifo, N.P. Identification of the genomic insertion site of Pmel-1 TCR α and β transgenes by next-generation sequencing. PLoS One. 2014, 9, e96650.
12. Dubose, A.J.; Lichtenstein, S.T.; Narisu, N.; Bonnycastle, L.L.; Swift A.J.; Chines, P.S.; Collins, F.S. Use of microarray hybrid capture and next-generation sequencing to identify the anatomy of a transgene. Nucleic Acids Res. 2013, 41, e70.
13. Cain-Hom, C.; Splinter, E.; van Min, M.; Simonis, M.; van de Heijning, M.; Martinez, M.; Asghari, V.; Cox, J.C.; Warming, S. Efficient mapping of transgene integration sites and local structural changes in Cre transgenic mice using targeted locus amplification. Nucleic Acids Res. 2017, pii, gkw1329.
14. de Vree, P.J.; de Wit, E.; Yilmaz, M.; van de Heijning, M.; Klous, P.; ... de Laat, W. Targeted sequencing by proximity ligation for comprehensive variant detection and local haplotyping. Nat Biotechnol. 2014, 32, 1019-1025.