Oh Transgene, Where Art Thou? Mapping Transgene Insertion

Key Takeaways

Transgenic mice are essential in biomedical research, but the random integration of transgenes during pronuclear injection can disrupt native genes and complicate phenotype interpretation, with 5-10% of transgenic mice showing unrelated phenotypes.
Traditional mapping techniques, such as FISH and PCR, are limited in resolution and efficiency, while newer methods like whole genome sequencing offer improvements in deciphering transgene insertion sites.
Targeted Locus Amplification (TLA) is an advanced NGS-based method that allows precise mapping of transgene integration sites, detecting all genetic variations with minimal prior sequence knowledge, improving understanding of transgene impact, and reducing colony management costs.

Over the last three decades, transgenic mice have become a critical in vivo modeling tool in biomedical research. Transgenic technique¹, whereby an exogenous gene is inserted into the mouse genome by direct injection of DNA into the pronuclei of a zygote, has enabled thousands of new transgenic lines to be created².

However the technique is not without limitations. One of the biggest drawbacks of using pronuclear injection for generating transgenic mice is that the transgene cannot be directed to a specific chromosomal location of the mouse genome. Integration of the transgene is a random event.

Without knowing where integration takes place, it is impossible to completely predict the consequences of a given genetic modification.

Interpreting Phenotypic Data from Transgenic Models

There are several factors worth considering when interpreting the phenotypes of transgenic lines:

The regulatory or coding region of a critical endogenous gene may be disrupted by insertion of the transgene^3,4,5, potentially complicating the interpretation of phenotypes. It has been estimated that 5-10% of transgenic mice carry phenotypes unrelated to the function of the transgene⁶.
The precise location where the integration event occurs may lead to mosaic patterns of transgene expression, a phenomenon known as position effect⁷.
Multiple copies of the transgene are often inserted as head-to-tail concatemers resulting in variability in copy number between or within founder lines. High transgene copy numbers may lead to epigenetic modification and transgene silencing⁸.

Because of the potential risks associated with random transgenesis, it is important to know the precise location of the transgene integration site. How can you effectively map transgene insertion?

Existing Methods for Mapping Transgene Insertion

The characterization of transgenic animals has historically been a challenging process. Mapping transgene insertion has typically been achieved using Fluorescence In-Situ Hybridization (FISH) or PCR-based methods — each of which bring their own limitations to the project.

FISH is a low resolution visualization technique that is labor intensive and limited by its inability to verify sequence integrity or detect tandem insertions at the integration site⁹. PCR-based approaches, such as inverse PCR or ligation-mediated PCR, offer better resolution than FISH, but require knowledge of the restriction sites within the transgene and specific sequence information¹⁰.

As the cost of sequencing has come down significantly, whole genome sequencing¹¹ and sequencing with capture probes¹² have recently been used to decipher transgene insertion sites. Although next generation sequencing (NGS)-based methods offer better clarity and quicker turnaround than FISH and PCR-based methods, the standard pair-end read (typically <400 bp) cannot reliably detect all structural variations in and around the transgene insertion site, particularly when the region is rich in repetitive sequences.

Efficiently Mapping Transgene Insertion Sites with TLA

In January 2017, an alternative NGS-based method — targeted locus amplification (TLA) — was successfully used to identify the transgene integration sites in seven commonly used Cre and CreERT2 transgenic lines¹³. TLA is a novel targeted enrichment strategy combining the principle of proximity ligation with NGS to selectively amplify and sequence the transgene and surrounding genomic region of sizes ranging from tens to hundreds of kilobases.

Using only one primer pair, complementary to a short sequence unique to the transgene, crosslinked and ligated DNA fragments surrounding the transgene insertion site are selectively amplified and sequenced. By analyzing the coverage profile of the sequencing reads and the breakpoint sequences, TLA allows the precise identification of the transgene integration site¹⁴.

The main advantage of TLA over conventional approaches is that it generates complete sequence information of a region of interest. TLA thus enables the detection of all Single Nucleotide Variants, structural changes (both in the transgene and integration site) and only requires very little prior knowledge of the transgene sequence.

In all seven transgenic lines, the TLA analyses detected structural changes — either deletions or genomic duplications — at the transgene integration sites. This illustrates the importance of testing for rearrangements around the site of transgene insertion. An example of TLA analyses is provided below.

Advantages of Transgene Mapping Analysis

TLA whole genome coverage and analysis plot of a Tyr-CreERT2 transgenic animal. Upper panel: TLA sequence coverage: mouse chromosomes 1 through X are arranged on the Y-axis. X-axis shows chromosomal position. Lower panel: graphic representation of transgene integration site and structural changes. Grey: flanking genomic sequence. Blue: transgene and corresponding genomic coordinates of the transgene sequence (mouse, human or rat genome). Mouse genome assembly: mm9; human genome assembly: hg19. Rat genome assembly: rn5 (Credit: this figure is kindly provided by Cergentis).

By fully characterizing the transgene insertion site, researchers gain better understanding of how the insertion site location of a transgene contributes to phenotypic outcomes and uncovers potential for instability in transgene expression.

Knowing the location of transgene insertion is also useful for planning intercrosses in the event that a transgene and the desired allele to be selected are located on the same chromosome. This can reduce the cost of maintaining a transgenic colony.

When integration site is unknown, a quantitative based copy number variation (CNV) analysis is utilized to distinguish wild-type from hemi- and homozygous animals. This can vary in resolution and reliability, depending on transgene size and copy number. Once the integration site is known, researchers can replace the quantitative analysis with targeted genotyping assays using standard PCR, which dramatically cuts colony management costs while also being more specific and reliable.

The power and simplicity of TLA technology has opened the door for researchers to efficiently identify transgene integration sites while also providing important data regarding transgene integrity and the integrity of the surrounding genome. This information is invaluable for interpreting all data generated from a transgenic model and for planning future studies and intercrosses.

References:

1. Gordon, J.W.; Sangos, G.A.; Plotkin, D.J.; Barbosa, J.A.; Ruddle, F.H. Genetic transformation of mouse embryos by microinjection of purified DNA. Proc. Natl. Acad. Sci. USA. 1980, 77, 7380-7384.

2. Eppig, J.T.; Blake, J.A.; Bult, C.J.; Kadin, J.A.; Richardson, J.E. The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse. Nucleic Acids Res. 2012, 40, D881-D886.

3. Mukai, H.Y.; Motohashi, H.; Ohneda, O.; Suzuki, N.; Nagano, M.; Yamamoto, M. Transgene insertion in proximity to the C-myb gene disrupts erythroid-megakaryocytic lineage bifurcation. Mol. Cell. Biol. 2006, 26, 7953-7965.

4. Durkin, M.E.; Keck-Waggoner, C.L.; Popescu, N.C.; Thorgeirsson, S.S. Integration of a C-myc transgene results in disruption of the mouse Gtf2ird1 gene, the homologue of the human GTF2IRD1 gene hemizygously deleted in williams-beuren syndrome. Genomics, 2001, 73, 20-27.

5. Meisler, M.H. Insertional mutation of 'classical' and novel genes in transgenic mice. Trends Genet. 1992, 9, 341-344.

6. Yong, C. S. M.; Sharkey, J.; Duscio, B.; Venville, B.; Wei, W.-Z.; Jones, R. F., ... Kershaw, M. H. Embryonic Lethality in Homozygous Human Her-2 Transgenic Mice Due to Disruption of the Pds5b Gene. PLoS ONE, 2015, 10, e0136817.

7. Dobie, K. W.; Lee, M.; Fantes, J. A.; Graham, E.; Clark, A. J.; ... McClenaghan, M. Variegated transgene expression in mouse mammary gland is determined by the transgene integration locus. PNAS, 1996, 93, 6659-6664.

8. Garrick, D.; Fiering, S.; Martin, D.I.; Whitelaw, E. Repeat-induced gene silencing in mammals. Nat. Genet. 1998, 18, 56-59.

9. Kulnane, L.; Lehman, E.; Hock, B.; Tsuchiya K.D.; Lamb, B.T. Rapid and efficient detection of transgene homozygosity by FISH of mouse fibroblasts. Mammalian Genome. 2002, 13, 223-226.

10. Liang, Z.; Breman, A.M.; Grimes, B.R.; Rosen, E.D. Identifying and genotyping transgene integration loci. Transgenic Res. 2008, 17, 979-983.

11. Ji, Y.; Abrams, N.; Zhu, W.; Salinas, E.; Yu, Z.; ... Restifo, N.P. Identification of the genomic insertion site of Pmel-1 TCR α and β transgenes by next-generation sequencing. PLoS One. 2014, 9, e96650.

12. Dubose, A.J.; Lichtenstein, S.T.; Narisu, N.; Bonnycastle, L.L.; Swift A.J.; Chines, P.S.; Collins, F.S. Use of microarray hybrid capture and next-generation sequencing to identify the anatomy of a transgene. Nucleic Acids Res. 2013, 41, e70.

13. Cain-Hom, C.; Splinter, E.; van Min, M.; Simonis, M.; van de Heijning, M.; Martinez, M.; Asghari, V.; Cox, J.C.; Warming, S. Efficient mapping of transgene integration sites and local structural changes in Cre transgenic mice using targeted locus amplification. Nucleic Acids Res. 2017, pii, gkw1329.

14. Vree, P. J. P. D.; Wit, E. D.; Yilmaz, M.; Heijning, M. V. D.; Klous, P.; Verstegen, M. J. A. M.; Wan, Y.; Teunissen, H.; Krijger, P. H. L.; Geeven, G.; et al. Targeted sequencing by proximity ligation for comprehensive variant detection and local haplotyping. Nature Biotechnology 2014, 32 (10), 1019-1025.

Featured Resources

On-Demand Webinar

In Mice for Men - Conditional Cytokine and Cytokine Receptor Mutants

Watch Now

On-Demand Webinar

Mapping Insertion Sites and Sequencing Transgenes with TLA

Watch Now

View All Resources

Get In Touch

Book a complimentary consultation

Taconic Biosciences' model generation team has produced about 5,000 models in the last 15 years, developing a globally-recognized reputation for advancing the work of in vivo researchers. Our scientific program managers are here to help you navigate the complexities of model generation.

If you need immediate assistance, please contact Customer Service:

Taconic Corporate Offices

Email: info@taconic.com

Phone: +1 (518) 697-3900

273 Hover Ave., Germantown, NY 12526

North American Customer Service

Email: info@taconic.com

Phone:
+1 (518) 697-3915

Toll-free:
+1 (888) 822-6642 

Hours:
(Monday - Friday): 7 a.m. - 6 p.m. ET

European Customer Service

Email: info@taconic.com

Phone (Europe and Denmark):
+45 70 23 04 05

Phone (Germany):
+49 214 50 68 023

Hours: (Monday - Friday):
7 a.m. - 5 p.m. CET