Bias Free Linear Vector for Cloning Recalcitrant DNA

Transcript Of Bias Free Linear Vector for Cloning Recalcitrant DNA
Bias Free Linear Vector for Cloning Recalcitrant DNA & Accelerating Sequence Finishing
Ronald Godiska1, Rebecca Hochstein1, Sarah Vande Zande1, Nikolai Ravin2, Attila Karis3, David A. Mead1
1 Lucigen Corporation, Middleton, WI 53562 2 Centre Bioengineering, Russian Academy of Science, Moscow, Russia 3 Mississippi State University
Abstract
We have developed a novel linear vector for unbiased cloning of 0-30 kb inserts in E. coli. This vector, termed “pJAZZ”, shows unprecedented ability to maintain large inserts from very AT-rich genomes. The otherwise difficult-to-clone genome from Flavobacterium columnare (70% AT, 3.2 Mb) was sequenced to seven fold coverage using the pJAZZ vector, with only 10 sequencing gaps. The linear vector was able to maintain 20-30 kb fragments from Lactobacillus helveticus (65% AT) and 2-4 kb inserts from Piromyces (up to 96% AT), which were unclonable in conventional plasmids. Unlike fosmid cloning, the construction of large-insert libraries (10-20 kb) in pJAZZ is simple and robust, using standard methods of transformation and plasmid purification. We are evaluating the use of a single pJAZZ shotgun library to eliminate the need for multiple libraries, making finishing easier and more cost effective. Enhanced stability of inserts in the pJAZZ vector is attributed to both the lack of supercoiling and the lack of transcriptional interference. Torsional strain inherent to supercoiled plasmids can induce localized melting and generate secondary structures, which are substrates for deletion or rearrangement by resolvases and replication enzymes. For example, the instability of tandem repeats and palindromic sequences is presumably due to cleavage of hairpin structures or to replication slippage across the secondary structures. Most conventional plasmid vectors also induce strong transcription and translation of inserted fragments, and they allow transcription from cloned promoters to interfere with plasmid stability. As a result, certain DNA sequences are deleterious or highly unstable, leading to sequence “stacking”, clone gaps, or a complete inability to construct libraries, especially from AT-rich genomes or toxic cDNAs. The transcription-free, linear pJAZZ vector also minimizes “sequence gaps” caused by secondary structures, as shown by its stable cloning of inverted repeats and di- and tri-nucleotide repeats.
Replication of the pJAZZ Linear Vector
The pJAZZ vector is derived from the linear phage N15 of E. coli. The repA and telN genes of N15 encode replicase and protelomerase. Bi-directional replication is initiated by RepA and is carried out by the host DNA polymerase. TelN cleaves the replicated telomeres and covalently joins the 5’ and 3’ strands of each free end, re-creating the terminal hairpin loops (telL and telR).
Linear plasmid DNA is electroporated into BigEasy™ TSA E. coli cells, which contain N15 genes needed for partitioning and copy number regulation. Linear plasmid DNA is efficiently isolated using standard methods, such as alkaline lysis and binding to a silica matrix.
tel L RepA
tel R
Insert
Insert
TelN
Insert
TelN TelN
Insert
Insert Insert
Minimal Size Bias
Sheared E. coli genomic DNA was cloned into the pJAZZ vector or pUC19 without size selection. As expected, the size distibution of the pUC19 inserts was clearly skewed toward smaller inserts. In contrast, the distribution of the pJAZZ inserts closely matched the input DNA.
40 pJAZZ pUC19
35
30
25
20
15
10
5
0 0-1 kb
1-2 kb 2-3 kb
3-4 kb
4-5 kb
5-6 kb 6-8 kb
>8 kb
pJAZZ™ Linear Vectors for High Stability Cloning
The linear pJAZZ vectors lack the structural and lysis genes of N15. All versions of the pJAZZ vectors are also transcription-free, and they appear to behave similarly. Dual selection ensures that recombinant clones contain both arms of the vector. The origin of replication is used to select for the left arm in the pJAZZ-OC vectors.
TelN, telomerase; repA, replication protein; cB, replication regulator; Cmr, chloramphenicol resistance; terminators; black circles, closed hairpin telomeres.
The ends of the vectors are free to rotate during replication, so cloned inserts are not subject to torsional stress caused by supercoiling. Transcriptional terminators at the cloning site minimize
transcriptional interference between the insert and the vector, increasing insert stability.The pJAZZ vectors are lowcopy (~5/cell) to further promote stable propagation of inserts, and their copy number can be induced 5-20X for DNA preparation.
The E. coli host strain for the pJAZZ vectors contains several required genes of N15: telN encodes protelomerase for replication; sopAB is needed for stable maintenance, and antA regulates copy number.
Cloning“Unclonable” DNA
AT-rich genomic DNA
Using conventional circular vectors, AT-rich DNA is often difficult to clone in E. coli, producing very few stable, intact clones (Figure A).
A) Lactobacillus helveticus (>67% AT) inserts of 1-2 kb are unstable in pUC19
M V
V M
16 kb 8 kb 6 kb 4 kb
2 kb
Intact clones Deleted clones
In contrast, large clones of extremely AT-rich DNA are stable in the linear, transcription-free pJAZZ vectors (Figures B, C, D, E)
B) L. helveticus inserts of 10-20 kb (67% AT) in the pJAZZ vector
23 kb
9.4 kb 6.6 kb
Left Arm (12 kb)
Right Arm (2 kb) [ran off gel]
C) Piromyces inserts of 2-6 kb (85-96% AT), with a sequence trace from one of the clones.
10 kb
Left Arm (12 kb)
6 kb
4 kb
2 kb
Right Arm (2 kb)
1 kb
GC-rich genomic DNA
The GC-rich genome of B. animalis (67% GC) was sheared to 6-20 kb, end-repaired, and cloned into the pJAZZ vector. Clones from this library were similar in size and number to those from the AT-rich L. helveticus library (see below). Uncut DNA from transformants was 24-40 kb, corresponding to inserts of 10-26 kb.
97 kb
48 kb 33 kb 15 kb
L. helveticus (65% AT)
B. animalis (67% GC)
Repetitive DNA
Repetitive DNA sequences can be extremely difficult to clone in circular plasmids. For example, mollusk cDNA of 0.3-2 kb was unclonable in all circular plasmids tested. However, the pJAZZ linear vector produced a library of full length clones. Repetitive DNA from two of the clones is shown (Figures A & B).
A)
B)
Cloning Large PCR products
PCR products of 15 - 20 kb were amplified from phage lambda DNA and cloned into the pJAZZ vector. Plasmid DNA from transformants was cut with NotI to excise the insert, then analyzed by gel electrophoresis. The left vector arm migrates at 10 kb; the right arm (2 kb) was run off the gel.
Phage lambda PCR fragments
23 kb
9.4 kb 6.6 kb
Efficient Library Construction and Assembly
The pJAZZ vector was used to construct genomic libraries of Flavobacterium columnare (75% AT, 3.2 Mb). Clones of 2-10 kb were readily obtained, and gave sequence reads of ~800 bp.
Automated assembly closely followed the theoretical Lander Waterman curve, yielding 21 major contigs. Further analysis revealed just 10 sequence gaps. Therefore, this AT-rich genome was completely cloned and nearly completely sequenced without the use of fosmid or BAC libraries.
-Attila Karsi, Mississippi State Univiversity
1400
1200
1000
800
Contigs
600
400
200
0
0
1
2
3
4
5
6
7
8
Coverage
E) Number of contigs vs. genomic sequence coverage of Flavobacterium columnare.
D) Clostridium inserts of 6-12 kb (65% AT)
- Bob Fulton, Debbie Moeller. Wash.Univ. Genome Sequencing Center
10 kb
Left Arm (12 kb)
6 kb
4 kb
2 kb
Right Arm (2 kb)
1 kb
E) Ichthyophthirius multifiliis inserts of 6-12 kb (85% AT)
- Donna Cassidy-Hanley, Cornell Univ.
10 kb
Left Arm (10 kb)
6 kb
4 kb
2 kb
Right Arm (2 kb)
1 kb
DNAs were sheared to 2-6 kb (Figures A & C) or 6-20 kb (Figures B, D, E), end-repaired, size-selected, and cloned into the vector. Plasmid DNA from pJAZZ transformants was cut with NotI to excise the insert and analyzed by gel electrophoresis. Vector bands are 12 kb and 2 kb. AT content of inserts was as high as 96% (Figure D).
In E. coli, clones of 50-100 CGG repeats from the Fragile X locus are highly unstable in circular vectors, and the frequency of deletion is increased by transcription and supercoiling. Corroborating these results, the pJAZZ vector was able to maintain fragments containing 220 copies of the CGG repeat, which has not been achieved with circular vectors.
(CGG)220
Summary
• Linear, transcription-free pJAZZ vectors allow cloning of repetitive DNA and large AT-rich or GC-rich DNAs. Libraries of 10-20 kb clones of AT-rich genomic DNA were routinely created with the linear transcription-free vector. Likewise, repetitive DNAs were maintained without rearrangement.
• Efficient Genomic Cloning and Sequencing. Powerful, unbiased cloning greatly reduced the need for manual finishing of genomic libraries. A 3-Mb microbial genome was assembled without the use of fosmid clones.
• Rapid and simple protocol. No vector preparation or special techniques are needed to generate high quality, large-insert libraries of otherwise “unclonable” DNAs.
1
2
3
4 56
8 10 kb
Size distribution of pJAZZ vs pUC19 clones. The size distribution of 85-100 inserts in each vector is graphed. The insert DNA and 1 kb ladder are shown below the chart.
Lucigen Corporation 2120 W. Greenview Dr., Ste 9 Middleton, WI 53562 www.lucigen.com
Ronald Godiska1, Rebecca Hochstein1, Sarah Vande Zande1, Nikolai Ravin2, Attila Karis3, David A. Mead1
1 Lucigen Corporation, Middleton, WI 53562 2 Centre Bioengineering, Russian Academy of Science, Moscow, Russia 3 Mississippi State University
Abstract
We have developed a novel linear vector for unbiased cloning of 0-30 kb inserts in E. coli. This vector, termed “pJAZZ”, shows unprecedented ability to maintain large inserts from very AT-rich genomes. The otherwise difficult-to-clone genome from Flavobacterium columnare (70% AT, 3.2 Mb) was sequenced to seven fold coverage using the pJAZZ vector, with only 10 sequencing gaps. The linear vector was able to maintain 20-30 kb fragments from Lactobacillus helveticus (65% AT) and 2-4 kb inserts from Piromyces (up to 96% AT), which were unclonable in conventional plasmids. Unlike fosmid cloning, the construction of large-insert libraries (10-20 kb) in pJAZZ is simple and robust, using standard methods of transformation and plasmid purification. We are evaluating the use of a single pJAZZ shotgun library to eliminate the need for multiple libraries, making finishing easier and more cost effective. Enhanced stability of inserts in the pJAZZ vector is attributed to both the lack of supercoiling and the lack of transcriptional interference. Torsional strain inherent to supercoiled plasmids can induce localized melting and generate secondary structures, which are substrates for deletion or rearrangement by resolvases and replication enzymes. For example, the instability of tandem repeats and palindromic sequences is presumably due to cleavage of hairpin structures or to replication slippage across the secondary structures. Most conventional plasmid vectors also induce strong transcription and translation of inserted fragments, and they allow transcription from cloned promoters to interfere with plasmid stability. As a result, certain DNA sequences are deleterious or highly unstable, leading to sequence “stacking”, clone gaps, or a complete inability to construct libraries, especially from AT-rich genomes or toxic cDNAs. The transcription-free, linear pJAZZ vector also minimizes “sequence gaps” caused by secondary structures, as shown by its stable cloning of inverted repeats and di- and tri-nucleotide repeats.
Replication of the pJAZZ Linear Vector
The pJAZZ vector is derived from the linear phage N15 of E. coli. The repA and telN genes of N15 encode replicase and protelomerase. Bi-directional replication is initiated by RepA and is carried out by the host DNA polymerase. TelN cleaves the replicated telomeres and covalently joins the 5’ and 3’ strands of each free end, re-creating the terminal hairpin loops (telL and telR).
Linear plasmid DNA is electroporated into BigEasy™ TSA E. coli cells, which contain N15 genes needed for partitioning and copy number regulation. Linear plasmid DNA is efficiently isolated using standard methods, such as alkaline lysis and binding to a silica matrix.
tel L RepA
tel R
Insert
Insert
TelN
Insert
TelN TelN
Insert
Insert Insert
Minimal Size Bias
Sheared E. coli genomic DNA was cloned into the pJAZZ vector or pUC19 without size selection. As expected, the size distibution of the pUC19 inserts was clearly skewed toward smaller inserts. In contrast, the distribution of the pJAZZ inserts closely matched the input DNA.
40 pJAZZ pUC19
35
30
25
20
15
10
5
0 0-1 kb
1-2 kb 2-3 kb
3-4 kb
4-5 kb
5-6 kb 6-8 kb
>8 kb
pJAZZ™ Linear Vectors for High Stability Cloning
The linear pJAZZ vectors lack the structural and lysis genes of N15. All versions of the pJAZZ vectors are also transcription-free, and they appear to behave similarly. Dual selection ensures that recombinant clones contain both arms of the vector. The origin of replication is used to select for the left arm in the pJAZZ-OC vectors.
TelN, telomerase; repA, replication protein; cB, replication regulator; Cmr, chloramphenicol resistance; terminators; black circles, closed hairpin telomeres.
The ends of the vectors are free to rotate during replication, so cloned inserts are not subject to torsional stress caused by supercoiling. Transcriptional terminators at the cloning site minimize
transcriptional interference between the insert and the vector, increasing insert stability.The pJAZZ vectors are lowcopy (~5/cell) to further promote stable propagation of inserts, and their copy number can be induced 5-20X for DNA preparation.
The E. coli host strain for the pJAZZ vectors contains several required genes of N15: telN encodes protelomerase for replication; sopAB is needed for stable maintenance, and antA regulates copy number.
Cloning“Unclonable” DNA
AT-rich genomic DNA
Using conventional circular vectors, AT-rich DNA is often difficult to clone in E. coli, producing very few stable, intact clones (Figure A).
A) Lactobacillus helveticus (>67% AT) inserts of 1-2 kb are unstable in pUC19
M V
V M
16 kb 8 kb 6 kb 4 kb
2 kb
Intact clones Deleted clones
In contrast, large clones of extremely AT-rich DNA are stable in the linear, transcription-free pJAZZ vectors (Figures B, C, D, E)
B) L. helveticus inserts of 10-20 kb (67% AT) in the pJAZZ vector
23 kb
9.4 kb 6.6 kb
Left Arm (12 kb)
Right Arm (2 kb) [ran off gel]
C) Piromyces inserts of 2-6 kb (85-96% AT), with a sequence trace from one of the clones.
10 kb
Left Arm (12 kb)
6 kb
4 kb
2 kb
Right Arm (2 kb)
1 kb
GC-rich genomic DNA
The GC-rich genome of B. animalis (67% GC) was sheared to 6-20 kb, end-repaired, and cloned into the pJAZZ vector. Clones from this library were similar in size and number to those from the AT-rich L. helveticus library (see below). Uncut DNA from transformants was 24-40 kb, corresponding to inserts of 10-26 kb.
97 kb
48 kb 33 kb 15 kb
L. helveticus (65% AT)
B. animalis (67% GC)
Repetitive DNA
Repetitive DNA sequences can be extremely difficult to clone in circular plasmids. For example, mollusk cDNA of 0.3-2 kb was unclonable in all circular plasmids tested. However, the pJAZZ linear vector produced a library of full length clones. Repetitive DNA from two of the clones is shown (Figures A & B).
A)
B)
Cloning Large PCR products
PCR products of 15 - 20 kb were amplified from phage lambda DNA and cloned into the pJAZZ vector. Plasmid DNA from transformants was cut with NotI to excise the insert, then analyzed by gel electrophoresis. The left vector arm migrates at 10 kb; the right arm (2 kb) was run off the gel.
Phage lambda PCR fragments
23 kb
9.4 kb 6.6 kb
Efficient Library Construction and Assembly
The pJAZZ vector was used to construct genomic libraries of Flavobacterium columnare (75% AT, 3.2 Mb). Clones of 2-10 kb were readily obtained, and gave sequence reads of ~800 bp.
Automated assembly closely followed the theoretical Lander Waterman curve, yielding 21 major contigs. Further analysis revealed just 10 sequence gaps. Therefore, this AT-rich genome was completely cloned and nearly completely sequenced without the use of fosmid or BAC libraries.
-Attila Karsi, Mississippi State Univiversity
1400
1200
1000
800
Contigs
600
400
200
0
0
1
2
3
4
5
6
7
8
Coverage
E) Number of contigs vs. genomic sequence coverage of Flavobacterium columnare.
D) Clostridium inserts of 6-12 kb (65% AT)
- Bob Fulton, Debbie Moeller. Wash.Univ. Genome Sequencing Center
10 kb
Left Arm (12 kb)
6 kb
4 kb
2 kb
Right Arm (2 kb)
1 kb
E) Ichthyophthirius multifiliis inserts of 6-12 kb (85% AT)
- Donna Cassidy-Hanley, Cornell Univ.
10 kb
Left Arm (10 kb)
6 kb
4 kb
2 kb
Right Arm (2 kb)
1 kb
DNAs were sheared to 2-6 kb (Figures A & C) or 6-20 kb (Figures B, D, E), end-repaired, size-selected, and cloned into the vector. Plasmid DNA from pJAZZ transformants was cut with NotI to excise the insert and analyzed by gel electrophoresis. Vector bands are 12 kb and 2 kb. AT content of inserts was as high as 96% (Figure D).
In E. coli, clones of 50-100 CGG repeats from the Fragile X locus are highly unstable in circular vectors, and the frequency of deletion is increased by transcription and supercoiling. Corroborating these results, the pJAZZ vector was able to maintain fragments containing 220 copies of the CGG repeat, which has not been achieved with circular vectors.
(CGG)220
Summary
• Linear, transcription-free pJAZZ vectors allow cloning of repetitive DNA and large AT-rich or GC-rich DNAs. Libraries of 10-20 kb clones of AT-rich genomic DNA were routinely created with the linear transcription-free vector. Likewise, repetitive DNAs were maintained without rearrangement.
• Efficient Genomic Cloning and Sequencing. Powerful, unbiased cloning greatly reduced the need for manual finishing of genomic libraries. A 3-Mb microbial genome was assembled without the use of fosmid clones.
• Rapid and simple protocol. No vector preparation or special techniques are needed to generate high quality, large-insert libraries of otherwise “unclonable” DNAs.
1
2
3
4 56
8 10 kb
Size distribution of pJAZZ vs pUC19 clones. The size distribution of 85-100 inserts in each vector is graphed. The insert DNA and 1 kb ladder are shown below the chart.
Lucigen Corporation 2120 W. Greenview Dr., Ste 9 Middleton, WI 53562 www.lucigen.com