Genetic Engineering
CRISPR, recombinant DNA, and synthetic biology.
Genetic Engineering
Genetic engineering is the direct manipulation of an organism's genes using biotechnology. It involves the introduction of foreign DNA into host organisms or the modification of existing DNA to alter characteristics or produce biological products. This field encompasses a range of technologies from traditional recombinant DNA techniques to cutting-edge genome editing and synthetic biology approaches.
Recombinant DNA Technology
Historical Development
The concept of recombinant DNA emerged in the 1970s when Stanley Cohen and Herbert Boyer successfully transferred DNA from one bacterium to another, creating the first genetically modified organism.
Key Components and Techniques
Restriction Enzymes
Recognition and Cleavage
Types of Ends
- Sticky ends: Single-stranded overhangs (4-6 bases)
- Blunt ends: Double-strand cuts producing flush termini
DNA Ligases
DNA Vectors
Plasmids
Essential Elements
- ori: Origin of replication (allows plasmid to replicate)
- antibiotic resistance gene: Selection marker
- MCS: Multiple cloning site with restriction sites
Viral Vectors
- Bacteriophage λ: 8-22 kb insert capacity
- Cosmids: 35-45 kb capacity (contain cos sites for packaging)
- BACs: Bacterial Artificial Chromosomes (~300 kb capacity)
- YACs: Yeast Artificial Chromosomes (~1000 kb capacity)
Cloning Strategies
Traditional Cloning
TA Cloning
Golden Gate Assembly
Transformation and Selection
Bacterial Transformation
Transformation Efficiency
Selection Methods
- Antibiotic resistance: Most common selection marker
- Auxotrophic complementation: Nutritional selection
- Blue-white screening: For insertional inactivation clones
CRISPR-Cas Systems
Discovery and Development
Bacterial Adaptive Immunity
- Adaptation: New spacer acquisition from invader DNA
- Expression: Pre-crRNA transcription
- Processing: crRNA maturation
- Interference: Target recognition and cleavage
Mechanism of Action
Class 1 Systems
- Type I: Cascade complex (multi-subunit effector)
- Type III: Csm/Cmr complexes (targeting RNA)
Class 2 Systems
- Type II: Single protein (Cas9)
- Type V: Single protein (Cas12)
- Type VI: Single protein (Cas13)
Cas9-Mediated Genome Editing
Mechanism
Components
- crRNA: CRISPR RNA (contains guide sequence)
- tracrRNA: Trans-activating crRNA (facilitates processing)
- sgRNA: Single guide RNA (fusion of crRNA and tracrRNA)
PAM Recognition
R-Loop Formation
CRISPR Variants
Modified Cas Systems
Base Editors
Prime Editors
Applications
- Gene knockouts: Induce indels via NHEJ
- Gene knock-ins: HDR-mediated insertion
- Gene regulation: dCas (dead Cas) systems for transcription
- Epigenome editing: dCas fusions with epigenetic modifiers
Off-Target Effects and Safety
Predicting Off-Targets
Reducing Off-Targets
- High-fidelity Cas9 variants: eSpCas9, SpCas9-HF1
- Truncated gRNAs: 17-18 nt instead of 20 nt
- Modified PAM requirements: SaCas9-NLS-KKH
Synthetic Biology
Definition and Scope
Synthetic biology is the design and construction of new biological parts, devices, and systems, or the redesign of existing natural biological systems for useful purposes.
Standardized Parts
BioBricks
Registry of Standard Biological Parts
- Part categories: Promoters, RBSs, coding sequences, terminators
- Characterization: Quantified expression levels, induction requirements
Genetic Circuits
Toggle Switch
Oscillators (Repressilator)
AND Gates
Applications
Metabolic Engineering
Biosensors
Advanced Techniques
Homologous Recombination
Gene Targeting
Gene Replacement
Conditional Mutagenesis
Cre-LoxP System
Flp-FRT System
Epigenome Editing
DNA Methylation Editing
Chromatin Remodeling
Regulatory Considerations
Risk Assessment
Environmental Risk
- Gene flow: Transfer to wild populations
- Fitness effects: Impact on ecosystem dynamics
- Horizontal gene transfer: Movement between species
Human Health Risk
- Allergenicity: Potential allergic reactions
- Antibiotic resistance markers: Selection pressure
- Toxicity: Production of harmful compounds
Ethical Considerations
Germline Editing
- Heritable genetic modifications
- Intergenerational consent issues
- Designer baby concerns
Agricultural Applications
- Coexistence: GM vs. non-GM crops
- Labeling: Consumer right to know
- Patenting: Ownership of genetic sequences
Current Applications
Medicine
Gene Therapy
CAR-T Cell Therapy
Agriculture
Crop Improvement
- Herbicide resistance: Roundup Ready crops
- Biotic stress tolerance: Bt crops
- Abiotic stress tolerance: Drought-resistant varieties
- Nutritional enhancement: Golden Rice
Industrial Biotechnology
Biomanufacturing
- Recombinant proteins: Insulin, growth hormones, antibodies
- Enzymes: Industrial catalysts
- Biofuels: Ethanol, biodiesel, alkanes
- Bioplastics: PHA, PLA production
Computational Tools
Design Software
CRISPR Design
- CHOPCHOP: Guide RNA design and specificity
- CRISPOR: Comprehensive CRISPR design tool
- Benchling: Molecular biology design platform
Pathway Design
- Pathway Tools: Metabolic pathway analysis
- KEGG: Kyoto Encyclopedia of Genes and Genomes
- BioBuilder: Synthetic biology design platform
Analysis Tools
- BLAST: Sequence similarity searches
- Clustal Omega: Multiple sequence alignment
- SnapGene: DNA visualization and cloning design
Real-World Application: CRISPR Therapeutic Development
The development of CRISPR-based therapeutics involves careful consideration of delivery, specificity, and safety.
Therapeutic CRISPR Design
# CRISPR therapeutic development analysis
crispr_params = {
'target_gene': 'HBB', # Beta-globin gene for sickle cell disease
'guide_length': 20, # Nucleotides
'pam_sequence': 'NGG', # Required PAM for SpCas9
'genome_build': 'hg38', # Human genome reference
'off_target_score': 0.85, # Specificity score (0-1, higher is better)
'editing_efficiency': 0.75, # 75% editing efficiency
'delivery_method': 'electroporation', # Method to deliver components
'cell_type': 'hematopoietic_stem_cells', # Target cells
'therapeutic_strategy': 'correction' # Type of edit needed
}
# Calculate potential off-target sites
# Using simplified specificity model
potential_off_targets = 0
for i in range(20): # For each position in guide RNA
# Calculate mismatch tolerance
if i < 12: # Seeds region - less tolerant of mismatches
mismatch_penalty = 10
else: # Non-seed region - more tolerant of mismatches
mismatch_penalty = 1
# Predict on/off-target binding probability
# Using thermodynamic model
target_binding_energy = -35 # kcal/mol (estimate)
off_target_binding_energy = -30 # kcal/mol (weaker binding)
k_on_ratio = math.exp(-(target_binding_energy - off_target_binding_energy) / (8.314e-3 * 310)) # At 37°C
# Calculate editing outcomes
original_allele = 1.0
edited_allele = crispr_params['editing_efficiency'] * original_allele
remaining_original = original_allele - edited_allele
# For sickle cell correction (E6V to E6E)
# Need to either correct the mutation or upregulate fetal hemoglobin
correction_outcome = {
'normal_alleles': edited_allele,
'sickle_alleles': remaining_original,
'compensated_alleles': 0 # If using alternative approach
}
# Estimate therapeutic threshold
therapeutic_threshold = 0.15 # Need 15% normal alleles for clinical improvement
therapeutic_success = edited_allele > therapeutic_threshold
# Calculate predicted clinical outcome
predicted_clinical_improvement = edited_allele / 2 * 100 # Assuming heterozygous state is beneficial
print(f"CRISPR therapeutic design for {crispr_params['target_gene']}:")
print(f" Guide RNA length: {crispr_params['guide_length']} nt")
print(f" PAM sequence: {crispr_params['pam_sequence']}")
print(f" Editing efficiency: {crispr_params['editing_efficiency']*100:.1f}%")
print(f" Predicted on/off-target ratio: {k_on_ratio:.2f}")
print(f" Therapeutic threshold ({therapeutic_threshold*100}%): {'Achieved' if therapeutic_success else 'Not achieved'}")
print(f" Predicted clinical improvement: {predicted_clinical_improvement:.1f}%")
print(f" Delivery method: {crispr_params['delivery_method']}")
# Safety assessment
if crispr_params['off_target_score'] < 0.9:
safety_concern = "High off-target risk - extensive validation needed"
else:
safety_concern = "Acceptable specificity - proceed with development"
print(f" Safety assessment: {safety_concern}")
# Potential complications
potential_issues = []
if crispr_params['editing_efficiency'] > 0.9:
potential_issues.append("High efficiency may increase risk of unwanted modifications")
if crispr_params['editing_efficiency'] < 0.1:
potential_issues.append("Low efficiency may not achieve therapeutic benefit")
if crispr_params['off_target_score'] < 0.8:
potential_issues.append("High off-target risk needs mitigation")
print(f" Potential issues: {potential_issues if potential_issues else ['None identified']}")
Clinical Trial Considerations
Factors in translating CRISPR to therapeutic applications.
Your Challenge: Vector Design and Cloning Strategy
Design a vector system for expressing a therapeutic protein and outline the cloning strategy.
Goal: Engineer a recombinant DNA construct for therapeutic protein production.
Design Parameters
import math
# Therapeutic protein design parameters
protein_design = {
'target_protein': 'Human insulin',
'accession_number': 'P01308',
'length': 51, # Amino acids
'molecular_weight': 5808, # Da
'signal_peptide': True, # Secreted protein
'required_modifications': ['disulfide_bonds', 'glycosylation'],
'expression_host': 'E.coli',
'selection_marker': 'ampicillin',
'promoter_type': 'inducible', # Constitutive or inducible
'copy_number': 'medium', # Low, medium, or high copy plasmid
'codon_optimization': True # For expression host
}
# Calculate codon adaptation index (CAI) for E. coli expression
# Simplified calculation based on codon frequency
def calculate_cai(sequence, host_codons):
# This would normally use a reference set of highly expressed genes
# For this exercise, we'll simulate a CAI calculation
cai_score = 0.75 # Simulated score
return cai_score
# Vector backbone requirements
vector_features = {
'ori': 'ColE1 origin', # High copy number
'promoter': 'Ptac', # IPTG-inducible
'ribosome_binding_site': 'strong', # AGGAGGT sequence
'terminator': 'T1 from E. coli rrnB', # Strong terminator
'selection': 'ampR', # Ampicillin resistance
'multiple_cloning_site': ['BamHI', 'EcoRI', 'XhoI', 'XbaI'] # Common sites
}
# Calculate insert size for cloning
protein_coding_seq = 'ATGAAATTTATCATCGCCCTGGTGATCGTTATCCTGGCGCTGGCCCAGCCCGGCGAA' # Insulin signal sequence & first part
poly_histidine_tag = 'CACCATCACCACCACCAC' # 6xHis tag for purification
terminator_seq = 'TAG' # Stop codon
full_insert = protein_coding_seq + poly_histidine_tag + terminator_seq
insert_length = len(full_insert)
# Calculate expression optimization
if protein_design['codon_optimization']:
codon_adaptation_index = 0.82 # Optimized for E.coli
else:
codon_adaptation_index = 0.55 # Native sequence
# Predict expression level based on design parameters
promoter_strength = 0.8 if protein_design['promoter_type'] == 'inducible' else 1.0 # Inducible is usually strong
rbs_strength = 0.9 if vector_features['ribosome_binding_site'] == 'strong' else 0.5
copy_number_factor = 10 if protein_design['copy_number'] == 'high' else 3 # High vs medium copy
predicted_expression_level = codon_adaptation_index * promoter_strength * rbs_strength * copy_number_factor
# Consider potential issues
expression_issues = []
if codon_adaptation_index < 0.6:
expression_issues.append("Codon bias may reduce expression")
if protein_design['required_modifications']:
if protein_design['expression_host'] == 'E.coli':
expression_issues.append("E.coli lacks glycosylation machinery")
if predicted_expression_level < 1.0:
expression_issues.append("Low expression level predicted")
# Calculate production yield estimation
culture_volume = 1 # Liters
cell_density = 4 # OD600 (approximately 2 g/L dry weight)
expression_level = 0.1 # Fraction of total protein as target
estimated_yield = culture_volume * cell_density * 2 * expression_level # grams per liter
Design a recombinant DNA construct for therapeutic protein production.
Hint:
- Consider the expression host and optimize for it
- Include proper regulatory elements
- Plan the cloning strategy with compatible restriction sites
- Consider protein purification and secretion
# TODO: Design the recombinant construct
vector_backbone = "" # Name of vector backbone to use
promoter_selected = "" # Promoter to use for expression
cloning_strategy = "" # Step-by-step cloning approach
expression_level_prediction = 0 # Predicted expression level (0-1 scale)
purification_tags = [] # Tags for protein purification
safety_considerations = [] # Safety considerations for therapeutic use
# Select appropriate vector
if protein_design['copy_number'] == 'high':
vector_backbone = "pET series (T7 promoter)"
elif protein_design['copy_number'] == 'medium':
vector_backbone = "pGEX series (glutathione S-transferase fusion)"
else:
vector_backbone = "pACYC series (low copy, good for toxic proteins)"
# Select appropriate promoter
if protein_design['promoter_type'] == 'inducible':
promoter_selected = "T7 or Ptac (IPTG-inducible)"
else:
promoter_selected = "trc or lacUV5 (constitutive)"
# Design cloning strategy
# Step 1: Design oligos for gene synthesis with optimal codons
# Step 2: PCR amplify with restriction sites for cloning
# Step 3: Digest vector and insert with compatible enzymes
# Step 4: Ligate and transform
# Step 5: Select and verify clones
cloning_strategy = [
"Synthesize gene with optimized codons for E. coli",
f"Add restriction sites for {vector_features['multiple_cloning_site'][0]} and {vector_features['multiple_cloning_site'][1]}",
f"Digest vector with {vector_features['multiple_cloning_site'][0]} and {vector_features['multiple_cloning_site'][1]}",
"Ligate insert into linearized vector",
"Transform into competent E. coli cells",
"Select on antibiotic plates and verify by sequencing"
]
# Calculate expression level
expression_level_prediction = predicted_expression_level
# Add purification tags
if protein_design['required_modifications'] and 'purification' in str(protein_design['required_modifications']):
purification_tags = ["6xHistidine tag", "FLAG tag"]
else:
purification_tags = ["6xHistidine tag"] # Standard for E. coli
# Consider safety factors
safety_considerations = []
if protein_design['expression_host'] == 'E.coli':
safety_considerations.append("Endotoxin removal required for therapeutic use")
if protein_design['required_modifications']:
if 'glycosylation' in str(protein_design['required_modifications']):
safety_considerations.append("E. coli cannot glycosylate proteins - may affect function")
if vector_design['selection_marker'] == 'ampicillin':
safety_considerations.append("Antibiotic resistance marker requires removal for clinical use")
# Print results
print(f"Vector backbone: {vector_backbone}")
print(f"Promoter selected: {promoter_selected}")
print(f"Cloning strategy: {cloning_strategy}")
print(f"Expression level prediction: {expression_level_prediction:.2f}")
print(f"Purification tags: {purification_tags}")
print(f"Estimated yield: {estimated_yield:.3f} g/L")
print(f"Safety considerations: {safety_considerations}")
# Design validation
if expression_level_prediction > 0.5 and not expression_issues:
design_assessment = "Promising design - likely successful expression"
elif expression_level_prediction > 0.2:
design_assessment = "Workable design - moderate expression expected"
else:
design_assessment = "Suboptimal design - consider improvements"
print(f"Design assessment: {design_assessment}")
How would you modify your design if the therapeutic protein required proper eukaryotic post-translational modifications?
ELI10 Explanation
Simple analogy for better understanding
Self-Examination
How does the CRISPR-Cas system work and what makes it so precise?
What are the key steps involved in creating recombinant DNA molecules?
What are the applications and potential risks of synthetic biology?