Molecular Biology
DNA replication, transcription, translation, and gene regulation.
Molecular Biology
Molecular biology is the study of biological activity at the molecular level, particularly focusing on the interactions between DNA, RNA, and proteins that drive cellular processes. Understanding these fundamental processes is essential for comprehending how genetic information is stored, expressed, and regulated in living organisms.
DNA Structure and Replication
DNA Structure
The double helix structure of DNA was elucidated by Watson and Crick in 1953:
Base Pairing Rules
- Adenine (A) pairs with Thymine (T) via 2 hydrogen bonds
- Guanine (G) pairs with Cytosine (C) via 3 hydrogen bonds
The stability relationship:
DNA Strands
- 5' to 3' direction: Phosphate group at 5', hydroxyl at 3'
- Antiparallel: The two strands run in opposite directions
- Complementarity: The sequence of one strand determines the sequence of the other
DNA Replication
DNA replication is semiconservative, meaning each new DNA molecule consists of one original strand and one newly synthesized strand:
\text{DNA} \xrightarrow{\text{DNA Polymerase}} \text{DNA (50% parental, 50% new)}Replication Process
-
Initiation
- DNA helicase unwinds the double helix
- Single-strand binding proteins (SSBs) stabilize unwound DNA
- Primase synthesizes RNA primers
-
Elongation
- DNA polymerase III adds nucleotides in 5' to 3' direction
- Leading strand: Continuous synthesis
- Lagging strand: Discontinuous synthesis (Okazaki fragments)
-
Termination
- RNA primers removed by RNase H
- Gaps filled by DNA polymerase I
- DNA ligase seals nicks
Replication Enzymes
| Enzyme | Function |
|---|---|
| DNA Helicase | Unwinds double helix |
| DNA Polymerase III | Main replication enzyme (prokaryotes) |
| DNA Polymerase α, δ, ε | Replication enzymes (eukaryotes) |
| Primase | Synthesizes RNA primers |
| DNA Ligase | Joins Okazaki fragments |
| Topoisomerase | Prevents supercoiling |
Proofreading and Repair
- 3' to 5' exonuclease: Immediate error correction
- Mismatch repair: Post-replication correction
- Nucleotide excision repair: Damage-induced repair
Transcription
RNA Polymerase Mechanism
Transcription converts DNA sequence information into RNA:
Transcription Process
-
Initiation
- RNA polymerase binds promoter region
- Transcription factors assist binding
- DNA unwinds to form transcription bubble
-
Elongation
- RNA polymerase moves 3' to 5' along template DNA
- RNA transcript grows 5' to 3'
- DNA helix reforms behind polymerase
-
Termination
- Intrinsic termination (prokaryotes): Hairpin loop formation
- Rho-dependent termination: Rho protein factor
- Rho-independent termination: Hairpin + U-rich sequence
Prokaryotic vs. Eukaryotic Transcription
| Feature | Prokaryotes | Eukaryotes |
|---|---|---|
| Location | Cytoplasm | Nucleus |
| RNA Polymerases | Single enzyme | RNA Pol I, II, III |
| Coupling | Transcription/translation coupled | Sequential processes |
| Processing | Minimal | Extensive processing |
Translation
The Genetic Code
The genetic code is degenerate and universal:
Code Characteristics
- Degenerate: Multiple codons for single amino acid
- Universal: Conserved across organisms
- Non-overlapping: Each base read once
- Commaless: No punctuation between codons
Translation Process
-
Initiation
- Small ribosomal subunit binds mRNA
- tRNA carrying methionine (fMet in prokaryotes) binds start codon (AUG)
- Large ribosomal subunit joins, forming complete ribosome
-
Elongation
- A site: Accepts incoming aminoacyl-tRNA
- P site: Holds growing peptide chain
- E site: Exit for uncharged tRNA
- Peptide bond formation: Peptidyl transferase activity
-
Termination
- Stop codons (UAA, UAG, UGA) recognized by release factors
- Polypeptide released from ribosome
- Ribosome subunits dissociate
tRNA Structure and Function
- Anticodon: 3-nucleotide sequence complementary to mRNA codon
- Amino acid attachment: At 3' end (CCA sequence)
- Secondary structure: Cloverleaf formation stabilized by H-bonds
Gene Regulation
Prokaryotic Gene Regulation: The Lac Operon
The lac operon is a classic model of gene regulation:
Components
- lacZ: β-galactosidase (breaks down lactose)
- lacY: Permease (lactose transport)
- lacA: Transacetylase (lactose metabolism)
- lacI: Repressor gene (regulates operon)
Regulation Mechanism
- Negative control: Repressor binding blocks transcription
- Induction: Lactose presence inactivates repressor
- Catabolite repression: Glucose inhibits lac operon via cAMP-CRP
Eukaryotic Gene Regulation
Transcriptional Control
-
Chromatin Remodeling
- Histone modifications: Acetylation, methylation
- DNA methylation: Generally repressive
- Chromatin accessibility: Open vs. closed domains
-
Transcription Factors
- General TFs: Basal transcription machinery
- Specific TFs: Enhancers and silencers
- Coactivators/corepressors: Modulate TF activity
Post-transcriptional Control
-
RNA Processing
- 5' capping: Protection and ribosome binding
- 3' polyadenylation: Stability and transport
- Splicing: Intron removal and exon joining
-
Alternative Splicing
Post-translational Control
- Protein modifications: Phosphorylation, glycosylation
- Protein degradation: Ubiquitin-proteasome pathway
- Regulatory proteins: Control protein activity
Advanced Topics in Gene Expression
RNA Processing in Eukaryotes
Pre-mRNA Splicing
The spliceosome removes introns and joins exons:
- Splice sites: Conserved sequences (GU-AG rule)
- Branch point: Critical for splicing reaction
- Lariat intermediate: Intron structure during splicing
Epigenetic Regulation
DNA Methylation
- Context: Typically CpG dinucleotides
- Effect: Generally repressive to transcription
- Maintenance: Preserved during DNA replication
Histone Modifications
- Acetylation: Generally activating (neutralizes positive charge)
- Methylation: Can activate or repress (context-dependent)
- Phosphorylation: Often involved in DNA damage response
Molecular Techniques
PCR (Polymerase Chain Reaction)
PCR Process
- Denaturation: 94-98°C (DNA strands separate)
- Annealing: 50-65°C (primers bind)
- Extension: 72°C (DNA synthesis by Taq polymerase)
PCR Applications
- Diagnostic: Pathogen detection
- Research: Gene cloning, sequencing
- Forensic: DNA fingerprinting
Recombinant DNA Technology
Restriction Enzymes
- Palindromic recognition: 4-8 base pairs
- Sticky ends: Single-strand overhangs
- Blunt ends: Double-strand cuts
Modern Developments
CRISPR-Cas Systems
- Guide RNA: Directs Cas nuclease to target
- PAM sequence: Required for recognition
- Versatility: Can target any genomic sequence
Single-cell Analysis
- Single-cell RNA-seq: Transcriptome of individual cells
- Spatial transcriptomics: Location-specific gene expression
- Lineage tracing: Cell fate determination
Real-World Application: Antibiotic Resistance Mechanisms
Antibiotic resistance provides a practical example of molecular biology principles in action.
Mechanism Analysis
# Antibiotic resistance mechanisms at molecular level
antibiotic_resistance = {
'ampicillin_resistance': {
'mechanism': 'Beta-lactamase production',
'gene': 'bla',
'protein': 'Beta-lactamase enzyme',
'function': 'Hydrolyzes beta-lactam ring'
},
'tetracycline_resistance': {
'mechanism': 'Efflux pump expression',
'gene': 'tetA',
'protein': 'Tetracycline efflux protein',
'function': 'Pumps antibiotic out of cell'
},
'kanamycin_resistance': {
'mechanism': 'Enzymatic modification',
'gene': 'aph(3\')-II',
'protein': 'Aminoglycoside phosphotransferase',
'function': 'Phosphorylates antibiotic, preventing binding'
}
}
# Calculate mutation rates affecting resistance
mutation_rate = 1e-6 # per base pair per generation
genome_size = 4.6e6 # base pairs for E. coli
per_genome_rate = mutation_rate * genome_size # ~4.6 mutations per genome per generation
# Estimate time to resistance development
bacterial_generations_per_day = 12 # assuming ideal growth
resistance_probability = 1 - (1 - per_genome_rate)**bacterial_generations_per_day # probability per day
print(f"Estimated bacterial mutations per genome per generation: {per_genome_rate:.2e}")
print(f"Resistance development probability per day: {resistance_probability:.2e}")
print(f"Average time to first resistance mutation: {1/resistance_probability/365:.1f} years (in ideal conditions)")
# Calculate selection pressure effects
drug_concentration = 10 # relative to MIC
fitness_cost = 0.05 # 5% fitness cost for resistance gene
selection_coefficient = drug_concentration * (1 - fitness_cost)
print(f"Selection coefficient with drug pressure: {selection_coefficient:.2f}")
print("This demonstrates how antibiotic use accelerates resistance evolution")
# Molecular mechanism of beta-lactam resistance
print(f"\nBeta-lactamase mechanism:")
print(f" - Gene: {antibiotic_resistance['ampicillin_resistance']['gene']}")
print(f" - Function: {antibiotic_resistance['ampicillin_resistance']['function']}")
print(f" - Result: Antibiotic inactivation through hydrolysis")
Evolutionary Implications
Understanding molecular mechanisms helps explain the rapid evolution of antibiotic resistance.
Your Challenge: Gene Expression Analysis
Analyze the regulation of a hypothetical gene and predict how mutations would affect expression levels.
Goal: Use molecular biology principles to analyze gene regulation and predict expression outcomes.
Gene Regulatory Sequence
import math
# Hypothetical gene regulatory region
gene_data = {
'promoter_strength': 0.8, # Relative strength (0-1)
'operator_sites': [
{'type': 'activator', 'affinity': 0.9}, # High affinity binding site
{'type': 'repressor', 'affinity': 0.6} # Medium affinity binding site
],
'upstream_enhancer': True,
'polyadenylation_signal': 'AATAAA',
'splice_sites': {
'donor': 'GTATGGT',
'acceptor': 'CAGG',
'branch_point': 'TACTAAC'
},
'length': 8500 # base pairs (including introns)
}
# Calculate expression level based on regulatory elements
basal_expression = gene_data['promoter_strength'] * 100 # arbitrary units
# Calculate activator effect (positive regulation)
activator_affinity = gene_data['operator_sites'][0]['affinity']
activator_effect = basal_expression * activator_affinity * 0.5 # 50% increase potential
# Calculate repressor effect (negative regulation)
repressor_affinity = gene_data['operator_sites'][1]['affinity']
repressor_effect = basal_expression * repressor_affinity * 0.3 # 30% decrease potential
# Calculate net expression level
net_expression = basal_expression + activator_effect - repressor_effect
# Calculate splicing efficiency
splice_strength = 0.85 # efficiency factor
processing_efficiency = 0.9 if gene_data['polyadenylation_signal'] == 'AATAAA' else 0.6
# Calculate mature mRNA abundance
mature_mrna = net_expression * splice_strength * processing_efficiency
# Calculate protein production (assuming 100% translation efficiency)
mrna_half_life = 4 # hours in prokaryotes
protein_synthesis_rate = mature_mrna * 10 # 10 proteins per mRNA per hour
# Simulate mutation effects
mutations_to_test = [
{'name': 'promoter_mutation', 'effect': -0.3}, # 30% decrease in promoter strength
{'name': 'enhancer_deletion', 'effect': -0.2}, # 20% decrease for enhancer
{'name': 'polyA_mutation', 'effect': -0.4} # 40% decrease for polyA processing
]
expression_effects = {}
for mutation in mutations_to_test:
mutated_strength = max(0.1, gene_data['promoter_strength'] + mutation['effect'])
mutated_expression = mutated_strength * 100 + activator_effect - repressor_effect
expression_effects[mutation['name']] = mutated_expression
Analyze the gene regulation system and predict the effects of mutations on expression levels.
Hint:
- Consider how regulatory elements (promoter, operator, enhancer) affect transcription
- Calculate the combined effects of activators and repressors
- Evaluate the impact of post-transcriptional modifications
- Estimate protein production from mRNA levels
# TODO: Calculate gene expression parameters
basal_expression_level = 0 # Arbitrary units (0-100 scale)
net_regulation_effect = 0 # Combined activator/repressor effect
mature_mrna_amount = 0 # Molecules per cell
protein_concentration = 0 # Units per cell
half_life_hours = 0 # RNA stability
expression_fold_change = 0 # Effect of regulatory mutations
# Calculate basal expression from promoter
basal_expression_level = gene_data['promoter_strength'] * 100
# Calculate net regulation (activator + repressor effects)
activator_contribution = gene_data['operator_sites'][0]['affinity'] * 50 # Scale factor
repressor_contribution = gene_data['operator_sites'][1]['affinity'] * 30 # Scale factor
net_regulation_effect = activator_contribution - repressor_contribution
# Calculate mature mRNA considering processing efficiency
processing_efficiency = 0.9 if gene_data['polyadenylation_signal'] == 'AATAAA' else 0.6
mature_mrna_amount = (basal_expression_level + net_regulation_effect) * processing_efficiency
# Calculate protein concentration (assuming 5 proteins per mRNA)
protein_concentration = mature_mrna_amount * 5
# Calculate fold change with mutations
control_expression = mature_mrna_amount
mutant_expression = control_expression * 0.7 # Example with 30% reduction
expression_fold_change = mutant_expression / control_expression
# RNA half-life calculation
if gene_data['polyadenylation_signal'] == 'AATAAA':
half_life_hours = 4 # Stable message
else:
half_life_hours = 1 # Less stable
# Print results
print(f"Basal expression level: {basal_expression_level:.1f} units")
print(f"Net regulation effect: {net_regulation_effect:.1f} units")
print(f"Mature mRNA amount: {mature_mrna_amount:.1f} molecules/cell")
print(f"Protein concentration: {protein_concentration:.1f} molecules/cell")
print(f"RNA half-life: {half_life_hours} hours")
print(f"Expression fold change with mutations: {expression_fold_change:.2f}")
# Regulatory assessment
if expression_fold_change < 0.5:
regulation_type = "Strong downregulation"
elif expression_fold_change < 0.8:
regulation_type = "Moderate downregulation"
elif expression_fold_change > 2.0:
regulation_type = "Strong upregulation"
else:
regulation_type = "Normal regulation"
print(f"Regulation assessment: {regulation_type}")
What would be the most effective strategy to increase expression of this gene for protein production purposes?
ELI10 Explanation
Simple analogy for better understanding
Self-Examination
What are the key differences between DNA replication, transcription, and translation?
How does the lac operon regulate gene expression in bacteria?
What is the role of RNA splicing in eukaryotic gene expression?