Molecular Biology

Molecular biology is the study of biological activity at the molecular level, particularly focusing on the interactions between DNA, RNA, and proteins that drive cellular processes. Understanding these fundamental processes is essential for comprehending how genetic information is stored, expressed, and regulated in living organisms.

DNA Structure and Replication

DNA Structure

The double helix structure of DNA was elucidated by Watson and Crick in 1953:

\text{DNA Structure} = \text{Antiparallel strands} + \text{Complementary base pairing} + \text{Right-handed helix}

Base Pairing Rules

Adenine (A) pairs with Thymine (T) via 2 hydrogen bonds
Guanine (G) pairs with Cytosine (C) via 3 hydrogen bonds

The stability relationship:

\text{Stability} \propto \text{GC content} \times \text{number of hydrogen bonds}

DNA Strands

5' to 3' direction: Phosphate group at 5', hydroxyl at 3'
Antiparallel: The two strands run in opposite directions
Complementarity: The sequence of one strand determines the sequence of the other

DNA Replication

DNA replication is semiconservative, meaning each new DNA molecule consists of one original strand and one newly synthesized strand:

\text{DNA} \xrightarrow{\text{DNA Polymerase}} \text{DNA (50% parental, 50% new)}

Replication Process

Initiation
- DNA helicase unwinds the double helix
- Single-strand binding proteins (SSBs) stabilize unwound DNA
- Primase synthesizes RNA primers
Elongation
- DNA polymerase III adds nucleotides in 5' to 3' direction
- Leading strand: Continuous synthesis
- Lagging strand: Discontinuous synthesis (Okazaki fragments)
Termination
- RNA primers removed by RNase H
- Gaps filled by DNA polymerase I
- DNA ligase seals nicks

Replication Enzymes

Enzyme	Function
DNA Helicase	Unwinds double helix
DNA Polymerase III	Main replication enzyme (prokaryotes)
DNA Polymerase α, δ, ε	Replication enzymes (eukaryotes)
Primase	Synthesizes RNA primers
DNA Ligase	Joins Okazaki fragments
Topoisomerase	Prevents supercoiling

Proofreading and Repair

\text{Error rate} = 1 \times 10^{-10} \text{ (with proofreading)}

3' to 5' exonuclease: Immediate error correction
Mismatch repair: Post-replication correction
Nucleotide excision repair: Damage-induced repair

Transcription

RNA Polymerase Mechanism

Transcription converts DNA sequence information into RNA:

\text{DNA (template)} \xrightarrow{\text{RNA Polymerase}} \text{mRNA}

Transcription Process

Initiation
- RNA polymerase binds promoter region
- Transcription factors assist binding
- DNA unwinds to form transcription bubble
Elongation
- RNA polymerase moves 3' to 5' along template DNA
- RNA transcript grows 5' to 3'
- DNA helix reforms behind polymerase
Termination
- Intrinsic termination (prokaryotes): Hairpin loop formation
- Rho-dependent termination: Rho protein factor
- Rho-independent termination: Hairpin + U-rich sequence

Prokaryotic vs. Eukaryotic Transcription

Feature	Prokaryotes	Eukaryotes
Location	Cytoplasm	Nucleus
RNA Polymerases	Single enzyme	RNA Pol I, II, III
Coupling	Transcription/translation coupled	Sequential processes
Processing	Minimal	Extensive processing

Translation

The Genetic Code

The genetic code is degenerate and universal:

\text{64 codons} \rightarrow \text{20 amino acids} + \text{start/stop signals}

Code Characteristics

Degenerate: Multiple codons for single amino acid
Universal: Conserved across organisms
Non-overlapping: Each base read once
Commaless: No punctuation between codons

Translation Process

Initiation
- Small ribosomal subunit binds mRNA
- tRNA carrying methionine (fMet in prokaryotes) binds start codon (AUG)
- Large ribosomal subunit joins, forming complete ribosome
Elongation
- A site: Accepts incoming aminoacyl-tRNA
- P site: Holds growing peptide chain
- E site: Exit for uncharged tRNA
- Peptide bond formation: Peptidyl transferase activity
Termination
- Stop codons (UAA, UAG, UGA) recognized by release factors
- Polypeptide released from ribosome
- Ribosome subunits dissociate

tRNA Structure and Function

\text{tRNA} = \text{Amino acid attachment} + \text{Anticodon loop} + \text{Secondary structure}

Anticodon: 3-nucleotide sequence complementary to mRNA codon
Amino acid attachment: At 3' end (CCA sequence)
Secondary structure: Cloverleaf formation stabilized by H-bonds

Gene Regulation

Prokaryotic Gene Regulation: The Lac Operon

The lac operon is a classic model of gene regulation:

\text{Lac Operon} = \text{Promoter} + \text{Operator} + \text{Structural genes} + \text{Regulatory gene}

Components

lacZ: β-galactosidase (breaks down lactose)
lacY: Permease (lactose transport)
lacA: Transacetylase (lactose metabolism)
lacI: Repressor gene (regulates operon)

Regulation Mechanism

Negative control: Repressor binding blocks transcription
Induction: Lactose presence inactivates repressor
Catabolite repression: Glucose inhibits lac operon via cAMP-CRP

Eukaryotic Gene Regulation

Transcriptional Control

Chromatin Remodeling
- Histone modifications: Acetylation, methylation
- DNA methylation: Generally repressive
- Chromatin accessibility: Open vs. closed domains
Transcription Factors
- General TFs: Basal transcription machinery
- Specific TFs: Enhancers and silencers
- Coactivators/corepressors: Modulate TF activity

Post-transcriptional Control

RNA Processing
- 5' capping: Protection and ribosome binding
- 3' polyadenylation: Stability and transport
- Splicing: Intron removal and exon joining
Alternative Splicing

\text{Exon selection} = f(\text{spliceosome}, \text{SR proteins}, \text{hnRNP proteins})

Post-translational Control

Protein modifications: Phosphorylation, glycosylation
Protein degradation: Ubiquitin-proteasome pathway
Regulatory proteins: Control protein activity

Advanced Topics in Gene Expression

RNA Processing in Eukaryotes

Pre-mRNA Splicing

\text{Primary transcript} \xrightarrow{\text{Spliceosome}} \text{Mature mRNA}

The spliceosome removes introns and joins exons:

Splice sites: Conserved sequences (GU-AG rule)
Branch point: Critical for splicing reaction
Lariat intermediate: Intron structure during splicing

Epigenetic Regulation

DNA Methylation

\text{5' cytosine} \xrightarrow{\text{DNA methyltransferase}} \text{5' methylcytosine}

Context: Typically CpG dinucleotides
Effect: Generally repressive to transcription
Maintenance: Preserved during DNA replication

Histone Modifications

Acetylation: Generally activating (neutralizes positive charge)
Methylation: Can activate or repress (context-dependent)
Phosphorylation: Often involved in DNA damage response

Molecular Techniques

PCR (Polymerase Chain Reaction)

\text{Target DNA} \xrightarrow{\text{Thermal Cycling}} \text{Exponential amplification}

PCR Process

Denaturation: 94-98°C (DNA strands separate)
Annealing: 50-65°C (primers bind)
Extension: 72°C (DNA synthesis by Taq polymerase)

PCR Applications

Diagnostic: Pathogen detection
Research: Gene cloning, sequencing
Forensic: DNA fingerprinting

Recombinant DNA Technology

Restriction Enzymes

\text{DNA} \xrightarrow{\text{Restriction enzyme}} \text{Specific recognition sequence cleavage}

Palindromic recognition: 4-8 base pairs
Sticky ends: Single-strand overhangs
Blunt ends: Double-strand cuts

Modern Developments

CRISPR-Cas Systems

Guide RNA: Directs Cas nuclease to target
PAM sequence: Required for recognition
Versatility: Can target any genomic sequence

Single-cell Analysis

Single-cell RNA-seq: Transcriptome of individual cells
Spatial transcriptomics: Location-specific gene expression
Lineage tracing: Cell fate determination

Real-World Application: Antibiotic Resistance Mechanisms

Antibiotic resistance provides a practical example of molecular biology principles in action.

Mechanism Analysis

# Antibiotic resistance mechanisms at molecular level
antibiotic_resistance = {
    'ampicillin_resistance': {
        'mechanism': 'Beta-lactamase production',
        'gene': 'bla',
        'protein': 'Beta-lactamase enzyme',
        'function': 'Hydrolyzes beta-lactam ring'
    },
    'tetracycline_resistance': {
        'mechanism': 'Efflux pump expression',
        'gene': 'tetA',
        'protein': 'Tetracycline efflux protein',
        'function': 'Pumps antibiotic out of cell'
    },
    'kanamycin_resistance': {
        'mechanism': 'Enzymatic modification',
        'gene': 'aph(3\')-II',
        'protein': 'Aminoglycoside phosphotransferase',
        'function': 'Phosphorylates antibiotic, preventing binding'
    }
}

# Calculate mutation rates affecting resistance
mutation_rate = 1e-6  # per base pair per generation
genome_size = 4.6e6  # base pairs for E. coli
per_genome_rate = mutation_rate * genome_size  # ~4.6 mutations per genome per generation

# Estimate time to resistance development
bacterial_generations_per_day = 12  # assuming ideal growth
resistance_probability = 1 - (1 - per_genome_rate)**bacterial_generations_per_day  # probability per day

print(f"Estimated bacterial mutations per genome per generation: {per_genome_rate:.2e}")
print(f"Resistance development probability per day: {resistance_probability:.2e}")
print(f"Average time to first resistance mutation: {1/resistance_probability/365:.1f} years (in ideal conditions)")

# Calculate selection pressure effects
drug_concentration = 10  # relative to MIC
fitness_cost = 0.05  # 5% fitness cost for resistance gene
selection_coefficient = drug_concentration * (1 - fitness_cost)

print(f"Selection coefficient with drug pressure: {selection_coefficient:.2f}")
print("This demonstrates how antibiotic use accelerates resistance evolution")

# Molecular mechanism of beta-lactam resistance
print(f"\nBeta-lactamase mechanism:")
print(f"  - Gene: {antibiotic_resistance['ampicillin_resistance']['gene']}")
print(f"  - Function: {antibiotic_resistance['ampicillin_resistance']['function']}")
print(f"  - Result: Antibiotic inactivation through hydrolysis")

Evolutionary Implications

Understanding molecular mechanisms helps explain the rapid evolution of antibiotic resistance.

Your Challenge: Gene Expression Analysis

Analyze the regulation of a hypothetical gene and predict how mutations would affect expression levels.

Goal: Use molecular biology principles to analyze gene regulation and predict expression outcomes.

Gene Regulatory Sequence

import math

# Hypothetical gene regulatory region
gene_data = {
    'promoter_strength': 0.8,  # Relative strength (0-1)
    'operator_sites': [
        {'type': 'activator', 'affinity': 0.9},   # High affinity binding site
        {'type': 'repressor', 'affinity': 0.6}    # Medium affinity binding site
    ],
    'upstream_enhancer': True,
    'polyadenylation_signal': 'AATAAA',
    'splice_sites': {
        'donor': 'GTATGGT',
        'acceptor': 'CAGG',
        'branch_point': 'TACTAAC'
    },
    'length': 8500  # base pairs (including introns)
}

# Calculate expression level based on regulatory elements
basal_expression = gene_data['promoter_strength'] * 100  # arbitrary units

# Calculate activator effect (positive regulation)
activator_affinity = gene_data['operator_sites'][0]['affinity']
activator_effect = basal_expression * activator_affinity * 0.5  # 50% increase potential

# Calculate repressor effect (negative regulation)  
repressor_affinity = gene_data['operator_sites'][1]['affinity']
repressor_effect = basal_expression * repressor_affinity * 0.3  # 30% decrease potential

# Calculate net expression level
net_expression = basal_expression + activator_effect - repressor_effect

# Calculate splicing efficiency
splice_strength = 0.85  # efficiency factor
processing_efficiency = 0.9 if gene_data['polyadenylation_signal'] == 'AATAAA' else 0.6

# Calculate mature mRNA abundance
mature_mrna = net_expression * splice_strength * processing_efficiency

# Calculate protein production (assuming 100% translation efficiency)
mrna_half_life = 4  # hours in prokaryotes
protein_synthesis_rate = mature_mrna * 10  # 10 proteins per mRNA per hour

# Simulate mutation effects
mutations_to_test = [
    {'name': 'promoter_mutation', 'effect': -0.3},  # 30% decrease in promoter strength
    {'name': 'enhancer_deletion', 'effect': -0.2},  # 20% decrease for enhancer
    {'name': 'polyA_mutation', 'effect': -0.4}      # 40% decrease for polyA processing
]

expression_effects = {}
for mutation in mutations_to_test:
    mutated_strength = max(0.1, gene_data['promoter_strength'] + mutation['effect'])
    mutated_expression = mutated_strength * 100 + activator_effect - repressor_effect
    expression_effects[mutation['name']] = mutated_expression

Analyze the gene regulation system and predict the effects of mutations on expression levels.

Hint:

Consider how regulatory elements (promoter, operator, enhancer) affect transcription
Calculate the combined effects of activators and repressors
Evaluate the impact of post-transcriptional modifications
Estimate protein production from mRNA levels

# TODO: Calculate gene expression parameters
basal_expression_level = 0  # Arbitrary units (0-100 scale)
net_regulation_effect = 0   # Combined activator/repressor effect
mature_mrna_amount = 0      # Molecules per cell
protein_concentration = 0   # Units per cell
half_life_hours = 0         # RNA stability
expression_fold_change = 0  # Effect of regulatory mutations

# Calculate basal expression from promoter
basal_expression_level = gene_data['promoter_strength'] * 100

# Calculate net regulation (activator + repressor effects)
activator_contribution = gene_data['operator_sites'][0]['affinity'] * 50  # Scale factor
repressor_contribution = gene_data['operator_sites'][1]['affinity'] * 30  # Scale factor
net_regulation_effect = activator_contribution - repressor_contribution

# Calculate mature mRNA considering processing efficiency
processing_efficiency = 0.9 if gene_data['polyadenylation_signal'] == 'AATAAA' else 0.6
mature_mrna_amount = (basal_expression_level + net_regulation_effect) * processing_efficiency

# Calculate protein concentration (assuming 5 proteins per mRNA)
protein_concentration = mature_mrna_amount * 5

# Calculate fold change with mutations
control_expression = mature_mrna_amount
mutant_expression = control_expression * 0.7  # Example with 30% reduction
expression_fold_change = mutant_expression / control_expression

# RNA half-life calculation
if gene_data['polyadenylation_signal'] == 'AATAAA':
    half_life_hours = 4  # Stable message
else:
    half_life_hours = 1  # Less stable

# Print results
print(f"Basal expression level: {basal_expression_level:.1f} units")
print(f"Net regulation effect: {net_regulation_effect:.1f} units")
print(f"Mature mRNA amount: {mature_mrna_amount:.1f} molecules/cell")
print(f"Protein concentration: {protein_concentration:.1f} molecules/cell")
print(f"RNA half-life: {half_life_hours} hours")
print(f"Expression fold change with mutations: {expression_fold_change:.2f}")

# Regulatory assessment
if expression_fold_change < 0.5:
    regulation_type = "Strong downregulation"
elif expression_fold_change < 0.8:
    regulation_type = "Moderate downregulation"
elif expression_fold_change > 2.0:
    regulation_type = "Strong upregulation"
else:
    regulation_type = "Normal regulation"
    
print(f"Regulation assessment: {regulation_type}")

What would be the most effective strategy to increase expression of this gene for protein production purposes?

Molecular Biology

Molecular Biology

DNA Structure and Replication

DNA Structure

Base Pairing Rules

DNA Strands

DNA Replication

Replication Process

Replication Enzymes

Proofreading and Repair

Transcription

RNA Polymerase Mechanism

Transcription Process

Prokaryotic vs. Eukaryotic Transcription

Translation

The Genetic Code

Code Characteristics

Translation Process

tRNA Structure and Function

Gene Regulation

Prokaryotic Gene Regulation: The Lac Operon

Components

Regulation Mechanism

Eukaryotic Gene Regulation

Transcriptional Control

Post-transcriptional Control

Post-translational Control

Advanced Topics in Gene Expression

RNA Processing in Eukaryotes

Pre-mRNA Splicing

Epigenetic Regulation

DNA Methylation

Histone Modifications

Molecular Techniques

PCR (Polymerase Chain Reaction)

PCR Process

PCR Applications

Recombinant DNA Technology

Restriction Enzymes

Modern Developments

CRISPR-Cas Systems

Single-cell Analysis

Real-World Application: Antibiotic Resistance Mechanisms

Mechanism Analysis

Evolutionary Implications

Your Challenge: Gene Expression Analysis

Gene Regulatory Sequence

ELI10 Explanation

Self-Examination