Genetics
Mendelian genetics, population genetics, and genomics.
Genetics
Genetics is the study of heredity and genetic variation in living organisms. It encompasses the mechanisms by which traits are passed from parents to offspring, the mathematical principles governing genetic frequencies in populations, and the comprehensive analysis of entire genomes. Understanding genetics is fundamental to biotechnology, medicine, agriculture, and evolutionary biology.
Mendelian Genetics
Historical Foundations
Gregor Mendel (1866) established the fundamental principles of inheritance through his work with pea plants. His laws form the foundation of modern genetics:
Law of Segregation
Law of Independent Assortment
Basic Genetic Concepts
Alleles and Genotypes
- Allele: Alternative forms of a gene (A, a)
- Homozygous: Two identical alleles (AA, aa)
- Heterozygous: Two different alleles (Aa)
- Genotype: Genetic constitution (AA, Aa, aa)
- Phenotype: Observable characteristics
Monohybrid Crosses
For a single gene with complete dominance:
Dihybrid Crosses
For two independently assorting genes:
Deviations from Mendelian Ratios
Incomplete Dominance
Codominance
Multiple Alleles
Epistasis
Population Genetics
Hardy-Weinberg Equilibrium
The Hardy-Weinberg principle describes the genetic equilibrium in an ideal population:
Where:
- = frequency of dominant allele
- = frequency of recessive allele
- = frequency of homozygous dominant genotype
- = frequency of heterozygous genotype
- = frequency of homozygous recessive genotype
Assumptions for Equilibrium
- No mutations
- No gene flow
- Large population size
- Random mating
- No natural selection
Testing Hardy-Weinberg Equilibrium
Chi-square Test
Where is observed and is expected frequency.
Factors Disrupting Equilibrium
Mutation
- Forward mutation: at rate
- Reverse mutation: at rate
- Equilibrium:
Migration (Gene Flow)
Where is migration rate, is source population frequency, is target population frequency.
Genetic Drift
Where is effective population size.
Natural Selection
Fitness and Selection Coefficient
- Fitness (W): Relative reproductive success
- Selection coefficient (s):
Selection Models
- Selection against recessive:
- Selection against dominant: (h = dominance coefficient)
Quantitative Genetics
Polygenic Inheritance
For traits controlled by multiple genes:
Components of Genetic Variance
- Additive variance (): Due to average allelic effects
- Dominance variance (): Due to interaction within loci
- Epistatic variance (): Due to interaction between loci
Heritability
Narrow-sense heritability
Broad-sense heritability
Response to Selection
Where is response to selection and is selection differential.
Genomics
Genome Structure and Organization
Prokaryotic Genomes
- Size: Usually 0.5-10 Mb
- Structure: Single circular chromosome
- Gene density: High (85-90% coding)
Eukaryotic Genomes
- Size: Variable (yeast: 12 Mb, humans: 3,200 Mb)
- Structure: Linear chromosomes in nucleus
- Gene density: Lower (only ~1.5% coding in humans)
Genomic Technologies
DNA Sequencing
Sanger Sequencing
Next-Generation Sequencing (NGS)
Genotyping Technologies
- SNP arrays: High-throughput genotyping
- Microarrays: Gene expression profiling
- Whole-genome sequencing: Complete genome analysis
Comparative Genomics
Synteny Analysis
Phylogenetics
Modern Genomic Applications
Whole Genome Analysis
Annotation
- Gene prediction: Identify protein-coding regions
- Functional assignment: Assign functions to genes
- Regulatory element identification: Promoters, enhancers, etc.
Variation Analysis
- SNPs: Single nucleotide polymorphisms
- Indels: Insertions/deletions
- CNVs: Copy number variations
- Structural variants: Large-scale rearrangements
Functional Genomics
Gene Expression Analysis
Epigenomics
- DNA methylation: Gene expression regulation
- ChIP-seq: Protein-DNA interactions
- ATAC-seq: Chromatin accessibility
Population Genomics
Genetic Diversity Measures
Heterozygosity
Where is frequency of allele .
Nucleotide Diversity
Where is number of differences between sequences and .
Population Structure
F-statistics (Fixation Indices)
- : Inbreeding within subpopulations
- : Differentiation between subpopulations
- : Total inbreeding in total population
Where is total heterozygosity and is subpopulation heterozygosity.
Genome-Wide Association Studies (GWAS)
Applications in Medicine
Medical Genetics
Single-Gene Disorders
- Autosomal dominant: Huntington's disease, Marfan syndrome
- Autosomal recessive: Cystic fibrosis, sickle cell anemia
- X-linked: Duchenne muscular dystrophy, hemophilia
Complex Diseases
- Multifactorial: Diabetes, heart disease, cancer
- Polygenic: Continuous distribution of risk
Pharmacogenomics
Computational Tools
Genetic Analysis Software
- PLINK: Whole-genome association analysis
- BEAGLE: Genotype imputation
- STRUCTURE: Population structure analysis
- PhyML, RAxML: Phylogenetic tree construction
Real-World Application: Population Bottleneck Analysis
Population bottlenecks significantly impact genetic diversity and can be studied using population genetics principles.
Bottleneck Analysis
# Population bottleneck and genetic drift analysis
population_data = {
'initial_size': 10000, # N0 (original population size)
'bottleneck_size': 50, # Nb (size during bottleneck)
'bottleneck_duration': 5, # generations
'recovery_time': 100, # generations since recovery
'mutation_rate': 2.5e-8, # per site per generation
'current_size': 50000 # N1 (current population size)
}
# Calculate heterozygosity after bottleneck
# Formula: Ht = H0 * (1 - 1/2N)^t
# Where N is harmonic mean of population sizes
# Harmonic mean calculation during bottleneck
N_harmonic = 1 / ((population_data['bottleneck_duration'] / population_data['bottleneck_size']) +
((population_data['recovery_time']) / population_data['current_size']))
# Effective population size over entire period
generations_total = population_data['bottleneck_duration'] + population_data['recovery_time']
N_e = generations_total / ((population_data['bottleneck_duration'] / population_data['bottleneck_size']) +
(population_data['recovery_time'] / population_data['current_size']))
# Expected heterozygosity after bottleneck
H0 = 0.0005 # Initial heterozygosity
Ht = H0 * (1 - 1/(2 * N_e)) ** generations_total
# Calculate nucleotide diversity reduction
# Expected reduction: π_post = π_pre * (1 - 1/(2*Nb))^t_bottleneck
pi_reduction = (1 - 1/(2 * population_data['bottleneck_size'])) ** population_data['bottleneck_duration']
# Effective number of founding individuals (genetic perspective)
# Using: Nb = (4*Nm*Nf) / (Nm + Nf) where Nm=male, Nf=female
# Simplified: assuming equal sex ratio for bottleneck
effective_founders = (4 * (population_data['bottleneck_size']/2) * (population_data['bottleneck_size']/2)) / population_data['bottleneck_size']
print(f"Population bottleneck analysis:")
print(f" Original size: {population_data['initial_size']:,}")
print(f" Bottleneck size: {population_data['bottleneck_size']}")
print(f" Bottleneck duration: {population_data['bottleneck_duration']} generations")
print(f" Effective population size (harmonic mean): {N_e:.1f}")
print(f" Expected heterozygosity reduction factor: {(1 - Ht/H0):.3f}")
print(f" Nucleotide diversity reduction: {(1 - pi_reduction):.3f}")
print(f" Effective founding individuals: {effective_founders:.1f}")
# Calculate time to recover original diversity level
# Approximate: generations to restore diversity = 4 * Ne
recovery_generations = 4 * N_e
print(f" Approximate generations to recover original diversity: {recovery_generations:.0f}")
# Interpretation
if effective_founders < 100:
bottleneck_severity = "Severe - significant genetic drift expected"
elif effective_founders < 500:
bottleneck_severity = "Moderate - some genetic drift"
else:
bottleneck_severity = "Mild - minimal genetic drift impact"
print(f" Bottleneck severity: {bottleneck_severity}")
Conservation Genetics Implications
Understanding population bottlenecks helps in conservation biology and species management.
Your Challenge: Hardy-Weinberg Analysis
Analyze genetic data to determine if a population is in Hardy-Weinberg equilibrium and calculate evolutionary parameters.
Goal: Use population genetics principles to analyze genetic data and assess evolutionary forces.
Population Data
import math
# SNP genotyping data for a population
genotype_counts = {
'AA': 420, # Number of homozygous dominant individuals
'Aa': 480, # Number of heterozygous individuals
'aa': 100 # Number of homozygous recessive individuals
}
# Calculate total individuals and allele frequencies
total_individuals = genotype_counts['AA'] + genotype_counts['Aa'] + genotype_counts['aa']
total_alleles = 2 * total_individuals
# Calculate allele frequencies
p = (2 * genotype_counts['AA'] + genotype_counts['Aa']) / total_alleles # freq of A allele
q = (2 * genotype_counts['aa'] + genotype_counts['Aa']) / total_alleles # freq of a allele
# Expected genotype frequencies under HWE
expected_AA = p**2 * total_individuals
expected_Aa = 2 * p * q * total_individuals
expected_aa = q**2 * total_individuals
# Chi-square test
observed_values = [genotype_counts['AA'], genotype_counts['Aa'], genotype_counts['aa']]
expected_values = [expected_AA, expected_Aa, expected_aa]
chi_square = sum([(O - E)**2 / E for O, E in zip(observed_values, expected_values)])
# Degrees of freedom = number of genotypes - 1 - number of estimated parameters
# For 3 genotypes with 1 estimated parameter (p), df = 3 - 1 - 1 = 1
df = 1
# Calculate FIS (inbreeding coefficient)
# FIS = (He - Ho) / He, where He is expected heterozygosity and Ho is observed heterozygosity
He = 2 * p * q # Expected heterozygosity under HWE
Ho = genotype_counts['Aa'] / total_individuals # Observed heterozygosity
FIS = (He - Ho) / He
# Calculate other population genetics parameters
heterozygosity_reduction = 1 - (Ho / He) # Reduction from expected
allele_balance = abs(p - q) # Difference in allele frequencies
# Assess evolutionary forces
evolutionary_forces = []
if abs(FIS) > 0.1:
if FIS > 0:
evolutionary_forces.append("Inbreeding")
else:
evolutionary_forces.append("Outbreeding")
if chi_square > 3.84: # Critical value for p=0.05 with df=1
evolutionary_forces.append("Deviates from HWE")
Analyze the population genetics data and determine the evolutionary forces at play.
Hint:
- Calculate observed vs. expected genotype frequencies
- Perform chi-square test to assess Hardy-Weinberg equilibrium
- Calculate FIS to assess inbreeding/outbreeding
- Consider the implications of observed allele frequencies
# TODO: Calculate genetics parameters
allele_frequency_A = 0 # Frequency of allele A
allele_frequency_a = 0 # Frequency of allele a
inbreeding_coefficient = 0 # FIS value
chi_square_statistic = 0 # Chi-square test result
hwe_status = "" # In or out of equilibrium
evolutionary_force = "" # Type of evolutionary pressure
# Calculate allele frequencies from counts
total_alleles = 2 * (genotype_counts['AA'] + genotype_counts['Aa'] + genotype_counts['aa'])
allele_frequency_A = (2 * genotype_counts['AA'] + genotype_counts['Aa']) / total_alleles
allele_frequency_a = (2 * genotype_counts['aa'] + genotype_counts['Aa']) / total_alleles
# Calculate expected frequencies under HWE
expected_AA = allele_frequency_A**2 * total_individuals
expected_Aa = 2 * allele_frequency_A * allele_frequency_a * total_individuals
expected_aa = allele_frequency_a**2 * total_individuals
# Calculate observed heterozygosity
observed_het = genotype_counts['Aa'] / total_individuals
expected_het = 2 * allele_frequency_A * allele_frequency_a
# Calculate FIS (inbreeding coefficient)
inbreeding_coefficient = (expected_het - observed_het) / expected_het
# Calculate chi-square statistic
chi_square_statistic = ((genotype_counts['AA'] - expected_AA)**2 / expected_AA +
(genotype_counts['Aa'] - expected_Aa)**2 / expected_Aa +
(genotype_counts['aa'] - expected_aa)**2 / expected_aa)
# Assess HWE status (critical value for df=1 at alpha=0.05 is 3.84)
if chi_square_statistic > 3.84:
hwe_status = "Out of equilibrium"
else:
hwe_status = "In equilibrium"
# Determine evolutionary force based on FIS and HWE
if inbreeding_coefficient > 0.1:
evolutionary_force = "Genetic drift or inbreeding"
elif inbreeding_coefficient < -0.1:
evolutionary_force = "Outbreeding or selection"
else:
evolutionary_force = "Random mating (no significant force)"
# Print results
print(f"Allele frequency A: {allele_frequency_A:.3f}")
print(f"Allele frequency a: {allele_frequency_a:.3f}")
print(f"Inbreeding coefficient (FIS): {inbreeding_coefficient:.3f}")
print(f"Chi-square statistic: {chi_square_statistic:.2f}")
print(f"Hardy-Weinberg status: {hwe_status}")
print(f"Dominant evolutionary force: {evolutionary_force}")
# Additional population genetics assessment
heterozygosity_ratio = observed_het / expected_het
if heterozygosity_ratio < 0.9:
diversity_status = "Reduced heterozygosity"
elif heterozygosity_ratio > 1.1:
diversity_status = "Increased heterozygosity"
else:
diversity_status = "Normal heterozygosity"
print(f"Heterozygosity status: {diversity_status}")
How might the population's genetic structure be affected if the observed deviation from Hardy-Weinberg equilibrium is due to population subdivision?
ELI10 Explanation
Simple analogy for better understanding
Self-Examination
What are the key differences between Mendelian and polygenic inheritance patterns?
How do Hardy-Weinberg equilibrium principles apply to population genetics?
What are the main approaches used in modern genomics?