Geospatial & Geophysics
Learn the algorithms inside every GIS by building them from scratch in Python: great-circle distance, map projections from their raw formulas, point-in-polygon and convex hulls, geohashes and quadtrees, slope and hillshade from elevation grids, spatial interpolation, and locating an earthquake from its waves. Where the map comes from, not which button draws it.
10 projects, 250 hands-on levels, run in your browser.
Syllabus
- Coordinates & the Earth: Every map starts with a pair of numbers: latitude and longitude. This project builds fluency with them, validating and normalizing coordinates, converting between degrees-minutes-seconds and decimal degrees, working with bounding boxes, and measuring the spherical Earth itself, how long a degree really is, and why that depends on where you stand.
- Distance & Geodesics: How far apart are two points on a sphere? Not the straight ruler line, the great-circle arc. This project derives the haversine formula and uses it for everything navigation needs: the initial bearing to steer, the destination after sailing a course, the closest point on a route, how far off-track you have drifted, and the length of a whole GPS track.
- Map Projections: You cannot flatten a sphere without lying somewhere. A projection chooses which lie: distances, areas, or angles. This project builds the classics from their raw formulas, equirectangular, the Mercator that keeps angles true, and the Web Mercator tile pyramid that every online map serves, then measures exactly how each one distorts and where.
- Vector Geometry: Is this point inside that district? How big is this parcel? Where do these two roads cross? Vector geometry answers the questions maps exist to ask. This project builds ray-casting point-in-polygon, the shoelace formula for area and centroids, segment intersection from cross products, the gift-wrapping convex hull, and the Douglas-Peucker simplification that keeps map lines light.
- Spatial Indexing: Ten million points, one question: what's near me? Scanning everything is hopeless, so spatial indexes carve space into cells you can jump to directly. This project builds the grid index, the geohash that turns location into a sortable string, the quadtree that adapts to density, nearest-neighbor search, and a bounding-box tree, the same family of structures inside PostGIS and every map server.
- Rasters & DEMs: A raster is a grid of values draped over the Earth; a Digital Elevation Model is a raster of heights. This project builds the raster machinery, georeferencing between world and pixel coordinates, bilinear sampling between cells, and then reads the terrain itself: slope from finite differences, aspect, the hillshade that makes maps look three-dimensional, and the D8 flow directions that tell you where water goes.
- Spatial Interpolation: You measured elevation at forty stations; the map needs a value everywhere. Interpolation fills the gaps. This project builds the standard estimators, nearest-neighbor, inverse distance weighting with its power knob, and barycentric interpolation inside triangles, then grids a whole field of scattered samples and uses leave-one-out cross-validation to measure honestly which estimator is best.
- Spatial Analysis: Data on a map invites questions: how many incidents within 500 meters of a school? Which neighborhood does each sale belong to? Are the outbreaks clustered or scattered? This project builds the analyst's toolkit, buffers, point-in-polygon spatial joins, DBSCAN clustering, kernel-density heatmaps, and Moran's I, the statistic that says whether your pattern is real or random.
- Seismology & Geophysics: An earthquake announces itself twice: the fast P wave, then the slower, stronger S wave. The gap between them is a built-in distance meter, and three distance circles pin the epicenter. This project builds the seismologist's chain: travel times, the S-minus-P rule, epicenter trilateration by grid search, the logarithmic magnitude scales, and the STA/LTA detector that picks wave arrivals out of noisy ground motion.
- Capstone: A Mini GIS Engine: Time to assemble the whole stack. A GIS engine loads layers of features, indexes them, answers spatial queries, and computes analytics. This capstone builds exactly that: a layer store over GeoJSON-shaped features, a grid index for speed, the query API every map application calls (bbox, radius, polygon, nearest), per-layer statistics, and finally a complete site-selection analysis, find the best location subject to spatial constraints, end to end.
Key concepts
- 2D cross product: The signed area test (b-a) x (c-a): positive means a left turn, negative a right turn, zero collinear. The primitive under hulls, intersections, and orientatio…
- Antipode: The point exactly opposite a location through the center of the Earth: negate the latitude and shift the longitude by 180 degrees.
- Aspect: The compass direction a slope faces (downhill), from the gradient: atan2(-gx, -gy). Drives sun exposure, vegetation, and avalanche analysis.
- Barycentric coordinates: A point's position inside a triangle expressed as three weights (ratios of sub-triangle areas) that sum to one. Negative weight means outside; the weights…
- Bearing: The compass direction of travel, measured clockwise from north in degrees. The initial bearing of a great circle generally changes along the route.
- Bilinear interpolation: Estimating a value at a fractional cell position as the distance-weighted blend of the four surrounding cells. The default smooth resampler.
- Bounding box: The smallest axis-aligned rectangle (min_lon, min_lat, max_lon, max_lat) around a geometry. The cheapest spatial filter: test the box before any expensive geom…
- Buffer: The zone within a distance of a feature, 'within 500 m of a school'. Point buffers are disks, stored in vector engines as many-sided polygons.
- Centroid: A polygon's center of mass, computed from the same cross products as the shoelace area. Where the label goes on the map.
- Choropleth: A map that shades regions by an aggregated value, incidents per district, income per county. Built from a spatial join plus a group-by.
- Convex hull: The tightest convex fence around a point set, like a stretched rubber band. Gift wrapping (Jarvis march) walks it by repeatedly taking the most counter-clockwi…
- Cross-track distance: How far a point sits off a route: the perpendicular distance to the great circle (or segment) being followed. The core of map matching and off-course alerts.
- DBSCAN: Density-based clustering: core points (enough neighbors within eps) grow clusters by flood fill; sparse points stay noise. Finds hotspots without choosing a cl…
- Degrees-minutes-seconds (DMS): The traditional coordinate notation: 48 degrees 51 minutes 24 seconds. Converted to decimal degrees as d + m/60 + s/3600, negated for south and west.
- Digital elevation model (DEM): A raster of terrain heights. The substrate for slope, aspect, hillshade, viewsheds, and hydrology.
- Douglas-Peucker: The standard line-simplification algorithm: keep the endpoints, find the vertex farthest from the chord, and recurse only if it exceeds a tolerance. Keeps shap…
- Epicenter: The point on the surface directly above an earthquake's rupture. Located by trilateration from station distances; the depth point below it is the hypocente…
- Equal-area projection: A projection that preserves relative areas exactly (like the sinusoidal), at the cost of shearing shapes. The honest choice for mapping statistics.
- Equirectangular projection: The simplest projection: longitude becomes x, latitude becomes y. Cheap and fine near a chosen standard parallel, increasingly stretched elsewhere.
- Feature: A geometry plus its attributes (properties): a school with its name, a parcel with its owner. The unit of vector GIS data.
- Flow direction (D8): Each DEM cell drains to its steepest-descent neighbor among the eight. Following the arrows traces streams; cells with no lower neighbor are pits, usually data…
- Generalization: Simplifying geometry for a smaller scale: fewer vertices, merged features, dropped detail. What keeps a world map from carrying a centimeter coastline.
- Geodesic: The shortest path between two points on a curved surface. On a spherical Earth it is a great-circle arc; computing its length and direction is the heart of nav…
- Geohash: A location encoded as a short base-32 string by alternately bisecting longitude and latitude. Nearby places share prefixes, making location sortable and joinab…
- GeoJSON: The web's standard format for geographic features: geometries (Point, LineString, Polygon) plus properties, bundled into FeatureCollections. Coordinates ar…
- Geotransform: The mapping between pixel (row, col) and world (x, y): an origin plus a cell size. What ties an image to the Earth.
- GPS trace: A sequence of timestamped position fixes. Real traces carry noise and teleporting glitches; cleaning means speed filters and deduplication before any analysis.
- Great circle: The largest circle on a sphere, made by a plane through the center. The shortest path between two points on Earth follows a great circle, not a straight ruler…
- Gutenberg-Richter law: Earthquake bookkeeping's power law: each step down in magnitude brings roughly ten times more events. A healthy catalog shows it; a gap reveals missing dat…
- Haversine formula: The standard, numerically stable formula for great-circle distance between two lat/lon points: a = sin^2(dlat/2) + cos(lat1)cos(lat2)sin^2(dlon/2), d = 2R asin…
- Hillshade: Synthetic terrain lighting: shade each cell by how squarely it faces a virtual sun (cos of the angle between surface normal and light). What makes flat maps lo…
- Hotspot: A location of significantly elevated density or value, the peak of the KDE surface or a dense DBSCAN cluster. Where the analyst's finger lands.
- Inverse distance weighting (IDW): Interpolation where each sample's weight is 1/distance^p. The power p tunes locality: high p trusts only the nearest samples, low p smooths toward the mean.
- Kernel density estimation (KDE): The smooth heatmap: every point spreads a small kernel of influence, and the surface is their sum. Bandwidth controls how smooth.
- Latitude: The angle north or south of the equator, from -90 at the south pole to +90 at the north. One degree of latitude is about 111 km everywhere on Earth.
- Layer: A named collection of features of the same kind (all schools, all roads). A GIS engine manages, indexes, and queries layers.
- Leave-one-out cross-validation: The honest referee of interpolation: hold each sample out, predict it from the rest, and score the errors (RMSE). Lets you tune parameters without fooling your…
- Longitude: The angle east or west of the Greenwich meridian, from -180 to +180. Unlike latitude, a degree of longitude shrinks toward the poles by the cosine of the latit…
- Magnitude: The logarithmic size scale of earthquakes: each whole step is 10 times the shaking amplitude and about 32 times the energy.
- Map matching: Snapping noisy GPS fixes onto the road network: find the nearest plausible segment for each fix. Built on point-to-segment distance.
- Map projection: A rule for flattening the curved Earth onto a plane. Every projection distorts something, distance, area, or angle; choosing one means choosing which lie you c…
- Mercator projection: The projection that keeps angles true (y = ln tan(pi/4 + lat/2)), making compass courses straight lines. Its price: scale inflates as 1/cos(lat), so Greenland…
- Moran's I: The standard statistic for spatial autocorrelation: a neighbor-weighted correlation of a variable with itself, near +1 clustered, near -1 dispersed, near 0 ran…
- Nearest neighbor: The 'closest X' query. Brute force scans all points; indexes restrict the search to nearby cells; k-NN returns the k closest, ranked.
- NoData value: A sentinel (often -9999) marking raster cells with no measurement. Every honest raster pipeline masks it before statistics.
- P wave: The fastest seismic wave (about 6 km/s in the crust), a compression wave and the first arrival on a seismogram. The 'P' is for primary.
- Point in polygon: The most-executed test in GIS: cast a ray from the point and count polygon-edge crossings; an odd count means inside. Works for concave polygons too.
- Quadtree: A spatial tree that splits a square into four children only where points crowd, adapting to density. Range queries descend only into intersecting quadrants.
- R-tree: The index for extended geometries: features grouped under nested minimum bounding rectangles. Queries prune whole subtrees whose boxes miss the search window;…
- Raster: Geography as a grid of cells, each holding a value: elevation, imagery, land cover. Cheap to process per-cell, heavy to store, the complement of vector data.
- Ray casting: The point-in-polygon algorithm: shoot a horizontal ray to infinity and toggle inside/outside at every edge it crosses.
- Resampling: Reading a raster at positions between cells: nearest-neighbor snaps (blocky but value-preserving), bilinear blends the four neighbors (smooth).
- Rhumb line: A path of constant compass bearing. Longer than the great circle, but straight on a Mercator chart, which is why sailors loved Mercator.
- S wave: The slower (about 3.5 km/s), stronger shear wave that arrives second and does most of the shaking. It cannot travel through liquid.
- S-minus-P time: The gap between the two arrivals grows linearly with distance, so one stopwatch reading gives the distance to the quake: d = gap / (1/vs - 1/vp).
- Segment intersection: Two segments properly cross when each straddles the other's line, tested with four orientation signs; the parametric form then gives the exact crossing poi…
- Seismogram: A station's recording of ground motion over time. Detrended, filtered, and picked for arrivals, it is the raw material of seismology.
- Shoelace formula: Computes a polygon's area in one pass: half the absolute sum of x_i*y_{i+1} - x_{i+1}*y_i around the ring. The sign reveals the winding order.
- Slope: Terrain steepness, computed from finite differences of a DEM: atan of the gradient magnitude. The first derivative of elevation.
- Spatial autocorrelation: Tobler's first law quantified: near things are more related than far things. Positive autocorrelation means clustering, negative means dispersion like a ch…
- Spatial index: A structure that finds nearby features without scanning everything: grids, geohashes, quadtrees, R-trees. The difference between milliseconds and minutes on bi…
- Spatial interpolation: Estimating a field everywhere from samples somewhere: nearest, IDW, splines, TINs, kriging. Turning forty stations into a full surface.
- Spatial join: Attaching attributes by location instead of by key: each point gets the polygon that contains it. Point-in-polygon at scale, accelerated by bbox pruning.
- STA/LTA: The classic seismic detector: the ratio of short-term to long-term average energy spikes when a wave arrives. Crossing a threshold picks the arrival time.
- Tile pyramid: The zoom hierarchy of web maps: zoom 0 is one tile for the world, and each level quadruples the tile count (2^z per side). A tile is addressed as z/x/y.
- Triangulated irregular network (TIN): Samples connected into triangles; values interpolate linearly inside each via barycentric coordinates. Exact at samples and piecewise-planar between them.
- Trilateration: Locating a source from distances: each station's distance draws a circle, and three circles pin the epicenter. Solved in practice by minimizing a misfit ov…
- Vector data: Geography stored as geometry: points, lines, and polygons with coordinates. The counterpart of raster data, which stores a grid of values.
- Voronoi diagram: The partition of the plane by nearest site: each cell contains the locations closer to its site than to any other. Service areas, school catchments, and the ne…
- Watershed: The area whose water drains to a common outlet, delineated by following flow directions upstream. The fundamental unit of hydrology.
- Web Mercator: The Mercator variant behind virtually every web map, cut off at about 85.05 degrees so the world is a square, then served as a pyramid of 256-pixel tiles.
- Winding order: The direction a polygon's vertices march: counter-clockwise gives positive signed area. GeoJSON wants exterior rings counter-clockwise and holes clockwise.
- Zoom level: The level in the tile pyramid. Ground resolution in meters per pixel is about 156543 * cos(lat) / 2^z, halving with each zoom step.