Crate genom
Fast reverse geocoding library with enriched location data. Convert coordinates to detailed place information including timezone, currency, region, and more.
Features
- Simple API - Single function call:
genom::lookup(lat, lon) - Rich Data - Returns 16+ fields including timezone, currency, postal code, region
- Fast Lookups - Grid-based spatial indexing for sub-millisecond queries
- Zero Config - Database builds automatically on first install
- Thread-Safe - Global singleton with lazy initialization
- Compact - Efficient binary format with string interning
Quick Start
use genom; fn main() { // Lookup coordinates if let Some(place) = genom::lookup(40.7128, -74.0060) { println!("{}, {}", place.city, place.country_name); // Output: New York, United States } }
Module enrichment
Data enrichment module that adds computed fields to raw geographic data.
This module provides functionality to enrich basic place data with additional information such as:
- Country names from ISO codes
- Currency codes by country
- Continent information
- EU membership status
- Timezone calculations (offset, abbreviation, DST status)
All enrichment data is stored in static lazy-initialized hash maps for efficient lookup.
Module types
Struct Geocoder
pub struct Geocoder { /* private fields */ }
The core geocoding engine. Manages the spatial database and performs coordinate lookups.
Conceptual Role
Geocoder is the transport layer for all geographic queries. It handles:
- Zero-copy parse of the embedded binary blob
- Grid-based spatial indexing for O(1) lookups
- Nearest-neighbor search across grid cells
- String table resolution for compact storage
What This Type Does NOT Do
- Data enrichment (handled by enrichment module)
- Distance calculations (delegated to Location type)
- Thread synchronization (uses
OnceLockfor initialization)
Invariants
- After construction, the database is fully loaded and valid
- Grid keys are consistent with coordinate quantization
- String indices into the embedded blob are bounds-checked at parse time
Thread Safety
Geocoder is Send but not Sync. However, the global instance accessed via Geocoder::global() is safe to use from multiple threads because all operations are read-only after initialization.
Implementations
pub fn global() → &'static Self
Returns a reference to the global geocoder singleton.
Initialization
First call zero-copy-parses the embedded binary blob and builds two FxHashMap indexes. Subsequent calls return the cached instance. Initialization is thread-safe via OnceLock.
Panics
Panics if database initialization fails (corrupted data, out of memory). This is intentional - the library cannot function without a valid database.
Examples
use genom::Geocoder; let geocoder = Geocoder::global(); let place = geocoder.lookup(51.5074, -0.1278);
pub fn lookup(&self, latitude: f64, longitude: f64) → Option<Place>
Finds the nearest place to the given coordinates.
Algorithm
- Quantize coordinates into the 0.1° city grid cell
- Expanding-ring search across neighboring cells until a candidate is found
- Refine the postal code from the matching country's 0.01° postal grid
- Enrich with country/currency/continent/timezone metadata
Returns
Some(Place) if a location is found within search radius, None otherwise. Ocean coordinates typically return None unless near coastal cities.
Examples
use genom::Geocoder; let geocoder = Geocoder::global(); // Paris, France let place = geocoder.lookup(48.8566, 2.3522).unwrap(); assert_eq!(place.city, "Paris"); assert_eq!(place.country_code, "FR");
Struct Place
pub struct Place { pub city: String, pub region: String, pub region_code: String, pub district: String, pub country_code: String, pub country_name: String, pub postal_code: String, pub timezone: String, pub timezone_abbr: String, pub utc_offset: i32, pub utc_offset_str: String, pub latitude: f64, pub longitude: f64, pub currency: String, pub continent_code: String, pub continent_name: String, pub is_eu: bool, pub dst_active: bool, }
The enriched output type containing complete geographic context for a location.
This struct is returned by lookup() and contains 18 fields providing comprehensive information about a geographic location.
Fields
city: String
City or locality name (e.g., "New York", "Tokyo", "Paris")
region: String
State, province, or administrative region full name (e.g., "California", "Tokyo", "Île-de-France")
region_code: String
ISO 3166-2 region code (e.g., "CA" for California, "13" for Tokyo)
district: String
County, district, or sub-region (e.g., "Los Angeles County", "Chiyoda")
country_code: String
ISO 3166-1 alpha-2 country code (e.g., "US", "JP", "FR")
country_name: String
Full country name (e.g., "United States", "Japan", "France")
postal_code: String
Postal or ZIP code (e.g., "10001", "100-0001", "75001")
timezone: String
IANA timezone identifier (e.g., "America/New_York", "Asia/Tokyo", "Europe/Paris")
timezone_abbr: String
Current timezone abbreviation (e.g., "EST", "JST", "CET"). Changes based on DST.
utc_offset: i32
Current UTC offset in seconds (e.g., -18000 for UTC-5, 32400 for UTC+9)
utc_offset_str: String
Formatted UTC offset string (e.g., "UTC-5", "UTC+9", "UTC+5:30")
latitude: f64
Precise latitude coordinate in decimal degrees (-90 to 90)
longitude: f64
Precise longitude coordinate in decimal degrees (-180 to 180)
currency: String
ISO 4217 currency code (e.g., "USD", "JPY", "EUR")
continent_code: String
Two-letter continent code (e.g., "NA" for North America, "AS" for Asia, "EU" for Europe)
continent_name: String
Full continent name (e.g., "North America", "Asia", "Europe")
is_eu: bool
Whether the location is in a European Union member state
dst_active: bool
Whether daylight saving time is currently active for this location
Examples
use genom; let place = genom::lookup(40.7128, -74.0060).unwrap(); println!("City: {}", place.city); println!("Country: {}", place.country_name); println!("Timezone: {} ({})", place.timezone, place.timezone_abbr); println!("Currency: {}", place.currency); println!("EU Member: {}", place.is_eu);
Struct Location
pub struct Location { pub latitude: f64, pub longitude: f64, }
A coordinate pair with distance calculation capabilities.
This is a simple wrapper around latitude and longitude coordinates that provides utility methods for geographic calculations.
Fields
latitude: f64
Latitude in decimal degrees (-90 to 90)
longitude: f64
Longitude in decimal degrees (-180 to 180)
Implementations
pub fn new(latitude: f64, longitude: f64) → Self
Constructs a new Location from coordinates.
Examples
use genom::Location; let loc = Location::new(40.7128, -74.0060); assert_eq!(loc.latitude, 40.7128); assert_eq!(loc.longitude, -74.0060);
pub fn distance_to(&self, other: &Location) → f64
Calculates the great-circle distance to another location using the haversine formula.
Returns the distance in kilometers. This calculation assumes a spherical Earth with radius 6371 km, which provides accuracy within 0.5% for most distances.
Examples
use genom::Location; let nyc = Location::new(40.7128, -74.0060); let la = Location::new(34.0522, -118.2437); let distance = nyc.distance_to(&la); assert!(distance > 3900.0 && distance < 4000.0); // ~3944 km
Reference Binary format
The compiled database is a single contiguous blob (geo.bin, ~37 MB) embedded into your binary via include_bytes! and exposed as &'static [u8]. All previously-public storage types (CompactPlace, Database) are gone — there is no bincode, no Vec deserialization, no decompression.
Layout
- Header — magic
"GEO1", version, 12×u32section offsets/lengths - Strings — interned UTF-8 table with
u32offsets - Country codes — packed 2-byte ISO 3166-1 alpha-2 codes indexed by
u16 - City grid —
FxHashMap<u32, u32>cell → byte offset, varint+zigzag deltas - Cities blob — per-cell varint stream of
(Δlat, Δlon, name, a1, a2, a1c, tz, cc) - Postal directory — per-country byte ranges
- Postal sections — country-local string table + 0.01° cell grid + delta-encoded entries
- Country polygon dir + blob — Natural Earth simplified rings with bbox, varint deltas
Lookup hot path
- Zero-copy slicing of the embedded blob — no allocations per query
- Two
FxHashMapindexes built once at startup (grid + postal_by_cc) - Distance computed in fixed-point
i64squared microdegrees
Build pipeline
build.rs downloads GeoNames cities500, admin1/2, countryInfo, allCountries.zip (postal) and Natural Earth country polygons, then runs the encoder in build/builder.rs to emit OUT_DIR/geo.bin. Downloads are cached in OUT_DIR/geonames-cache/. Skip the build with --features no-build-database (lookups then return None).
Struct PlaceInput
pub struct PlaceInput<'a> { pub city: &'a str, pub region: &'a str, pub region_code: &'a str, pub district: &'a str, pub country_code: &'a str, pub postal_code: &'a str, pub timezone: &'a str, pub latitude: f64, pub longitude: f64, }
Input structure for the enrich_place function.
This struct contains the basic geographic data that will be enriched with additional computed fields (country name, currency, continent, timezone details, etc.).
Uses borrowed string slices to avoid unnecessary allocations during the enrichment process.
Fields
city: &'a str
City name
region: &'a str
Region/state name
region_code: &'a str
Region code
district: &'a str
District/county name
country_code: &'a str
ISO country code
postal_code: &'a str
Postal/ZIP code
timezone: &'a str
IANA timezone identifier
latitude: f64
Latitude coordinate
longitude: f64
Longitude coordinate
Function lookup
pub fn lookup( latitude: f64, longitude: f64 ) → Option<Place>
Performs reverse geocoding on the given coordinates, returning enriched place data if found.
Conceptual Role
This is the primary entry point for all geocoding operations. It abstracts away database access, spatial indexing, and data enrichment into a single call.
What This Function Does
- Accesses the global geocoder singleton (lazy initialization on first call)
- Performs grid-based spatial lookup to find nearest place
- Enriches raw data with timezone, currency, and regional information
- Returns
Noneif no place found within search radius
Thread Safety
This function is thread-safe and can be called concurrently from multiple threads. The underlying database is initialized once and shared via a static OnceLock.
Performance
Sub-microsecond per lookup in steady state. First call incurs a small one-shot index build (~5 ms). Subsequent calls are lock-free reads.
Examples
use genom; // Tokyo coordinates if let Some(place) = genom::lookup(35.6762, 139.6503) { println!("{}, {}", place.city, place.country_name); println!("Timezone: {}", place.timezone); println!("Currency: {}", place.currency); } // Ocean coordinates return None assert!(genom::lookup(0.0, -160.0).is_none());
Function enrich_place
pub fn enrich_place(input: PlaceInput) → Place
Enriches basic place data with computed fields.
This function takes a PlaceInput containing basic geographic information and returns a fully enriched Place with additional computed fields.
Enrichment Process
- Timezone Parsing: Parses the IANA timezone to extract current offset, abbreviation, and DST status using
chrono-tz - Country Lookup: Maps country code to full country name using static hash map
- Currency Lookup: Maps country code to ISO 4217 currency code
- Continent Lookup: Maps country code to continent code and name
- EU Status: Checks if country is an EU member state
Static Data Sources
All enrichment data is stored in static LazyLock<FxHashMap> instances:
COUNTRY_NAMES- 200+ country code to name mappingsCOUNTRY_CURRENCIES- 200+ country code to currency mappingsCOUNTRY_CONTINENTS- 200+ country code to continent mappingsCONTINENT_NAMES- 7 continent code to name mappingsEU_COUNTRIES- 27 EU member states
DST Detection
DST status is determined by comparing the current UTC offset with the minimum offset observed in January and July. If the current offset differs from the minimum, DST is active.
Examples
use genom::enrichment::{enrich_place, PlaceInput}; let input = PlaceInput { city: "New York", region: "New York", region_code: "NY", district: "New York County", country_code: "US", postal_code: "10001", timezone: "America/New_York", latitude: 40.7128, longitude: -74.0060, }; let place = enrich_place(input); assert_eq!(place.country_name, "United States"); assert_eq!(place.currency, "USD"); assert_eq!(place.continent_name, "North America"); assert_eq!(place.is_eu, false);
Performance & Implementation Details
Memory Layout
- Single blob:
~37 MB geo.binembedded viainclude_bytes!, lives in.rodata - String interning: shared UTF-8 table, every string field is a
u32offset - Varint + zigzag deltas: coordinates and IDs stored as 1–3 byte sequences
- Two-level grid: 0.1° cells for cities, per-country 0.01° cells for postal codes
Initialization
include_bytes!bakesgeo.bininto the executable- On first
lookup, headers are parsed into&'static [u8]slices - Two
FxHashMapindexes are built (city grid, postal-by-country) - The
Geocoderis cached in aOnceLockfor the lifetime of the process
Thread Safety
After initialization, all operations are read-only and require no synchronization. Multiple threads can perform lookups concurrently without contention.
Build Process
build.rs generates the database during cargo build:
- Download GeoNames
cities500,admin1/2,countryInfo,allCountries - Download Natural Earth
ne_110m_admin_0_countriesshapefile - Parse, simplify polygon rings (Douglas–Peucker)
- Intern strings, group cities by 0.1° cell, group postal codes per country
- Emit a single varint-encoded blob to
OUT_DIR/geo.bin - Library
include_bytes!'s the result
Downloads are cached in OUT_DIR/geonames-cache/, so a clean rebuild without network works as long as the cache exists. Build time on first run: ~30–60 s.