Crate genom
Fast reverse geocoding library with enriched location data. Convert coordinates to detailed place information including timezone, currency, region, and more.
Features
- Simple API - Single function call:
genom::lookup(lat, lon) - Rich Data - Returns 16+ fields including timezone, currency, postal code, region
- Fast Lookups - Grid-based spatial indexing for sub-millisecond queries
- Zero Config - Database builds automatically on first install
- Thread-Safe - Global singleton with lazy initialization
- Compact - Efficient binary format with string interning
Quick Start
use genom; fn main() { // Lookup coordinates if let Some(place) = genom::lookup(40.7128, -74.0060) { println!("{}, {}", place.city, place.country_name); // Output: New York, United States } }
Module enrichment
Data enrichment module that adds computed fields to raw geographic data.
This module provides functionality to enrich basic place data with additional information such as:
- Country names from ISO codes
- Currency codes by country
- Continent information
- EU membership status
- Timezone calculations (offset, abbreviation, DST status)
All enrichment data is stored in static lazy-initialized hash maps for efficient lookup.
Module types
Core data structures for geographic information.
This module defines the fundamental types used throughout the library:
Place- Enriched output with complete geographic contextLocation- Simple coordinate pair with distance calculationsCompactPlace- Compressed storage format using string table indicesDatabase- Complete spatial database with grid index
Struct Geocoder
pub struct Geocoder { /* private fields */ }
The core geocoding engine. Manages the spatial database and performs coordinate lookups.
Conceptual Role
Geocoder is the transport layer for all geographic queries. It handles:
- Database initialization and decompression
- Grid-based spatial indexing for O(1) lookups
- Nearest-neighbor search across grid cells
- String table resolution for compact storage
What This Type Does NOT Do
- Data enrichment (handled by enrichment module)
- Distance calculations (delegated to Location type)
- Thread synchronization (uses
OnceLockfor initialization)
Invariants
- After construction, the database is fully loaded and valid
- Grid keys are consistent with coordinate quantization
- String indices in CompactPlace are valid into strings vector
Thread Safety
Geocoder is Send but not Sync. However, the global instance accessed via Geocoder::global() is safe to use from multiple threads because all operations are read-only after initialization.
Implementations
pub fn global() → &'static Self
Returns a reference to the global geocoder singleton.
Initialization
First call initializes the database by decompressing the embedded binary data. Subsequent calls return the cached instance. Initialization is thread-safe via OnceLock.
Panics
Panics if database initialization fails (corrupted data, out of memory). This is intentional - the library cannot function without a valid database.
Examples
use genom::Geocoder; let geocoder = Geocoder::global(); let place = geocoder.lookup(51.5074, -0.1278);
pub fn lookup(&self, latitude: f64, longitude: f64) → Option<Place>
Finds the nearest place to the given coordinates.
Algorithm
- Quantize coordinates to grid key (0.1° resolution)
- Search target cell and 8 neighboring cells
- Calculate haversine distance to all candidates
- Return nearest place, enriched with metadata
Returns
Some(Place) if a location is found within search radius, None otherwise. Ocean coordinates typically return None unless near coastal cities.
Examples
use genom::Geocoder; let geocoder = Geocoder::global(); // Paris, France let place = geocoder.lookup(48.8566, 2.3522).unwrap(); assert_eq!(place.city, "Paris"); assert_eq!(place.country_code, "FR");
Struct Place
pub struct Place { pub city: String, pub region: String, pub region_code: String, pub district: String, pub country_code: String, pub country_name: String, pub postal_code: String, pub timezone: String, pub timezone_abbr: String, pub utc_offset: i32, pub utc_offset_str: String, pub latitude: f64, pub longitude: f64, pub currency: String, pub continent_code: String, pub continent_name: String, pub is_eu: bool, pub dst_active: bool, }
The enriched output type containing complete geographic context for a location.
This struct is returned by lookup() and contains 18 fields providing comprehensive information about a geographic location.
Fields
city: String
City or locality name (e.g., "New York", "Tokyo", "Paris")
region: String
State, province, or administrative region full name (e.g., "California", "Tokyo", "Île-de-France")
region_code: String
ISO 3166-2 region code (e.g., "CA" for California, "13" for Tokyo)
district: String
County, district, or sub-region (e.g., "Los Angeles County", "Chiyoda")
country_code: String
ISO 3166-1 alpha-2 country code (e.g., "US", "JP", "FR")
country_name: String
Full country name (e.g., "United States", "Japan", "France")
postal_code: String
Postal or ZIP code (e.g., "10001", "100-0001", "75001")
timezone: String
IANA timezone identifier (e.g., "America/New_York", "Asia/Tokyo", "Europe/Paris")
timezone_abbr: String
Current timezone abbreviation (e.g., "EST", "JST", "CET"). Changes based on DST.
utc_offset: i32
Current UTC offset in seconds (e.g., -18000 for UTC-5, 32400 for UTC+9)
utc_offset_str: String
Formatted UTC offset string (e.g., "UTC-5", "UTC+9", "UTC+5:30")
latitude: f64
Precise latitude coordinate in decimal degrees (-90 to 90)
longitude: f64
Precise longitude coordinate in decimal degrees (-180 to 180)
currency: String
ISO 4217 currency code (e.g., "USD", "JPY", "EUR")
continent_code: String
Two-letter continent code (e.g., "NA" for North America, "AS" for Asia, "EU" for Europe)
continent_name: String
Full continent name (e.g., "North America", "Asia", "Europe")
is_eu: bool
Whether the location is in a European Union member state
dst_active: bool
Whether daylight saving time is currently active for this location
Examples
use genom; let place = genom::lookup(40.7128, -74.0060).unwrap(); println!("City: {}", place.city); println!("Country: {}", place.country_name); println!("Timezone: {} ({})", place.timezone, place.timezone_abbr); println!("Currency: {}", place.currency); println!("EU Member: {}", place.is_eu);
Struct Location
pub struct Location { pub latitude: f64, pub longitude: f64, }
A coordinate pair with distance calculation capabilities.
This is a simple wrapper around latitude and longitude coordinates that provides utility methods for geographic calculations.
Fields
latitude: f64
Latitude in decimal degrees (-90 to 90)
longitude: f64
Longitude in decimal degrees (-180 to 180)
Implementations
pub fn new(latitude: f64, longitude: f64) → Self
Constructs a new Location from coordinates.
Examples
use genom::Location; let loc = Location::new(40.7128, -74.0060); assert_eq!(loc.latitude, 40.7128); assert_eq!(loc.longitude, -74.0060);
pub fn distance_to(&self, other: &Location) → f64
Calculates the great-circle distance to another location using the haversine formula.
Returns the distance in kilometers. This calculation assumes a spherical Earth with radius 6371 km, which provides accuracy within 0.5% for most distances.
Examples
use genom::Location; let nyc = Location::new(40.7128, -74.0060); let la = Location::new(34.0522, -118.2437); let distance = nyc.distance_to(&la); assert!(distance > 3900.0 && distance < 4000.0); // ~3944 km
Struct CompactPlace
pub struct CompactPlace { pub city: u32, pub region: u32, pub region_code: u32, pub district: u32, pub country_code: u32, pub postal_code: u32, pub timezone: u32, pub lat: i32, pub lon: i32, }
Compressed storage format using string table indices and fixed-point coordinates.
This is the internal storage representation used in the database. All string fields are stored as u32 indices into a shared string table, and coordinates are stored as i32 fixed-point values (multiplied by 100,000).
This reduces memory footprint by approximately 70% compared to storing full Place structs.
Fields
city: u32
Index into the string table for the city name
region: u32
Index into the string table for the region name
region_code: u32
Index into the string table for the region code
district: u32
Index into the string table for the district name
country_code: u32
Index into the string table for the country code
postal_code: u32
Index into the string table for the postal code
timezone: u32
Index into the string table for the timezone identifier
lat: i32
Latitude as fixed-point integer (multiply by 100,000 to get decimal degrees)
lon: i32
Longitude as fixed-point integer (multiply by 100,000 to get decimal degrees)
Implementations
Struct Database
pub struct Database { pub strings: Vec<String>, pub places: Vec<CompactPlace>, pub grid: FxHashMap<(i16, i16), Vec<u32>>, }
The complete spatial database structure with string interning and grid index.
This struct contains all the data needed for geocoding operations. It uses string interning to deduplicate common strings and a spatial grid index for fast coordinate lookups.
Fields
strings: Vec<String>
Deduplicated string table. All string fields in CompactPlace are stored as indices into this vector. Common strings like country codes and timezone names are stored only once.
places: Vec<CompactPlace>
All geographic entries in compressed format. Each entry contains indices into the string table and fixed-point coordinates.
grid: FxHashMap<(i16, i16), Vec<u32>>
Spatial index mapping grid cells to place indices. The world is divided into 0.1° × 0.1° cells (~11km at equator). Each cell contains a vector of indices into the places vector.
Uses FxHashMap (from rustc-hash) for faster hashing of integer keys compared to the standard library's HashMap.
Spatial Indexing Strategy
The grid divides the world into 0.1° × 0.1° cells. For a lookup:
- Quantize the input coordinates to a grid key:
(lat * 100000 / 10000, lon * 100000 / 10000) - Search the target cell and 8 neighboring cells (3×3 grid)
- Calculate haversine distance to all candidates in these cells
- Return the nearest place
This provides O(1) average-case lookup with a small constant factor (typically 10-50 candidates to check).
Struct PlaceInput
pub struct PlaceInput<'a> { pub city: &'a str, pub region: &'a str, pub region_code: &'a str, pub district: &'a str, pub country_code: &'a str, pub postal_code: &'a str, pub timezone: &'a str, pub latitude: f64, pub longitude: f64, }
Input structure for the enrich_place function.
This struct contains the basic geographic data that will be enriched with additional computed fields (country name, currency, continent, timezone details, etc.).
Uses borrowed string slices to avoid unnecessary allocations during the enrichment process.
Fields
city: &'a str
City name
region: &'a str
Region/state name
region_code: &'a str
Region code
district: &'a str
District/county name
country_code: &'a str
ISO country code
postal_code: &'a str
Postal/ZIP code
timezone: &'a str
IANA timezone identifier
latitude: f64
Latitude coordinate
longitude: f64
Longitude coordinate
Function lookup
pub fn lookup( latitude: f64, longitude: f64 ) → Option<Place>
Performs reverse geocoding on the given coordinates, returning enriched place data if found.
Conceptual Role
This is the primary entry point for all geocoding operations. It abstracts away database access, spatial indexing, and data enrichment into a single call.
What This Function Does
- Accesses the global geocoder singleton (lazy initialization on first call)
- Performs grid-based spatial lookup to find nearest place
- Enriches raw data with timezone, currency, and regional information
- Returns
Noneif no place found within search radius
Thread Safety
This function is thread-safe and can be called concurrently from multiple threads. The underlying database is initialized once and shared via a static OnceLock.
Performance
Typical lookup time: <1ms. First call incurs database initialization overhead (~100ms to decompress and load). Subsequent calls are lock-free reads.
Examples
use genom; // Tokyo coordinates if let Some(place) = genom::lookup(35.6762, 139.6503) { println!("{}, {}", place.city, place.country_name); println!("Timezone: {}", place.timezone); println!("Currency: {}", place.currency); } // Ocean coordinates return None assert!(genom::lookup(0.0, -160.0).is_none());
Function enrich_place
pub fn enrich_place(input: PlaceInput) → Place
Enriches basic place data with computed fields.
This function takes a PlaceInput containing basic geographic information and returns a fully enriched Place with additional computed fields.
Enrichment Process
- Timezone Parsing: Parses the IANA timezone to extract current offset, abbreviation, and DST status using
chrono-tz - Country Lookup: Maps country code to full country name using static hash map
- Currency Lookup: Maps country code to ISO 4217 currency code
- Continent Lookup: Maps country code to continent code and name
- EU Status: Checks if country is an EU member state
Static Data Sources
All enrichment data is stored in static LazyLock<FxHashMap> instances:
COUNTRY_NAMES- 200+ country code to name mappingsCOUNTRY_CURRENCIES- 200+ country code to currency mappingsCOUNTRY_CONTINENTS- 200+ country code to continent mappingsCONTINENT_NAMES- 7 continent code to name mappingsEU_COUNTRIES- 27 EU member states
DST Detection
DST status is determined by comparing the current UTC offset with the minimum offset observed in January and July. If the current offset differs from the minimum, DST is active.
Examples
use genom::enrichment::{enrich_place, PlaceInput}; let input = PlaceInput { city: "New York", region: "New York", region_code: "NY", district: "New York County", country_code: "US", postal_code: "10001", timezone: "America/New_York", latitude: 40.7128, longitude: -74.0060, }; let place = enrich_place(input); assert_eq!(place.country_name, "United States"); assert_eq!(place.currency, "USD"); assert_eq!(place.continent_name, "North America"); assert_eq!(place.is_eu, false);
Performance & Implementation Details
Memory Layout
The database uses a highly optimized memory layout:
- String Interning: Common strings stored once, reducing memory by ~60%
- Fixed-Point Coordinates: 32-bit integers instead of 64-bit floats, reducing coordinate storage by 50%
- Spatial Grid: O(1) lookup with small constant factor (10-50 candidates)
Initialization
The database is embedded in the binary at compile time using include_bytes!. On first access:
- Binary data is decompressed using bincode
- Database struct is deserialized (~100ms)
- Reference is stored in static
OnceLock - All subsequent accesses are instant
Thread Safety
After initialization, all operations are read-only and require no synchronization. Multiple threads can perform lookups concurrently without contention.
Build Process
The database is built from GeoNames data during cargo build:
- Download GeoNames cities dataset
- Parse and filter entries
- Build string table with deduplication
- Create spatial grid index
- Serialize to binary format
- Embed in compiled binary