πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Design a Proximity Service

System Design ProblemsLocation-Based Search🟒 Free Lesson

Advertisement

System Design Problems

Design a Proximity Service

A proximity service finds nearby points of interest (restaurants, gas stations, ATMs) based on a user's geographic location. The system must efficiently query millions of locations within a given radius using geospatial data structures.

  • Nearby Search β€” Find all businesses within a radius
  • Geospatial Indexing β€” Efficient range queries on lat/lng coordinates
  • Real-time Updates β€” Business locations change; index must stay current

The core challenge is efficiently querying a 2D space. A naive approach checks every location (O(N)), but geospatial indexes reduce this to O(log N) or O(1) for bounded regions.

Requirements

Functional Requirements

  • Search for businesses near a location (lat, lng, radius)
  • Return results sorted by distance
  • Business details (name, address, hours, rating)
  • Search by category (restaurants, cafes, etc.)
  • Add/update/remove business locations
  • Filter by open hours, rating, price range

Non-Functional Requirements

  • Latency: Search results in < 200ms
  • Scale: 100M businesses, 10K QPS
  • Accuracy: Results must be within specified radius
  • Freshness: Location updates reflected within 1 minute
  • Availability: 99.99%

Geospatial queries are fundamentally different from text search. The data is 2D (latitude/longitude), requiring special indexes like geohashing, quad-trees, or R-trees.

Back-of-the-Envelope Estimation

Proximity Service Capacity

  • 100M businesses Γ— 500 bytes = 50 GB metadata
  • Search QPS: 10K (reads), 100 (writes)
  • 1 km radius in NYC: ~1000 businesses
  • 1 km radius in rural area: ~5 businesses
  • Average search radius: 5 km
  • Geohash precision (6 chars): ~1.2 km Γ— 0.6 km cells

Geospatial Indexing

Geohashing

DfGeohash

A geohash encodes a 2D latitude/longitude coordinate into a 1D string of characters. Longer strings represent smaller areas. Locations that share a geohash prefix are geographically close.

9q8yy9q8yz9q8y09q8y19q8yk9q8ym9q8yn9q8ypSearch RadiusGeohash grid with search radius (dashed circle)

Geohash Encoding

geohash=base32_encode(interleave(lat_bits,lng_bits))geohash = \text{base32\_encode}(\text{interleave}(\text{lat\_bits}, \text{lng\_bits}))

Here,

  • latbitslat_bits=Binary representation of latitude
  • lngbitslng_bits=Binary representation of longitude
  • interleaveinterleave=Alternating bits from lat and lng

Geohash Precision

Geohash "9q8yy" (5 chars): ~5 km Γ— 5 km area Geohash "9q8yy9" (6 chars): ~1.2 km Γ— 0.6 km area Geohash "9q8yy9u" (7 chars): ~153 m Γ— 153 m area Geohash "9q8yy9u4" (8 chars): ~38 m Γ— 19 m area

For nearby search, use 5-6 character geohashes to find candidate cells.

Query Algorithm

DfGeohash Range Query

To find nearby locations:

  1. Compute geohash of the search center
  2. Compute geohashes of 8 neighboring cells
  3. Query the database for all locations in these cells
  4. Filter by actual distance (Haversine formula)
  5. Return results within the radius, sorted by distance

Haversine Distance

d=2rΓ—arcsin⁑(sin⁑2Δϕ2+cos⁑ϕ1cos⁑ϕ2sin⁑2Δλ2)d = 2r \times \arcsin\left(\sqrt{\sin^2\frac{\Delta\phi}{2} + \cos\phi_1 \cos\phi_2 \sin^2\frac{\Delta\lambda}{2}}\right)

Here,

  • rr=Earth's radius (6371 km)
  • phi1,phi2phi_1, phi_2=Latitude of points 1 and 2 (radians)
  • DeltaphiDeltaphi=Difference in latitude
  • DeltalambdaDeltalambda=Difference in longitude

For very nearby points (< 1 km), use the simpler Pythagorean approximation: distance β‰ˆ √((Ξ”lat Γ— 111)Β² + (Ξ”lng Γ— 111 Γ— cos(lat))Β²). This avoids expensive trig functions.

High-Level Architecture

ClientCDNProximity ServiceGeohash EncoderNeighbor FinderDistance CalculatorResult RankerCache ManagerRedisGeohash IndexPostgreSQL+ PostGISElasticsearchgeo_pointProximity Service Architecture

Database Schema

CREATE TABLE businesses (
    id          UUID PRIMARY KEY,
    name        VARCHAR(200),
    address     TEXT,
    latitude    DECIMAL(10, 8),
    longitude   DECIMAL(11, 8),
    geohash     VARCHAR(12),
    category    VARCHAR(50),
    rating      DECIMAL(2, 1),
    open_hours  JSONB,
    created_at  TIMESTAMP
);

CREATE INDEX idx_geohash ON businesses(geohash);
CREATE INDEX idx_category ON businesses(category);

PostGIS provides native geospatial support in PostgreSQL. Use ST_DWithin for efficient radius queries and CREATE INDEX with GiST for geospatial indexing.

Caching Strategy

Cache popular geohash cells:

  • Cache key: geo:{geohash_6}:{category}
  • Cache value: List of business IDs sorted by rating
  • TTL: 5 minutes (balance freshness vs. performance)
  • Cache hit ratio target: > 80% for popular areas

Pre-compute and cache results for popular geohash cells (city centers, tourist areas). This reduces database load for the most frequent queries.

Practice Exercises

  1. Algorithm: Implement geohash encoding from latitude/longitude. What is the precision at 6 characters?

  2. Scale: If 100M businesses are distributed globally, estimate the Redis memory needed for geohash indexes at 6-character precision.

  3. Optimization: How would you handle a search in a sparse area (no businesses within 5 km)? Design a strategy to expand the search radius dynamically.

  4. Real-time: How would you update the geohash index when a business moves to a new location? Design an atomic update strategy.

Key Takeaways:

  • Geohashing encodes 2D coordinates into 1D strings for efficient range queries
  • Nearby search: query center cell + 8 neighbors, then filter by Haversine distance
  • Geohash precision (6 chars) provides ~1.2 km cells for efficient neighborhood queries
  • Redis geohash indexes enable sub-millisecond nearby lookups
  • PostGIS provides native geospatial support for complex radius queries

What to Learn Next

-> Design Google Maps Routing, traffic, and map tile infrastructure.

-> Database Indexing B-tree, GiST, and geospatial index structures.

-> Caching Strategies Caching geohash cells and popular search results.

-> Design Search Autocomplete Location-based typeahead suggestions.

-> Databases PostGIS and geospatial database support.

-> Consistent Hashing Geohash-based data distribution.

⭐

Premium Content

Design a Proximity Service

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert System Design Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement