All projects
MaintainedJan 2026 – Apr 2026

Dhaka Restaurant Directory

Mapping every restaurant in Dhaka I could find — pulled from open geographic data, cleaned, and shipped as a live directory.

  • Python
  • SQL
  • APIs
  • Data Analytics
  • Web Deployment
Dhaka Restaurant Directory

01 — Problem

Why this project

Dhaka has no central, searchable directory of restaurants. Google Maps is incomplete; Foodpanda only shows places it delivers from; restaurant-discovery apps are paywalled or limited to the wealthier neighbourhoods.

I wanted to know what was actually out there — by area, by cuisine, by price band. The directory had to be free, transparent about its sources, and fast.

02 — Approach

How I tackled it

  1. 01

    Pull all `amenity=restaurant`, `amenity=cafe`, and `amenity=fast_food` nodes from OpenStreetMap inside a bounding box covering Greater Dhaka.

  2. 02

    Deduplicate aggressively — OSM contributors often submit the same place under multiple node IDs, sometimes years apart, sometimes with different spellings of the same name in English and Bengali.

  3. 03

    Normalise cuisine tags into a clean taxonomy (Bengali, Chinese, Continental, Thai, etc.) using a hand-curated mapping.

  4. 04

    Enrich with neighbourhood and thana lookups using a separate OSM administrative-boundary query.

  5. 05

    Ship as a static directory with search, filter, and a Folium-rendered map. Hosted on Vercel; rebuilds nightly.

03 — Data sources

Where the data came from

SourceViaRows
OpenStreetMap nodesOverpass API{{TODO: real count}}
OSM admin boundaries (Dhaka thanas)Overpass API~140 polygons
Cuisine taxonomyHand-curated CSV (50+ tags → 14 categories)

04 — Pipeline

End-to-end flow

  1. 01

    Overpass query

    amenity=restaurant|cafe|fast_food in Dhaka bbox

  2. 02

    Raw JSON dump

    stored locally for reproducibility

  3. 03

    Deduplicate

    Levenshtein + geo proximity within 25m radius

  4. 04

    Normalise cuisine tags

    via curated mapping CSV

  5. 05

    Reverse-geocode thanas

    point-in-polygon against admin boundaries

  6. 06

    SQLite + GeoJSON

    single artefact per build

  7. 07

    Static site build

    Folium map, Jinja2 list pages

  8. 08

    Deploy to Vercel

    nightly rebuild via cron

05 — Code

A key snippet

Overpass query and dedupe heuristic

snippet.pythonpython
import overpy, geopy.distance as gd
from rapidfuzz import fuzz

api = overpy.Overpass()
QUERY = """
[out:json][timeout:60];
area["name"="Dhaka"]->.dhaka;
(
  node["amenity"~"restaurant|cafe|fast_food"](area.dhaka);
);
out body;
"""

def is_duplicate(a, b, name_thresh=88, distance_m=25):
    name_ok = fuzz.ratio(a.tags.get("name", ""),
                         b.tags.get("name", "")) >= name_thresh
    geo_ok  = gd.distance((a.lat, a.lon), (b.lat, b.lon)).m <= distance_m
    return name_ok and geo_ok

nodes = api.query(QUERY).nodes
print(f"raw count: {len(nodes)}")
# ... pairwise dedupe (kdtree-bucketed in the real script) ...

06 — Results

What it shipped

MetricValue
Restaurants in directory{{TODO}}
Duplicate rate before dedupe{{TODO}}%
Cuisine categories14
Thanas covered{{TODO}}
Build time end-to-end{{TODO}} sec

Caveat: Coverage is uneven — wealthier areas (Gulshan, Banani, Dhanmondi) are over-represented because OSM contribution density tracks tech adoption. The Old Dhaka coverage is improving but still patchy.

07 — Lessons

What I learned

  • Eighty percent of the work in any open-data project is deduplication. The interesting part shows up only after that.

  • Bengali ↔ English name reconciliation needs a transliteration step, not just string-matching.

  • OSM is a fantastic free starting point but assumes you can verify a sample against ground truth. I spot-checked 50 entries on foot.

  • Shipping nightly was the right call — keeps the directory fresh without me touching it.

08 — Links

References