Dhaka Restaurant Directory
Mapping every restaurant in Dhaka I could find — pulled from open geographic data, cleaned, and shipped as a live directory.
- Python
- SQL
- APIs
- Data Analytics
- Web Deployment

01 — Problem
Why this project
Dhaka has no central, searchable directory of restaurants. Google Maps is incomplete; Foodpanda only shows places it delivers from; restaurant-discovery apps are paywalled or limited to the wealthier neighbourhoods.
I wanted to know what was actually out there — by area, by cuisine, by price band. The directory had to be free, transparent about its sources, and fast.
02 — Approach
How I tackled it
- 01
Pull all `amenity=restaurant`, `amenity=cafe`, and `amenity=fast_food` nodes from OpenStreetMap inside a bounding box covering Greater Dhaka.
- 02
Deduplicate aggressively — OSM contributors often submit the same place under multiple node IDs, sometimes years apart, sometimes with different spellings of the same name in English and Bengali.
- 03
Normalise cuisine tags into a clean taxonomy (Bengali, Chinese, Continental, Thai, etc.) using a hand-curated mapping.
- 04
Enrich with neighbourhood and thana lookups using a separate OSM administrative-boundary query.
- 05
Ship as a static directory with search, filter, and a Folium-rendered map. Hosted on Vercel; rebuilds nightly.
03 — Data sources
Where the data came from
| Source | Via | Rows |
|---|---|---|
| OpenStreetMap nodes | Overpass API | {{TODO: real count}} |
| OSM admin boundaries (Dhaka thanas) | Overpass API | ~140 polygons |
| Cuisine taxonomy | Hand-curated CSV (50+ tags → 14 categories) | — |
04 — Pipeline
End-to-end flow
- 01
Overpass query
amenity=restaurant|cafe|fast_food in Dhaka bbox
- 02
Raw JSON dump
stored locally for reproducibility
- 03
Deduplicate
Levenshtein + geo proximity within 25m radius
- 04
Normalise cuisine tags
via curated mapping CSV
- 05
Reverse-geocode thanas
point-in-polygon against admin boundaries
- 06
SQLite + GeoJSON
single artefact per build
- 07
Static site build
Folium map, Jinja2 list pages
- 08
Deploy to Vercel
nightly rebuild via cron
05 — Code
A key snippet
Overpass query and dedupe heuristic
import overpy, geopy.distance as gd
from rapidfuzz import fuzz
api = overpy.Overpass()
QUERY = """
[out:json][timeout:60];
area["name"="Dhaka"]->.dhaka;
(
node["amenity"~"restaurant|cafe|fast_food"](area.dhaka);
);
out body;
"""
def is_duplicate(a, b, name_thresh=88, distance_m=25):
name_ok = fuzz.ratio(a.tags.get("name", ""),
b.tags.get("name", "")) >= name_thresh
geo_ok = gd.distance((a.lat, a.lon), (b.lat, b.lon)).m <= distance_m
return name_ok and geo_ok
nodes = api.query(QUERY).nodes
print(f"raw count: {len(nodes)}")
# ... pairwise dedupe (kdtree-bucketed in the real script) ...
06 — Results
What it shipped
| Metric | Value |
|---|---|
| Restaurants in directory | {{TODO}} |
| Duplicate rate before dedupe | {{TODO}}% |
| Cuisine categories | 14 |
| Thanas covered | {{TODO}} |
| Build time end-to-end | {{TODO}} sec |
Caveat: Coverage is uneven — wealthier areas (Gulshan, Banani, Dhanmondi) are over-represented because OSM contribution density tracks tech adoption. The Old Dhaka coverage is improving but still patchy.
07 — Lessons
What I learned
Eighty percent of the work in any open-data project is deduplication. The interesting part shows up only after that.
Bengali ↔ English name reconciliation needs a transliteration step, not just string-matching.
OSM is a fantastic free starting point but assumes you can verify a sample against ground truth. I spot-checked 50 entries on foot.
Shipping nightly was the right call — keeps the directory fresh without me touching it.
08 — Links