Finding Urban Locality data for India

Published

July 14, 2024

There was a need to move away from Google Maps for locality data in India due to increase in Google Maps geocoding pricing. We could find very good Village data from Local Government Directory. But the urban data still consisted of government bodies in urban areas, but not the new and popular names they are referred with. For example, we may not find HSR Layout (Bangalore)

Data Sources considered

This needs to be expanded. WIP ##### India postcode data https://www.data.gov.in/resource/all-india-pincode-directory-till-last-month This contains only the postal codes. If an urban area does not have a post office, we cannot find it in this dataset

import pandas as pd
from pathlib import Path

postcodes = pd.read_csv(Path.home()/'Downloads/pincode.csv', low_memory=False)
print(f'postcodes shape - {postcodes.shape}')
print(f'number of unique postcodes - {postcodes["Pincode"].nunique()}')
postcodes.head(3)
postcodes shape - (157126, 11)
number of unique postcodes - 19300
CircleName RegionName DivisionName OfficeName Pincode OfficeType Delivery District StateName Latitude Longitude
0 Andhra Pradesh Circle Kurnool Region Hindupur Division Peddakotla B.O 515631 BO Delivery ANANTAPUR ANDHRA PRADESH 14.5689 77.85624
1 Andhra Pradesh Circle Kurnool Region Hindupur Division Pinnadhari B.O 515631 BO Delivery ANANTAPUR ANDHRA PRADESH 14.5281 77.857014
2 Andhra Pradesh Circle Kurnool Region Hindupur Division Yerraguntapalle B.O 515631 BO Delivery ANANTAPUR ANDHRA PRADESH 14.561111 77.85715

Google Maps vs Ola Maps

When we search for ‘Goregoan, Mumbai’ in https://developers.google.com/maps/documentation/geocoding/overview#geocoding-requests, it is identified as a sublocality.

Geocoding Result

Let us search for it in Ola Maps

import os
import requests
import json
api_key = os.environ.get('OLA_MAPS_API_KEY')

address = 'Goregoan, Mumbai'

url = "https://api.olamaps.io/places/v1/geocode"
params = {
    "address": address,
"language": "EN",
    "api_key": api_key
}

response = requests.get(url, params=params)
resp = response.json()

print(f'num results - {len(resp["geocodingResults"])}')
for idx in range(len(resp["geocodingResults"])):
    print(f'{idx+1} - {resp["geocodingResults"][idx]["formatted_address"]}')

# print formatted json
# print(json.dumps(resp['geocodingResults'][0], indent=4, sort_keys=True))
num results - 5
1 - Lupin, Mantri Park, Goregoan East, Lupin, IGIDR, Nagri Niwara CoOperative Housing Society, Goregaon, Mumbai, Maharashtra, 400065, India
2 - Goregoan West, Teen Dongari, Prem Nagar, Goregaon West, Mumbai, Maharashtra, 400104, India
3 - Goregoan East, Peru Baug, Churi Wadi, Goregaon, Mumbai, Maharashtra, 400063, India
4 - Parking Goregoan, Laxmi Rd, Ganesh Nagar, Goregaon, Mumbai, Maharashtra, 400065, India
5 - Goregoan Properties, 138, 4, Jawahar Nagar Rd, No 9, Goregaon West, Mumbai, Maharashtra, 400104, India

We can see that Ola Maps returns specific addresses with matching names, but not the sublocality. We can retrieve sublocality from the returned components, but if there are no matching addresses, we will miss out sub sub-localities