News — 22 October, 2024
Will OpenStreetMap meet your needs? HOT’s move towards more data insights
In the spirit of FAIR data (findable, accessible, interoperable, and reusable) and with support from the H2H Network, HOT is improving the reusability of OpenStreetMap (OSM) data through improved metadata of its datasets.
If a user needs a quick answer to the question, “what is the best dataset in the Humanitarian Data Exchange”, HDX’s Data Grids has a recommended source for all countries with Humanitarian Response Plans for six spatial datasets (administrative divisions, populated places, roads, airports, and health and education facilities).
At Humanitarian OpenStreetMap Team (HOT), we often get asked, “how good is OpenStreetMap data” in a given area? With better information on a dataset’s spatial coverage and attribute completeness, users can make quicker and more efficient decisions on what data source to use.
The two most common data sources for spatial datasets are HOT and OCHA. HOT provides data exports directly from OpenStreetMap. OSM datasets are crowdsourced and community generated. OCHA data comes from a variety of sources and is usually created by a single entity. The table below shows the top recommended data source, and its completness per country and dataset, according to the Data Grids.
A dataset is considered “complete” by HDX if it’s:
- In a readable format
- Covers the whole country
- Follows an update cycle
Country | Administrative Divisions | Populated Places | Roads | Airports | Health Facilities | Education Facilities |
---|---|---|---|---|---|---|
Afghanistan | OCHA | HOT | HOT | HOT | OCHA | OCHA |
Burkina Faso | OCHA | OCHA | OCHA | HOT | HOT | HOT |
Cameroon | OCHA | OCHA | HOT | HOT | HDX | HOT |
CAR | OCHA | OCHA | HOT | OCHA | HDX | OCHA |
Chad | OCHA | OCHA | OCHA | OCHA | HDX | HOT |
Colombia | OCHA | OCHA | OCHA | OurAirports | OCHA | OCHA |
Democratic Congo | OCHA | HOT | HOT | HOT | HOT | HOT |
Ethiopia | OCHA | OCHA | OCHA | OCHA | HDX | OCHA |
Haiti | OCHA | HOT | HOT | HOT | OCHA | OCHA |
Mali | OCHA | OCHA | HOT | HOT | OCHA | OCHA |
Mozambique | OCHA | HOT | OCHA | HOT | WHO | OCHA |
Myanmar | MIMU | MIMU | MIMU | MIMU | HOT | MIMU |
Niger | OCHA | OCHA | OCHA | OCHA | OCHA | OCHA |
Nigeria | OCHA | OCHA | OCHA | HOT | HDX | HOT |
Somalia | OCHA | OCHA | HOT | OCHA | WHO | OCHA |
South Sudan | OCHA | OCHA | OCHA | OCHA | HDX | IOM |
State of Palestine | OCHA | OCHA | OCHA | OCHA | OCHA | OCHA |
Sudan | OCHA | OCHA | OCHA | OCHA | HDX | OCHA |
Syrian Arab Republic | OCHA | OCHA | WFP | HOT | HOT | HOT |
Ukraine | OCHA | OCHA | HOT | OCHA | ||
Venezuela | OCHA | OCHA | HOT | HOT | HOT | |
Yemen | OCHA | OCHA | OCHA | OCHA | OCHA | HOT |
■■ Complete, ■■ Incomplete
50% of the recommended spatial datasets above are considered “complete”. Of the 35 HOT datasets recommended in the Data Grids, 30 are identified by HDX as “incomplete”.
Is an “incomplete” dataset still usable? And what about its attributes (OSM tags)? It depends on the use case. Based on user feedback, we’ve developed resources to better understand OSM data before downloading.
Starting with roads and populated place datasets, we surveyed OSM data users for the most important factors they consider when determining if a dataset is usable. We found that spatial coverage is the most important factor, especially for populated places, while precision and attributes also remain important. Click on elements of the graph to test different view options.
In the same survey, we found feature categorization was the most important attribute (OSM tag) for both datasets.
How this will help data users?
HOT has created a data quality report as a prototype of how automate the assessment of OSM data quality and completeness that can use similar datasets in HDX or AI generated ones as benchmark.
In the report, we compare HOT/OSM and OCHA data sources (as the two most popular data sources for spatial datasets in HDX) for greater context. In general, we found HOT data has larger spatial coverage, but less completed attributes. Here we share two country-specific examples of these findings, but we encourage you to check out the full report to view similar comparisons!
Example 1. Country: Somalia, Dataset: Populated Places
- Coverage: HOT has 80% more populated places than OCHA, with HOT’s 57,822, to OCHA’s 11,283.
- Name: OCHA’s dataset has significantly more places with names. 13% of the places from HOT’s dataset have a name, vs. 100% in OCHA.
⠀“Type”: HOT’s OSM dataset categorizes points as isolated dwellings, hamlet, and village, while OCHA’s dataset is mostly classified as settlements, with very few nomadic settlements and IDP camps.
Example 2. Country: South Sudan, Dataset:Roads
- Coverage
- HOT has 99% more features than OCHA, with 176,345 road features in OSM vs 976 in OCHA.
- Despite having significantly less individual features, OCHA has decent spatial coverage with 69% of HOT’s coverage (43,164 kms vs 137,476 kms).
- “Type”
- OCHA only has river and road for transportation “type”, while HOT’s OSM dataset has path, track, residential, tertiary, and primary, secondary, footway, etc.
What’s next?
Based on these insights, we are currently working on adding additional data quality statistics to the metadata of our OSM exports in HOT’s Export Tool and HDX. Attributes like feature name, spatial coverage, or how can a “type” of feature (primary road, capital city, etc.) be used for humanitarian purposes are some of the top considerations when deciding to use a dataset, so we strive to make that information more understandable when reviewing an OSM data export.
With clearer OSM data insights, humanitarians can spend more time on analyzing data and making informed decisions than on figuring out what dozens of data source with poor metadata are actually about. Organizations planning aid delivery will be able to identify crisis impacted populated centers, and road transportation to reach an impacted population and deliver aid.
Excited for these new resources on OSM insights? Have a request? Don’t hesitate to contact us at data@hosm.org.
Sponsored by: