Accurate disaster data is crucial for saving lives and reducing economic losses, but global databases often contain messy, unstructured location information that hampers effective analysis. This is addressed by a new AI-driven that automatically converts free-text disaster reports into precise geographic coordinates, making it easier for researchers and policymakers to understand and respond to hazards. By leveraging large language models, this approach eliminates the need for time-consuming manual geocoding, which has traditionally limited the scalability and consistency of disaster risk assessments.
The researchers developed a fully automated workflow that processes disaster location data from the EM-DAT database, covering events from 2000 to 2024. Using GPT-4o, the system interprets and structures raw text descriptions—such as 'Dhaka district, Bangladesh'—into a standardized hierarchical format, correcting typos and inconsistencies along the way. This structured data is then cross-referenced against three independent geographic sources: GADM for administrative boundaries, OpenStreetMap for detailed maps, and Wikidata for entity information. assigns a reliability score based on how well these sources agree, ensuring that locations with multiple confirmations are flagged as highly trustworthy.
In terms of , the workflow successfully geocoded 14,215 disaster events across 17,948 unique locations, achieving 92% coverage of records with location data. The analysis revealed that over 55% of events were mapped at the regional level (Admin1), 34% at the district level (Admin2), and nearly 11% at the municipal level (Admin3), with variations by disaster type—for instance, climatological events were mostly reported at broader scales, while industrial accidents had finer details. Spatial comparisons showed strong agreement with manually geocoded benchmarks, with over 97% of AI-assigned units intersecting the reference areas, and reliability scores indicated high cross-source consistency, such as an average 86.5% boundary overlap between GADM and OpenStreetMap.
Of this advancement are significant for disaster risk reduction, as it enables more accurate integration with other datasets like population maps or infrastructure layers, supporting targeted interventions under frameworks like the Sendai Framework. For example, identifying hotspots in regions like Uttar Pradesh or Sichuan can help prioritize resources for flood or earthquake preparedness. Moreover, 's scalability allows for real-time updates and application to other databases, potentially improving global monitoring of compound hazards without the delays of manual processing.
However, the approach has limitations, primarily due to inconsistencies in the original data sources, where reports may contain errors or ambiguities that even AI cannot fully resolve. The paper notes that human oversight remains essential for complex cases, and future improvements could involve combining multiple AI models or incorporating additional data streams to enhance accuracy. Despite these s, the workflow represents a major step forward in making disaster data more accessible and reliable for practical use.
Original Source
Read the complete research paper
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn