Data helps solve development challenges through multiple channels. It is the foundation of the research that informs the decisions of policy makers and development practitioners. It can serve as a spur to policy reform efforts by providing benchmarks and examples of best practice. And it is a critical input into the World Bank’s own lending and policy advice.
Development Research Group: The World Bank's Data Incubator
The Development Research Group is an incubator of many of the World Bank's most cited datasets. Below is a selection of the datasets produced or co-produced by the Group. Additional datasets can be found in the Microdata Catalog or by visiting the websites of our researchers.
DATASETS & ANALYTICAL TOOLS BY TOPIC
FINANCE AND PRIVATE SECTOR DEVELOPMENT
The Bank Regulation and Supervision Survey is a unique source of comparable worldwide data on how banks are regulated and supervised around the world.
This comprehensive database provides detailed information on deposit insurance schemes across the world.
The data collection for entrepreneurship project was completed in June 2017. To measure entrepreneurial activity, annual data was collected directly from 143 company registrars on the number of newly registered firms.
This database of indicators of financial development and structure across countries and over time includes a range of indicators (31 indicators in total), starting from 1960, that measure the size, activity, and efficiency of financial intermediaries and markets.
The Global Financial Development Database is an extensive dataset of financial system characteristics for 203 economies.
The Global Findex database, the world’s most comprehensive database on financial inclusion, provides in-depth data on how individuals save, borrow, make payments, and manage risks.
Patterns of educational attainment vary greatly across countries, and across population groups within countries. In some countries, virtually all children complete basic education whereas in others large groups fall short. The primary purpose of this research is to document and analyze these differences using a compilation of a variety of household-based data sets.
Most health systems aspire to deliver health services to people who need them, without causing financial hardship for the families involved. The Health Equity and Financial Protection Indicators (HEFPI) dataset allows you to answer questions about how close health systems around the world come to achieving this goal of universal health coverage.
The Human Capital Index (HCI) database provides data at the country level for each of the components of the Human Capital Index as well as for the overall index, disaggregated by gender. The index measures the amount of human capital that a child born today can expect to attain by age 18, given the risks of poor health and poor education that prevail in the country where she lives.
The World Bank Human Capital Index (HCI) is based on the productivity gains of future workers from human capital accumulation. But in many developing countries, a sizeable fraction of people are not employed, or are in jobs in which they cannot fully use their skills and cognitive abilities to increase their productivity. The Utilization-adjusted Human Capital Indices (UHCIs) adjust the HCI for labor-market underutilization of human capital
"A Toolkit for Informality Scenario Analysis” is a spreadsheet-based toolkit that allows researchers and practitioners to project the size of formal and informal labor in almost 100 countries yearly until 2030.
The Long Term Growth Model (LTGM) is an Excel-based tool to analyze long-term growth scenarios building on the celebrated Solow-Swan Growth Model. The tool can also be used to assess the implications of growth (and changes in inequality) for poverty rates.
The Worldwide Governance Indicators (WGI) project reports aggregate and individual governance indicators for over 200 countries and territories over the period 1996–2016 for six dimensions of governance.
The GDIM provides estimates of intergenerational mobility covering 148 economies for cohorts between 1940 and 1989. This translates to a world population coverage of 96 percent.
The Poverty and Inequality Platform (PIP) is an interactive computational tool that offers users quick access to the World Bank’s estimates of poverty, inequality, and shared prosperity. PIP provides a comprehensive view of global, regional, and country-level trends for more than 160 economies around the world.
PovMap 2.0 provides computational solutions to all stages of poverty mapping activities. It uses a proprietary data engine to ensure the speedy processing of census data.
This database contains a balanced and unbalanced panel of country-decile groups covering the twenty year period 1988-2008, expressed in a common currency and prices. The database allows comparisons of average incomes by decile both across time and across countries.
These data reproduce Figure III in the paper entitled Subways and CO2 emissions: A global analysis with satellite data. The table presents social rate of return results for future subways in three scenarios for 1,214 Functional Urban Areas. The three scenarios incorporate different assumptions about the Social Cost of Carbon (SCC) and per-km unit cost of subway installation (UC). The table labels the three scenarios “Pessimistic” (SCC, US$50 per ton; UC, US$280 million per km);” "Mid-Range” (SCC, US$100 per ton; UC, US$200 million per km); and “Optimistic” (SCC, US$150 per ton; UC, US$140 million per km).
This dataset contains annual change estimates for Functional Urban Areas (FUAs) from Schiavina et al. (2019) by statistical significance class, providing information that links grid cells in the World Bank’s global XCO2 database to IDs for FUAs and national administrative units.
To better track carbon pledges and support mitigation finance, researchers have developed a new database and web facility that place carbon data at the user’s fingertips. The database meets the urgent need to track carbon pledges by providing objective, spatially-referenced, frequently-updated information for tracking CO2 trends in local areas and regions. With gridded information at 25 km resolution, the system can support analyses for states, provinces, urban areas and project areas.
This dataset and paper identifies 775 high-priority areas for methane emissions reduction, using a GIS analysis of information from the Emissions Database for Global Atmospheric Research (EDGAR). It estimates recent emissions changes in those areas using atmospheric concentration data from the European Space Agency’s Sentinel-5P satellite platform.
This dataset contains methane emissions estimates for Functional Urban Areas (FUAs) from Schiavina et al. (2019) providing information that links grid cells in the World Bank’s global XCH4 database to IDs for FUAs and national administrative units.
The database includes monthly mean values for methane concentrations and concentration anomalies for the global 5x5 km grid along with the grid cell id, centroid coordinates, year and month.
The database includes monthly mean values for methane concentrations and concentration anomalies for the global 25 km grid along with the grid cell id, centroid coordinates, year and month. The database can support analyses for states, provinces, urban and rural areas, and project areas. It can incorporate user-defined area boundaries, as well as standard boundaries for administrative areas. The system can be a useful tool for World Bank studies, such as the Country Climate and Development Reports; priority area analysis for emissions reduction; and research by global stakeholders on CH4 emissions sources and changes over time.
The Global Species Database provides habitat and endangerment information for over 90,000 terrestrial vertebrates, aquatic vertebrates, plants and invertebrates. The database has three principal features for all ISO3-coded areas (termed “countries” below) in the World Bank database.
(1) All habitat countries for each species, along with percent of habitat in each country. This identifies endemic species by the traditional definition (over 38,000 species have 100% of habitat in one country), as well as providing information to define alerts for countries that have major habitat shares.
(2) Identification of over 2,000 species whose small habitats (less than 25 km2) may be jeopardized by rapid, large-scale development. The database includes total habitat areas, permitting other definitions of critical habitat scale.
(3) IUCN endangerment ratings for over 75,000 species.
The ad valorem equivalent (AVE) of non-tariff measures (NTMs) is the uniform tariff that will result in the same trade impacts on the import of a product due to the presence of the NTMs. This analysis utilizes a reduced sample of NTMs data collected between 2012 and 2016. The data is transformed in a cross-section database spanning about 40 importing countries plus the European Union, about 200 exporting countries.
As the first public database to provide detailed comparable information on the microstructure of trade flows between countries, the Exporter Dynamics Database (EDD) has filled a significant gap in our understanding of the foundations of export growth. The database offers a comprehensive picture of exporter dynamics in both developed and developing countries.
Global matrices of bilateral migrant stocks spanning the period 1960-2000, disaggregated by gender and based primarily on the foreign-born concept are presented. For the first time, a comprehensive picture of bilateral global migration over the last half of the twentieth century emerges.
Non-Tariff Measures (NTM) Database includes major economies like the EU, US, China, and many developing countries. Data on a total of 109 countries were collected by UNCTAD, the World Bank, and other partners.
Global value chains (GVCs) powered the surge of international trade after 1990 and now account for almost half of all trade. This module on GVC uses the data produced as part of The World Development Report 2020 exercise and subsequent updates, in the form of charts and tables to enable users to explore the data and derive meaningful results.
The Household Impacts of Tariffs (HIT) simulation tool enables users to simulate how changes in import tariffs impact the incomes of households across the income distribution.
The Overall Trade Restrictiveness Index (OTRI) summarizes the trade policy stance of a country by calculating the uniform tariff that will keep its overall imports at the current level when the country in fact has different tariffs for different goods. In a nutshell, the OTRI is a more sophisticated way to calculate the weighted average tariff of a given country, with the weights reflect the composition of import volume and import demand elasticities of each imported product.
The World Bank’s Services Trade Restrictions Database aims to facilitate dialogue about, and analysis of, services trade policies. The database provides comparable information on services trade policy measures for 103 countries, five sectors (telecommunications, finance, transportation, retail and professional services), and key modes of delivery.
The possibility of interference between individuals has traditionally been seen as the Achilles heel of randomized experiments, because contamination of the control group by spillover effects generates impact estimates that are internally invalid. This software allows a researcher to explore the statistical power of experiments to identify estimands of treatment and spillover effects when there is interference between units.
Last Updated: Nov 14, 2023