Select Country
GoSearch the Entire Site
SearchWe generate our Statistical Risk Assessment by identifying historical instances of mass killing, discerning patterns that distinguished countries that experienced mass killing from others, and then applying that model to the latest publicly available data to estimate the likelihood of a new mass killing in each of more than 160 countries.
Photo above: Bangladeshi policemen fire tear gas. Getty Images/Farjana K. Godhuly.
We generate our Statistical Risk Assessment with a statistical modeling approach involving five steps:
As of the 2017–18 assessment, the “winning” algorithm in our tests, which we employ, is a logistic regression model with “elastic-net” regularization. This model is given a set of about 30 variables, though it also chooses among these variables (automatically), resulting in a reduced set of fewer than 20. Based on the model, factors associated with greater risk of mass killing include history of mass killing, large population size, high infant mortality, ethnic fractionalization, high battle-related deaths, ban on opposition parties, existence of politically motivated killings, lack of freedom of movement, repression of civil society, coup attempts within the last five years, and anocratic regime type (i.e., neither full democracy nor full autocracy).
We assessed the accuracy of this model in ways that mimicked how we use its results: we built our model on data from a period of years and then tested its accuracy on data for later years (i.e., we conducted out-of-sample testing). Our results indicate that eight out of every ten countries that later experienced a new onset of mass killing had risk estimates of greater than 4 percent (which usually meant they were among the 30 top-ranked countries in a given year). We expect to release a technical paper later in 2018 that more fully characterizes the accuracy of our model based on the types of statistics usually used to assess such models.
Note: Because our data and methods changed in 2017, risk estimates and rankings from 2014 through 2016 should not be compared directly with results from 2017 onward.
Previous iterations of our Statistical Risk Assessment assessed risk of state-led mass killing only. Now we train our statistical model on episodes of state-led and non-state-led mass killing, meaning the results reflect the likelihood of mass killing by either type of perpetrator.
Our risk assessment relies on publicly available data on a variety of country characteristics. In 2017, we updated our data sources as new datasets became available—for example, measures of civil liberties and government repression. We also took special care to avoid using data that, in our judgment, could be susceptible to bias when coded or recoded retrospectively. This should help ensure that our model performs as well in practice as it does on historical data.
Our updated model forecasts events that could occur anytime in the two calendar years following the year in which our risk factors are measured. Previously, our forecasts were focused on the following one year. We believe the two-year forecasts are a better match for the time required to conduct additional analysis and planning, and implement preventive actions while they are still timely. They are also statistically preferable because “rare events” make forecasting much more difficult, but onsets are less rare in two-year windows than in single-year windows.
We sought to leverage statistical procedures that have shown promise on similar problems, using a process of model development and selection that follows best practices for political forecasting. This meant comparing different models on a task that closely mimicked true forecasting in practice. We identified one model whose accuracy compared well with alternatives (including an average of multiple models) and whose results could be interpreted with relative ease.
Previously, the project used an average of forecasts from three models representing different ideas about the origins of mass atrocities.
Drawing on work by Barbara Harff and the Political Instability Task Force, the first model emphasized characteristics of countries’ national politics that hint at a predilection to commit genocide or “politicide,” especially in the context of political instability. Key risk factors in Harff’s model include authoritarian rule, the political salience of elite ethnicity, evidence of an exclusionary elite ideology, and international isolation as measured by trade openness. We refer to this model as the "bad regime" model.
The second model took a more instrumental view of mass killing. It used statistical forecasts of future coup attempts and new civil wars as proxy measures of factors that could either spur incumbent rulers to lash out against threats to their power or usher in an insecure new regime that might do the same. We refer to this model as the "elite threat" model.
The third model was a machine-learning process called Random Forests applied to the risk factors identified by the other two models. The resulting algorithm was an amalgamation of theory and induction that took experts’ beliefs about the origins of mass killing as its jumping-off point. It also left more room for inductive discovery of contingent effects.
To get our single-best risk assessment, we averaged the forecasts from these three models. We preferred the average to a single model’s output because we knew from work in many fields—including meteorology and elections forecasting—that this “ensemble” approach generally produces more accurate assessments than we could expect to get from any one model alone. By averaging the forecasts, we learned from all three perspectives while hedging against the biases of any one of them.
from the Early Warning Project and the Simon-Skjodt Center for the Prevention of Genocide