Select Country

Go

Search the Entire Site

Search
Bangladeshi policemen fire tear gas. Getty Images/Farjana K. Godhuly.

Methodology: Statistical Model

We generate our Statistical Risk Assessment by identifying historical instances of mass killing, discerning patterns that distinguished countries that experienced mass killing from others, and then applying that model to the latest publicly available data to estimate the likelihood of a new mass killing in each of more than 160 countries.

Photo above: Bangladeshi policemen fire tear gas. Getty Images/Farjana K. Godhuly.

Share


Methodology for Generating Our Statistical Risk Assessment

We generate our Statistical Risk Assessment with a statistical modeling approach involving five steps:

  1. Identifying historical episodes of state- and non-state-led mass killing (1945–present for state-led, 1989–present for non-state-led)
  2. Compiling data of potential “predictors” or “risk factors”—i.e., characteristics of countries that are thought to be associated with the likelihood of mass killing in the near future—from existing public sources
  3. Training different statistical algorithms on historical data (1945 to 2015) to identify a model that performs well predicting onset of mass killing within the training set
  4. Testing alternative models and selecting one that maximizes accuracy (as measured on a new dataset, not the one used for training the model), while still allowing for useful interpretation of the model
  5. Using current data on countries to make forecasts two years into the future (2016 data is used for the 2017–18 forecasts; 2017 data for the 2018–19 forecasts); this generates an estimated risk (as a percentage chance of onset of mass killing) for each country, and a corresponding ranking

As of the 2017–18 assessment, the “winning” algorithm in our tests, which we employ, is a logistic regression model with “elastic-net” regularization. This model is given a set of about 30 variables, though it also chooses among these variables (automatically), resulting in a reduced set of fewer than 20. Based on the model, factors associated with greater risk of mass killing include history of mass killing, large population size, high infant mortality, ethnic fractionalization, high battle-related deaths, ban on opposition parties, existence of politically motivated killings, lack of freedom of movement, repression of civil society, coup attempts within the last five years, and anocratic regime type (i.e., neither full democracy nor full autocracy).

How accurate is the model?

We assessed the accuracy of this model in ways that mimicked how we use its results: we built our model on data from a period of years and then tested its accuracy on data for later years (i.e., we conducted out-of-sample testing). Our results indicate that eight out of every ten countries that later experienced a new onset of mass killing had risk estimates of greater than 4 percent (which usually meant they were among the 30 top-ranked countries in a given year). We expect to release a technical paper later in 2018 that more fully characterizes the accuracy of our model based on the types of statistics usually used to assess such models.


Important Updates in 2017 Model Revision

Note: Because our data and methods changed in 2017, risk estimates and rankings from 2014 through 2016 should not be compared directly with results from 2017 onward.

Perpetrator Groups

We now estimate the risk of both state-led and non-state mass killing onsets.

Previous iterations of our Statistical Risk Assessment assessed risk of state-led mass killing only. Now we train our statistical model on episodes of state-led and non-state-led mass killing, meaning the results reflect the likelihood of mass killing by either type of perpetrator.

Data Sources

We systematically reviewed the data we use to generate forecasts to take advantage of new sources and guard against potential biases.

Our risk assessment relies on publicly available data on a variety of country characteristics. In 2017, we updated our data sources as new datasets became available—for example, measures of civil liberties and government repression. We also took special care to avoid using data that, in our judgment, could be susceptible to bias when coded or recoded retrospectively. This should help ensure that our model performs as well in practice as it does on historical data.

Timing

We now estimate the probability of a mass killing onset over a two-year window.

Our updated model forecasts events that could occur anytime in the two calendar years following the year in which our risk factors are measured. Previously, our forecasts were focused on the following one year. We believe the two-year forecasts are a better match for the time required to conduct additional analysis and planning, and implement preventive actions while they are still timely. They are also statistically preferable because “rare events” make forecasting much more difficult, but onsets are less rare in two-year windows than in single-year windows.

Model

We tested several statistical algorithms and selected an approach that maximized forecasting accuracy and interpretability.

We sought to leverage statistical procedures that have shown promise on similar problems, using a process of model development and selection that follows best practices for political forecasting. This meant comparing different models on a task that closely mimicked true forecasting in practice. We identified one model whose accuracy compared well with alternatives (including an average of multiple models) and whose results could be interpreted with relative ease.


Previous Statistical Risk Assessment Methodology (2014–16)

Previously, the project used an average of forecasts from three models representing different ideas about the origins of mass atrocities.

Drawing on work by Barbara Harff and the Political Instability Task Force, the first model emphasized characteristics of countries’ national politics that hint at a predilection to commit genocide or “politicide,” especially in the context of political instability. Key risk factors in Harff’s model include authoritarian rule, the political salience of elite ethnicity, evidence of an exclusionary elite ideology, and international isolation as measured by trade openness. We refer to this model as the "bad regime" model.

The second model took a more instrumental view of mass killing. It used statistical forecasts of future coup attempts and new civil wars as proxy measures of factors that could either spur incumbent rulers to lash out against threats to their power or usher in an insecure new regime that might do the same. We refer to this model as the "elite threat" model.

The third model was a machine-learning process called Random Forests applied to the risk factors identified by the other two models. The resulting algorithm was an amalgamation of theory and induction that took experts’ beliefs about the origins of mass killing as its jumping-off point. It also left more room for inductive discovery of contingent effects.

To get our single-best risk assessment, we averaged the forecasts from these three models. We preferred the average to a single model’s output because we knew from work in many fields—including meteorology and elections forecasting—that this “ensemble” approach generally produces more accurate assessments than we could expect to get from any one model alone. By averaging the forecasts, we learned from all three perspectives while hedging against the biases of any one of them.

Read our in-depth academic paper on this topic


Receive Updates

from the Early Warning Project and the Simon-Skjodt Center for the Prevention of Genocide

Select Country

Go