Predicting civil conflict: what can machine learning tell us?
Computer programs can be used as early warning systems, allowing the global community to act before violence erupts.
Steven Pinker has famously suggested that violence is on the decline. While this appears to be generally true, a recent RAND report suggests that medium and low intensity intrastate conflict bucks this trend and shows an uptick since 2008. While people in countries afflicted with civil conflict suffer, this sort of conflict has globalised effects as well. Arguably, the refugee crisis emanating from the Syrian civil war has had political and economic effects in Europe and the United States, while devastating the lives of Syrians.
Policymakers everywhere should therefore care about developing early warning systems for detecting civil conflicts and identifying effective policy levers to end these conflicts. On the other hand, academics studying this problem scientifically, myself included, tend to identify causal links between theoretically indicated correlates of war and civil conflict. Unfortunately, stable links are hard to come by. For example, in cross-country analyses of civil war, whether a variable such as trade shocks causes civil conflict or not depends on what other controls are included in the empirical model. This is obviously problematic as a matter of science. Further, from a policymaker’s perspective such ephemeral causal links are useless. Even the more stable correlates of civil conflict like gross domestic product (GDP) – while good news for the academic – is useless to a policymaker. How does one stop civil conflict in this case? Parachute cash?
In spite of much academic effort in studying conflict, and stable causal links between variables, that policymakers can use to end such conflict, conflict itself remains chimeric. So what can be done to deliver stable causal links to policymakers and academics? I suggest that machine learning can show a way out.
In my recent work, alongside Jim Bang, Tinni Sen and John David, we explore how machine learning can further our knowledge of civil conflict and provide finer grain policy responses. Our definition of civil conflict is the same as that of the Political Instability Task Force. We assembled 58 potential covariates of civil conflict covering as many countries and as many years for which cross-country data was available. These variables were the set of political, social, institutional and economic variables identified in the literature as potential correlates of civil conflict. Our goal was to find and validate, out of sample, the variables that predicted civil conflict five years out. The machine learning algorithms identified these predictors by measuring how much prediction error increased when a variable was taken out.
The algorithms do not identify causal patterns, merely identifying predictive patterns. This causal agnosticism is a limitation of predictive machine learning models. Nevertheless, we suggested that variables that could not predict, also could not be causal. After all, causal variables should predict. Therefore, poor predictors should not be part of any model specification designed to identify causality. Thus, we developed a consistent pathway to identify which variables should be in a causal econometric model explaining civil conflict. We argued that the search for causality should start among predictive variables. Predictive variables, if found causal, can be good policy levers as well. In fact, since policy by definition affects future outcomes, a very important predictor, if also causal, becomes an obvious choice as a policy lever over a less important predictor.
Our algorithms scored the likelihood of civil conflict in different countries in the 2014 to 2019 period. The “random forest” method was one of the most accurate. The likelihood of civil conflict as estimated by that method appears below. My first point here is the most obvious. Machine learning algorithms can be used as an early warning system for civil conflict, allowing the global community to act before conflict starts.
Likelihood of civil conflict 2014-2019 by country:
What action should this be? That is, where the most influential predictors become important as policy levers. Existing civil conflict is our top predictor of future conflict, which suggests that civil conflict is path dependent and makes identifying conflict-prone countries and preventing this conflict more important. Once conflict starts, it becomes hard to stop. This may seem obvious in hindsight, but it is not so obvious to policymakers, or for that matter, academics.
We have heard how democratic elections, more economic growth, or even preventing climate change can help stop conflict. Machine learning suggests conflict needs to be stopped before any other intervention can become effective. It also suggests that conflict persistence needs to be studied more since there is currently no universally accepted validated theory explaining persistence. Although, I am working on it.
The top ten predictors of conflict are presented in the table below and give us a sense of policy levers that may prevent conflict. Although firstly, I would like to point out variables that are not good predictors of conflict. Notice for example variables such as rainfall, trade shocks, ethno-social differences, or income inequality are missing. Thus, climate change or grievances arising from income inequality or ethnic differences are not likely overall to be immediate causes of civil conflict. If they were, they would be predicting civil conflict better than they do.
Top 10 predictors of civil conflict:
Machine learning has winnowed 58 variables down to about ten or less, as the most important predictors of civil conflict, with some variables sorted through an extensive data preparation process using Exploratory Factor Analysis. In our findings, we suggest that the search for causality must involve these variables. Indeed, for example rainfall matters in fomenting civil conflict, and machine learning suggests it operates using one of the channels identified among these top ten. Incidentally, the top ten is not a magic number. It could be the top 15 or the top five. However, the predictive importance after the top seven drops about 0.5% and never quite recovers. At any rate, we now have a robustly validated set of correlates of civil conflict, based on thousands of different possibilities, that also counter the tyranny of the “p value” in quantitative science.
These findings could be a starting point for concentrated academic interest in robustly understanding the causes of civil conflict. Further, the search for appropriate policy levers must start among the most predictive variables, since changing the values of these variables, by definition, will have the most impact on the likelihood of future conflict.
And yes, we still think parachuting cash into a war-torn country to improve GDP per capita is a bad idea!
The opinions expressed throughout this article are the opinions of the individual author and do not necessarily reflect the opinions of Vision of Humanity or the Institute for Economics & Peace.