Projects & papers

R packages

spduration: implementation of Weibull and Log-logistic split-population duration models. These models are designed for instances when you want to model the duration to some event but the population of cases consists of a mixture that includes some proportion of cases that will not experience the event at all, e.g. at what age teenagers start smoking, criminal recidivism among former prisoners, and cancer relapse.

states: make country-year/month/day panel data that is consistent with the COW or Gleditsch & Ward lists of independent states. I use these bare-bones panels as templates to which other data have to conform when merging IR datasets.

Articles, technical reports

Peer-reviewed academic articles and one technical report:


Unpublished papers (and that I am not working on publishing):

  • Precision-recall curves
    2016, on SSRN (pdf)
    For rare outcomes (*cough*, a lot of IR), ROC curves and the area under them are not a great measure of model fit. Look at (the area under) precision-recall curves as well.

  • Using front lines to predict deaths in the Bosnian civil war
    2012, (pdf)
    To be useful for forecasting and prediction, a statistical model needs to be feasible given the data it requires. This paper examines the relationship between front lines and other, time-invariant variables, and killings during the Bosnian civil war from 1992 to 1995. It uses a Bayesian spatial count model to estimate and compare model fit to other, more established conflict models. One of the dissertation papers.

  • Explaining and predicting interstate war deaths
    2012, on SSRN (pdf)
    This paper is about predicting interstate war battle deaths. Data on 89 interstate wars between 1815 and 1991 is used to estimate a truncated regression model that provides the basis for out-of sample forecasts for two other wars. Also a dissertation paper.

  • Predicting the intensity and location of violence in war
    2012, (pdf)
    My three-papers-wrapped-together Ph.D. dissertation.

  • Simulating the Effects of Selection Bias in the Minorities at Risk Project.
    2008, (pdf)
    How much of a problem is it that the Minority at Risk project collects information only for ethnic groups that are “at risk”, i.e. selection on the dependent variable?