by Andrea Fontanari and Kees Oosterlee (CWI)
The increasing use of drones is likely to result in growing demand for insurance policies to hedge against damage to the drones themselves or to third parties. The current lack of accident data, however, makes it difficult for insurance companies to develop models. We provide a simple yet flexible model using an archetype of Bayesian neural networks, known as Bayesian generalised linear models, to predict the risk of drone accidents and the claim size.
Drones (formally known as unmanned aircraft vehicles, Figure 1), are no longer purely recreational objects, but essential business tools, used for purposes from real estate photography to transportation of goods.
Figure 1: An unmanned aircraft vehicle (drone) in flight. (Licence Creative Commons made by Walter Baxter).
The market demand for drones is expected to rise , and with increased drone use, comes a concomitant demand for insurance of flights. Additionally, the legislator - the European Union Aviation Safety Agency (EASA) - is advocating a per-flight insurance to cover damage that a drone may cause to a third party. For example, as of 1 July 2020, the guidelines issued by EASA stipulate a per-flight insurance any time a drone is used beyond the line of sight [L1].
The goal of our project, together with our partners Bright Cape (Netherlands), Achmea (Nederlands), University Politecnica, Madrid (Spain) and EURAPCO (Switzerland), is to build a simple but robust prediction model to study the risk embedded in drone flights to help design a per-flight drone insurance.
Insurance pricing is usually determined by a model based on a large pool of historical accident data, using the size and the counts of claims as a function (often non-linear) of the determining features. In the car insurance industry, for example, features may include the type of car, its age and its location. However, with drones being a recent innovation, insurance companies lack adequate data to design a full data-driven model. Data will likely arrive sequentially as more policies are developed and more flights occur. A model therefore needs to be able to handle existing information and expert judgements to partially make up for the initial lack of data, and should adapt to new information as it becomes available.
To this end, we have adopted a flexible framework offered by Bayesian generalised linear models (BGLM) . These models can be understood as simple Bayesian neural networks, with no hidden layers, that can design non-linear interactions between the features and the responses, while maintaining a fair degree of mathematical tractability and explainability.
We chose a sigmoid output layer with a binomial loss function to model the risk of an accident (understood as a probability), and an exponential output layer together with a gamma loss function to model the claim size.
The simple structure of BGLM reduces the risk of over-parametrisation and over-fitting, especially in the initial phases of the training when we cannot expect to have a large batch of available data.
The Bayesian side of the model allows prior information and expert judgements to be incorporated in a simple and natural way: the model parameters are understood as random variables whose distributions, called priors, handle the uncertainty about their realisations.
In terms of the model calibration, the Bayesian framework regularises the loss function to penalise calibration iterations that tend to over-fit. The Bayesian setting also enables us to handle the sequential nature of the problem. In fact, once the model is calibrated the posterior distribution can be used as a new and improved guess on the parameters for the next run of the model.
Using interviews and questionnaires, we collected expert opinions in order to identify the features that are most likely relevant for our problem. We embedded this information into our model using the hyper-parameters of the prior distributions.
Once data are available, the model is calibrated using a Markov Chain Monte Carlo algorithm (MCMC) and scenarios are generated to predict the risk profile of a drone flight and the size of possible claims that could be generated.
In the early stages, it is likely that the model results will closely resemble the initial guesses formulated by the experts. However, over time, as more data become available, the model will be able to deviate from the initial guesses towards a data-driven explainable estimation by giving more and more weight to the empirical evidence. Finally, Figure 2 exhibits a stylised version of the main ingredients and their interactions of our approach.
Figure 2: A schematic view of the model.
While it is important to keep in mind that “all models are wrong but some models are actually useful”, it may be reasonable to expect that the model described may not be able to learn very complex nonlinear structures that may arise in the data. An interesting extension that may help to mitigate this issue is the Combined Actuarial Neural Network approach (CANN) . This approach allows a fairly simple model to be blended with a deep feed-forward neural network designed to handle more complex data structures. However, the price to pay for this flexibility is an increase in complexity and the amount of data required for training and a loss of explainability.
 C. Anderson: “The drone economy. HARv. Bus. REV, 5, 2017.
 E. Ohlsson, E., B. Johansson: “Non-life insurance pricing with generalized linear models (Vol. 2)”; Springer, 2010.
 M. V.Wüthrich, M. Merz, M.: “Yes, we CANN!”, ASTIN Bulletin: The Journal of the IAA, 49(1), 1-3, 2019.
Andrea Fontanari, CWI, The Netherlands