On top of this, the use of quantification has significantly increased over the last decades with the inflation of metrics, indicators, and scores to rank and benchmark options (Muller, 2018). The case of energy policy making in the European Union is again an effective example. The European Union’s recent energy strategy has been underpinned by the Clean Energy for all Europeans packages, which are in turn supported by a number of individual directives, each one characterized by a series of quantitative goals (European Commission, 2023). The quantification of the impact (impact assessment) is customarily required to successfully promote new political measures (European Commission, 2015a) and is in turn based on quantification, often from mathematical models (Saltelli et al., 2023). The emphasis on producing exact figures to assess the contribution of a new technology, political or economic measure has put many models and their users into contexts of decision-making that at times extends beyond their original intent (Saltelli, Bammer et al., 2020). At the same time, the efforts to retrospectively assess the performance of energy models have been extremely limited, one example being the Energy Modeling Forum in the United States (Huntington et al., 1982). In spite of this, retrospective assessments can be very helpful in understanding the sources of mismatch between a forecast and the actual figures reported a posteriori (Koomey et al., 2003). For example, long-range forecast models are typically based on the assumption of gradual structural changes, which are at stake with the disruptive events and discontinuities occurring in the real world (Craig et al., 2002). This dimension is especially important in terms of the nature and pace of technology change (Bistline et al., 2023 ; Weyant & Olavson, 1999). A further critical element in this approach is the cognitive bias in scenario analysis that naturally leads to overconfidence in the option being explored and results in an underestimate of the ranges of possible outcomes (Morgan & Keith, 2008).
Additionally, in their quest for capturing the features of the energy systems represented, models have increased their complicatedness and/or complexity. In this context, the need to appraise model uncertainty has become of paramount importance, especially considering the uncertainty due to propagation errors caused by model complexification (Puy et al., 2022). In ecology, this is known as the O’Neil conjecture, which posits a principle of decreasing returns for model complexity when uncertainties come to dominate the output (O’Neill, 1989 ; Turner & Gardner, 2015). Capturing and apportioning uncertainty is crucial for a healthy interaction at the science–policy interface, including energy policy making, because it promotes better informed decision-making. Yet Yue et al. (2018) found that only about 5% of the studies covering energy system optimization models have included some form of assessment of stochastic uncertainty, which is the part of uncertainty that can be fully quantified (Walker et al., 2003). When it comes to adequately apportioning this uncertainty onto the input parameters and hypotheses through sensitivity analysis, the situation is even more critical: Only very few papers in the energy field have made the use of state-of-the-art approaches (Lo Piano & Benini, 2022 ; Saltelli et al., 2019). Further to that, the epistemic part of uncertainty, the one that arises due to imperfect knowledge and problem framing, has been largely ignored in the energy modeling literature (Pye et al., 2018). For instance, important sources of uncertainties associated with regulatory lag and public acceptance have typically been overlooked. 1
In this contribution, we discuss three approaches to deal with the challenges of non-neutrality and uncertainty in models: The numerical unit spread assessment pedigree (NUSAP) method, diagnostic diagrams, and sensitivity auditing (SAUD). These challenges are especially critical when only one (set of) model(s) has been selected to contribute to decision-making. One practical case is used to showcase in retrospective the relevance of the issue and the associated problems: the International Institute for Applied Systems Analysis (IIASA) global modeling in the 1980s.