A Comment on “Development of a Quantitative Model for the Top-Down Estimation of GHG Emissions from Transportation and Distribution”
Robert J. Erhardt, PhD
Associate Chair and Associate Professor of Statistics, Department of Mathematics and Statistics, Wake Forest University
To quantify a company’s full greenhouse gas (GHG) emissions, life cycle analysis represents a valuable standard and active area of research (Reijnders, 2012). Clearly, one needs to consider the full path from extraction through manufacture, transportation, use, and disposal to tease out the fixed but unknown proportion of GHG emissions attributable to any particular company or product. The degree to which a life cycle procedure correctly computes the “true” GHG emissions of a product or company is always an important debate, but suppose one assumes a life cycle analysis is essentially correct, but too expensive or burdensome to conduct for all companies in some research setting. How should one evaluate the value of other estimates of GHG emissions that might be cheaper and easier to produce?
The preceding paper by Kjaer is an effort to develop such a model. However, the effort to do so faces some substantial challenges. Necessary — but insufficient — is an evaluation of the prediction accuracy by using a benchmark set of cases for which GHG emissions are known and can be compared to predictions. What is needed beyond this is (1) an assessment of underlying sample data themselves, and (2) statistical considerations of the quantitative methods themselves.
Assume we have: (1) a set of n observations (which could be companies, products, etc.) for which we wish to estimate total GHG emissions; (2) actual benchmark values of GHG emissions obtained from life cycle analysis which we call yi, i=1, …, n; (3) proxy data on each company xi, i=1, …, n; and (4) estimated GHG emissions ŷi, i=1, …, n which are produced from some statistical procedure applied to the proxy data. Whether or not the prediction procedure is of any value is a function of the prediction errors, which are the observed values minus predicted values yi-ŷi, i=1, …, n. Minimizing some sensible function of these errors such as the mean squared error (MSE) is often the basis for what we call the “best” model. Prior to this stage, however, are considerations of the data and methods themselves.
Is the GHG estimation procedure intended to be applied widely across a larger population, but demonstrated only on a smaller sample? If so, is the sample representative of the full population, or might it be a convenience sample — easily obtained but ill-suited to stand in for the full population? The degree to which any value of the procedure extends to other samples depends precisely on the answer to this question.
Moving to the true benchmark GHG emissions yi and proxy data xi, one must consider if either explanation or extrapolation are also goals in building the quantitative model. Explanation allows for the meaningful interpretation of model parameters, and (loosely speaking) is the ability to explain to a person unfamiliar with building quantitative models why the model arrives at certain conclusions or predictions. Put another way, explanation is the opposite of so-called black box techniques, which produce predictions but often in an opaque manner. Extrapolation allows the quantitative model to be applied to new proxy data outside the range of the observed xi, i=1, …, n. A linear model (Faraway 2016) often scores highly on explanation and can score highly on extrapolation in some cases, whereas a tree or random forest (James et. al. 2013) often scores less highly on both counts. It is even possible that the model which scores highest for prediction accuracy has essentially no ability to be explained or extrapolated. Assessing predictive accuracy says very little about the value of a quantitative model for explanation or extrapolation.
Turning now to assessing predictive accuracy directly, a guiding question must be if errors in overestimation and underestimation are to be treated equally. There is the practical question based on how the GHG predictions are being used, but also the statistical question of how the model is fit. Consider a log-transformation of GHG emissions log10(yi), often needed to combine small-scale and large-scale companies in the same analysis. Suppose a particular company has actual emissions of 104 tons of CO2 equivalent, or 4 on a log10 scale. Overestimating this company on a log-transformed scale by 1 is equivalent to overestimating by 90,000 tons (5 – 4 on a log scale is 100,000 – 10,000 on the base scale); underestimating by 1 is equivalent to underestimating by 9,000 tons (3 – 4 on a log scale is 1,000 – 10,000 on the base scale). Symmetry in errors also underlies many of the usual metrics for assessing prediction accuracy, such as R-squared, mean squared prediction error, and mean absolute prediction error. This is all to say that even in applications for which errors have asymmetric practical consequences, a surprising number of statistical procedures default to fit the quantitative model by treating errors symmetrically.
In summary, the value of a quantitative model for estimating GHG emissions should be determined partly by the predictive performance of that model on a benchmarked sample, and also from considerations of the sample itself and the properties of the modeling technique that are known to hold across numerous samples. The present paper pursues a worthy goal, but further development will be needed to realize its full, intended benefits.
Faraway, J. J. 2016. Linear Models with R. Boca Raton, Florida: A Chapman and Hall Book, CRC Press, Taylor and Francis Group.
James, G., D. Witten, T. Hastie, and R. Tibshirani. 2013. An Introduction to Statistical Learning, 18. New York: Springer-Verlag. Reijnders L. 2012. “Life Cycle Assessment of Greenhouse Gas Emissions.” In Handbook of Climate Change Mitigation, edited by Wei-Yin Chen, J. Seiner, T. Suzuki, and M. Lackner. New York: Springer.