Using ESG Score Predictor

Environmental, Social, Governance (ESG) assessment and carbon footprint data, especially for small- and medium-sized SMEs), is a young and emerging field. Very few data providers service this space, and those that do maintain limited company coverage. This paper introduces Moody’s Analytics ESG Score Predictor models, designed to bridge these coverage gaps by generating predictions of a wide array of ESG and carbon footprint metrics. Based on a model derived from Moody’s proprietary ESG scoring methodology, the ESG Score Predictor provides comparable and standardized predicted metrics for public and private enterprises worldwide, spanning a wide variety of industries using location, sector, and size. These metrics can be used for portfolio and risk management, and they can help companies monitor ESG risk across global supply chains.

1. Introduction

Environmental, Social, Governance (ESG) issues and climate risk have become critical considerations for investors such as banks, insurers, commercial real estate managers, asset managers, and hedge funds, among others, to identify risks and opportunities within their portfolios. Firms are under increasing regulatory and market pressure worldwide to assess their ESG and climate risk management practices, an effort that requires consistent and comparable metrics. Despite rapid growth in available underlying data, the provision of ESG data, especially for small- and medium-sized enterprises (SMEs), remains a young and emerging field. Few data providers service this space, and those that do typically provide coverage for only a few thousand companies, a small sample of the hundreds of millions of firms worldwide.

This paper introduces Moody’s Analytics ESG Score Predictor methodology framework—a set of models designed to bridge coverage gaps by generating estimates of ESG and climate risk metrics for any company, including private, public, and large-, mid-, and small-cap firms. Analyzing how well a firm manages its ESG-related matters, including climate risk, is typically performed using a combination of company-level quantitative and qualitative information. This process involves directly engaging with companies, as well as reviewing publicly disclosed information. Sometimes this information is supplemented with alternative data to help measure and assign attribute weights during the assessment scope. However, carrying out a primary assessment is not always feasible, as data gaps often exist. Also, since a primary assessment is a lengthy and manual process, it cannot be done for a large number of companies. Coverage is particularly patchy for smaller companies and companies in less-regulated industries or emerging markets. These factors impact the number of companies covered by major ESG score providers and typically only range from 1,000−10,000 firms, posing a challenge for organizations with larger portfolios. Moody’s ESG Score Predictor models help assess “unscored” firms, achieving full portfolio coverage.

ESG Score Predictor models provide estimates of 59 ESG and climate risk assessment metrics designed by Moody’s ESG Solutions.The ESG assessment follows ESG Solutions proprietary methodology and contains multiple layers of granular scores. Climate metrics include physical risk management scores, energy transition scores, and Scope 1 and Scope 2 carbon emissions.¹ The ESG Score Predictor models provide comparable and standardized metrics, allowing users to compare companies across industrial sectors, market cap size segments, and locations, while accounting for economic, social, natural, and human capital development indicators in the location(s) where a company operates.

Moody’s ESG Solutions

Moody's ESG Solutions' ESG scores follow a proprietary research methodology and measure the degree to which companies consider and manage material Environmental, Social, and Governance factors. Companies with higher ESG scores are stronger at managing relationships with their stakeholders. They are also less likely to experience business disruption or miss opportunities due to failed stakeholders’ expectations. This, in turn, can better position firms to mitigate risks and create sustainable value for shareholders over the medium to long term.

The research methodology is based on universally recognized norms and standards emanating from international organizations, such as the UN, ILO, and OECD. To generate the scores, Moody's ESG Solutions analyses and scores up to 30 distinct ESG criteria that are framed within over 50 industry-specific models. Criteria are examined using a managerial questioning framework based on three standardized pillars: leadership, implementation, and results. The leadership pillar contains questions that examine the visibility of a policy on a specific criterion, exhaustiveness of the policy, and the level of ownership. Implementation examines the means put in place to manage responsibilities within a criterion, the scope of the means, and the geographical coverage. Results show the trend of relevant key performance indicators, the frequency and severity of controversies, and the responsiveness of the company to the controversies. Criteria-level scores are aggregated, through a weighted average calculation, into six domain level scores, E, S, and G scores, and an ESG Global score for the company. In each industry framework, the criteria aggregation weight depends on the materiality to the sector and stakeholders.

Bridging coverage gaps using ESG Score Predictor

Currently, Moody's ESG Solutions' research covers approximately 5,000 issuers divided into over 50 sectors. Hence, the predicted metrics using the ESG Score Predictor models serve to expand company coverage and complement the existing universe.

To build the ESG Score Predictor models, we leverage the historical data of companies assessed by Moody’s ESG Solutions and its predecessor businesses from 2004–2020. The models are trained and calibrated on a dataset of more than 100,000 individual firms, covering 600+ industries across 220 countries and territories. The prediction model for each ESG metric is a combination of different individual regressions and alternative machine learning (ML) models, trained on the raw dataset, with predictions extracted and then combined into a final model using ensemble methods. The model predictions are further calibrated to facilitate the model’s extension to a wider universe of companies in countries not yet well covered by Moody’s ESG Solutions, to include medium- and small-sized companies, and to provide more granular metrics’ predictions by industry and location. The ESG Score Predictor models are used to calculate ESG scores and to produce interpretable, predicted metrics for any firm for which the size, location (within the 12,000 subnational locations), and industry (in NACE 4 list) are known.

Figure 1 illustrates the ESG Score Predictor models’ framework for the individual metrics. The target variables of the models are each of the 59 ESG assessment scores and climate risk metrics using models with a variety of drivers, including corporate disclosures and macroeconomic, socioeconomic, climate, and spatial data (see full list in Appendix).

In Step 1, a set of base learners (both regression and machine learning models) are trained on the training data sample. The predictions of these base models are fed into an ensemble model in Step 2. In Step 3, the model performance is evaluated on the test sample. Finally, in Step 4, the final scores are recalibrated to account for known bias in the modeling data and to further expand the location granularity of the ESG Score Predictor methodology to the subnational level and NACE 4 industry classifications. The bias in the modeling data stems from the fact that companies scored by Moody's ESG Solutions are typically larger (often publicly traded) companies that would typically have a more advanced ESG strategy. The relatively small size of the existing sample of scored companies also means gaps across geographic coverage.

We organize the remainder of this paper as follows. Section 2 discusses the data used in the model. Section 3 provides details on the methodology choices for the base learners and the ensemble model, as well as model performance. Section 4 covers the model calibration performed to overcome the known bias in the model training data. Section 5 presents the key conclusions.

2. Data

Figure 2 displays the flowchart for creating the modeling dataset. We obtain the data to build the prediction models from various sources (Appendix 7.2 lists the data types and sources to build the ESG Score Predictor models). The target variables are the 56 firm-level ESG scores and the three GHG emission indicators sourced from Moody’s ESG Solutions (Appendix 6.1 lists ESG Score Predictor target metrics). The ESG scores range from 0 (bad sustainability performance) to 100 (high performance), while the greenhouse gas emissions (GHG) are expressed in CO2-equivalent tons. The drivers include firm-level corporate disclosures and country- and subnational-level variables. Firm-level corporate disclosures include company size (quantified by total assets, turnover, number of employees), industry,² and geographical location obtained from Moody’s Market Implied Ratings (MIR), Moody’s Analytics CreditEdge™ (CE), and Moody’s Default and Recovery Databases (DRD). Country-level climate, physical, and sovereign risk metrics are sourced from Moody’s ESG Solutions. Country and subnational macroeconomic and population indicators are obtained from Moody’s Data Buffet,³ which aggregates data from private and government data providers. Social, economic, and natural and human capital indicators are obtained from a variety of open sources, including government data providers and NGOs.

The datasets are carefully merged and cleaned to ensure that the data used to build the models are of very high quality. The goal of merging the datasets is to cross-source information from different repositories and to populate firm-level data for as many companies as possible.

We expand the target indicators since only a small fraction of the entities contained in the firm-level data have been rated by Moody's ESG Solutions, assuming:

» ESG scores for national or regional branches of corporates are equal to the values for the entire group, as Moody's ESG Solutions typically rates corporates as a whole. This assumption is not applied when scores are provided separately for the branch.

» ESG scores for subsidiaries are equal to the value for the parent companies, assuming that the latter exert influence on subsidiaries strong enough to influence corporate sustainability policies to the latter. This assumption is discarded when subsidiaries are rated by Moody's ESG Solutions independently from parent companies.

These assumptions do not apply to the carbon footprint indicators (Scope 1, Scope 2, Scope 1+2), as the amount of GHG emissions strongly depends on company size.

The factors driving the changes in ESG ratings are typically linked to the improvement or the deterioration of corporate sustainability policies, their implementation (also requiring the investment of capital), development in national or regional laws and regulations, and technological improvements. Similarly, for carbon footprint indicators, regulations, technological developments (especially for energy efficiency or emission capture), the increase/decrease in business activity can be responsible for variations in GHG emissions. All of these factors typically change on a time scale of several years. Therefore, we deem it reasonable to assume a time persistence for ESG scores and emissions, i.e., that they remain constant between two subsequent sustainability assessments by Moody's ESG Solutions.

For most of the country- and subnational-level macroeconomic and population variables, we consider the percentual annual growth rate rather than the level, to avoid that the scale of the economies may affect our results and make the comparison among countries unreliable.⁴ Furthermore, to avoid issues of endogeneity, all the macroeconomic variables are lagged by one year, in relation to the target variable and remaining drivers.

The resulting dataset for modeling and subsequent calibrations contains corporate disclosures for more than 100,000 companies. Depending on the data availability for each target metric and corresponding drivers, the modeling datasets for each metric contain between 28,985 and 323,051 observations per firm and range from 2004–2020, covering 96 countries.

3. Prediction Models

The ESG Score Predictor model for each of the 59 target metrics (56 ESG scores plus 3 carbon footprint indicators) consists of individual base models combined into one, using ensemble techniques to provide the best approximation of the target metrics using a variety of drivers. This framework allows for predictions that are more flexible, stable, and less data-sensitive than standalone models. Figure 3 illustrates the process used to build the prediction models for each metric.

We start by defining the list of potential drivers considered for the construction of individual models for each target metric. Driver choice is based on comparability, impact, data availability, and relevance for influencing the target metric. We also consider documented findings on the empirical determinants of the metrics in the literature. For example, firm size is often associated with “better” ESG scores, together with the economic and social development of the country where the firm operates. Firm-level drivers are company size, industrial sector, firm country location, and the firm’s participation in the UN Global Compact. To assess a company’s efforts to adhere to sustainability, we also incorporate national-level drivers, such as economic, social, and natural and human capital indicators in the location where the company operates.

The sample used for model building includes 19,000+ firms globally, for which data for target metrics are available. We split the modeling dataset into train and holdout in a standard proportion of 70:30. The models are developed using the training sample, and then their performance is tested using the holdout sample. To fit the relationship between the drivers and each target metric, we construct individual regression models (including linear, logit-transformed, and fractional response models) and individual, alternative machine learning (ML) models (including neural networks, random forest, gradient boosted, and regression tree models).

The regression models are trained for the 56 ESG scores. For the carbon emission metrics, we only use linear regression, since they are not bound to a finite interval but instead take any non-negative real value.

The regression models use sector, size, year, and country as drivers. We discard other drivers since these are country-level time series that encode information on location and time. Hence, their inclusion does not significantly improve the predictive power of the models, whereas they would bring in a severe degree of multicollinearity reflected on very high variable inflation factors (VIF) growing up to order ~100.

The ML individual models for all metrics include gradient boosted, random forest regression trees, and neural networks to capture non-linear dependences in a non-parametric way and to boost model performance. The ML base models are trained and their hyperparameters tuned using three bootstrapped resamples (with replacement) of the training data. The hyperparameters are tuned via iterative grid search over various parameter combinations, to minimize the root mean square error. Separate models are constructed on each of the bootstrapped samples and then evaluated on a holdout validation set. The models are aggregated to arrive at one final set of optimized hyperparameters, before the entire training dataset is used to fit a final model.

The next stage of model selection relies on interpretability measures and variable importance plots, to further screen drivers with counterintuitive or negligible impact on the target metrics. This step allows us to eliminate drivers that do not comply with prior expectations about their contribution and impact.

The intuition check is conducted by computing the Accumulated Local Effects plot (ALE) for each driver to analyze their influence on the prediction in terms of directionality and magnitude. To evaluate the impact of predictors on the target variables, Variable Importance Plots (VIP) are created for every ML model. We conduct this process to arrive at a parsimonious and explainable model, from which we can draw clear directional relationships between target variables and predictors.

The individual ML models and the regressions are combined into an ensemble, to increase the predictive power of the final models: the output of the individual base learners become the input of a regression model, which uses an Elastic Net algorithm to shrink the coefficients of irrelevant base models close, or all the way, to zero. The coefficients of the regression 𝛽 are determined through a minimization

where S represents the dependent variable (ESG score or GHG emissions), the vector X contains the independent variables (predictions of the base models given X), the Lagrange multipliers λ₁, λ₂ are, respectively, the L1 and L2 regularization parameters associated with the 𝐿¹ and 𝐿² norms of the regression coefficient vector, ||𝛾||₁= ∑_𝑝 |𝛾_𝑝| and ||𝛾||₂² = ∑_𝑝 𝛾_𝑝² , respectively. Individually, for each modeled metric, this algorithm tunes over a grid of L1 and L2 regularization parameters, incorporating Ridge and Lasso as limiting cases, and automatically selects the model, which minimizes the Mean Square Error (MSE). The lower limit for the coefficients relative to each base learner is set to zero, so that weak and irrelevant signals from the base models are dropped, and negative coefficients are not allowed, for they would be counterintuitive. The values of λ₁ and λ₂ are tuned via 10-fold cross-validation to ensure the final ensemble model generalizes well to an unseen population of data. After selecting the ensemble model, we establish model validation requirements. We test the accuracy of the final models using measures, including R-squared, Root Mean Square Error (RMSE), and Mean Absolute Error (MAE), and consistency checks in terms of size, location, and industry. To arrive at the optimal model, the validation is performed in parallel and in an iterative manner instead of sequentially.

The ensemble model performance for selected metrics is summarized in the following tables. The R2 indicates that the models explain above 90% of the variability in the training sample and above 70% in the testing sample. For the scores, the MAE 2.89 indicates that, on average, the model has a prediction error of only 1–7 points on a 0–100 scale, while the RMSE goes up to 9 points. For carbon emissions, the RMSE and MAE are below 1 logarithm point.

In addition, we look at the model accuracy by region, country, and industry, to evaluate how close the predicted metrics are to the actual target metrics in each subgroup; we examine the distribution of residuals to confirm the model choice is appropriate. In this paper, we display these results for the Overall ESG score for selected industries and regions below.⁷ We observe that the R2 does not go below 75%, while the maximum MAE is 4.64 on a 0–100 scale. The distributions between actual and predicted scores are of similar shapes and ranges of distributions. The predicted scores exhibit slightly higher concentration (indicated by higher peaks), which are expected. With much less detail and granularity in ESG disclosure for the targeted companies, the ESG Score Predictor outputs are less idiosyncratic than the scores assigned by Moody's ESG Solutions, which involves expert opinions on a case-by-case basis. Residuals between actual and predicted scores demonstrate the model residuals highly concentrate at the level of zero, indicating that the score predicted is unbiased to the actual score and can capture the majority of the actual score’s dynamics. The Q-Q plots also show good model performance, since the data points are closely clustered around the diagonal or the 45-degree line.

Thus, the robust model performance indicates we can confidently predict ESG assessment scores and climate metrics for unscored firms.

4. Calibration of Model Predictions: Avoiding Statistical Bias and Expanding Model Coverage

The ESG Score Predictor aims to provide estimates of ESG ratings and GHG emissions for the widest variety of firms. However, the predictions may suffer statistical biases for companies in locations, sizes, or industries that have limited or no coverage in our modeling dataset. To assess this challenge, we apply four types of calibrations to model predictions to ensure metrics robustness:

1. Out-of-geography calibration for countries and territories are not covered or are relatively underrepresented in the modeling dataset. We identify country clusters to expand the country coverage in the modeling dataset from 96 countries to 220 countries and territories.

2. Small-size calibration to capture the specific features of micro-, small-, and medium-sized enterprises (SMEs). We adjust the predictions leveraging the relationship between the target metrics and company size from an auxiliary model.

3. Industry calibration to increase the granularity of model-driven predicted metrics from 272 NACE 3 industry groups to 615 NACE 4 classes. Under the assumption that companies’ financial performances contribute to discriminating among the NACE 4 categories in terms of target metrics, we use a best-fitted regression model on industry-specific financial ratios⁸ to quantify corrections to the predicted scores.

4. Sub-national calibration to further increase location granularity from 220 countries and territories to 12,000 subnational locations, we apply an adjustment to differentiate each subnational region from a country as a whole. Up to three distinct levels of subnational granularity are addressed⁹ by leveraging established empirical relationships among the target metrics, economic indicators, and development indices¹⁰ using subnational data. We treat advanced economies and emerging countries separately to account for fundamental differences in their economic systems.

Out-of-Geography Calibration: Expanding Country Coverage

The modeling sample is mostly comprised of firms operating in developed countries, which contributes to higher baseline predicted scores. For developing countries, the firms rated by Moody's ESG Solutions are likely large and attentive to corporate sustainability problematics, boosting the average scores for these countries. Therefore, without the out-of-geography, calibration predictions risk being over-estimated.

To cure this bias, we devise thus calibration to rescale the output of the ensemble models for entities located in countries and territories not covered by Moody's ESG Solutions yet, or for which less than 10 entities have been rated, i.e., out-of-geography locations. Countries with solid coverage are not affected. The models for GHG emissions are not significantly affected by this potential bias since these are mainly driven by the sector and size of the companies. In fact, the predictions are consistent with the country-level ratios of CO2 emissions (in kg) from fuel combustion, over GDP (USD PPP).

The predicted metrics are rescaled by a country-and score-specific factor. The computation of the rescaling factor proceeds through several steps. First, by a K-means clustering algorithm, all the countries and territories are segregated into a handful of groups, based on economic, social, and natural and human capital country-level indicators as well as sovereign scores provided by Moody's ESG Solutions, and geography, so that every nation is put into relation with a set of countries exhibiting similar features. K-means clustering is applied on all 220 countries and territories based on a collective set of up to 98 indicators, as per Table 4.

The selection of indicators considered for clustering is tailored to the ESG score to predict. The optimal number of clusters is chosen conjugating the Elbow and Silhouette methods, ensuring that each cluster contains a reasonable amount of countries covered and companies rated by Moody's ESG Solutions.

The next step establishes the empirical relations between ESG scores on one side and socioeconomic, environmental, human development, macroeconomic indicators, and sovereign ratings on the other. To formulate these relations, an auxiliary linear model is created for each of the 56 ESG scores by regressing the country average of the target score against variables, such as the ones shown in Table 4. The auxiliary regressions are cluster-specific, in that for every cluster the linear model is fitted over all the data available, but weights for countries belonging to that cluster are doubled. The variable selection process is automated and relies on Moody’s Optimal Variable Selection Algorithm.¹¹ The outputs of the fitted auxiliary model are referred to as

The tendency of the models to overestimate/underestimate the predicted scores for a country is evaluated by computing the average of a synthetic portfolio of simulated companies. For each of the 220 countries, the representative sample contains one fictitious company for each of the 272 industries at NACE level 3, whose size is of the same magnitude as the median of the full modeling sample, over three years (2011, 2015, and 2019). The output of the Score Predictor for a given ESG metric S and a country c is aggregated into a country-level average, denoted as

The ratio of

is used to correct the output of the ensemble model.

To visualize the effect of the “out-of-geography” calibration, we compute the average Overall ESG score with Score Predictor for 220 samples (one for each country) for each combination of 615 NACE level 4 industries and 17 firm-size buckets, i.e., 10,455 simulated companies in total. The Spearman correlation (i.e., rank order-based correlation) between the country-average Overall ESG scores and sovereign ESG scores increases from 12.3% to 73.3%, as shown in Table 5.

Figure 7 illustrates the calibration steps for sample Libyan and a French company. Because France is adequately represented in the Moody's ESG Solutions database, the out-of-geography calibration is not performed. However, companies located in Libya are not well represented in our training dataset and, consequently, predictions are adjusted to correct the sample bias.

Small-Size Calibration: Increase Coverage to Micro, Small, and Medium Enterprises (SME)

Figure 8 shows the trend between the predicted Overall ESG scores and total assets (here used as a proxy for company size), represented by the green line. For companies with total assets lower than a certain threshold, the trend flattens, since smaller size firms have limited coverage in the modeling sample and, therefore, the model cannot differentiate between firm sizes with assets below a certain point. Empirically, we observe that the flattening trend starts developing for firms with assets less than 20 million USD for ESG scores, while the threshold is of the order ~100 million USD for carbon footprint metrics. For smaller size companies, it is thus appropriate to calibrate the raw output of the ESG Score Predictor model by understanding the relation between target metrics and size, and then extending it to the smallest businesses, overriding the raw model’s predictions.

To calculate a calibrated score to correct for this size bias, we use auxiliary fractional response regression models once more to allow ESG Score Predictor models to develop more accurate scores for smaller size companies, i.e., below a threshold 𝑨₀ = 20 (million USD),

where 𝑺(𝑨) refers to the target ESG score as a function of size (total assets), 𝑺(𝑨₀) is obtained as the output of the ESG Score Predictor model, and m is a regression parameter.

The three carbon footprint metrics (GHG emissions Scope 1, Scope 2, and their sum) require a slightly different approach, since their range spans the positive real axis. Therefore, the dependence of carbon footprint indicators on size is more suitably described by a linear regression. The calibrated benchmark GHG emissions, as a function of turnover 𝑺(𝑨), for companies below the empirical threshold 𝑨₀ = 100 million USD, can be formulated as:

where 𝑺(𝑨₀) represents the output of the ESG Score Predictor model and m is a regression parameter.

Industry Calibration: Increase Granularity from NACE 3 to NACE 4

The goal of this additional calibration is to extend the granularity of the industry coverage from the 272 industries (“groups”) in NACE level 3 to the 615 (“classes”) in NACE level 4. For this step, the output of the ensemble model is adjusted by adding sector-specific corrective terms.¹²The corrected terms are calculated as:

where the X variables represent the median value of a number of financial ratios/indictors for companies in NACE 3 and NACE 4, respectively. Examples of financial indicators considered: asset turnover, asset volatility, working capital turnover, revenue per employee, and debt ratio. The weights 𝛽𝑗 are estimated through regression using a pool of 100K+ firms. The financial ratios used are selected using the OVS algorithm.

Subnational Calibration: Increase Location Granularity to Subnational

The methodology used to calibrate scores to a subnational level is very similar to the one used to increase industry granularity. To ensure consistency between nationwide and subnational-level ESG scores, upper and lower bounds are imposed on the spread between national and subnational ∆𝑆 calculated as:

X variables represent the geographical location at subnational (SN) and national (Nat). The 𝛽𝑗, once again, are estimated through regression analysis, where the ESG scores or carbon footprint indicators are modeled as functions of these geographic locations. We observe that, for most of the ESG metrics, around 90% of subnational scores differ by less than four points¹³ (in absolute value) from the country-level scores, hence it is imposed |∆𝑆| ≤ 4 . Similarly, the constraint |∆𝑆| ≤ log(3/2) is imposed to the logarithm of carbon footprint indicators, affecting around 10% of the subnational correction terms.

The names of the administrative divisions vary, depending on the country and the language, and there is no international standard. We borrow the terminology of the NUTS (Nomenclature of Territorial Units for Statistics) geocode from European Union,¹⁴ which applies to all the EU countries, members of the European Free Trade Association, and several non-EU countries. Loosely, this terminology is extended to non-European countries, identifying the levels of granularity heuristically, by analogy with regions’ population or area.¹⁵ In total, the subnational dataset used for the subnational calibrations covers 150 countries and can extend coverage to 592 NUTS1 areas, 6638 NUTS 2 areas, and 4849 NUTS 3 areas across the globe.

5. Summary

ESG Score Predictor models provide an analytical solution for generating a wide range of comparable and standardized metrics for assessing the potential degree to which companies take into account and manage ESG and climate risks. This is especially relevant for companies where a primary assessment is not possible, such as smaller-sized companies, less regulated industries, and companies in emerging markets. Using company size, location, and industry as inputs, the ESG Score Predictor models generate a set of prediction metrics by benchmarking to the historical ESG assessment scores by Moody’s ESG Solutions. The models are based not only on firm-level information, analyzing the location where the company operates, they also link regional-level data such as physical risk, macroeconomic indicators, sustainability metrics, and developmental and freedom indicators to take into account the company’s operating environment.

Our statistical analysis shows adequate accuracy of the model on Moody’s ESG assessment scores expanding its coverage to an unlimited number of companies while maintaining a consistent level of soundness. The ESG Score Predictor estimates are complementary to Moody’s ESG assessment metrics. Using both methodologies in combination allows institutions to evaluate companies across the entire book, identify ESG risk exposure by pockets (size, location, and industry), indicate companies likely to provide good returns and further help broader ESG roadmap efforts.

Appendix

ESG Score Predictor Target Metrics

Table 6 displays the list of metrics estimated by ESG Score Predictor models. 56 metrics are scores ranging from 0 (low performance) to 100 (high performance and three carbon emissions in CO2-equivalent tons.

Initial Drivers for Ensemble Modeling

Table 7 displays the list of potential drivers initially considered to train the base learners, either machine learning models or regressions. The type of variable and source are also reported.

References

Chen, Tianqui and Carlos Guestrin, “XGBoost: A Scalable Tree Boosting System.” KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2016.

Crespi, Fabrizio and Milena Migliavacca, "T he Determinants of ESG Rating in the Financial Industry: The Same Old Story or a Different Tale?" Sustainability, MDPI, Open Access Journal, Vol. 12(16), pages 1-20, August 2020.

Drempetic, Samuel, Christian Klein, and Bernhard Zwergel., “The Influence of Firm Size on the ESG Score: Corporate Sustainability Ratings Under Review.” Journal of Business Ethics, 167, 333–360, April 2019.

Licari, J., Olga Loiseau-Aslanidi, Simone Piscaglia, and Brenda Solis Gonzalez, “ESG Score Predictor: Applying a Quantitative Approach for Expanding Company Coverage.” Moody’s Analytics Whitepaper, June 2021.

Licari, Juan, Olga Loiseau-Aslanidi, and Dmytro Vikhrov, “Dynamic Model-Building: A Proposed Variable Selection Algorithm.” Moody's Analytics Risk Perspectives, Managing Disruption, Volume IX, July 2017.

Quéré, Bertrand P., Geneviève Nouyrigat, and C. Richard Baker, “A Bi-Directional Examination of the Relationship Between Corporate Social Responsibility Ratings and Company Financial Performance in the European Context.” Journal of Business Ethics, 148, 527– 544, December 2015.

Reverte, Carmelo, “Determinants of Corporate Social Responsibility Disclosure Ratings by Spanish Listed Firms.” Journal of Business Ethics, 88 (2):351-366, 2009.

Zhou, Xiaoyan, Ben Caldecott, Elizabeth Harnett, and Kim Schumacher, “The Effect of Firm-level ESG Practices on Macroeconomic Performance.” Oxford Sustainable Finance Programme, Smith School of Enterprise and the Environment, University of Oxford, June 2020.

Footnotes

¹Scope 1 are direct emissions from sources that the company owns or controls. Scope 2 are indirect emissions from consumption of purchased electricity, heat, or steam. Scope 3 are all other indirect emissions that occur in a company’s value chain.

²The European Union-wide standard Nomenclature of Economic Activities (NACE) is selected for reference industry classification. Its scope and granularity vary from NACE level 1 to 4, with NACE 4 having the highest, internationally-harmonized industry sector granularity.

³Data Buffet is Moody's Analytics repository of international and subnational economic and demographic time series data.

⁴Exceptions are provided by the worldwide prices of the commodities, for which we take the first difference, since they apply to all countries, and by unemployment, as we consider both level and growth rate.

⁵For Scope 1 emissions, the performance metrics are computed on the logarithm of the emissions.

⁶For Scope 2 emissions, the performance metrics are computed on the logarithm of the emissions.

⁷We can provide these results for all 59 metrics upon request.

⁸We include asset turnover, working capital turnover, revenue per employee, debt ratio, and asset volatilities measures.

⁹For the European Union and other European countries, the Nomenclature of Territorial Units for Statistics (NUTS) standard was adopted to define subnational territorial units, covering up to three levels of granularity. For the U.S, the three levels of territorial subdivision include states, counties, and townships. For other countries, territorial divisions are in line with Moody’s Analytics Global Subnational areas definitions.

¹⁰The source for subnational development indices and indicators is Global Data Lab (https://globaldatalab.org/).

¹¹The Moody’s Optimal Variable Selection (OVS) algorithm is an automated procedure that finds the optimal combination of explanatory variables to predict a target metric. The OVS approach generates multiple models with all possible combinations of predicting variables, narrows down the model space by excluding model options based on defined selection criteria, and then sorts the resulting subset of models according to the specified ranking criteria, to determine the final adopted model.

¹²These additive terms do not depend on corporate size or location; they are independent of time.

¹³Due to the extreme volatility of macroeconomic variables in 2009 and 2020, subnational corrections computed from data relative to these years have been excluded from the analyses to determine the bound to subnational corrections.

¹⁴“Nomenclature des unités territoriales statistiques," https://ec.europa.eu/eurostat/web/nuts/background.

¹⁵As an example, for United States, NUTS1 regions are assimilated to States, NUTS2 to counties, NUTS3 to townships.

Using ESG Score Predictor: A methodological framework to estimate ESG scores