EDF-X Early Warning System

Summary

An effective early warning of credit risk requires three essential elements: forward-looking risk measures that capture firm risk attributes and stages in the credit cycle, early warning decision rules, and an analytic framework that summarizes risk information from various sources and turns it into actionable insights. The EDF-X Early Warning System provides pre-calculated, point-in-time-oriented, forward-looking credit risk signals for more than 400 million public and private firms globally.

The EDF-X EWS combines two early warning decision rules into an actionable early warning framework. The first decision rule is distance to trigger, which helps you spot which firms stand out as relatively risky compared to their country/region/industry peer group. The second decision rule is the year-on-year change in implied rating, which measures the significance of recent change in credit risk. The EDF-X EWS synthesizes relative-to-peers and absolute credit risk information into a simple, actionable signal presented in the quadrant view.

1 Introduction

Early warning of material risk is one of the fundamental challenges of credit risk analysis. Whether it is a bank originating loans, a portfolio manager attempting to outperform a benchmark, or a corporate managing counterparty risk, extracting a signal of excessive risk for an exposure with enough time to take action is a problem shared by all. Credit risk measures and models that capture changes in risk over time have existed for decades. Metrics based on firm financials; the prices of traded financial assets, for example, equity shares, bonds, and credit-default swaps; behavioral data; macroeconomics; and, increasingly, alternative data, such as news, supply-chain resilience, etc., provide near real-time windows into the evolution of a company’s credit risk. These risk metrics provide important credit information, but deriving timely and actionable insights from these measures remains a real challenge.

Effective early warning of credit risk requires three essential elements. The first is timely, forward-looking, and point-in-time-oriented risk measures for every credit exposure in your portfolio. Moody’s Analytics probability of default (PD) models are ideally suited for this purpose and provide the foundation for the accurate measurement of credit risk. The second essential element of early warning is the set of decision rules that trigger alerts. The rules and alerts provide answers to two key questions: Which exposures should I worry about? And when should I get worried? With the answers to these questions, you can build an effective watchlist.

The final essential element of early warning is the analytic framework to efficiently implement an early warning system. Moody’s Analytics’ EDF-X ¹ platform highlights the early warning risk measures and tools required, not just to manage risk, but also to screen potential borrowers and exposures, and sort out which are the stronger credits. Early and accurate identification of potentially problematic credit is certainly one of the primary goals of an effective early warning system (EWS), but so is efficiency. The early warning tools in EDF-X allow you to quickly screen your portfolio for excessively risky exposures, allowing you to use scarce human resources to dive deeper into these names and gain greater insight.

This Model Methodology describes the elements of the EDF-X EWS and the technical features of the models and algorithms on which it is based. In Section 2, we briefly discuss the first element of an effective EWS, the PD measures themselves. Section 3 presents the early warning decision rules currently available in EDF-X. In Section 4, we put it all together and discuss the EWS framework. Section 5 is a detailed analysis of the early warning power of the EDF-X EWS and includes a case study that emphasizes the use and interpretation of the EWS. Section 6 concludes.

2 Moody’s Analytics’ PDs

An essential element of an effective early warning system is forward-looking, point-in-time-oriented measures of credit risk. To the extent possible, credit risk measures need to reflect all relevant factors that inform our future view of risk as of today, without filtering out transient signals: firm-specific information, industry-level trends, macroeconomic conditions, market views, etc. Early warning of material changes in credit risk must take into account information about the business and credit cycles, as credit events are often a result of individual firms’ lack of resiliency to broad-based downturns.

Additionally, effective early warning requires risk measures that are updated at least monthly. Annual and semiannual reviews are not frequent enough to provide sufficient early warning of changes in credit risk to take proactive steps to protect portfolio quality and performance.

The EDF-X EWS uses as key inputs the PDs from our suite of PD models: CreditEdge, for listed firms; RiskCalc, the financials-based model suite mainly used for private firms with sufficient financials; Trade Payment PD Model, for firms with payment behavior data; and RiskCalc Benchmark model, for private firms without sufficient financials.² Although these models are derived using different data inputs and different modeling methods, the models all yield forward-looking, point-in-time probabilities of default. They are updated monthly or daily and serve as a solid foundation for early warning of credit risk. Although we calculate a term structure of PD measures, throughout this methodology, we specifically refer to PD measures with a one-year forecast horizon. Table 1 summarizes the key features of Moody’s Analytics’ suite of PD models.

A powerful feature of the EDF-X EWS is the pre-calculation of point-in-time PDs for more than 400 million firms, public and private, globally. In practice, that means that the early warning system can screen for material changes in credit risk in a near automated way, without the need to input data or manually run models. The EDF-X EWS also provides the best estimate of the probability of default for a given exposure. For example, if a firm has publicly listed equity, the CreditEdge PD will be used; if the exposure were a loan to a logistics company based in the U.K., the PD will be calculated using the appropriate country RiskCalc model, in this case, the U.K.

For listed firms, the PD modeling methodology is by its nature point-in-time.³ However, many credit portfolios have the majority of their exposures to private firms, making effective early warning for the entire portfolio a difficult challenge. Risk assessments for private companies that rely solely on backward-looking, and often stale, financial information is simply a nonstarter for early warning. Similarly, through-the-cycle-oriented ratings, either internal or agency, are also useless from an early warning perspective as they often lag changes in credit risk and credit spreads.

The EDF-X EWS utilizes Credit Cycle-Adjusted (CCA) PDs for private companies. CCA PDs start with an assessment based on a firm’s financial information or, for firms lacking sufficient financials and payment data, the Benchmark PD, then adds an adjustment factor that reflects the effect of the aggregate credit cycle appropriate for the firm’s industry sector and country. The result is that we get point-in-time, forward-looking PDs for private firms that may be leveraged for early warning.

Figure 1 shows an example illustrating the difference between the Credit Cycle-Adjusted PD and the PD based only on financial statements, essentially, a TTC measure. In the first half of the time series, the credit cycle is neutral, neither adding nor subtracting risk—the PD based on financials only and the CCA PD are almost the same. But in the second half of the time series, the CCA PD rises sooner, faster, and to a higher level than the TTC PD. That is the effect of the Credit Cycle Adjustment. The CCA PD brings forward risk information that would eventually be reflected in financial statements, and also reflects information about the wider credit environment in that firm’s industry sector and country.

3 Decision Rules for Early Warning

The decision rules that we use to determine which exposures are the ones that belong on our watchlist are arguably more important than the risk measures themselves. As accurate as they may be for measuring and rank ordering credit quality, risk measures like PDs lack a natural and intuitive interpretation. If a PD were to rise from, say, 0.2% to 0.4%, should you be concerned? It’s not obvious. But if the same exposure’s rating were to go from A to Baa, then you would clearly be concerned, because the thresholds between ratings, as fuzzy as their definitions might be, have meaning. Early warning rules are critical because they help us transform point-in-time PD measures into actionable information.

The EDF-X EWS currently relies on two key decision rules to identify firms most at risk: PD trigger level and change in PD-implied rating. The PD trigger level captures current point-in-time relative credit risk; if at any time we observe a given exposure’s PD rise above the early warning threshold for its peer group, such as its industry group, it should go on our watchlist. The PD-implied rating change captures transitions in absolute credit risk; PD measures are mapped to their Moody’s ratings equivalents, and changes in implied rating indicate improvement or worsening of an exposure’s risk level. In Section 4, we discuss how to combine the PD trigger level and the PD-implied rating change into a single, summary early warning signal.

3.1 PD trigger level

Diagnostic thresholds are commonly used in many scientific disciplines to dichotomize the result of a quantitative test into actionable, binary decisions. Diagnostic thresholds turn quantitative measurements, such as blood pressure, into informative assessments, like hypertension. In this section, we describe one of the cornerstones of the EDF-X EWS, the PD trigger level, which acts as the diagnostic threshold for early warning. When we observe a firm’s PD above the trigger level, it is exhibiting excessive risk relative to its peer group, and is more likely to experience financial distress or a negative credit event in the future. Figure 2 shows an illustrative example for Cineworld Group PLC, which filed for bankruptcy in September 2022. Its PD measure crossed the early warning trigger level for its industry peer group in March 2020.

Calibration of the PD trigger level is critical to its success as an early warning signal. A useful analogy is to think of the PD trigger level like a net. We intend to catch firms relatively more likely to become problem credits with our net, but if we make the netting too tight, we will catch too many firms to be useful—we will have a higher chance of catching all the potential defaulters, but they will be mixed in with many solid credits, and we will not have solved the original sorting problem we started with. However, if we make the netting holes too large, some risky firms will slip through, and our early warning process will not be as effective as it could be.

To state the calibration problem in statistical terms, the higher we set the PD trigger level, the fewer firms will trigger an alert, which leads to lower false positives, or Type I errors, but at the cost of potentially missing defaults below, but near, the trigger level, leading to a higher false negative rate. Conversely, a relatively lower PD trigger level will lead to a higher number of firms signaling elevated risk and thus reduce the false negative rate, or Type II errors. You will catch more potential defaulters, but increase the false positive rate. There is a level for the PD trigger that is optimal in the sense that it balances out the early warning prediction errors.

Ideally, the design of an effective PD trigger level is based on an optimal tradeoff between the forecast error types. An optimal trigger level can be derived using information from the receiver operating characteristic curve, such as Youden’s Index.⁴ Such methods usually assume that the costs of errors are equal. In the present context, for example, the cost of incorrectly forecasting a firm that eventually defaults is the same as the cost of spending more resources researching a longer watchlist. Given that EWS users’ risk appetites are likely to vary, it is desirable to have flexibility in the PD threshold design to accommodate potential customizations. It is conceptually difficult to map an institution’s risk appetite to a precise and explicit preference expressed as true/false positive/negative rates. In order to operationalize the preferences different users may have for early warning forecast error classifications, we introduced a single user-defined parameter: the positive signal rate (PSR)⁵.

PSR is the percentage of firms from your portfolio of exposures that your risk tolerance will yield on your watchlist. For example, a PSR of 20% means that, at any given time, 20% of the riskiest exposures in your portfolio will appear on your watchlist. A given PSR will naturally have implications in terms of the expected true/false positive/negative rates, but will be specific to the composition of any given portfolio. A trigger level optimized to historical data will also be associated with a particular PSR, which may be higher or lower than an institution’s desired PSR.⁶ The advantage of a PSR as a governing choice parameter is that it intuitively reflects how much work you are willing to spend to avoid problem credits by tolerating a longer watchlist.

PD trigger levels must be appropriately calibrated for a given exposure’s peer group. The most granular peer group includes private or public firms in the same country and industry sector. Across industry sectors and countries, we can observe material differences in the distribution of PDs. The technology sector, for example, exhibits much higher average risk and volatility than the banking sector. Applying a single PD trigger level across sectors would result in a perversely biased watchlist: risky firms from sectors with lower average default risk might not make the watchlist, while relatively safer firms from higher average PDs would be overrepresented in the watchlist. Hence, we must calibrate PD trigger levels for country and industry combinations.

PD trigger levels must also move over time with the credit cycle. When aggregate credit risk is low, it is relatively easier to spot the likely problem credits. However, when the credit cycle turns negative, it becomes much more challenging to sort the potential problem credits from the stronger ones. Almost all PDs rise when the credit cycle turns, because refinancing conditions are less favorable, rates and credit spreads often increase, and the business environment is simply more challenging. Hence, the trigger must adjust when the aggregate credit environment gets riskier, and vice-versa, in order to effectively identify the relatively risky names.

In the EDF-X EWS, PD trigger levels exhibit both of these characteristics. We calculate time-varying PD trigger levels for more than 300,000 peer groups based on country; region, or aggregation of countries; industry; and if the firm is public or private. The calculation of the PD trigger levels generally proceeds in two steps. First, we select a PSR as the default setting: 20% for private firms, 15% for listed financial institutions, and 25% for public corporates.⁷ These PSRs imply PD percentiles on a through-the-cycle basis. We add a credit cycle adjustment to obtain the point-in-time PD percentiles. In the second step, we map the point-in-time PD percentiles to PD levels for the sector/country triggers. We rely on data from granular peer groups when sufficiently representative, but resort to more broadly defined peer groups for information when needed, using a hierarchical Bayesian model.

3.1.1 Credit cycle adjustment to the PD trigger

Credit events tend to be clustered in time. In a stressed period, such as the COVID-19 pandemic, or the Global Financial Crisis, uncertainty and real economic difficulties raise the credit risk of many, if not most, firms. In a scenario where the median PD of an industry sector doubled, we would want to introduce a credit cycle adjustment that lowers the PD percentile to pick up firms that are excessively risky compared to their peers in the stressed scenario. First, in calculating the EDF-X EWS PD trigger level, we introduce a credit cycle adjustment (CCA). The through-the-cycle PD percentile implied by the preselected PSR provides an anchor point for the credit cycle-adjusted PD percentile.

The calculation of the credit cycle-adjusted PD percentile for peer group ii at time t is shown in equation 1.

Where the TTC PD percentile is the PD percentile corresponding to the peer group i’s specific PSR, and the credit cycle adjustment for the current time period,t, is calculated as

PD⁸⁰_i is the long-run median from all peer groups’ 80th percentile PD distribution. PD⁸⁰_i,t is the peer group i’s 80th percentile PD at time t. Hence, the credit cycle adjustment to the PD trigger level can be thought of as the current deviation from the long-run trend PD. We calibrate the CCA factor using the 80th percentile of a peer group’s PD as it is more responsive to changes in the general risk level for the group but is also robust to extreme values. To convert the log difference to a percentile used for the PD trigger adjustment, we multiply the log difference by 12.⁸ We also apply a lower bound of -15 and an upper bound of 15 to the CCA factor to avoid extreme values.

The next step in calculating the PD trigger level entails applying the hierarchical Bayesian adjustment to the final set of credit cycle-adjusted PD triggers, which helps address small sample issues while keeping high granularity in the peer groups.

3.1.2 Hierarchical Bayesian-adjusted PD trigger

Because early warning of credit risk requires comparisons of exposures across economically relevant peers, we employ a hierarchical Bayesian approach that allows us to fully utilize the available data for calibration of the triggers while avoiding the problem of small sample sizes at granular peer group levels. In the calculation of the trigger PDs, we rely on data from granular peer groups when sufficiently representative, but resort to more broadly defined peer groups for information when needed. Another benefit of this approach is that by appropriately weighting lower and higher levels of peer aggregation—using more of the available data to calculate the PD triggers—the PD triggers will be less subject to volatility over time than if we calculated them directly for each peer group.

We define three levels for information for the hierarchical Bayesian-adjusted trigger calculation: ‘country+industry’ group (level 1), ‘region+industry’ group (level 2), and ‘global industry’ group (level 3). The Bayesian adjustment method allows us to start with the most credit-relevant information (level 1), then perform adjustments to achieve full coverage by appropriately weighting higher-level groups. We perform the weighting adjustments twice, until we reach a final estimate.

Table 2 illustrates the process of calculating the Bayesian-adjusted PD trigger level for a given peer group. For example, at a given point in time, a peer group in industry X and country Y in Western Europe consists of 32 firms at the country level, 226 at the regional level (Western Europe), and 1,556 globally. First, we compute the ‘country+industry’ trigger, obtained from the first step of the CCA adjustment. Then, we calculate an adjustment (Adj Trigger₁), which is a weighted-average of the ‘region+industry’ and ‘global+industry’ triggers, or Trigger_L2 and Trigger_L3, respectively.

In this example, the weight on Trigger_L2 is 92%, given the number of firms in Western Europe industry X (226). The weight on the global industry X is the remaining 8%. To obtain the final estimate, we combine the most granular estimate, Trigger_L1, with the weighted average from step two, as shown in equation 4:

As a result, we obtain the final set of Bayesian-adjusted PD triggers.⁹ In this example, the data allowed us to put substantial weight (62%) on the information directly from the country+industry group, while also enhancing the estimate with data from the higher-level peer groups.

Figure 2 shows an example of the final, cyclically adjusted and Bayesian-adjusted PD trigger level over 16 years at a monthly frequency for France’s pharmaceuticals peer group after performing calculations as illustrated above.

3.2 PD-Implied rating change

In many applications, particularly among banks and insurance companies, the management of credit portfolios must take into account the concept of significant increases in credit risk. This concept is distinct from the relative risk concept we discussed in the EWS PD trigger section, in that significant increase in credit risk is concerned with the absolute, not the relative, risk level. Whereas the PD trigger level is based on a peer-comparison approach, significant increase in credit risk essentially evaluates changes in firm risk on a stand-alone basis. Moreover, this aspect of risk is often communicated using the language of credit ratings. To address this risk concept, the second decision rule in the EDF-X EWS is the change in the PD-implied rating over the past 12 months.

PD metrics can be mapped to the Moody’s Investors Service rating scale using a PD-implied rating mapping, with the mapping determined by historical PD measures associated with each rating class. In EDF-X, the mapping applies to public and private firms.¹⁰ It should be clearly understood that PD-implied ratings, while expressed using the same ratings symbols as Moody’s Investors Service credit ratings, are not agency credit ratings.¹¹

To assess changes in the level of credit risk, we calculate the 12-month change in the implied rating. We map both the current one-year PD and the PD value 12 months ago to the numerical static-implied rating, using Table 3. The implied rating change is computed as the difference between the two. For example, if the PD today is 5% and it was 2% 12 months ago, the corresponding numerical implied rating is 15 (B2) at present and 13 (Ba2) one year ago. Consequently, the 12-month net change in implied rating is 15-13=2, meaning that the implied rating has deteriorated by two notches over the past 12 months.

There is a noticeable difference in the distribution of the implied static rating between the public and private firm EDF-X EWS calibration sample, as shown in Figure 4. In particular, very few private firms have an implied rating better than A3. This result remains consistent with what we observe in practice: Proportionally, fewer private firms have very good ratings compared to their public counterparts. Private firms also tend to have relatively higher PDs on average, which leads to relatively worse implied ratings in general.

As we discussed, the language of ratings is often more intuitive for communicating risk than PD levels. In Figure 4, we show the PD measures for Ford Motor Co. and its PD-implied ratings, derived using the mapping table in Table 3, from 2019 to 2022. Ratings are exponential in default risk, as we can infer from Table 3, so relatively small changes in PD can translate into larger PD-implied rating notch movements when the starting PD is low, as is the case for Ford. Conversely, as a firm’s PD level increases to relatively lower PD-implied rating grades—below Baa, or investment grade—it takes larger movements in the PD level to trigger a migration to a lower PD-implied rating grade. Although this behavior is a historical artifact of the relationship between agency rating grades and default rates, not a modeled result, it has beneficial properties from an early warning perspective. For example, if we were managing a portfolio of high-credit-quality loans, we would be primarily concerned with initial changes in the PD that would have a relatively larger impact on implied ratings early on. If we had warning of that early, and most impactful, deterioration, we could take proactive steps to mitigate the potential consequences.

4 Early Warning Signal and Quadrant View

Multiple early warning signals provide different perspectives on credit risk and strengthen the early warning process. It is sometimes challenging, however, to draw inferences from different signals and combine them into a total assessment of risk. In this section, we describe how the EDF-X EWS combines the two early warning decision rules into an intuitive and actionable early warning signal.

The EDF-X EWS employs a quadrant framework where the x-axis represents the distance to the PD trigger and the y-axis measures the 12-month change in PD-implied rating. Conceptually, the x-axis represents the assessment of an exposure’s relative risk level, while the y-axis captures changes in the firm’s absolute risk ranking. There are four early warning categories: Severe, High, Medium and Low. We color-code them into red, orange, yellow and green, respectively (Figure 6).

The severe, or red, signal consists of exposures with PDs above their PD trigger levels and whose PD-implied ratings have deteriorated over the past 12 months. The high, or orange, bucket includes exposures with PDs above their PD trigger levels, but with a stable or improving PD-implied rating. Exposures that currently have PDs below their PD trigger levels but have experienced deteriorating PD-implied ratings over the last 12 months belong to the medium, or yellow, bucket. The low, or green, category includes exposures whose PDs are below their early warning trigger levels as well as stable or improving PD-implied ratings over the past year.Table 4 summarizes the EDF-X early warning signal categorization.

The EDF-X EWS quadrant design has a few noticeable advantages. It is intuitive, accounting for a company’s current level of credit risk and the momentum of recent credit developments. It is transparent, with bucket classifications determined entirely by quantifiable credit metrics. It offers useful visualization; a firm’s movement within and across warning signal buckets can be easily traced through the two-dimensional axis over time. Alternatively, an entire portfolio can be plotted on the quadrant at a particular point in time, cross sectionally, and each firm’s relative credit ranking in the portfolio can be visualized.

The evolution of the early warning signals for J.C. Penney, Inc. and its eventual bankruptcy is an instructive example of the usage and power of the EDF-X EWS. J.C. Penney, Inc. is a department store company headquartered in the U.S. belonging to the consumer products retail/wholesale peer group (USA N18). Figure 7 shows its one-year PD (monthly frequency) against its peer group PD trigger level from 2010 up to its Chapter 11 bankruptcy filing in May 2020. The progression in the company’s PD measure over time shows how difficult it can be to interpret changes in credit risk directly from a PD metric. Over the year 2012, J.C. Penney’s PD level doubled, although it’s imperceptible given the scaling of the graph. Should one have been concerned at that time? In the moment, it would have been difficult to decide. However, once we overlay the PD trigger level, we know exactly when to get worried. In mid-2013, the company’s PD crossed the PD trigger level and remained above it, with the exception of a couple of months in 2016, until it eventually defaulted in 2020.

Similarly, PD-implied ratings provide an intuitive translation of the changes in PD levels for J.C. Penney. Figure 8 shows J.C. Penney’s PD-implied ratings and implied rating changes. From 2012 to 2013, the deterioration in the company’s PD translated into a seven-notch-implied rating change, knocking the company’s implied rating down from Baa3, investment grade, to the lowest noninvestment-grade-implied rating Caa-C. From that time until it filed for bankruptcy, the PD-implied rating fluctuated in a five-notch range.

Figure 9 provides a visualization of J.C. Penney’s risk migration in the quadrant design of the EDF-X EWS. For illustration purposes, we plot the company’s EWS metrics for each January between 2010 and 2020. J.C. Penney was in the green bucket before 2012, because its PD level is lower than the trigger and its PD-implied rating experienced an improvement in the prior 12 months. From 2015 to 2017, the firm’s PD level was higher than its peer group PD trigger, while the implied rating was either improved or stable, placing the firm in the high-risk early warning category during that period. In 2018, the firm was flagged as a severe risk, given its two-notch deterioration in its PD-implied rating, as its PD moved further above its early warning trigger level. In 2019 and 2020, the firm remained in the severe-warning bucket but became riskier compared to 2018, as evidenced by a higher distance to trigger. Finally, J.C. Penney filed for bankruptcy in May 2020.

J.C. Penney is a telling example of the power of EDF-X EWS. Not only did the EWS signal flag J.C. Penney in the severe bucket at least three years before its default, the EWS quadrant design also provided an informative visualization of its migration between different warning buckets over time.

5 Early Warning System Performance Assessment

As we discussed above, early warning systems collapse one or more risk metrics into a smaller number of actionable assessments. On their own, the early warning trigger and the PD-implied ratings each classify risk into two categories, yielding the two by two EWS buckets. In making these forward-looking assessments, there are four potential outcomes: true positives and true negatives, or successful forecasts; and false positives and false negatives, or incorrect forecasts. In this section, we assess the early warning signaling power of the EDF-X EWS by evaluating these four types of outcomes with respect to default events, and rating actions. We are also concerned with how far in advance the EDF-X provides an early warning signal, as well as, from a portfolio perspective, how valuable an effective EWS is in terms of losses from an avoided default.

We begin by describing the development data set we used to develop the EWS, and on which we conduct the performance assessment.¹²

5.1 EDF-X EWS development data set

The EDF-X EWS is calibrated and evaluated separately on a data set of public companies and private companies, as we saw in Figure 4 of section 3, due to the inherent differences in the risk profiles between the public and private firms. This section describes the underlying data used for calibration.

We construct a sample of global public firms from the CreditEdge data set from January 2004 to December 2020. In total, we cover 63,810 unique firms with 6,272,655 monthly observations. Table 5 details the top countries and industries present in the public universe.

Two data sources help provide extensive coverage for private firms. The Orbis¹³ data set provides substantial coverage for private companies across the globe. For U.S. companies, however, there is relatively limited data in Orbis. To expand and complement private-firm coverage for U.S.-based companies, we also utilize the Moody’s Analytics Data Alliance data set. We calculate credit cycle-adjusted PDs for the private-firm sample using RiskCalc models, leveraging the financial statements in Orbis and the Data Alliance. The CCA PDs are subsequently used in our metric development and testing.¹⁴ Tables 6 and Table 7 present a detailed description of the private-sample data.

5.2 Early warning assessment on default events

The first performance assessment of the EDF-X EWS we perform consists of measuring the ability of the system to accurately identify default events¹⁵ within a year of an exposure being put on a watchlist. In section 3, we discussed the early warning rules in the EDF-X EWS. Forming a watchlist introduces an additional layer of decision-making, which entails mapping those early warning signals to your watchlist policy and aligning it with your institution’s risk appetite and workflow. A watchlist can be built by flagging firms in the severe warning bucket alone, or it can include firms in the high warning bucket.¹⁶ Firms on the watchlist are considered predicted positive, otherwise they are categorized as predicted negative. If a predicted positive firm defaults within the next 12 months, it becomes a true positive. Conversely, if the firm does not default in the next 12 months, it is defined as a false positive. Table 8 lists the four possible outcomes.

With the four possible outcomes, we derive two important metrics for performance evaluation:

Previously, we discussed the meaning and interpretation of the Positive Signal Rate (or PSR), which we formally define here as

To evaluate how the EDF-X EWS performs in signaling defaults, we computed one-year PDs until 2020 and their associated early warning signals using the Moody’s Analytics’ historical default data set.¹⁷ Table 9 presents the performance analysis on the public sample from January 2004 to December 2020. Watchlist 1 (first row) consists of firms classified in the severe warning bucket only. The second row expands the watchlist to include firms in the ‘severe+high’ warning buckets. The two following rows include the two remaining buckets accordingly, until the PSR reaches 100% in the fourth watchlist, where all four EWS buckets are included.

For all defaulted firms, we extract the last 12 months of observations leading toward the defaults and construct a default sample. The percentage of default observations that is accurately covered by the watchlist is a highly informative metric for the EWS performance assessment. Another metric used in the performance evaluation is the percentage of peer groups with a 0 PSR, in which case no firms in these peer groups are flagged in the watchlist. This occasionally happens because certain peer groups tend to have relatively high credit quality.

As the results in Table 9 suggest, there is an apparent trade-off between the PSR and the combined signaling error rate, namely, the Type I and Type II errors. When the PSR increases, more firms are included in the watchlist, which in turn leads to a higher Type I error due to rising false positive events. Meanwhile, the Type II error drops because of fewer false negative events.

The power of the EDF-X EWS is clearly shown by the effectiveness of watchlists constructed using the early warning buckets. For example, Watchlist 1 successfully signals 72% of the default observations with a 17% PSR, while managing to limit the combined error rate to 44%. The portion of peer groups with no firms flagged in the watchlist is merely 5.4%. When we expand the watchlist to cover severe and high warning buckets, the PSR increases to 24%, but the watchlist can signal 81% of the default observations, with the combined error rate remaining at 43%.

Table 10 presents the composition of the buckets for the public-firm sample. The default rate monotonically decreases from severe to low risk, with the default rate for the severe bucket being 5%, highlighting the power of EWS in signaling the riskiest firms.

Tables 11 and Table 12 present similar analyses for a private-firm data sample. We use the same warning bucket classification as in Table 4, but with a different trigger level, anchored at the 80th PD percentile. The choice of 80 is to target a similar level of PSR to that in EWS 2019, which maintains the consistency for current users and represents a transparent way to consider user preferences.

While it appears that the EDF-X EWS performs less effectively on the private-firm sample, we still see benefits of using the EDF-X EWS compared to using other techniques. One likely reason why the performance on private firms is lower could be missing defaults. The default record collection for private firms is more challenging compared to public firms, because reporting requirements and news coverage for public companies makes credit events more transparent. Another potential reason is the difference in information sources. While we can access financial statements for credit assessment for private firms, for public firms, we have market signals as an additional information source.

Nevertheless, with a 15% PSR, the severe-warning bucket can signal 39% of the default observations, with only 1.9% of the peer groups having no firms flagged. In addition, firms classified in the severe- and high-warning buckets have substantially higher realized default rates than those in the medium- and low-warning buckets. Given that credit portfolios are often overweighted in exposure to private companies, and that early warning signals for private companies are often otherwise lacking, the CCA PDs and the early warning signals built around them in the EDF-X EWS represents a significant advancement in our ability to manage exposure to private firms.

5.3 Early warning assessment on credit rating changes

We also evaluate the performance of the EDF-X EWS on credit rating changes. After all, default events are rare. Firms could have substantial credit risk and merit increased scrutiny even when they do not default because unanticipated ratings changes impose costs and potential losses. To that end, we conduct an analysis on the credit rating changes associated with firms that are classified in different EWS buckets but do not default in the next 12 months. For this analysis, we study the public-firm sample with either a Moody’s Investors Service senior unsecured debt ratings or, if that does not exist, an equivalent rating derived from the rated subordinated or secured debt, calculated using the Senior Ratings Algorithm.¹⁸

Table 13 presents the MIS rating distribution by EWS warning bucket for the rated public-firm data sample. As the table shows, riskier EWS buckets are associated with lower-quality credit ratings. For example, 74% of the firms in the severe EWS bucket are rated B or below; while for the low-warning bucket, 66% of the firms are rated Baa or above, or investment grade. The 1% in the severe bucket that are rated Aaa consist of Fannie Mae and Freddie Mac. This contrasting credit assessment stems from the fact that one of the key inputs to the rating decision is the level of government support, which is not a factor in the PD modeling.¹⁹

Table 14 further breaks down the observations in each warning bucket into various rating actions made by MIS within the following 12 months. For example, 16% of the firms in the severe-warning buckets actually default within 12 months, another 28% of the firms are downgraded, and yet another 23% receive a negative rating outlook. With safer warning buckets, the percentage of firms in default, downgraded, or with a negative rating outlook becomes lower. This performance result provides additional supporting evidence for the power of EWS in flagging risky names: Most flagged companies experience significant credit deterioration during the upcoming 12 months, even if they do not end up defaulting.

5.4 Early Warning Signal as a leading indicator of firm defaults

The sections 5.2 and 5.3 measured the accuracy of the EDF-X EWS with a one-year time horizon. An additional important assessment is how much time in advance of a default does the EDF-X EWS provide an early warning signal. In this section, we analyze the EWS quadrant risk classification of defaulting firms over time horizons longer than one year.

Figure 10 presents an event study rendering of the percentage of defaulted firms in the severe or high EWS buckets up to 36 months ahead of default. We analyze the public and private data samples separately. On the public-firm data sample, 39% of firms that end up defaulting are identified in the severe risk category three years in advance. When we include exposures identified as high risk, the EWS captures 51% of eventual defaulters three years ahead. Two years before a default, 59% of firms are flagged as severe or high, and by the time we reach one month before default, 87% of the defaulting firms are classified as high or severe risk. On the private sample, the EDF-X EWS categorizes 32%, 36%, 43% and 51% of the firms in either severe or high-warning buckets 36 months, 24 months, 12 months, and one month before default, respectively. These results demonstrate the long early warning lead times the EDF-X EWS produces before defaults actually occur.

To understand if there are geographical differences in EWS performance, Table 15 presents the percentage of defaults flagged as either severe or high risk by region at different time horizons.²⁰ Overall, the EDF-X EWS is capable of effectively flagging defaulting firms at various horizons ahead of default in all regions. Similar to what we saw for the global sets of public and private firms, across regions, the EWS tends to identify around 51% of eventual defaulters 36 months before a credit event, with the hit rate increasing to 80%-90% by the time default is on the doorstep, or one month ahead. Similarly for private firms, across regions roughly one-third of eventual defaulters are identified as severe or high 36 months in advance, with the percentage increasing to 35%-50% one month before default.

For firms with at least one severe- or high-risk signal before the default, we assess to what extent such signals are sustained before the defaults. For example, if a defaulted firm’s EWS signals six months prior to default were high (six months), low (five months), low (four months), high (three months), high (two months), and severe (one month), then the first sustained severe or high signal would be at three months before default.

Table 16 presents the distribution of the first sustained severe or high signal by region, on public and private calibration samples. For the public sample, on average, the first sustained severe or high signal is observed 21 months before default. On the private sample, we tend to observe the first sustained signal 30 months before default. These results suggest that users, in general, have a considerable amount of time to take the proper action after observing the first sustained warning signal.

5.5 Testing the EDF-X EWS on a hypothetical credit portfolio

Performance statistics are helpful to understand the classification power of an early warning system, but they do not necessarily directly answer the question of how valuable an effective EWS is. To tackle that question, we tested the EDF-X EWS on a hypothetical portfolio of noninvestment-grade credit exposures and measured the portfolio loss rate when no EWS is used, when a random watchlist is used, and when the EDF-X EWS is used.

The hypothetical portfolio is composed of 773 noninvestment-grade public firms globally. We conducted the analysis for year 2008 to assess how our EWS tool would have performed during the financial crisis—the default status of the exposures in the portfolio were recorded at the end of 2008 and we tracked whether the EDF-X EWS flagged the exposure as a high or severe risk prior to that time. All firms in the portfolio have unit exposure and the industry composition was taken as given. For example, we did not try to match a benchmark portfolio’s sector allocation. For simplicity, we assumed a 50% loss given default rate, which is immediately recognized and not discounted.

The results of the portfolio simulation are presented in Figure 11. Of the exposures in the portfolio, 7.8% were recorded as having defaulted, and of the 773 names in the portfolio, 34% were placed on our watchlist using the EDF-X EWS, or the PSR. The portfolio loss rates clearly demonstrate the power and value that the EDF-X EWS delivers. If you did not use an early warning system, the loss rate on the portfolio is eight times higher than if you had used the EDF-X EWS. Using a random watchlist where we sampled 34% of the exposures, or the PSR, at random helped reduce the portfolio loss rate to 2.6%, but that is still five times higher than the EDF-X EWS.

6 Conclusion

Early warning of credit risk will be one of the main challenges facing lenders, insurance companies, asset managers and corporates going forward. Higher interest rates, asset volatility, supply-chain disruptions, and economic uncertainty require credit professionals to closely monitor their portfolios and identify potential problem credits with as much lead time to take protective action as soon as possible. The EDF-X Early Warning System was designed specifically with this purpose in mind.

The EDF-X EWS provides pre-calculated, point-in-time-oriented, forward-looking credit risk signals for more than 400 million firms globally. The EDF-X EWS synthesizes relative-to-peers and absolute credit risk information into a simple actionable signal which helps identify firms with expected upcoming material credit deterioration.

The EDF-X EWS combines two early warning decision rules into an actionable early warning framework. The first decision rule is Distance to Trigger which helps you spot which firms stand out as relatively risky compared to their country/region/industry peer group. The second decision rule is the PD-implied rating change which measures the significance of recent change in their credit risk. The quadrant design of the EDF-X EWS yields actionable early warning risk assessments: Severe, high, medium and low.

Our performance tests of the EDF-X EWS signals show that the early warning framework effectively answers the two key questions any good early warning system needs to address: Which exposures to worry about, and when to worry about them. For default events and ratings changes, we showed that when the EDF-X EWS identifies a credit exposure as severe or high risk, it is indeed associated with a relatively higher risk of default, rating downgrade, or some other negative rating event. We also showed that the EDF-X EWS can provide signals of early warning for subsequent default three years in advance of the credit event.

We also discussed the value that an effective early warning system can bring by helping you avoid losses from default events. In 2021, the size of the average corporate default, for bonds and loans, was almost $700 million²¹, so an effective early warning system pays for itself by avoiding even one default. In our model portfolio analysis, we showed that the EDF-X EWS helped reduce the expected loss on a model high-yield portfolio from 3.9% when no early warning system was used to just 0.5%, a loss reduction of eight times.

Footnotes

¹https://edfx.moodysanalytics.com

² Chaves and Pieschacon (2022), Pieschacon and Zeng (2022)

³ Nazeran and Dwyer (2015)

⁴ Youden (1950)

⁵ Customization of PSR will be available to users in future EDF-X releases.

⁶ Zweig, M.H. and Campbell, G. (1993)

⁷ We tested various early warning PD trigger levels and found that a 15% PSR for financial institutions and 25% PSR for corporates yielded the best results in terms of minimizing the sum of Type I and Type II forecast errors, as well as in yielding the highest hit rate for identifying historical defaulters in backtesting.

⁸ The multiplier 12 is calibrated so that 40% of the peer groups result in a CCA factor between -5 and 5; and 80% of the peer groups achieve a CCA factor between -10 to 10. This multiplier is not related to the number of months in a year.

⁹The scalar 20 in equations 3 and 4 represents an adjustment that evenly weights the importance of the prior and the new information from a sample with N=20 observations. With an N= 1, we would assign a 95% weight to the prior and a 5% weight on the new sample (the single observation). And with an N=380, we would place a 5% weight on the prior and a 95% weight on the new sample. We have experimented with different values. With N substantially lower than 20, we find that the estimate tends to jump over time when a few firms come in or drop out of the sample. For values substantially greater than 20, we observe a reduction in predictive accuracy due to overweighting the aggregate mean.

¹⁰ An alternative method for deriving PD-implied ratings would be based on a “dynamic” mapping approach, where the PD bands associated with each rating grade move over time as the PD distribution changes. This method can also yield useful insights because it extracts aggregate market movements and identifies over/underperformance. However, as the focus here is on significant changes in the level of credit risk, we do not explore this approach further.

¹¹ It should be understood that Moody’s Investors Service credit ratings are based on the agency’s own methodologies and do not rely on Moody’s Analytics’ PD measures.

¹² We have also conducted a performance comparison of the EDF-X EWS with the previous 2019 early warning system that is available upon request.

¹³ Orbis is the resource for entity data, with information on over 448 million companies worldwide, including 45 million private companies with detailed financials and 316 million companies with ESG predicted scores. It includes comparable information, extensive corporate ownership details and comprehensive coverage, combining data from hundreds of sources.

¹⁴ Not-for-profit entities are removed from the private-firm sample.

¹⁵ Default is defined as missed payment of interest or principal, bankruptcy, administration (and legal equivalents), or distressed restructuring.

¹⁶ A watch-list policy that includes firms that fall within the severe and high EWS risk levels is implicitly putting 100% weight on the PD trigger level signal and 0% weight on the PD-implied rating early warning signal.

¹⁷ Moody’s Analytics’ default data set includes approximately 790,000 credit default events for global private firms and public firms during 1992-2021. Due to default data collection lags, we decided to exclude 2022 data at the time of EDF-X EWS development.

¹⁸ These are the same ratings that are the basis for Moody’s Investors Service’s annual corporate default study.

¹⁹ The government support impacts loss given default. It should be further noted that some investors in Fannie Mae did, in fact, experience a loss when Fannie was placed into administration. From an early warning perspective, one could argue that, despite their Aaa ratings, Fannie Mae and Freddie Mac, should have been on a watch list prior to the time that government support for senior creditors was made explicit.

²⁰ We only include regions with more than 50 default cases.

²¹ “Annual default study: After a sharp decline in 2021, defaults will rise modestly this year,” Moody’s Investors Service, February 2022.

References

Chaves, Leonardo S.S. and Pieschacon, Anamaria. “Trade Payment Behavioral PD Model”, August 2022.

Malone, Samuel W. and Reginald White, “Weighted Optimal (W-OPT) EDF Triggers: A Screening Methodology for Corporate Debt Portfolios.” Moody’s Analytics Research Framework, August 2020.

Malone, Samuel W., Irina Baron, and Reginald White, “The Deterioration Probability (DP) Model: Methodology and Validation.” Moody’s Analytics Research Framework, March 2018.

Pieschacon, Anamaria and Zeng, Michael, “The RiskCalc Benchmark Model”, June 2022.

Nazeran, Pooya and Douglas Dwyer, “Credit Risk Modeling of Public Firms: EDF9,” Moody’s Analytics White Paper, June 2015.

Sun, Ziyi, Janet Zhao, and Gustavo Jimenez, “Identifying At-Risk Names in Your Private Firm Portfolio - RiskCalc Early Warning Toolkit.” Moody’s Analytics Viewpoints, November 2018.

Youden, W.J., “Index for Rating Diagnostic Tests,” Cancer, Vol 3, No.1, pp. 32-35, January 1950.

Zweig, Mark H. and Gregory Campbell, “Receiver-Operating Characteristic (ROC) Plots: A Fundamental Evaluation Tool in Clinical Medicine,” Clinical Chemistry, Vol. 39, No.4, pp. 561-577, 1993.