CECL – The Power of Vintage Analysis

I would argue that a critical step in getting ready for CECL is to review the vintage curves of the segments that have been identified. Not only do the resulting graphs provide useful information but the process itself also requires thought on how to prepare the data.

Consider the following graph of auto loan losses for different vintages of Not-A-Real-Bank bank[1]:

 

While this is a highly-stylized depiction of vintage curves, its intent is to illustrate what information can be gleaned from such a graph. Consider the following:

  1. A clear end to the seasoning period can be determined (period 8)
  2. Outlier vintages can be identified (2015Q4)
  3. Visual confirmation that segmentation captures risk profiles (there aren’t a substantial number of vintages acting odd)

But that’s not all! To get to this graph, some important questions need to be asked about the data. For example:

  1. Should prepayment behavior be captured when deriving the loss rates? If so, what’s the definition of prepayment?
  2. At what time period should the accumulation of losses be stopped (e.g., contractual term)?
  3. Is there enough loss[2] behavior to model on the loan level?
  4. How should accounts that renew be treated (e.g., put in new vintage)?

In conclusion, performing vintage analysis is more than just creating a picture with many different colors. It provides insight into the segments, makes one consider the data, and, if the data is appropriately constructed, positions one for subsequent analysis and/or modeling.

Jonathan Leonardelli, FRM, Director of Business Analytics for the Financial Risk Group, leads the group responsible for model development, data science, documentation, testing, and training. He has over 15 years’ experience in the area of financial risk.

 

[1] Originally I called this bank ACME Bank but when I searched to see if one existed I got this, this, and this…so I changed the name. I then did a search of the new name and promptly fell into a search engine rabbit hole that, after a while, I climbed out with the realization that for any 1 or 2 word combination I come up with, someone else has already done the same and then added bank to the end.

[2] You can also build vintage curves on defaults or prepayment.

 

RELATED:

CECL—Questions to Consider When Selecting Loss Methodologies

CECL—The Caterpillar to Butterfly Evolution of Data for Model Development

CECLData (As Usual) Drives Everything

CECL—Questions to Consider When Selecting Loss Methodologies

Paragraph 326-20-30-3 of the Financial Accounting Standards Board (FASB) standards update[1] states: “The allowance for credit losses may be determined using various methods”. I’m not sure if any statement, other than “We need to talk”, can be as fear inducing. Why is it scary? Because in the world of details and accuracy, this statement is remarkably vague and not prescriptive.

Below are some questions to consider when determining the appropriate loss methodology approaches for a given segment.

How much history do you have?

If a financial institution (FI) has limited history[2] then the options available to them are, well, limited. To build a model one needs sufficient data to capture the behavior (e.g., performance or payment) of accounts. Without enough data the probability of successfully building a model is low. Worse yet, even if one builds a model, the likelihood of it being useful and robust is minimal. As a result, loss methodology approaches that do not need a lot of data should be considered (e.g., discount cashflow or a qualitative factor approach based on industry information).

Have relevant business definitions been created?

The loss component approach (decomposing loss into PD, LGD, and EAD) is considered a leading practice at banks[3]. However, in order to use this approach definitions of default and, arguably, paid-in-full, need to be created for each segment being modeled. (Note: these definitions can be the same or different across segments.) Without these definitions, one does not know when an account has defaulted or paid-off.

Is there a sufficient number of losses or defaults in the data?

Many of the loss methodologies available for consideration (e.g., loss component or vintage loss rates) require enough losses to discern a pattern. As a result, banks that are blessed with infrequent losses can feel cursed when they try to implement one of those approaches. While low losses do not necessarily rule out these approaches, it does make for a more challenging process.

Are loan level attributes available, accurate, and updated appropriately?

This question tackles the granularity of an approach instead of an approach itself. As mentioned in the post CECL – Data (As Usual) Drives Everything, there are three different data granularity levels a model can be built on. Typically, the decision is between loan-level versus segment level. Loan-level models are great for capturing sensitivities to loan characteristics and macroeconomic events provided the loan characteristics are accurate and updated (if needed) on a regular interval.

Jonathan Leonardelli, FRM, Director of Business Analytics for the Financial Risk Group, leads the group responsible for model development, data science, documentation, testing, and training. He has over 15 years’ experience in the area of financial risk.

 

[1]FASB accounting standards update can be found here

[2] There is no consistent rule, at least that I’m aware of, that defines “limited history”. That said, we typically look for clean data reaching back through an economic cycle.

[3] See: Capital Planning at Large Bank Holding Companies: Supervisory Expectations and Range of Current Practice August 2013

RELATED:

CECL—The Caterpillar to Butterfly Evolution of Data for Model Development

CECLData (As Usual) Drives Everything

CECL—The Caterpillar to Butterfly Evolution of Data for Model Development

I don’t know about you, but I find caterpillars to be a bit creepy[1]. On the other hand, I find butterflies to be beautiful[2]. Oddly enough, this aligns to my views on the different stages of data in relation to model development.

As a financial institution (FI) prepares for CECL, it is strongly suggested (by me at least) to know which stage the data falls into. Knowing its stage provides one with guidance on how to proceed.

The Ugly

At FRG we use the term dirty data to describe data that is ugly. Dirty data typically has these following characteristics (the list is not comprehensive):

  • Unexplainable missing values: The key word is unexplainable. Missing values can mean something (e.g., a value has not been captured yet) but often they indicate a problem. See this article for more information.
  • Inconsistent values: For example, a character variable that holds values for state might have Missouri, MO, or MO. as values. A numeric variable for interest rate might have a value as a percent (7.5) and a decimal (0.075)
  • Poor definitional consistency: This occurs when a rule that is used to classify some attribute of an account changes during history. For example, at one point in history a line of credit might be indicated by a nonzero original commitment amount, but at a different point it might be indicated by whether a revolving flag is non-missing.
The Transition

You should not model or perform analysis using dirty data. Therefore, the next step in the process is to transition dirty data into clean data.

Transitioning to clean data, as the name implies, requires scrubbing the information. The main purpose of this step is to address the issues identified in the dirty data. That is, one would want to fix missing values (e.g., imputation), standardized variable values (e.g., all states are identified by a two-character code), and correct inconsistent definitions (e.g., a line indicator is always based on nonzero original commitment amount).

The Beautiful

A final step must be taken before data can be used for modeling. This step takes clean data and converts it to model-ready data.

At FRG we use the term model-ready to describe clean data with the application of relevant business definitions. An example of a relevant business definition would be how an FI defines default[3]. Once the definition has been created the corresponding logic needs to be applied to the clean data in order to create, say, a default indicator variable.

Just like a caterpillar metamorphosing to a butterfly, dirty data needs to morph to model-ready for an FI to enjoy its true beauty. And, only then, can an FI move forward on model development.

 

Jonathan Leonardelli, FRM, Director of Business Analytics for the Financial Risk Group, leads the group responsible for model development, data science, documentation, testing, and training. He has over 15 years’ experience in the area of financial risk.

 

[1] Yikes!

[2] Pretty!

[3] E.g., is it 90+ days past due (DPD) or 90+ DPD or in bankruptcy or in non-accrual or …?

 

RELATED:

CECL—Questions to Consider When Selecting Loss Methodologies

CECLData (As Usual) Drives Everything

CECL – Data (As Usual) Drives Everything

To appropriately prepare for CECL a financial institution (FI) must have a hard heart-to-heart with itself about its data. Almost always, simply collecting data in a worksheet, reviewing it for gaps, and then giving it the thumbs up is insufficient.

Data drives all parts of the CECL process. The sections below, by no means exhaustive, provide key areas where your data, simply being by your data, constrains your options.

Segmentation

Paragraph 326-20-30-2 of the Financial Accounting Standards Board (FASB) standards update[1] states: “An entity shall measure expected credit losses of financial assets on a collective (pool) basis when similar risk characteristic(s) exist.” It then points to paragraph 326-20-55-5 which provides examples of risk characteristics, some of which are: risk rating, financial asset type, and geographical location.

Suggestion: prior to reviewing your data consider what risk profiles are in your portfolio. After that, review your data to see if it can adequately capture those risk profiles. As part of that process consider reviewing:

  • Frequency of missing values in important variables
  • Consistency in values of variables
  • Definitional consistency[2]
Methodology Selection

The FASB standard update does not provide guidance as to which methodologies to use[3]. That decision is entirely up to the FI[4]. However, the methodologies that are available to the FI are limited by the data it has. For example, if an FI has limited history then any of the methodologies that are rooted in historical behavior (e.g., vintage analysis or loss component) are likely out of the question.

Suggestion: review the historical data and ask yourself these questions: 1) do I have sufficient data to capture the behavior for a given risk profile?; 2) is my historical data of good quality?; 3) are there gaps in my history?

Granularity of Model

Expected credit loss can be determined on three different levels of granularity: loan, segment (i.e., risk profile), and portfolio. Each granularity level has a set of pros and cons but which level an FI can use depends on the data.

Suggestion: review variables that are account specific (e.g., loan-to-value, credit score, number of accounts with institution) and ask yourself: are the sources of these variables reliable? Do they get refreshed often enough to capture changes in customer or macroeconomic environment behavior?

Hopefully, this post has started you critically thinking about your data. While data review might seem daunting, I cannot stress enough—it’s needed, it’s critical, it’s worth the effort.

 

Jonathan Leonardelli, FRM, Director of Business Analytics for the Financial Risk Group, leads the group responsible for model development, data science, documentation, testing, and training. He has over 15 years’ experience in the area of financial risk.

 

[1] You can find the update here

[2] More on what these mean in a future blog post

[3] Paragraph 326-20-30-3

[4] A future blog post will cover some questions to ask to guide in this decision.

 

RELATED:

CECL—The Caterpillar to Butterfly Evolution of Data for Model Development

Avoiding Discrimination in Unstructured Data

An article published by the Wall Street Journal on Jan. 30, 2019  got me thinking about the challenges of using unstructured data in modeling. The article discusses how New York’s Department of Financial Services is allowing life insurers to use social media, as well as other nontraditional sources, to set premium rates. The crux: the data cannot unfairly discriminate.  

I finished the article with three questions on my mind. The first: How does a company convert unstructured data into something useful? The article mentions that insurers are leveraging public information – like motor vehicle records and bankruptcy documents – in addition to social media. Surely, though, this information is not in a structured format to facilitate querying and model builds.

Second: How does a company ensure the data is good quality? Quality here doesn’t only mean the data is clean and useful, it also means the data is complete and unbiased. A lot of effort will be required to take this information and make it model ready. Otherwise, the models will at best provide spurious output and at worst provide biased output.

The third: With all this data available what “new” modeling techniques can be leveraged? I suspect many people read that last sentence and thought AI. That is one option. However, the key is to make sure the model does not unfairly discriminate. Using a powerful machine learning algorithm right from the start might not be the best option. Just ask Amazon about its AI recruiting tool.[1]

The answers to these questions are not simple, and they do require a blend of technological aptitude and machine learning sophistication. Stay tuned for future blog posts as we provide answers to these questions.

 

[1] Amazon scraps secret AI recruiting tool that showed bias against women

 

Jonathan Leonardelli, FRM, Director of Business Analytics for the Financial Risk Group, leads the group responsible for model development, data science, documentation, testing, and training. He has over 15 years’ experience in the area of financial risk.

IFRS 9: Evaluating Changes in Credit Risk

Determining whether an unimpaired asset’s credit risk has meaningfully increased since the asset was initially recognized is one of the most consequential issues banks encounter in complying with IFRS 9. Recall the stakes:

  • The expected credit loss for Stage 1 assets is calculated using the 12-month PD
  • The ECL for Stage 2 assets (defined as assets whose credit risk has significantly increased since they were first recognized on the bank’s books) is calculated using the lifetime PD, just as it is for Stage 3 assets (which are in default).

To make the difference more concrete, consider the following:

  • A bank extends an interest-bearing five-year loan of $1 million to Richmond Tool, a hypothetical Virginia-based tool, die, and mold maker serving the defense industry.
  • At origination, the lender estimates the PD for the next 12 months at 1.5%, the PD for the rest of the loan term at 4%, and the loss that would result from default at $750,000.
  • In a subsequent reporting period, the bank updates those figures to 2.5%, 7.3%, and $675,000, respectively.

If the loan were still considered a Stage 1 asset at the later reporting date, the ECL would be $16,875. But if it is deemed a Stage 2 or Stage 3 asset, then the ECL is $66,150, nearly four times as great.

Judging whether the credit risk underlying those PDs has materially increased is obviously important. But it is also difficult. There is a “rebuttable presumption” that an asset’s credit risk has increased materially when contractual payments are more than 30 days past due. In general, however, the bank cannot rely solely upon past-due information if forward-looking information is to be had, either on a loan-specific or a more general basis, without unwarranted trouble or expense.

The bank need not undertake an exhaustive search for information, but it should certainly take into account pertinent intelligence that is routinely gathered in the ordinary course of business.

For instance, Richmond Tool’s financial statements are readily available. Balance sheets are prepared as of a point in time; income and cash flow statements reflect periods that have already ended. Nonetheless, traditional ratio analysis serves to evaluate the company’s prospects as well as its current capital structure and historical operating results. With sufficient data, models can be built to forecast these ratios over the remaining life of the loan. Richmond Tool’s projected financial position and earning power can then be used to predict stage transitions.

Pertinent external information can also be gathered without undue cost or effort. For example, actual and expected changes in the general level of interest rates, mid-Atlantic unemployment, and defense spending are likely to affect Richmond Tool’s business prospects, and, therefore, the credit risk of the outstanding loan. The same holds true for regulatory and technological developments that affect the company’s operating environment or competitive position.

Finally, the combination of qualitative information and non-statistical quantitative information such as actual financial ratios may be enough to reach a conclusion. Often, however, it is appropriate to apply statistical models and internal credit rating processes, or to base the evaluation on both kinds of information. In addition to designing, populating, and testing mathematical models, FRG can help you integrate the statistical and non-statistical approaches into your IFRS 9 platform.

For more information about FRG’s modeling expertise, please click here.

IFRS 9: Modeling Challenges

Calculating expected credit losses under IFRS 9 is easy. It requires little more than high school algebra to determine the aggregate present value of future cash flows. But it is not easy to ascertain the key components that are used by the basic equation—regardless whether the approach taken is “advanced”  (i.e., where PD, LGD, and EAD are modeled) or ”simplified” (also called “intermediate”). The forward-looking stance mandated by IFRS 9 makes the inherently difficult process of specifying these variables all the more complex.

For the sake of brevity, let’s consider only the advanced approach for this discussion. There are two immediate impacts on PD model estimation: the point-in-time requirements and the length of the forecast horizon.

PD estimates need to reflect point-in-time (PIT) rather than through-the-cycle (TTC) values. What this means is that PDs are expected to represent the current period’s economic conditions instead of some average through an economic cycle. Bank risk managers will have to decide whether they can adapt a CCAR (or other regulatory) model to this purpose, determine a way to convert a TTC PD to a PIT PD, or build an entirely new model.

The length of the forecast horizon has two repercussions. First, one must consider how many models to build for estimating PDs throughout the forecast. For example, it may be determined that a portfolio warrants one model for year 1, a second model for years 2 to 3, and a third model for years 3+. Second, one should consider how far into the forecast horizon to use models. Given the impacts of model risk, along with onus of maintaining multiple models, perhaps PDs for a horizon greater than seven years would be better estimated by drawing a value from some percentile of an empirical distribution.

Comparatively speaking, bank risk managers may find it somewhat less difficult to estimate LGDs, especially if collateral values are routinely updated and historical recovery rates for comparable assets are readily available in the internal accounting systems. That said, IFRS 9 requires an accounting LGD, so models will need to be developed to accommodate this, or a process will have to be defined to convert an economic LGD into an accounting one.

Projecting EADs is similarly challenging. Loan amortization schedules generally provide a valid starting point, but unfortunately they are only useful for installment loans. How does one treat a revolving exposure? Can one leverage, and tweak, the same rules used for CCAR? In addition, embedded options have to be taken into account. There’s no avoiding it: estimating EADs calls for advanced financial modeling.

As mentioned above, there are differences between the requirements of IFRS 9 and those of other regulatory requirements (e.g., CCAR). As a result, the models that banks use for stress testing or other regulatory functions cannot be used as-is for IFRS 9 reporting. Bank risk managers will have to decide, then, whether their CCAR models can be adapted with relatively minor modifications. In many cases they may conclude that it makes more sense to develop new models. Then all the protocols and practices of sound model design and implementation come into play.

Of course, it is also important to explain the conceptual basis and present the supporting evidence for PD, LGD, and EAD estimates to senior management—and to have the documentation on hand in case independent auditors or regulatory authorities ask to see it.

In short, given PD, LGD, and EAD, it’s a trivial matter to calculate expected credit losses. But preparing to comply with the IFRS 9 standard is serious business. It’s time to marshal your resources.

IFRS 9: Classifying and Staging Financial Assets

Under IFRS 9, Financial Instruments, banks will have to estimate the present value of expected credit losses in a way that reflects not only past events but also current and prospective economic conditions. Clearly, complying with the 160-page standard will require advanced financial modeling skills. We’ll have much more to say about the modeling challenges in upcoming posts. For now, let’s consider the issues involved in classifying financial assets and liabilities.

The standard introduces a principles-based classification scheme that will require banks to look at financial instruments in a new way. Derivative assets are classified as “fair value through profit and loss” (FVTPL), but other financial assets have to be sorted according to their individual contractual cash flow characteristics and the business model under which they are held. Figure 1 summarizes the classification process for debt instruments. There are similar decisions to be made for equities.

The initial classification of financial liabilities is, if anything, more important because they cannot be reclassified. Figure 2 summarizes the simplest case.

That’s only the first step. Once all the bank’s financial assets have been classified they have to be sorted into stages reflecting their exposure to credit loss:

  • Stage 1 assets are performing
  • Stage 2 assets are underperforming (that is, there has been a significant increase in their credit risk since the time they were originally recognized)
  • Stage 3 assets are non-performing and therefore impaired

These crucial determinations have direct consequences for the period over which expected credit losses are estimated and the way in which effective interest is calculated. Mistakes in staging can have a very substantial impact on the bank’s credit loss provisions.

In addition to the professional judgment that any principles-based regulation or accounting standard demands, preparing data for the measurement of expected credit losses requires creating and maintaining both business rules and data transformation rules that may be unique for each portfolio or product. A moderately complex organization might have to manage hundreds of rules and data pertaining to thousands of financial instruments. Banks will need systems that make it easy to update the rules (and debug the updates); track data lineage; and extract both the rules and the data for regulators and auditors.

IFRS 9 is effective for annual periods beginning on or after January 2018. That’s only about 18 months from now. It’s time to get ready.

IFRS 9 Figure 1

IFRS 9 Figure 2

 

 

 

Managing Model Risk

The Federal Reserve and the OCC define model risk as “the potential for adverse consequences from decisions based on incorrect or misused model outputs and reports.”[1]  Statistical models are the core of stress testing and credit analysis, but banks are increasingly using them in strategic planning. And the more banks integrate model outputs into their decision making, the greater their exposure to model risk.

Regulators have singled out model risk for supervisory attention;[2] managers who have primary responsibility for their bank’s model development and implementation processes should be no less vigilant. This article summarizes the principles and procedures we follow to mitigate model risk on behalf of our clients.

The first source of model risk is basing decisions on incorrect output.  Sound judgment in the design stage and procedural discipline in the development phase are the best defenses against this eventuality. The key steps in designing a model to meet a given business need are determining the approach, settling on the model structure, and articulating the assumptions.

  • Selecting the approach means choosing the optimal level of granularity (for example, should the model be built at the loan or segment level).
  • Deciding on the structure means identifying the most suitable quantitative techniques (for example, should a decision tree, multinomial logistic, or deep learning model be used).
  • Stating the assumptions means describing both those that are related to the model structure (for instance, distribution of error terms) and those pertaining to the methodology (such as default expectations and the persistence of historical relationships over the forecast horizon).

Once the model is defined, the developers can progressively refine the model, critically subjecting it to rounds of robust testing both in and out of sample. They will make further adjustments until the model reliably produces plausible results.

Additionally, independent model validation teams provide a second opinion on the efficacy of the model.  Further model refinement might be required.  This helps to reduce the risk of confirmation bias on the part of the model developer.

This iterative design, development, and validation process reduces the first kind of risk by improving the likelihood that the final version will give decision makers solid information.

The second kind of model risk, misusing the outputs, can be addressed in the implementation phase. Risk managers learned the hard way in the financial crisis of 2007-2008 that it is vitally important for decision makers to understand—not just intellectually but viscerally—that mathematical modeling is an art and models are subject to limitations. The future may be unlike the past.  Understanding the limitations can help reduce the “unknown unknowns” and inhibit the misuse of model outputs.

Being aware of the potential for model risk is the first step. Acting to reduce it is the second. What hedges can you put in place to mitigate the risk?

First, design, develop, and test models in an open environment which welcomes objective opinions and rewards critical thinking.  Give yourself enough time to complete multiple cycles of the process to refine the model.

Second, describe each model’s inherent limitations, as well as the underlying assumptions and design choices, in plain language that makes sense to business executives and risk managers who may not be quantitatively or technologically sophisticated.

Finally, consider engaging an independent third party with the expertise to review your model documentation, audit your modeling process, and validate your models.

For information on how FRG can help you defend your firm against model risk, please click here.

[1] Federal Reserve and OCC, “Supervisory Guidance on Model Risk Management,” Attachment to SR Letter 11-07 (April 4, 2011), page 3. Emphasis added.

[2] See for example the Federal Reserve’s SR letters 15-8 and 12-17.

The Heroes of the Risk Quantification Process

What makes for a successful risk quantification process?  Prior to joining the firm I thought it was all about analytics (my own specialty).   I’ve come to realize that a happy marriage between data, analytics, and reporting needs to take place.  Each component brings a necessary piece to the risk puzzle of a portfolio.

But it goes beyond just that.  After working in all three areas I realized that the talented people who specialized in a particular domain were, in a sense, heroes.

The Unsung Hero

These are the people working with the data.  They are also, I believe, the lynchpin to the entire process.   The individuals who work in this area go through much effort (and frustration) to ensure the data being piped down the line is clean, coherent, relevant, and current.  This involves cool stuff like using data models and fancy acronyms like ETL.

What shocks me the most?  Few people truly recognize the importance of these individuals.  Especially if the data is clean and correct.  If it is dirty and incorrect, you know it in a hurry.

The Superhero

These are the people using statistics/mathematics to assess risk.  Much like superheroes with their utility belts or powers, people in this group have their own special tools.  These individuals use a host of nifty items to get a sense of the risk in the portfolio.  Value-at-Risk, regression models, time series analysis, copulas and other intimidating sounding, but extremely useful, tools are employed.

The Epic Hero

These are the people who take the data and analytics and build reports.  In literature, an epic hero is a person favored by the gods.  In this case, the “gods” can be one’s boss or upper management.  Individuals who do this well get praise upon praise upon praise.  The reason why: done correctly, nothing tells a better story than a picture…with pretty colors and nicely formatted numbers.

In Summary

There are three core pieces required to establish a solid risk quantification process.  And, if you are lucky enough to be working with heroes, then all sorts of insight about the risk in one’s portfolio can be obtained.

A parting suggestion: next time you want to praise the person who created your awesome report, do so.  Then follow the thumping sound – that will be the data person banging his head off the nearest wall.  Make sure you thank him too.

Subscribe to our blog!