Most financial institutions (FI’s) find that data is the biggest hurdle when it comes to regulatory requirements: they don’t have enough information, they have the wrong information, or they simply have missing information. With the CECL accounting standard, the range of data required to estimate expected credit losses (e.g., reasonable and supportable forecasts) grew from what was previously required. While this is a good thing in the long run (as the requirements gradually help FI’s build up their inventory of clean, model-ready data), many FI’s are finding it difficult to address data problems right now. In particular, how to handle missing data is a big concern.
Missing data becomes a larger issue because not all missing data is the same. Classifications, based on the root causes of the missing data, are used as guidance in choosing the appropriate method for data replacement. The classifications consist of:
- Not missing at random – the cause of the missing data is related to the missing values
- For example, CLTV values are missing when previous values have exceeded 100.
- Missing at random (MAR) – the cause of the missing data is related to observed values of other variables
- For example, DTI values are missing when the number of borrowers is 2 or more.
- Missing completely at random (MCAR) – the cause of the missing data is unrelated to values of the variable or other variables; data is missing due to an entirely random process
- For example, LTV values are missing because a system outage caused recently loaded data to be reset to default value of missing.
Once a classification is made for the reason of missing data, it is easier to determine its resolution. For example, if the data is MCAR there is no pattern and therefore, involves no loss of information if those observations with the missing values are dropped. Unfortunately, data is rarely MCAR.
The following table represents some methods (not meant to be all inclusive) a FI may use to handle other, more common, data issues.
[table id=7 /]
Understanding why the data is missing is an important first step in resolving the issue. Using the imputation methods outlined above can provide a temporary solution in creating clean historical data for methodology development. However, in the long run, FI’s will benefit from establishing a more permanent solution by constructing data standards/procedures and implementing a robust on-going monitoring process to ensure the data is accurate, clean, and consistent.
- FASB Accounting Standards Update, No. 2016-13, Financial Instruments – Credit Losses (Topic 326).
Samantha Zerger, business analytics consultant with FRG, is skilled in technical writing. Since graduating from the North Carolina State University’s Financial Mathematics Master’s program in 2017 and joining FRG, she has taken on leadership roles in developing project documentation as well as improving internal documentation processes.