In order for the Consumer Financial Protection Bureau (CFPB) to protect millions of consumers from unsound lending, the agency must implement the public disclosure of the enhanced Home Mortgage Disclosure Act (HMDA) data in a rigorous manner that provides comprehensive and public information about loan terms and conditions.
NCRC has a mantra about the importance of data. You may hear one of us say at a conference plenary session that “Data drives the movement for economic justice.” The idea behind this mantra is that the public disclosure of data motivates lending institutions to lend in a fair and equitable manner because they know that members of the public can analyze the data to see if they are giving borrowers a fair deal. The stakes are large. Almost 50 million consumers have a mortgage and there is about $10 trillion of mortgage debt outstanding.
First, some background about the data.
For data to be effective, it must be robust. Congress passed HMDA in 1975 for the purposes of assessing whether financial institutions are meeting the credit needs of their communities and assisting public sector entities in directing government investments in a manner that would leverage private lending in areas of need. In 1989 after amendments to HMDA, Congress and the Federal Reserve Board recognized a third HMDA purpose of aiding in the identification and eradication of discrimination in lending.
For several years, HMDA effectively identified lenders that were doing a good job of meeting credit needs as well as those that lagged in serving communities of color and working class communities. However, as subprime and abusive lending proliferated in the late 1990s through the mid 2000’s, HMDA data became less effective as a check on abusive and discriminatory lending. HMDA data lacked information on loan terms and conditions and thus could not spot and help deter the surge of deceptive and harmful terms being added to high cost loans in the run up to the financial crisis.
Congress recognized that incomplete data cannot achieve its statutory purpose of promoting a responsible and equitable lending marketplace. Accordingly, the Dodd-Frank Wall Street Reform and Consumer Protection Act of 2010 included a provision to add several data points to HMDA. These included data on loan terms and conditions and additional data on borrower characteristics such as more racial and ethnic subcategories and indicators of creditworthiness. This information was to be publicly disseminated in an effort to prevent future crises caused by abusive and discriminatory lending. At the same time, however, Congress recognized that enhanced data created privacy risks. In other words, better data created the possibility that bad actors (adversaries) could use the publicly available data to identify actual borrowers with the objectives of either embarrassing them, stealing their identities or peddling abusive products to vulnerable consumers.
When implementing the new public disclosure requirements, Congress mandated that the CFPB use a balancing test that weighs the benefits of data dissemination against the risks to borrower privacy. A risk occurs when a single data variable or combination of variables increases the chances that an individual borrower can be identified using HMDA data. The risk is amplified when the variable discloses information about the borrower that is not otherwise publicly available and may be harmful or sensitive.
Some industry stakeholders used the balancing test requirement to argue that the CFPB must err on the side of not disclosing the new elements in the HMDA data if there is “any chance” that the HMDA data could be used for privacy invasions and criminal behavior. These industry commentators used an academic paper to assert that in any given census tract, about 72 percent of the borrowers could be identified using HMDA data in combination with other publicly available databases such as real estate transaction records. However, NCRC previously reviewed this paper, concluding that the paper understated the difficulty of matching HMDA data and real estate transaction data. For starters, county level real estate transaction data cannot be effectively matched with HMDA data, because it can typically be viewed only a few records at a time instead of being downloaded in its entirety. The CFPB implicitly recognizes the NCRC analysis (which was shared with and published by the CFPB) by stating that “it appears that this exercise (described by the paper) was undertaken solely to demonstrate that such matching can be done” instead of taking the next step and actually affirming that the matching was executed.
In a new policy guidance issued in December 2018, the CFPB maintained that HMDA data does not substantially facilitate privacy invasions and criminal activity. The CFPB stated, “Even though some adversaries may have such incentives and loan-level HMDA data has been made available to the public since 1991, the Bureau is unaware of any instances of re-identification of the data for harmful purposes.” The CFPB also found that the data to be disclosed under its final policy guidance “will be of minimal value to an adversary seeking to perpetrate identity theft or financial fraud against applicants.” The CFPB noted that the HMDA data lacks personally identifiable information such as Social Security numbers, date of birth or personal passwords that facilitate criminal identification theft. While some privacy risk is possible with HMDA data, it is generally “low” and acceptable given the public benefits of disclosure in terms of the promotion of a fair lending marketplace, the CFPB concluded (bolding added by author).
Consistent with its analysis, the CFPB opted to disclose most of the Dodd-Frank data enhancements, starting with the 2018 HMDA data to be disseminated in the spring of 2019. These data variables include loan terms and conditions such as the presence (or absence) of prepayment penalties, several pricing variables including total points and fees and additional racial and ethnic subcategories for Asian and Hispanic borrowers (However, it must be noted that smaller institutions making 500 or fewer loans will not be disclosing the new Dodd-Frank data as result of a 2018 law passed by Congress.)
The previous HMDA data available before Dodd-Frank, including information on borrower characteristics and action taken on the application, will continue to be publicly disclosed. The CFPB, however, did modify some variables out of concern of privacy risks. One such variable was the loan amount expressed in dollars. This variable will now be expressed as a midpoint within a $10,000 increment that includes the actual loan value. For example, if a loan amount is $117,000, the public HMDA data for that loan will be reported as $115,000, which is the midpoint of $110,000 and $120,000 within which the actual loan value falls. The rationale for this is that other data including real estate or taxation data maintained by counties include loan amounts. Community groups did not object to modification of loan amounts but argued for smaller intervals for smaller loan amounts, particularly for home improvement lending. We did not prevail on this point. The new Dodd-Frank variable for property value will also be reported in this manner.
The CFPB also modified its proposed disclosure of data elements for multifamily lending (defined as loans to properties with five or more units). Because the HMDA database contains significantly fewer multifamily than single family loans, the CFPB determined that a privacy risk uncovering the owners of multifamily properties existed on a census tract level. Accordingly, instead of the actual number of units being disclosed, the CFPB will disclose data in bins. That is whether the loan was for properties with units of 5 to 24 units; 25 to 49; 50 to 99; 100 to 149; and 150 and over. For similar reasons, the CFPB will also disclose less precise information for multifamily loans regarding the number of units subject to public sector program income restrictions; that is whether the units are restricted to low- and moderate-income borrowers. The agency will report the percentage of units, instead of the number, that are income restricted. NCRC and our members opposed these modifications but appreciate that the modifications could have been more severe as some in the industry were advocating.
While backtracking on multifamily data, the CFPB improved upon their proposal of 2017 regarding the disclosure of debt-to-income (DTI) information in the HMDA data. DTI is a critical source of information about the affordability and sustainability of lending since high DTIs pose significant risk that borrowers will fall behind on their loans as regularly occurred with subprime loans before the financial crisis. The CFPB will report most DTI data in bins such as whether the ratio is between 20 and 30 percent, but will report actual percentages between 36 to 50 percent. This range includes important benchmarks for underwriting purposes.
Privacy risks prompted the CFPB to exclude certain variables from public disclosure. One of these variables is date of application, which has always been excluded due to the possibilities of matching date of application in HMDA with publicly available databases such as real estate transaction records. The CFPB also chose not to disclose a unique loan identification number of each HMDA reportable loan due to the increased possibilities of matching. To the disappointment of community advocates, the CFPB also declined to provide identification numbers for branches of loan companies that advocates argued was needed for fair lending enforcement inquiries that would help identify rogue branches but not individual loan officers.
One of the largest losses for the consumer advocacy community was the CFPB’s decision to completely exclude credit scores in any form from the public data. As recognized by even one industry commentator, credit score information is essential for fair lending inquiries since it can help analysts determine if similarly situated borrowers of different race/ethnicity or gender are being treated differently (denied loans or offered loans with higher prices). Steering of people of color into higher cost loans when they qualified for lower cost loans was widespread during the era of subprime lending before the financial crisis.
The CFPB concluded that the risk of harm from revealing sensitive information about a borrower outweighed the benefits of credit score disclosure. NCRC and our allies had advocated for modifying disclosure by suppressing the specific score and instead reporting it as a percentile score or as a normalized “Z score.” Consumer advocates had argued that the sensitivity issue would be mitigated by Z scores since a Z score is harder to interpret than a low FICO credit score. Although the CFPB did not accept this disclosure method, it could have opted for “aggregate” instead of individual loan level disclosure of credit scores. For each census tract, it could have reported the distribution of loans by percentile ranges of scores for all lenders, as a group, and each individual lender. This would have greatly facilitated fair lending analysis and identification of outlier lenders that treated applicants from census tracts with high numbers of people of color differently than their peers. A failure to report information in some aggregate form instead of loan level disclosure in response to privacy concerns appears to violate a disclosure provision of Dodd-Frank as even discussed by the CFPB.
Although the CFPB finalized its policy guidance for disclosure of the 2018 HMDA data, the CFPB indicated that it will reconsider HMDA disclosure in a rulemaking likely to be commenced in May 2019. It is a safe assumption that the current CFPB is less friendly to data disclosure and friendlier to industry arguments than the previous CFPB whose director was appointed by the previous president. Yet, even the current CFPB issued a policy guidance; and while not completely what the consumer advocates desired, it was generally expansive regarding data disclosure. It conducted objective analysis and research and did not find a single instance of privacy invasion and criminal behavior facilitated by HMDA data. It would seem that privacy risks are most seriously associated with companies like Equifax (143 million consumers had their identities exposed) and Facebook than with a 1975 disclosure law that is aimed at rooting out discrimination.
In the spring, when the CFPB reconsiders HMDA data, it needs to remember its careful balancing test analysis. It would seem that its current analysis would constrain extreme moves towards data exclusion since the current analysis was comprehensive and thoughtful. The CFPB should also consider an accommodation regarding credit scores and a few other excluded variables. If the agency ventures far in another direction of data exclusion, it will violate the fair lending purpose of HMDA and the Dodd-Frank enhancements to HMDA.
 CFPB, Disclosure of Loan Level HMDA Data: Final Policy Guidance, p.6, see https://s3.amazonaws.com/files.consumerfinance.gov/f/documents/HMDA_Disclosure_FPG_–_Final_12.21.2018_for_website_with_date.pdf
 CFPB, p.6.
 CFPB, p. 12.
 CFPB, p. 17
 CFPB, p. 16
 NCRC, Rebuttal to personal privacy of HMDA in a world of big data, January 23, 2018, https://ncrc.org/rebuttal-personal-privacy-hmda-world-big-data/
 CFPB, p. 23
 CFPB, p. 23
 CFPB, p. 21
 CFPB, p. 25
 CFPB, p. 11.
 CFPB, pp. 45-50
 CFPB, p. 67
 CFPB, pp. 63-65
 CFPB, p. 44.
 CFPB, pp. 69-70
 CFPB, pp. 59-60
 CFPB, p. 9