Added on May 3, 2023
Just wanted to follow up with all of you as we learn more about the address quality letter. RATA has not gotten a response to our letter yet, but one of our clients submitted a letter on their own and I think the response is worth sharing and will make many of you feel better. See below.
Question from RATA Client:
The email "HMDA Data Outreach: Review Data for Invalid Entries in the Street Address Field" leads me to believe that additional address components or special characters should not be included in the street address field.
If that is true, LARS for years 2020 – 2022 that were submitted on time, that contain these address components such as a comma, dash, period, etc. in the street address field; if the geocoding is correct (can be proven) will a financial institution be penalized for not correcting the street address?
Answer from HMDA Help:
Apologies for the delay in response, the HMDA Help inbox has been dealing with a high influx of cases.
Punctuation and including abbreviations of the street names and numbers are not the types of issues that our analysis was flagging.
Additionally, the guidance was sent out to assist filers with reviewing their data should they choose to do so. It was not a mandatory request for resubmission.
Thank you for reaching out for clarification!
Below, please find RATA's response to the CFPB regarding the HMDA Data Outreach: Review Data for Invalid Entries in the Street Address Field. Hopefully this letter (sent via email) will help guide you in your decision making on how you and your institutions will be handling the CFPB's recommendations.
RATA Response to CFPB RE: HMDA Data Outreach: Review Data for Invalid Entries in the Street Address Field
RATA Associates is an industry leader in software and geocoding for HMDA, CRA, and Fair Lending. Over the last 35 years, we have developed a geocoding methodology and system that consistently returns the highest precision geocode for every address processed to satisfy the Compliance-grade geocoding demanded by these regulations. We process address data for some of the country's largest lenders; in fact, our geocoding results represent over 35% of the overall volume submitted for HMDA and CRA.
The RATA geocoding system allows us to return high-precision geocode results despite the challenges associated with the addresses themselves, which can include addresses with lot numbers, apartments or units, multiple street numbers, no street numbers, cross streets, and many of the other variations you listed. Using CASS-certified USPS standardization tools, we are able to fix most of these issues and get a rooftop match or other very precise geocode result. Having our customers try to fix all of these issues would be a time-consuming endeavor with little to no opportunity cost, as it would not improve most results at all.
We utilize multiple geocoders comprised of multiple data sources and proprietary processes to help us accurately hit geocodes. We double-verify results when we can and use fallback geocoding when necessary (on average, 1-2% of submitted addresses). We return a standardized address, including a ZIP+4, when the address is recognized by the USPS standardizer and a geocode precision code. Our customers can then audit their geocodes by filtering to those applications and visually verifying the results to find the exact Census Tract.
Over the last few weeks, we have had many emails and phone calls from our clients who are confused about how they should handle the address quality letter you sent to institutions. We are writing to you now to express our concern regarding the advice given to users to put "NA" as a placeholder for incomplete addresses. While we understand that this may seem like a reasonable solution for incomplete addresses, we would like to explain why this advice is not always the best course of action.
As you are aware, geocoding is the process of assigning geographic coordinates to a given address and further defining the demographics of the areas the applications came from. This process is essential for CRA and HMDA reporting, where results must meet compliance accuracy, as the geocodes significantly impact the accuracy and reliability of data used in decision-making processes for the banks when the information is publicly released.
The quality of geocoding results is directly influenced by the completeness and accuracy of the address information provided. While attempting to address this concern, your guidance has several suggestions that, unfortunately, take the geocode accuracy in the opposite direction, making the geocode results less accurate. Specifically in relation to replacing incomplete addresses (i.e., addresses that fail to meet USPS or 911 addressing standards) with "NA".
A couple of your examples are:
In each of the cases above, you can accurately obtain geocode results by focusing on the street(s). In over 90% of cases, the street in its entirety exists within a single census tract. This offers a high-accuracy geocode with a 99% precision level for the loan application, whereas replacing the address with "NA" would only permit geocoding to the center of a 5-digit ZIP Code, which has a 33% precision level on average. Even when the street crosses between census tracts, the geocoders know this and can geocode to the street's center, yielding, on average, a 75% precision level. Still much higher than the 5-digit ZIP Code centroid assigned to "NA".
As you can see, taking away an address where the physical street is actually known has the opposite effect from what is desired, which is accurate geocode results for all loan applications reported. Geocodes can often be accurately assigned, even when a street name is unknown. For example, an address such as Lot 13 Pinewood Subdivision is common with new construction lenders. However, even in these cases, the location of the subdivision is known and geocodes can be accurately assigned. In many cases, the subdivision falls entirely within a single census tract.
Every address and every situation is unique, and even getting perfect addresses for new home builders or rural areas will prove problematic to most in-house geocoders, including the FFIEC geocoder.
We just had a client put "NA" for over 400 addresses based on your letter and most, if not all, of those geocodes are going from 99% precision to 33% precision for the 5-digit ZIP centroid based on the CFPB guidance.
This subject is like an iceberg, and the 10% of it above the water does not tell the complete story. Again, we have been doing this for over 35 years and have seen it all regarding geocoding and results. We would be happy to have a conference call and go into more detail about the other 90% of this iceberg with your team if you would like to discuss it further.
In conclusion, while the advice to use "NA" as a placeholder for incomplete addresses may seem reasonable, it is not usually the best course of action. Incomplete address information will most likely still yield accurate geocodes, and using "NA" will definitely reduce the accuracy and reliability of geocoding results. Since the overall purpose of the address is to provide the best geocode accuracy for all loan applications, providing as much information as possible when entering address information is preferable.
I hope that this information will be helpful in updating your guidance on geocoding and that it will be taken into consideration when providing advice to users. Thank you for your time and attention to this matter.
RATA Associates Compliance-grade Geocoding Team