Improving the Collection of Race, Ethnicity, and Language Data to Reduce Healthcare Disparities: A Case Study from an Academic Medical Center

Lee WC, Veeranki SP, Serag H, Eschbach K, Smith KD
Source: Perspect Health Inf Manag
Publication Year: 2016
Patient Need Addressed: Patient satisfaction/engagement
Population Focus: Medicaid beneficiaries
Demographic Group: Racial and ethnic minority groups
Intervention Type: Technology/innovation
Type of Literature: White

Well-designed electronic health records (EHRs) must integrate a variety of accurate information to support efforts to improve quality of care, particularly equity-in-care initiatives. This case study provides insight into the challenges those initiatives may face in collecting accurate race, ethnicity, and language (REAL) information in the EHR. We present the experience of an academic medical center strengthening its EHR for better collection of REAL data with funding from the EHR Incentive Programs for meaningful use of health information technology and the Texas Medicaid 1115 Waiver program. We also present a plan to address some of the challenges that arose during the course of the project. Our experience at an academic medical center can provide guidance about the likely challenges similar institutions may expect when they implement new initiatives to collect REAL data, particularly challenges regarding scope, personnel, and other resource needs.

Insights Results

Overview of article/project

  • This article describes the experience of an academic medical center (University of Texas Medical Branch [UTMB]), working to strengthen its electronic health records (EHRs) for better collection of race, ethnicity, and language (REAL) data. It also presents a plan to address some of the challenges that arose during the course of the project
  • The REAL Data project was funded and implemented through a 5-year DSRIP project. The REAL Data project has established several specific achievement milestones related to the collection of patient demographics: 1) To establish/modify the registration screens and written materials in the EHR to collect accurate information; and 2) To develop a training module to guide staff to collect additional demographic data to be entered in the new system
  • The main purposes behind this project are: 1) To improve the UTMB health information system to report patient outcomes, diagnoses, and quality measures stratified by race, ethnicity, language, and
    billing/insurance status; and 2) To identify priority disparities and develop and disseminate intervention
    plans to address them through effective partnerships with relevant stakeholder groups


    • The REAL Data project found lack of availability of complete and/or reliable REAL data in the health system prior to implementation
    • After one year of implementation, the percent age of valid REAL data collected in the EHR increased from 71.7% of 18,577 unique patients as of October 1, 2013, to 75.9% of 26,611 unique patients in the selected location by September 30, 2014
    • In response to the results in this first phase of implementation, the project team, employee education team, and Epic management team put a warning in the EHR system that automatically reminds staff to ask new patients their race, ethnicity, and language or to ask existing patients for this information if it is missing. With this warning, the percent age of valid REAL data collected increased further to 84% of 27,491 unique patients as of September 30, 2015
    • There were differences among racial/ethnicity groups in selection of race and whether or not they responded. For example, 5.9% of the 7,000 patients with Hispanic origin did not report their race (i.e., unknown) but less than 1% of the 14,798 non-Hispanic patients did not have race information (i.e., unknown). On the other hand, 19.5% of the 20,351 white patients did not provide information about their Hispanic origin status (i.e., unknown). This number is even higher in other groups: 24.1% of 5,862 black patients and 32.3% of 640 Asian patients did not provide information about Hispanic origin status (i.e., unknown)

    Key takeaways/implications

    • The REAL program was able to increase the percent age of valid REAL data collected in the EHR over 1 year of implementation through training of staff about the knowledge, skills and attitude for collecting REAL data, and reminders to ask patients about their race or ethnicity on discharge claims with “unknown” for race or ethnicity
    • Separating out race/ethnicity into 2 separate data collection questions should be considered and may better capture missing or unknown data
    • Challenges faced in the project include non-inclusion of socioeconomic predictors, limited complete and precise data from all patients, patient discomfort in disclosing their race/ethnicity, repeatedly asking patients for their REAL data and limited financial and human resources for the project
    • The ABC plan was created as recommendations to further fix REAL data collection in EHR systems: A – adjust the EHR system (e.g., “refused/don’t know” instead of “unknown”), allowing for better capacity for racial/ethnic disparities analyses. This could ultimately lead to more informed clinical and health policy decisions and development of interventions tailored to specific racial/ethnic groups; B – build awareness among all professionals and patients (e.g., education/training for staff, warning in EHR for registration staff to input patient’s race/ethnicity if status is “unknown”, develop collaborations across different departments to raise awareness among patients and providers – for providers, it is recommended to develop a script that outlines rationale for data collection and information about training and for patients it is recommended to distribute flyers in waiting rooms); and C – collaborate and share lessons learned with other health systems to enable programs’ beneficiaries to identify useful approaches from many other solutions for potential challenges
    • Future research and data collection efforts should focus on facilitating REAL data collection and its meaningful use to address racial, ethnic, and language disparities