Too Many Scientists Still Say Caucasian

Seven of the ten US clinical genetics laboratories that share most data with researchers include Caucasian as a multi-choice category for patients racial and ethnic identity. This despite it not having any scientific basis. Since 2010, nearly 5,000 biomedical papers have used Caucasian as a way to describe European populations. This indicates that scientists are too often using the term without realizing its racist roots. These taxonomies were used to justify slavery and other forms of racism, as well as pseudoscientific claims of superior biological whiteness.
My research interests lie at the intersection between statistics, evolutionary genetics, and bioethics. Since 2017, I have co-led a diverse, multidisciplinary working group funded by the US National Institutes of Health to investigate diversity measures in clinical genetics and genomics (go.nature.com/3su2t8n).

Many people working in genomics have a deep understanding of the issues. They want to do things correctly. Yet, I was dismayed at how many clinicians and academics have refused to acknowledge racism in science. Numerous studies have proven that racial groups can be defined by society, not genetics. This has been evident for decades. This is a problem that only the wealthy can overlook. As a white woman, my blind spots also need to be constantly examined.

Social science pioneers such as Dorothy Roberts Fatal Invention (2012) and Kim Tallbears Native American DNA (2013) have pointed out many of those flawed assumptions and approaches in human genomics.

This scholarship has a common theme: groupings are more dependent on dominant culture than ancestry. The government of Singapore requires that all individuals be identified as Chinese, Malay or Indian. This has a significant impact on where they can live and how they study. People with Indian and Chinese ancestry are combined into one racial group called Asian in the United States. The term Hispanic also erases many cultural and ancestral identities, particularly among Indigenous peoples in the Americas.

These erroneous beliefs about genetic races are still prevalent in broad, ambiguous continent ancestry groups like Black, African, or African American. They are also used in the US Census. They destroy cultural and ancestral identities and collapse amazing amounts of diversity. Participants in studies who are not able to fit into these crude buckets are often excluded. Despite the fact that less people identify with one population of origin, it is becoming increasingly difficult for them to be included in analyses.

Moving away from requiring people to identify themselves by using checkboxes is a practical step in the right direction. I don't advocate for the end of research on genetic ancestry and socio-cultural categories like self-identified ethnicity or race. These can be used to track and study equity in justice, education, and health care. Scientists and clinicians want to avoid confusing the two. This allows them to attribute health differences to innate biology, rather than poverty and social inequalities.

It is not genetics that creates health disparities, but systemic racism. It shouldn't have taken the unjust ravages of a Pandemic to show this. Every physician and researcher should be aware that there is a racial bias in medical practice. Some pulse oximeters are more accurate for people with light skin than those with darker skin. Black Americans are often undertreated for pain. Historical biases in the data used to train algorithms for medical decisions can also lead to poor outcomes for vulnerable populations. This is why the American Medical Associations Manual of Style continues to revise the section on race and ethnicity. It also explains why medical schools are looking at how their curricula perpetuate harmful stereotypes about race.

Researchers are now collecting more self-reported data about geographical family origins, language spoken at home, and cultural affiliations. I would prefer data collection forms that allow for open-ended questions to be used, as opposed to those that require you to make a choice or limit your identity to what is labeled other. These self-reported indicators can be combined with genetic data in order to improve current methods of mapping the diversity within our populations.

Because so much of our data is missing, it makes it difficult to determine genetic ancestry using only known reference populations. The Human Pangenome Reference Consortium is my collaboration, which seeks to create a better and more inclusive resource for global genetic diversity. It will involve communities, particularly Indigenous peoples in the development of protocols for data storage, collection, and use. This will respect Indigenous data sovereignty and allow for more inclusive and accurate studies.

Researchers will less rely on biologically meaningless labels that reinforce incorrect assumptions and cause harm if we can accurately measure genetic and non-genetic factors to health and disease. For example, using sequence data in clinical care might allow for recommendations for drug dosages that are more genotype-based than race-based.

It won't suffice to choose another word to replace Caucasian to eradicate racism in medicine and research. However, everyone should be aware that the word can cause harm.

This article was published with permission on August 24, 2021.