About four years ago, researchers at Stanford University showed in a study that an artificial intelligence (AI) was better than humans at determining a person’s sexual orientation based on portrait photos. The AI was able to identify whether or not someone was homosexual with up to 81 percent accuracy. Using five images per person as source material, the reliability even rose to as high as 91 percent. The result raised goosebumps even then, both in the LGBTQ community and among human rights activists. It’s easy to imagine that states that criminalize homosexuality could use such a procedure to crack down on their fellow citizens. This brings back painful memories of the Third Reich’s racial doctrine.
In the Stanford study, the researchers themselves still understood quite well on the basis of which criteria the AI could make the distinction. These were certain facial features as depicted in the test photos. But a new study leaves the researchers puzzled as to how the AI arrives at its conclusions.
A study by an American research group shows how an AI can distinguish between Asian, black and white patients from X-ray and computer tomography images. The surprising thing is that human experts themselves are unable to make such a classification, and there are not even any criteria for such a distinction.
In this work, the AI models were trained on an ImageNet with a dataset of chest, limb, breast, and spine images. These images had a patient self-reported category by race. The AI was then able to match between 80 and 97 percent correctly.
The study authors tried to reproduce the results of the AI and determine them by parameters such as age, gender, body mass or tissue density, but these turned out to be of little significance. The AI was even able to come to a correct result when the images were of poor quality.
The study authors themselves sounded alarmed. Without understanding how the AI comes to its conclusions, there is a danger that the AI could use the patient’s race to make suggestions and decisions that disadvantage certain groups. And the human experts wouldn’t even catch on and be able to comprehend that.
Here’s what the researchers themselves said:
We emphasize that model ability to predict self-reported race is itself not the issue of importance. However, our findings that AI can trivially predict self-reported race — even from corrupted, cropped, and noised medical images — in a setting where clinical experts cannot, creates an enormous risk for all model deployments in medical imaging: if an AI model secretly used its knowledge of self-reported race to misclassify all Black patients, radiologists would not be able to tell using the same data the model has access to.
As humans, we recognize that neural networks can interpret information, such as that found in X-rays and computed tomography scans, in ways that we as humans have not seen before. Understanding how a neural network does this offers us the opportunity to identify, mitigate, or entirely prevent potential cognitive biases and biases.