…and as soon as they could be heard behind the curtain, the conversations ended, the note sheets rustled briefly once more, and after some clearing of throats behind held-out hands, it became quiet. The sound of a cello broke the silence and the audience listened tensely. There was nothing to see, because the curtain remained closed and only the music could be heard. This went on for a few minutes, with those present seeming absent and distant, for some had closed their eyes and leaned far back. Observers might have guessed they would have fallen asleep if it hadn’t been for their hands following the melody. As soon as the game ended, the usual applause did not rise. Rustling of clothes, a bow clattering against the cello could be heard and then the footsteps disappeared in the distance behind the curtain. This should be repeated throughout the day like a ritual. Footsteps, throat clearing, blowing, rustling, music, then clattering and disappearing footsteps.
We witnessed a selection process for new orchestra members and the audition behind the curtain was to ensure a fair process. The major symphony orchestras had begun to reorganize their search for musicians 50 years ago. Until then, the orchestras, most of which were exclusively male, had been handpicked by their equally male conductors or music directors. But times had changed. Not only had the orchestra members demanded and won a voice, they also wanted to include female musicians in their ranks. However, the previous selection method, in which candidates auditioned on the open stage, almost always sifted women out completely. Male bias toward female musicians could not simply be turned off by fiat. That’s why auditioning behind a curtain became standard repertoire for most orchestras, and the number of female candidates selected clicked upward by a quarter. What sounded like a lot at first was not, however, because too many candidates still fell through the sieve.
Were the women still not good enough? No, the men’s jury had unconsciously included other clues according to the gender of the candidates. A clattering step of a woman in high heels or a too soft gait in flat shoes unmistakably revealed the gender. Since then, the shoes are taken off backstage so that the barefoot candidates remain unrecognized.
If we humans already consciously or unconsciously perceive criteria that reinforce our prejudices, how then will an artificial intelligence, inspired by the human brain and fed with human-generated data, have access to even more such criteria and find patterns in them that implicitly reflect prejudices?
Online retailer Amazon struggled for nearly two years with its AI, which was tasked with going through a rising tide of thousands of applications per week and filtering out the right candidates. However, the system became suspicious because almost no female candidates were nominated. Research showed that the application records of the hundreds of thousands of previous employees had favored men, and the AI simply learned from this that women’s volleyball in college or female first names did not match any pattern in the training data, thus discarding them. Amazon began removing these criteria, but the AI blithely identified other criteria. The fight remained futile and Amazon gave up. Today, humans are looking at job applications again, and the AI has been fired.
The bias of AI will keep us even more busy, as will many more opportunities, possibilities and risks. And to do that, we need to find ways to test and evaluate AIs. Tests are many, most would be developed specifically for humans, others for machines.
This is a small excerpt from my book Creative Intelligence: How ChatGPT and Co will change the world, which will be published in the fall. It can already be pre-ordered here.
