Method Mitigating Gender Bias Of AI Models By Salesforce Researchers

Method Mitigating Gender Bias Of AI Models Claimed By Salesforce Researchers

558

A new way to mitigate gender bias in word embedding, translate languages, the word representations used to train AI models to summarize, and perform other prediction tasks has been proposed by researchers at Salesforce and the University of Virginia.

According to the team, correcting for certain regularities — such as word frequency in large data sets — permits their method to “purify” the embeddings before inference, eliminating potentially gendered words.

The proposed alternative of Salesforce — Double-Hard Debias — transforms the embedding space into an ostensibly genderless one. It transforms word embeddings into a “subspace”, which can be used to find the dimension that encodes frequency information distracting from the encoded genders.

Image Credit: Salesforce

Salesforce’s proposed alternative then “projects away” the gender component along this decision for obtaining revised embeddings before executing another debiasing action. The researchers tested it against the WinoBias data set that includes pro-gender-stereotype and anti-gender-stereotype sentences.

Image Credit: Salesforce

The Salesforce and University of Virginia team believe their technique measurably reduces the gender bias present in embeddings.

“We found that simple changes in word frequency statistics can have an undesirable impact on the debiasing methods used to remove gender bias from word embeddings,” wrote the coauthors of the Double-Hard Debias paper. “[Our method] mitigates the negative effects that word frequency features can have on debiasing algorithms. We believe it is important to deliver fair and useful word embeddings, and we hope that this work inspires further research in this direction.”