CeADAR and UCD use machine learning to quantify gender bias in the Irish Constitution
Materials for this article were originally provided by Robbie Brennan, StoryLab
Researchers from CeADAR, Ireland's centre for artificial intelligence, and UCD College of Business have called for more inclusive language in the Constitution after using AI and machine learning to reveal the extent to which gender bias is endemic in the text.
“The close association between certain words and gender is important to highlight because ultimately that is how humans learn,” said UCD College of Business Associate Professor Paula Carroll. “If the word ‘president’ appears alongside the word ‘he’, an association is automatically formed.
"We might hear in the discourse that the Constitution is sexist and biased, but this research quantifies it for the first time.”
In their research, Uncovering gender dimensions in energy policy using Natural Language Processing, Associate Professor Paula Carroll, CeADAR researcher Bhawna Singh, and Prof Eleni Mangina of UCD School of Computer Science developed a methodology to quantify the gender association of the words in energy policy documents. They applied the same method to the constitution and found that male-gendered terms appeared with greater frequency in the Constitution than female-gendered terms. They also found key authoritative words were much more closely associated with men and family-orientated words.
The study comes after the two proposed constitutional amendments on family and care were overwhelming rejected in the referendum on the 8th of March 2024.
As part of the process, researchers conducted a simple frequency count of gendered terms in the Constitution and found that ‘he’, ‘him’, ‘his’, ‘man’, and ‘father’ appear collectively 109 times, with ‘his’ appearing 63 times alone.
The word ‘her’ appears twice, ‘woman’ three times, and ‘mother’ twice. The word ‘she’ does not appear in the text.
They then counted the document’s 1,283 unique words and organised those most often repeated into two separate lists.
The ‘authoritative’ list contained words like, ‘successor’, ‘genius’, ‘chief’, ‘command’, ‘head’, ‘authority’ and ‘ownership’, while a ‘family’ list contained words like ‘birth’, ‘diversity’, ‘guardianship’, ‘parent’, ‘child’, and ‘spouse’.
Using machine learning and national language processing (NLP) tools, the project leaders ran a word embedded association test (WEAT) against the target list of gendered terms. The AI tools then quantified the high level of male gender bias in the document.
“The fact that the words in the authoritative list were much more closely associated with the male gender highlights how much of the language in the Irish Constitution is outdated and the need for language in future policy documents to be inclusive", said Singh.
"When we have the opportunity to choose words, we should choose the gender-neutral option when we can.”
Associate Professor Carroll added that the research has further demonstrated the need to increase visibility of women across society to not only ensure a more cohesive society, but in the future as well.
“Our models are trained on Google News. Those articles are historical data so gender bias is already encoded in it", she said.
"Machine learning has no option but to take that as an input. If we increase representation of women in policy, media and everywhere else, it’s more likely that we can train the machines on more balanced datasets in the future.”