Projectdetail

Disciplines

Computer Sciences (40%); Psychology (10%); Sociology (10%); Linguistics and Literature (40%)

Keywords

Gender Bias, Natural Language Processing, Children'S Literature, Content Analysis

Abstract

Final report

When asked to draw a mathematician, girls are twice more likely to draw a man than a woman, while boys almost universally draw a man. A similar tendency to associate professions such as firefighters, surgeons and fighter-pilots to the masculine gender has been observed in children as young as 5 years old. Gender stereotypes form early in the childs development and are carried over throughout adolescence into adulthood, leaving long-lasting effects on emotional and cognitive development, while shaping activity and career choices as well as impacting academic performance. In this work, we propose a solution for addressing gender under- and misrepresentation in textual literature for pre- and primary-school children. In childrens books, a crucial element in the child development process, male characters outnumber female characters, non- binary characters are basically absent, and gender roles and stereotypes are being reinforced. The goal of the project is twofold. Firstly, we want to identify and measure different aspects related to gender under- and misrepresentation. For example, proportion of male vs. female characters, gender-assuming pronouns and language that reinforces stereotypical gender roles could all be relevant in this context. Once reliable measurements are obtained, they will be combined into a gender representation score. The score should be easily interpretable to increase public awareness and serve as an aid to parents, educators and decision-makers. Secondly, after computing this score we want to develop best-practice guidelines for its validation in order to ensure transparency and accuracy of the methodology. In this step we will rely primarily on the opinions of gender experts and linguists. The innovative character of this project lies on the integration of the following quantitative and qualitative research techniques. On the one hand, the measurement procedure will build on modern artificial intelligence (AI) algorithms for the analysis of text. Recent advances in this field allow for algorithms to be aware of the context in which words appear, rather than analyzing words separately. Context-awareness makes such algorithms promising tools for the measurement of more complex components of gender bias in textual data. However, as it is well known that AI techniques may present drawbacks in terms of transparency and interpret ability, we do not plan to rely solely on them in our analysis. In particular, we will complement them by making use of state-of-the- art qualitative methods for literature review, data collection and validation procedure.

Gender stereotypes form early in the child's development and are carried over throughout adolescence into adulthood, leaving long-lasting effects which may impact activity and career choices, as well as academic performance. Books, in particular, can have considerable influence, as their characters serve to shape role models of femininity and masculinity for young children. Thus, gender under- and misrepresentation in children's textual literature can contribute to the internalization and reinforcement of negative stereotypes. In this project we aimed to leverage natural language processing tools to automatically measure different aspects of gender bias in children's literature. To address this issue, we first reviewed exiting literature in psychology and social sciences and identify relevant dimensions of gender bias in children's books. The representation of gender among the characters, their centrality to the story, stereotypical portrayal related to occupations, appearance, brilliance bias, emotions, toys and interests, physical attributes and strength as well as agency vs passivity of the characters should be taken into account when aiming to measure the gender bias of a text. Moreover, the presence of stereotypical language should also be detected. As part of the project, we employed natural language processing tools to automatically build interpretable gender-bias measures for an extensive collection of the identified dimensions. We furthermore proposed a "data-driven" method to measure bias by utilizing word embeddings, which are patterns learned from a large collection of text. This allows us to compute an over-all (albeit less interpretable) bias response for a whole story. Finally, to improve the interpretability, we propose to use the collection of interpretable measures to explain the rather black box computed bias measure in order to derive a scoring function which allows to understand which dimensions contribute most in explaining the bias. We illustrate the approach on a collection of 30 classical fairytales.

Research institution(s)

Technische Universität Wien - 100%

Research Output

1 Publications
1 Datasets & models

Publications

Title	DETECTING GENDER BIAS IN FAIRY TALES
Type	Other
Author	Camilla Damian
Link	Publication

Datasets & models

Public Access
Title	DGBIAS
Type	Data analysis technique
Link	Link

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

Detecting gender bias in children´s books

Detecting gender bias in children´s books

Disciplines

Keywords

Research Output

Contact

General information

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

SOCIAL MEDIA

SCILOG

Detecting gender bias in children´s books

Detecting gender bias in children´s books

Disciplines

Keywords

Research Output