1 option
Text classification of gender-biased language in archival documentation / Lucy Havens.
- Format:
- Book
- Author/Creator:
- Havens, Lucy, author.
- Series:
- SAGE Research methods: diversifying and decolonizing research.
- SAGE Research methods: diversifying and decolonizing research
- Language:
- English
- Subjects (All):
- Computational linguistics.
- Data sets.
- Physical Description:
- 1 online resource
- Place of Publication:
- London : SAGE Publications Ltd, 2024.
- Summary:
- This dataset is designed for teaching a supervised learning approach to creating natural language processing models for classifying gender biases in a text corpus. The dataset contains catalogue metadata descriptions from the University of Edinburgh's Heritage Collections' Archives catalogue, which were manually annotated for gendered and gender-biased language by Lucy Havens, Suzanne Black, Ashlyn Cudney, Anna Kuslits, and Iona Walker. The sample dataset contains examples of text that were manually annotated with all available labels from the Taxonomy of Gendered and Gender Biased Language, providing a dataset that was then used to train several gender-biased text classification models. Here, we use a subset of that dataset to focus on two labels only, "Omission" and "Stereotype," and on one type of classification task, document classification. The dataset file is accompanied by a Teaching Guide and a Student Guide, which explain how to create a text classification model with the data and evaluate the model's performance both quantitatively and qualitatively.
- Notes:
- Description based on XML content.
- ISBN:
- 1-5296-9260-1
- 9781529692600
- OCLC:
- 1428169646
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.