My Account Log in

2 options

Harmonizing language data : Standards for linguistic resources. / Piotr Bański.

De Gruyter DG Plus DeG Package 2025 Part 1 Available online

View online

Walter De Gruyter: Open Access eBooks Available online

View online
Format:
Book
Author/Creator:
Bański, Piotr.
Contributor:
Bański, Piotr, editor.
Heid, Ulrich, editor.
Herzberg, Laura, editor.
Volkswagenstiftung, funder.
Series:
Digital Linguistics, 4.
Language:
English
Physical Description:
1 online resource
Place of Publication:
LaVergne : De Gruyter, 2025.
Berlin Boston De Gruyter, [2025]
Language Note:
In English.
Biography/History:
Ulrich Heid, Univ. of Hildesheim; Piotr Bański and Laura Herzberg, Leibniz Institute for the German Language, Mannheim, Germany.
Summary:
Standards function as safeguards to ensure that data remains interpretable, uniformly queryable, and archivable over time – a critical challenge for digital humanists working with complex linguistic resources. This book provides an overview of essential standards for ensuring the sustainability of data in the Digital Humanities (DH). It addresses the selection of data encoding formats, methods of annotating primary data, and approaches to making resources findable and accessible. The focus is on various forms of linguistic data, such as texts, lexicons, or parallel arrangements (e.g., translations or transcribed recordings). The work explains the role of annotations and metadata in structuring and contextualizing data and examines the influence of diverse data formats, shaped by local academic or industrial practices. In contrast to neural language models, which often yield impressive but opaque results, DH projects aim for transparency, reproducibility, and sustainability. Achieving these goals requires interoperability – the seamless interaction between data and tools. The book demonstrates how clear guidelines and best practices help ensure the long-term usability of data. It offers digital humanists practical approaches and well-founded standards to sustainably archive and efficiently utilize their data, making it an indispensable resource for the field.
Contents:
IIIFrontmatter
VIIVIIIContents
1Towards an optimum degree of order in the field of language resources / Piotr Bański, Ulrich Heid, Laura Herzberg
17Character encoding and its importance for text resources / Christian Wartena
35International standards for the identification and the description of languages and their varieties / Laurent Romary
61Part-of-speech tagging and related annotation / Nikola Ljubešić, Tomaž Erjavec
89Named entity recognition and entity linking / Pia Schwarz
115Annotated audiovisual language data: data quality and data maturity / Vera Ferreira, Hanna Hedeland, Kelsey Neely
145From spoken language data to TEI-based ISO standard / Antonina Werthmann
169Dealing with multiple annotations / Piotr Bański, Nils Diewald
201Standards and practices for long-term digital archiving / Ines Pisetta, Thorsten Trippel
229Conversion into the archival format I5 / Harald Lüngen, Ines Pisetta
251Metadata for research data / Thorsten Trippel
281Linguistic linked (open) data / Anas Fahad Khan
303Data exploitation: corpus queries / Stephanie Evert, Timm Weber, Steffen Bothe, Philipp Heinrich, Alexander Piperski
339Querying spoken language data / Elena Frick, Thomas Schmidt
377Accessing linguistic content in distributed research environments / Erik Körner, Thomas Eckart
401Taxonomy of legal and ethical metadata for language resources / Paweł Kamocki
427The life of an ISO standard / Annette Preissner, Ulrich Heid
Notes:
Title from eBook information screen..
This eBook is made available Open Access under a Creative Commons Attribution (CC BY) license: https://creativecommons.org/licenses/by/4.0
Description based on online resource; title from PDF title page (publisher's Web site, viewed January 15 2026)
ISBN:
3-11-220953-2
3-11-220821-8
9783112208212

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account