My Account Log in

3 options

Collaborative annotation for reliable natural language processing : technical and sociological aspects / Karën Fort.

Ebook Central Academic Complete Available online

Ebook Central Academic Complete

Ebook Central College Complete Available online

Ebook Central College Complete

O'Reilly Online Learning: Academic/Public Library Edition Available online

O'Reilly Online Learning: Academic/Public Library Edition
Format:
Book
Author/Creator:
Fort, Karën, author.
Series:
Cognitive science series.
THEi Wiley ebooks.
Cognitive Science Series, 2051-249x
THEi Wiley ebooks
Language:
English
Subjects (All):
Natural language processing (Computer science).
Physical Description:
1 online resource (196 p.)
Edition:
1st edition
Place of Publication:
London, England ; Hoboken, New Jersey : ISTE : Wiley, 2016.
System Details:
Access using campus network via VPN at home (THEi Users Only).
text file
Summary:
This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP). NLP has witnessed two major evolutions in the past 25 years: firstly, the extraordinary success of machine learning, which is now, for better or for worse, overwhelmingly dominant in the field, and secondly, the multiplication of evaluation campaigns or shared tasks. Both involve manually annotated corpora, for the training and evaluation of the systems. These corpora have progressively become the hidden pillars of our domain, providing food for our hungry machine learning algorithms and reference for evaluation. Annotation is now the place where linguistics hides in NLP. However, manual annotation has largely been ignored for some time, and it has taken a while even for annotation guidelines to be recognized as essential. Although some efforts have been made lately to address some of the issues presented by manual annotation, there has still been little research done on the subject. This book aims to provide some useful insights into the subject. Manual corpus annotation is now at the heart of NLP, and is still largely unexplored. There is a need for manual annotation engineering (in the sense of a precisely formalized process), and this book aims to provide a first step towards a holistic methodology, with a global view on annotation.
Contents:
Cover; Title Page; Copyright; Contents; Preface; List of Acronyms; Introduction; 1: Annotating Collaboratively; 2: Crowdsourcing Annotation; Conclusion; Appendix: (Some) Annotation Tools; Glossary; Bibliography; Index; Other titles from ISTE in Cognitive Science and Knowledge Management; ELUA; I.1. Natural Language Processing and manual annotation: Dr Jekyll and Mr Hy|ide?; I.2. Rediscovering annotation; 1.1. The annotation process (re)visited; 1.2. Annotation complexity; 1.3. Annotation tools; 1.4. Evaluating the annotation quality; 1.5. Conclusion
2.1. What is crowdsourcing and why should we be interested in it?2.2. Deconstructing the myths; 2.2.3. "Crowdsourcing involves (a crowd of) non-experts"; 2.3. Playing with a purpose; 2.4. Acknowledging crowdsourcing specifics; 2.5. Ethical issues; A.1. Generic tools; A.2. Task-oriented tools; A.3. NLP annotation platforms; A.4. Annotation management tools; A.5. (Many) Other tools; I.1.1. Where linguistics hides; I.1.2. What is annotation?; I.1.3. New forms, old issues; I.2.1. A rise in diversity and complexity; I.2.2. Redefining manual annotation costs; 1.1.1. Building consensus
1.1.2. Existing methodologies1.1.3. Preparatory work; 1.1.4. Pre-campaign; 1.1.5. Annotation; 1.1.6. Finalization; 1.2.1. Example overview; 1.2.2. What to annotate?; 1.2.3. How to annotate?; 1.2.4. The weight of the context; 1.2.5. Visualization; 1.2.6. Elementary annotation tasks; 1.3.1. To be or not to be an annotation tool; 1.3.2. Much more than prototypes; 1.3.3. Addressing the new annotation challenges; 1.3.4. The impossible dream tool; 1.4.1. What is annotation quality?; 1.4.2. Understanding the basics; 1.4.3. Beyond kappas; 1.4.4. Giving meaning to the metrics; 2.1.1. A moving target
2.1.2. A massive success2.2.1. Crowdsourcing is a recent phenomenon; 2.2.2. Crowdsourcing involves a crowd (of non-experts); 2.3.1. Using the players' innate capabilities and world knowledge; 2.3.2. Using the players' school knowledge; 2.3.3. Using the players' learning capacities; 2.4.1. Motivating the participants; 2.4.2. Producing quality data; 2.5.1. Game ethics; 2.5.2. What's wrong with Amazon Mechanical Turk?; 2.5.3. A charter to rule them all; A.1.1. Cadixe; A.1.2. Callisto; A.1.3. Amazon Mechanical Turk; A.1.4. Knowtator; A.1.5. MMAX2; A.1.6. UAM CorpusTool; A.1.7. Glozz; A.1.8. CCASH
A.1.9. bratA.2.1. LDC tools; A.2.2. EasyRef; A.2.3. Phrase Detectives; A.2.4. ZombiLingo; A.3.1. GATE; A.3.2. EULIA; A.3.3. UIMA; A.3.4. SYNC3; A.4.1. Slate; A.4.2. Djangology; A.4.3. GATE Teamware; A.4.4. WebAnno; 1.1.3.1. Identifying the actors; 1.1.3.2. Taking the corpus into account; 1.1.3.3. Creating and modifying the annotation guide; 1.1.4.1. Building the mini-reference; 1.1.4.2. Training the annotators; 1.1.5.1. Breaking-in; 1.1.5.2. Annotating; 1.1.5.3. Updating; 1.1.6.1. Failure; 1.1.6.2. Adjudication; 1.1.6.3. Reviewing; 1.1.6.4. Publication; 1.2.1.1. Example 1: POS
1.2.1.2. Example 2: gene renaming
Notes:
Description based upon print version of record.
Includes bibliographical references and index.
Description based on print version record.
ISBN:
9781119307648
1119307643
9781119306696
1119306698
OCLC:
951809856

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account