My Account Log in

1 option

Human-in-the-loop machine learning : active learning and annotation for human-centered AI / Robert Monarch ; foreword by Christopher D. Manning.

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Monarch, Robert, author.
Contributor:
Manning, Christopher D., writer of foreword.
Language:
English
Subjects (All):
Machine learning.
Human-computer interaction.
Physical Description:
1 online resource (390 pages)
Place of Publication:
Shelter Island, New York : Manning Publications, [2021]
Summary:
Human-in-the-Loop Machine Learning lays out methods for humans and machines to work together effectively. Summary Most machine learning systems that are deployed in the world today learn from human feedback. However, most machine learning courses focus almost exclusively on the algorithms, not the human-computer interaction part of the systems. This can leave a big knowledge gap for data scientists working in real-world machine learning, where data scientists spend more time on data management than on building algorithms. Human-in-the-Loop Machine Learning is a practical guide to optimizing the entire machine learning process, including techniques for annotation, active learning, transfer learning, and using machine learning to optimize every step of the process. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Machine learning applications perform better with human feedback. Keeping the right people in the loop improves the accuracy of models, reduces errors in data, lowers costs, and helps you ship models faster. About the book Human-in-the-Loop Machine Learning lays out methods for humans and machines to work together effectively. You'll find best practices on selecting sample data for human feedback, quality control for human annotations, and designing annotation interfaces. You'll learn to create training data for labeling, object detection, and semantic segmentation, sequence labeling, and more. The book starts with the basics and progresses to advanced techniques like transfer learning and self-supervision within annotation workflows. What's inside Identifying the right training and evaluation data Finding and managing people to annotate data Selecting annotation quality control strategies Designing interfaces to improve accuracy and efficiency About the author Robert (Munro) Monarch is a data scientist and engineer who has built machine learning data for companies such as Apple, Amazon, Google, and IBM. He holds a PhD from Stanford. Robert holds a PhD from Stanford focused on Human-in-the-Loop machine learning for healthcare and disaster response, and is a disaster response professional in addition to being a machine learning professional. A worked example throughout this text is classifying disaster-related messages from real disasters that Robert has helped respond to in the past. Table of Contents PART 1 - FIRST STEPS 1 Introduction to human-in-the-loop machine learning 2 Getting started with human-in-the-loop machine learning PART 2 - ACTIVE LEARNING 3 Uncertainty sampling 4 Diversity sampling 5 Advanced active learning 6 Applying active learning to different machine learning tasks PART 3 - ANNOTATION 7 Working with the people annotating your data 8 Quality control for data annotation 9 Advanced data annotation and augmentation 10 Annotation quality for different machine learning tasks PART 4 - HUMAN-COMPUTER INTERACTION FOR MACHINE LEARNING 11 Interfaces for data annotation 12 Human-in-the-loop machine learning products
Contents:
Intro
inside front cover
Human-in-the-Loop Machine Learning
Copyright
brief contents
contents
front matter
foreword
preface
acknowledgments
about this book
Who should read this book
How this book is organized: A road map
About the code
liveBook discussion forum
Other online resources
about the author
Part 1 First steps
1 Introduction to human-in-the-loop machine learning
1.1 The basic principles of human-in-the-loop machine learning
1.2 Introducing annotation
1.2.1 Simple and more complicated annotation strategies
1.2.2 Plugging the gap in data science knowledge
1.2.3 Quality human annotation: Why is it hard?
1.3 Introducing active learning: Improving the speed and reducing the cost of training data
1.3.1 Three broad active learning sampling strategies: Uncertainty, diversity, and random
1.3.2 What is a random selection of evaluation data?
1.3.3 When to use active learning
1.4 Machine learning and human-computer interaction
1.4.1 User interfaces: How do you create training data?
1.4.2 Priming: What can influence human perception?
1.4.3 The pros and cons of creating labels by evaluating machine learning predictions
1.4.4 Basic principles for designing annotation interfaces
1.5 Machine-learning-assisted humans vs. human-assisted machine learning
1.6 Transfer learning to kick-start your models
1.6.1 Transfer learning in computer vision
1.6.2 Transfer learning in NLP
1.7 What to expect in this text
Summary
2 Getting started with human-in-the-loop machine learning
2.1 Beyond hacktive learning: Your first active learning algorithm
2.2 The architecture of your first system
2.3 Interpreting model predictions and data to support active learning
2.3.1 Confidence ranking
2.3.2 Identifying outliers.
2.3.3 What to expect as you iterate
2.4 Building an interface to get human labels
2.4.1 A simple interface for labeling text
2.4.2 Managing machine learning data
2.5 Deploying your first human-in-the-loop machine learning system
2.5.1 Always get your evaluation data first
2.5.2 Every data point gets a chance
2.5.3 Select the right strategies for your data
2.5.4 Retrain the model and iterate
Part 2 Active learning
3 Uncertainty sampling
3.1 Interpreting uncertainty in a machine learning model
3.1.1 Why look for uncertainty in your model?
3.1.2 Softmax and probability distributions
3.1.3 Interpreting the success of active learning
3.2 Algorithms for uncertainty sampling
3.2.1 Least confidence sampling
3.2.2 Margin of confidence sampling
3.2.3 Ratio sampling
3.2.4 Entropy (classification entropy)
3.2.5 A deep dive on entropy
3.3 Identifying when different types of models are confused
3.3.1 Uncertainty sampling with logistic regression and MaxEnt models
3.3.2 Uncertainty sampling with SVMs
3.3.3 Uncertainty sampling with Bayesian models
3.3.4 Uncertainty sampling with decision trees and random forests
3.4 Measuring uncertainty across multiple predictions
3.4.1 Uncertainty sampling with ensemble models
3.4.2 Query by Committee and dropouts
3.4.3 The difference between aleatoric and epistemic uncertainty
3.4.4 Multilabeled and continuous value classification
3.5 Selecting the right number of items for human review
3.5.1 Budget-constrained uncertainty sampling
3.5.2 Time-constrained uncertainty sampling
3.5.3 When do I stop if I'm not time- or budget-constrained?
3.6 Evaluating the success of active learning
3.6.1 Do I need new test data?
3.6.2 Do I need new validation data?
3.7 Uncertainty sampling cheat sheet
3.8 Further reading.
3.8.1 Further reading for least confidence sampling
3.8.2 Further reading for margin of confidence sampling
3.8.3 Further reading for ratio of confidence sampling
3.8.4 Further reading for entropy-based sampling
3.8.5 Further reading for other machine learning models
3.8.6 Further reading for ensemble-based uncertainty sampling
4 Diversity sampling
4.1 Knowing what you don't know: Identifying gaps in your model's knowledge
4.1.1 Example data for diversity sampling
4.1.2 Interpreting neural models for diversity sampling
4.1.3 Getting information from hidden layers in PyTorch
4.2 Model-based outlier sampling
4.2.1 Use validation data to rank activations
4.2.2 Which layers should I use to calculate model-based outliers?
4.2.3 The limitations of model-based outliers
4.3 Cluster-based sampling
4.3.1 Cluster members, centroids, and outliers
4.3.2 Any clustering algorithm in the universe
4.3.3 K-means clustering with cosine similarity
4.3.4 Reduced feature dimensions via embeddings or PCA
4.3.5 Other clustering algorithms
4.4 Representative sampling
4.4.1 Representative sampling is rarely used in isolation
4.4.2 Simple representative sampling
4.4.3 Adaptive representative sampling
4.5 Sampling for real-world diversity
4.5.1 Common problems in training data diversity
4.5.2 Stratified sampling to ensure diversity of demographics
4.5.3 Represented and representative: Which matters?
4.5.4 Per-demographic accuracy
4.5.5 Limitations of sampling for real-world diversity
4.6 Diversity sampling with different types of models
4.6.1 Model-based outliers with different types of models
4.6.2 Clustering with different types of models
4.6.3 Representative sampling with different types of models.
4.6.4 Sampling for real-world diversity with different types of models
4.7 Diversity sampling cheat sheet
4.8 Further reading
4.8.1 Further reading for model-based outliers
4.8.2 Further reading for cluster-based sampling
4.8.3 Further reading for representative sampling
4.8.4 Further reading for sampling for real-world diversity
5 Advanced active learning
5.1 Combining uncertainty sampling and diversity sampling
5.1.1 Least confidence sampling with cluster-based sampling
5.1.2 Uncertainty sampling with model-based outliers
5.1.3 Uncertainty sampling with model-based outliers and clustering
5.1.4 Representative sampling cluster-based sampling
5.1.5 Sampling from the highest-entropy cluster
5.1.6 Other combinations of active learning strategies
5.1.7 Combining active learning scores
5.1.8 Expected error reduction sampling
5.2 Active transfer learning for uncertainty sampling
5.2.1 Making your model predict its own errors
5.2.2 Implementing active transfer learning
5.2.3 Active transfer learning with more layers
5.2.4 The pros and cons of active transfer learning
5.3 Applying active transfer learning to representative sampling
5.3.1 Making your model predict what it doesn't know
5.3.2 Active transfer learning for adaptive representative sampling
5.3.3 The pros and cons of active transfer learning for representative sampling
5.4 Active transfer learning for adaptive sampling
5.4.1 Making uncertainty sampling adaptive by predicting uncertainty
5.4.2 The pros and cons of ATLAS
5.5 Advanced active learning cheat sheets
5.6 Further reading for active transfer learning
6 Applying active learning to different machine learning tasks
6.1 Applying active learning to object detection.
6.1.1 Accuracy for object detection: Label confidence and localization
6.1.2 Uncertainty sampling for label confidence and localization in object detection
6.1.3 Diversity sampling for label confidence and localization in object detection
6.1.4 Active transfer learning for object detection
6.1.5 Setting a low object detection threshold to avoid perpetuating bias
6.1.6 Creating training data samples for representative sampling that are similar to your predictions
6.1.7 Sampling for image-level diversity in object detection
6.1.8 Considering tighter masks when using polygons
6.2 Applying active learning to semantic segmentation
6.2.1 Accuracy for semantic segmentation
6.2.2 Uncertainty sampling for semantic segmentation
6.2.3 Diversity sampling for semantic segmentation
6.2.4 Active transfer learning for semantic segmentation
6.2.5 Sampling for image-level diversity in semantic segmentation
6.3 Applying active learning to sequence labeling
6.3.1 Accuracy for sequence labeling
6.3.2 Uncertainty sampling for sequence labeling
6.3.3 Diversity sampling for sequence labeling
6.3.4 Active transfer learning for sequence labeling
6.3.5 Stratified sampling by confidence and tokens
6.3.6 Create training data samples for representative sampling that are similar to your predictions
6.3.7 Full-sequence labeling
6.3.8 Sampling for document-level diversity in sequence labeling
6.4 Applying active learning to language generation
6.4.1 Calculating accuracy for language generation systems
6.4.2 Uncertainty sampling for language generation
6.4.3 Diversity sampling for language generation
6.4.4 Active transfer learning for language generation
6.5 Applying active learning to other machine learning tasks
6.5.1 Active learning for information retrieval
6.5.2 Active learning for video.
6.5.3 Active learning for speech.
Notes:
Description based on print version record.
Includes index.
ISBN:
9781638351030
1638351031
OCLC:
1261363936

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account