My Account Log in

1 option

Scaling up machine learning : parallel and distributed approaches / edited by Ron Bekkerman, Mikhail Bilenko, John Langford.

EBSCOhost Academic eBook Collection (North America) Available online

EBSCOhost Academic eBook Collection (North America)
Format:
Book
Contributor:
Bekkerman, Ron, editor.
Bilenko, Mikhail, 1978- editor.
Langford, John, 1975- editor.
Language:
English
Subjects (All):
Machine learning.
Data mining.
Parallel algorithms.
Parallel programs (Computer programs).
Physical Description:
1 online resource (xvi, 475 pages) : digital, PDF file(s).
Place of Publication:
Cambridge : Cambridge University Press, 2012.
Language Note:
English
Summary:
This book presents an integrated collection of representative approaches for scaling up machine learning and data mining methods on parallel and distributed computing platforms. Demand for parallelizing learning algorithms is highly task-specific: in some settings it is driven by the enormous dataset sizes, in others by model complexity or by real-time performance requirements. Making task-appropriate algorithm and platform choices for large-scale machine learning requires understanding the benefits, trade-offs and constraints of the available options. Solutions presented in the book cover a range of parallelization platforms from FPGAs and GPUs to multi-core systems and commodity clusters, concurrent programming frameworks including CUDA, MPI, MapReduce and DryadLINQ, and learning settings (supervised, unsupervised, semi-supervised and online learning). Extensive coverage of parallelization of boosted trees, SVMs, spectral clustering, belief propagation and other popular learning algorithms and deep dives into several applications make the book equally useful for researchers, students and practitioners.
Contents:
Cover; Scaling Up Machine Learning; Title; Copyright; Contents; Contributors; Preface; CHAPTER 1 Scaling Up Machine Learning: Introduction; 1.1 Machine Learning Basics; 1.2 Reasons for Scaling Up Machine Learning; 1.2.1 Large Number of Data Instances; 1.2.2 High Input Dimensionality; 1.2.3 Model and Algorithm Complexity; 1.2.4 Inference Time Constraints; 1.2.5 Prediction Cascades; 1.2.6 Model Selection and Parameter Sweeps; 1.3 Key Concepts in Parallel and Distributed Computing; 1.3.1 Data Parallelism; 1.3.2 Task Parallelism; 1.4 Platform Choices and Trade-Offs; 1.5 Thinking about Performance
1.6 Organization of the Book1.6.1 Part I: Frameworks for Scaling Up Machine Learning; 1.6.2 Part II: Supervised and Unsupervised Learning Algorithms; 1.6.3 Part III: Alternative Learning Settings; 1.6.4 Part IV: Applications; 1.7 Bibliographic Notes; References; Acknowledgments; PART ONE I Frameworks for Scaling Up Machine Learning; CHAPTER 2 MapReduce and Its Application to Massively Parallel Learning; 2.1 Preliminaries; 2.1.1 MapReduce; MapReduce Example: Word Histogram; MapReduce Example: k-means Clustering; 2.1.2 Tree Models; 2.1.3 Learning Tree Models; Scalability Challenge
2.1.4 Regression Trees2.2 Example of PLANET; 2.2.1 Components; 2.2.2 Walkthrough; 2.3 Technical Details; 2.3.1 MR_Expand Nodes: Expanding a Single Node; 2.3.2 MR_INMEMORY: In Memory Tree Induction; 2.3.3 Controller Design; 2.4 Learning Ensembles; 2.5 Engineering Issues; 2.5.1 Forward Scheduling; 2.5.2 Fingerprinting; 2.5.3 Reliability; 2.6 Experiments; 2.6.1 Setup; 2.6.2 Results; 2.7 Related Work; 2.8 Conclusions; Acknowledgments; References; CHAPTER 3 Large-Scale Machine Learning Using DryadLINQ; 3.1 Manipulating Datasets with LINQ; 3.2 k-Means in LINQ
3.3 Running LINQ on a Cluster with DryadLINQ3.3.1 Dryad; 3.3.2 DryadLINQ; 3.3.3 MapReduce and DryadLINQ; 3.3.4 k-means Clustering in DryadLINQ; Measurements; 3.3.5 Decision Tree Induction in DryadLINQ; Measurements; 3.3.6 Example: Singular Value Decomposition; Measurements; 3.4 Lessons Learned; 3.4.1 Strengths; 3.4.2 Weaknesses; 3.4.3 A Real Application; 3.4.4 Availability; References; CHAPTER 4 IBM Parallel Machine Learning Toolbox; 4.1 Data-Parallel Associative-Commutative Computation; 4.2 API and Control Layer; 4.3 API Extensions for Distributed-State Algorithms
4.4 Control Layer Implementation and Optimizations4.5 Parallel Kernel k-Means; 4.6 Parallel Decision Tree; 4.7 Parallel Frequent Pattern Mining; 4.8 Summary; References; CHAPTER 5 Uniformly Fine-Grained Data-Parallel Computing; 5.1 Overview of a GP-GPU; 5.2 Uniformly Fine-Grained Data-Parallel Computing on a GPU; 5.2.1 Data-Parallel Computing; 5.2.2 Uniformly Fine-Grained Data-Parallel Design; 5.3 The k-Means Clustering Algorithm; 5.3.1 Uniformly Fine-Grained Data Parallelism in k-means; 5.4 The k-Means Regression Clustering Algorithm
5.4.1 Fine-Grained Data-Parallel Structures in k-means RCon a GPU
Notes:
Title from publisher's bibliographic system (viewed on 05 Oct 2015).
Includes bibliographical references and index.
ISBN:
1-108-46174-3
1-107-22310-5
1-280-48475-6
1-139-22175-2
9786613579737
1-139-21693-7
1-139-21386-5
1-139-22346-1
1-139-04291-2
1-139-22002-0
OCLC:
775869713

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

We want your feedback!

Thanks for using the Penn Libraries new search tool. We encourage you to submit feedback as we continue to improve the site.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account