1 option
Parallel R / Q. Ethan McCallum and Stephen Weston ; editors, Mike Loukides and Meghan Blanchette ; illustrator, Robert Romano.
- Format:
- Book
- Author/Creator:
- McCallum, Q. Ethan.
- Language:
- English
- Subjects (All):
- R (Computer program language).
- Mathematical statistics--Data processing.
- Mathematical statistics.
- Open source software.
- Physical Description:
- 1 online resource (122 p.)
- Edition:
- First edition.
- Place of Publication:
- Sebastopol, CA : O'Reilly, 2011.
- Language Note:
- English
- System Details:
- text file
- Summary:
- It's tough to argue with R as a high-quality, cross-platform, open source statistical software product-unless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets. You'll learn the basics of Snow, Multicore, Parallel, and some Hadoop-related tools, including how to find them, how to use them, when they work well, and when they don't. With these packages, you can overcome R's single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R's memo
- Contents:
- Table of Contents; Preface; Conventions Used in This Book; Using Code Examples; Safari® Books Online; How to Contact Us; Acknowledgments; Q. Ethan McCallum; Stephen Weston; Chapter 1. Getting Started; Why R?; Why Not R?; The Solution: Parallel Execution; A Road Map for This Book; What We'll Cover; Looking Forward...; What We'll Assume You Already Know; In a Hurry?; snow; multicore; parallel; R+Hadoop; RHIPE; Segue; Summary; Chapter 2. snow; Quick Look; How It Works; Setting Up; Working with It; Creating Clusters with makeCluster; Parallel K-Means; Initializing Workers
- Load Balancing with clusterApplyLBTask Chunking with parLapply; Vectorizing with clusterSplit; Load Balancing Redux; Functions and Environments; Random Number Generation; snow Configuration; Installing Rmpi; Executing snow Programs on a Cluster with Rmpi; Executing snow Programs with a Batch Queueing System; Troubleshooting snow Programs; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 3. multicore; Quick Look; How It Works; Setting Up; Working with It; The mclapply Function; The mc.cores Option; The mc.set.seed Option; Load Balancing with mclapply; The pvec Function
- The parallel and collect FunctionsUsing collect Options; Parallel Random Number Generation; The Low-Level API; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 4. parallel; Quick Look; How It Works; Setting Up; Working with It; Getting Started; Creating Clusters with makeCluster; Parallel Random Number Generation; Summary of Differences; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 5. A Primer on MapReduce and Hadoop; Hadoop at Cruising Altitude; A MapReduce Primer; Thinking in MapReduce: Some Pseudocode Examples; Calculate Average Call Length for Each Date
- Number of Calls by Each User, on Each DateRun a Special Algorithm on Each Record; Binary and Whole-File Data: SequenceFiles; No Cluster? No Problem! Look to the Clouds...; The Wrap-up; Chapter 6. R+Hadoop; Quick Look; How It Works; Setting Up; Working with It; Simple Hadoop Streaming (All Text); Streaming, Redux: Indirectly Working with Binary Data; The Java API: Binary Input and Output; Processing Related Groups (the Full Map and Reduce Phases); When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 7. RHIPE; Quick Look; How It Works; Setting Up; Working with It; Phone Call Records, Redux
- Tweet BrevityMore Complex Tweet Analysis; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 8. Segue; Quick Look; How It Works; Setting Up; Working with It; Model Testing: Parameter Sweep; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 9. New and Upcoming; doRedis; RevoScale R and RevoConnectR (RHadoop); cloudNumbers.com
- Notes:
- Description based upon print version of record.
- Includes bibliographical references.
- Description based on online resource; title from PDF title page (ebrary, viewed September 24, 2013).
- ISBN:
- 9781306813648
- 1306813646
- 9781449320331
- 1449320333
- 9781449320348
- 1449320341
- OCLC:
- 767502408
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.