1 option

Data algorithms / Mahmoud Parsian.

O'Reilly Online Learning: Academic/Public Library Edition Available online

Format:: Book
Author/Creator:: Parsian, Mahmoud, author.
Language:: English
Subjects (All):: MapReduce (Computer file).; Apache Hadoop.; Electronic data processing.
Physical Description:: 1 online resource (778 p.)
Edition:: 1st edition
Other Title:: Recipes for scaling up with Hadoop and Spark
Place of Publication:: Beijing, China : O'Reilly, 2015.
System Details:: text file
Summary:: If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Each chapter provides a recipe for solving a massive computational problem, such as building a recommendation system. You’ll learn how to implement the appropriate MapReduce solution with code that you can use in your projects. Dr. Mahmoud Parsian covers basic design patterns, optimization techniques, and data mining and machine learning solutions for problems in bioinformatics, genomics, statistics, and social network analysis. This book also includes an overview of MapReduce, Hadoop, and Spark.
Contents:: ""Copyright""; ""Table of Contents""; ""Foreword""; ""Preface""; ""What Is MapReduce?""; ""Simple Explanation of MapReduce""; ""When to Use MapReduce""; ""What MapReduce Isn't""; ""Why Use MapReduce?""; ""Hadoop and Spark""; ""What Is in This Book?""; ""What Is the Focus of This Book?""; ""Who Is This Book For?""; ""Online Resources""; ""What Software Is Used in This Book?""; ""Conventions Used in This Book""; ""Using Code Examples""; ""Safari® Books Online""; ""How to Contact Us""; ""Acknowledgments""; ""Comments and Questions for This Book""; ""Chapter 1. Secondary Sort: Introduction""; ""Solutions to the Secondary Sort Problem""""Implementation Details""; ""Data Flow Using Plug-in Classes""; ""MapReduce/Hadoop Solution to Secondary Sort""; ""Input""; ""Expected Output""; ""map() Function""; ""reduce() Function""; ""Hadoop Implementation Classes""; ""Sample Run of Hadoop Implementation""; ""How to Sort in Ascending or Descending Order""; ""Spark Solution to Secondary Sort""; ""Time Series as Input""; ""Expected Output""; ""Option 1: Secondary Sorting in Memory""; ""Spark Sample Run""; ""Option #2: Secondary Sorting Using the Spark Framework""; ""Further Reading on Secondary Sorting""""Chapter 2. Secondary Sort: A Detailed Example""; ""Secondary Sorting Technique""; ""Complete Example of Secondary Sorting""; ""Input Format""; ""Output Format""; ""Composite Key""; ""Sample Run-Old Hadoop API""; ""Input""; ""Running the MapReduce Job""; ""Output""; ""Sample Run-New Hadoop API""; ""Input""; ""Running the MapReduce Job""; ""Output""; ""Chapter 3. Top 10 List""; ""Top N, Formalized""; ""MapReduce/Hadoop Implementation: Unique Keys""; ""Implementation Classes in MapReduce/Hadoop""; ""Top 10 Sample Run""; ""Finding the Top 5""; ""Finding the Bottom 10""""Spark Implementation: Unique Keys""; ""RDD Refresher""; ""Spark's Function Classes""; ""Review of the Top N Pattern for Spark""; ""Complete Spark Top 10 Solution""; ""Sample Run: Finding the Top 10""; ""Parameterizing Top N""; ""Finding the Bottom N""; ""Spark Implementation: Nonunique Keys""; ""Complete Spark Top 10 Solution""; ""Sample Run""; ""Spark Top 10 Solution Using takeOrdered()""; ""Complete Spark Implementation""; ""Finding the Bottom N""; ""Alternative to Using takeOrdered()""; ""MapReduce/Hadoop Top 10 Solution: Nonunique Keys""; ""Sample Run""; ""Chapter 4. Left Outer Join""""Left Outer Join Example""; ""Example Queries""; ""Implementation of Left Outer Join in MapReduce""; ""MapReduce Phase 1: Finding Product Locations""; ""MapReduce Phase 2: Counting Unique Locations""; ""Implementation Classes in Hadoop""; ""Sample Run""; ""Spark Implementation of Left Outer Join""; ""Spark Program""; ""Running the Spark Solution""; ""Running Spark on YARN""; ""Spark Implementation with leftOuterJoin()""; ""Spark Program""; ""Sample Run on YARN""; ""Chapter 5. Order Inversion""; ""Example of the Order Inversion Pattern""; ""MapReduce/Hadoop Implementation of the Order Inversion Pattern""
Notes:: Description based upon print version of record.; Includes bibliographical references and index.; Description based on online resource; title from PDF title page (ebrary, viewed July 29, 2015).
ISBN:: 9781491906132; 1491906138; 9781491906170; 1491906170; 9781491906156; 1491906154; 9781491906187; 1491906189
OCLC:: 903658436

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

1 option

Data algorithms / Mahmoud Parsian.

Find

My Account

Guides