My Account Log in

1 option

Learning cascading : build reliable, robust, and high-performance big data applications using the cascading application development efficiently / Michael Covert, Victoria Loewengart.

Ebook Central College Complete Available online

View online
Format:
Book
Author/Creator:
Covert, Michael, author.
Loewengart, Victoria, author.
Series:
Community experience distilled.
Community Experience Distilled
Language:
English
Subjects (All):
Apache Hadoop.
Web site development.
Cascading style sheets.
Physical Description:
1 online resource (276 p.)
Place of Publication:
Birmingham, [England] ; Mumbai, [India] : Packt Publishing, 2015.
Language Note:
English
Summary:
This book is intended for software developers, system architects and analysts, big data project managers, and data scientists who wish to deploy big data solutions using the Cascading framework. You must have a basic understanding of the big data paradigm and should be familiar with Java development techniques.
Contents:
Cover; Copyright; Credits; Foreword; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: The Big Data Core Technology Stack; Reviewing Hadoop; Hadoop architecture; HDFS - the Hadoop Distributed File System; The NameNode; The secondary NameNode; DataNodes; MapReduce execution framework; The JobTracker; The TaskTracker; Hadoop jobs; Distributed cache; Counters; YARN - MapReduce version 2; A simple MapReduce job; Beyond MapReduce; The Cascading framework; The execution graph and flow planner; How Cascading produces MapReduce jobs; Summary
Chapter 2: Cascading Basics in DetailUnderstanding common Cascading themes; Data flows as processes; Understanding how Cascading represents records; Using tuples and defining fields; Using a Fields object, named field groups and selectors; Data typing and coercion; Defining schemes; Schemes in detail; TupleEntry; Understanding how Cascading controls data flow; Using pipes; Creating and chaining; Pipe operations; Each; Splitting; GroupBy and sorting; Every; Merging and joining; The Merge pipe; The join pipes - CoGroup and HashJoin; CoGroup; HashJoin; Default output selectors; Using taps; Flow
FlowConnectorCascades; Local and Hadoop modes; Common errors; Putting it all together; Summary; Chapter 3: Understanding Custom Operations; Understanding operations; Operations and fields; The Operation class and interface hierarchy; The basic operation lifecycle; Contexts; FlowProcess; OperationCall; An operation processing sequence and its methods; Operation types; Each operations; Every operations; Buffers; Assertions; Summary; Chapter 4: Creating Custom Operations; Writing custom operations; Writing a filter; Writing a function; Writing an aggregator; Writing a custom assertion
Writing a bufferIdentifying common use cases for custom operations; Putting it all together; Summary; Chapter 5: Code Reuse and Integration; Creating and using subassemblies; Built-in subassemblies; Creating a new custom subassembly; Using custom subassemblies; Using cascades; Building a complex workflow using cascades; Skipping a flow in a cascade; Intermediate file management; Dynamically controlling flows; Instrumentation and counters; Using counters to control flow; Using existing MapReduce jobs; The FlowDef fluent interface; Integrating external components; Flow and cascade events
Using external JAR filesUsing Cascading as insulation from big data migrations and upgrades; Summary; Chapter 6: Testing a Cascading Application; Debugging a Cascading application; Getting your environment ready for debugging; Using Cascading local mode debugging; Setting up Eclipse; Remote debugging; Using assertions; The Debug() filter; Managing exceptions with traps; Checkpoints; Managing bad data; Viewing flow sequencing using DOT files; Testing strategies; Unit testing and JUnit; Mocking; Integration testing; Load and performance testing; Summary
Chapter 7: Optimizing the Performance of a Cascading Application
Notes:
Includes index.
Description based on online resource; title from PDF title page (ebrary, viewed June 18, 2015).

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account