1 option
Hadoop : the definitive guide / by Tom White.
- Format:
- Book
- Author/Creator:
- White, Tom (Tom E.)
- Language:
- English
- Subjects (All):
- Apache Hadoop.
- Computer software.
- Physical Description:
- 1 online resource (526 p.)
- Edition:
- First edition.
- Place of Publication:
- Sebastopol, California : O'Reilly Media, Inc., 2009.
- Language Note:
- English
- System Details:
- text file
- Summary:
- Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you: <p
- Contents:
- Table of Contents; Foreword; Preface; Administrative Notes; What's in This Book?; Conventions Used in This Book; Using Code Examples; Safari® Books Online; How to Contact Us; Acknowledgments; Chapter 1. Meet Hadoop; Data!; Data Storage and Analysis; Comparison with Other Systems; RDBMS; Grid Computing; Volunteer Computing; A Brief History of Hadoop; The Apache Hadoop Project; Chapter 2. MapReduce; A Weather Dataset; Data Format; Analyzing the Data with Unix Tools; Analyzing the Data with Hadoop; Map and Reduce; Java MapReduce; A test run; The new Java MapReduce API; Scaling Out; Data Flow
- Combiner FunctionsSpecifying a combiner function; Running a Distributed MapReduce Job; Hadoop Streaming; Ruby; Python; Hadoop Pipes; Compiling and Running; Chapter 3. The Hadoop Distributed Filesystem; The Design of HDFS; HDFS Concepts; Blocks; Namenodes and Datanodes; The Command-Line Interface; Basic Filesystem Operations; Hadoop Filesystems; Interfaces; Thrift; C; FUSE; WebDAV; Other HDFS Interfaces; The Java Interface; Reading Data from a Hadoop URL; Reading Data Using the FileSystem API; FSDataInputStream; Writing Data; FSDataOutputStream; Directories; Querying the Filesystem
- File metadata: FileStatusListing files; File patterns; PathFilter; Deleting Data; Data Flow; Anatomy of a File Read; Anatomy of a File Write; Coherency Model; Consequences for application design; Parallel Copying with distcp; Keeping an HDFS Cluster Balanced; Hadoop Archives; Using Hadoop Archives; Limitations; Chapter 4. Hadoop I/O; Data Integrity; Data Integrity in HDFS; LocalFileSystem; ChecksumFileSystem; Compression; Codecs; Compressing and decompressing streams with CompressionCodec; Inferring CompressionCodecs using CompressionCodecFactory; Native libraries
- Compression and Input SplitsUsing Compression in MapReduce; Compressing map output; Serialization; The Writable Interface; WritableComparable and comparators; Writable Classes; Writable wrappers for Java primitives; Text; BytesWritable; NullWritable; ObjectWritable and GenericWritable; Writable collections; Implementing a Custom Writable; Implementing a RawComparator for speed; Custom comparators; Serialization Frameworks; Serialization IDL; File-Based Data Structures; SequenceFile; Writing a SequenceFile; Reading a SequenceFile; Displaying a SequenceFile with the command-line interface
- Sorting and merging SequenceFilesThe SequenceFile Format; MapFile; Writing a MapFile; Reading a MapFile; Converting a SequenceFile to a MapFile; Chapter 5. Developing a MapReduce Application; The Configuration API; Combining Resources; Variable Expansion; Configuring the Development Environment; Managing Configuration; GenericOptionsParser, Tool, and ToolRunner; Writing a Unit Test; Mapper; Reducer; Running Locally on Test Data; Running a Job in a Local Job Runner; Fixing the mapper; Testing the Driver; Running on a Cluster; Packaging; Launching a Job; The MapReduce Web UI
- The jobtracker page
- Notes:
- Description based upon print version of record.
- Description based on online resource; title from PDF title page (ebrary, viewed October 1, 2013).
- ISBN:
- 9781306817462
- 1306817463
- 9780596551360
- 0596551363
- 9780596551179
- 0596551177
- OCLC:
- 317877866
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.