4 options
Pig design patterns : simplify hadoop programming to create complex end-to-end enterprise big data solutions with pig / Pradeep Pasupuleti ; Srinivas Uppuluri, foreword.
- Format:
- Book
- Author/Creator:
- Pradeep Pasupuleti, author.
- Uppuluri, Srinivas, author of introduction, etc.
- Series:
- Community experience distilled
- Language:
- English
- Subjects (All):
- Apache Hadoop.
- Open source software.
- Programming languages (Electronic computers).
- Physical Description:
- 1 online resource (310 p.)
- Edition:
- 1st edition
- Place of Publication:
- Birmingham, England : Packt Publishing, 2014.
- Language Note:
- English
- System Details:
- text file
- Biography/History:
- Pasupuleti Pradeep: Pradeep Pasupuleti has over 17 years of experience in architecting and developing distributed and real-time data-driven systems. Currently, he focuses on developing robust data platforms and data products that are fuelled by scalable machine-learning algorithms, and delivering value to customers in addressing business problems by applying his deep technical insights. Pradeep founded Datatma expressly to humanize Big Data, simplify it, and unravel new value on a previously unimaginable scale in economy and scope. He has created COE (Centers of Excellence) to provide quick wins with data products that analyze high-dimensional multistructured data using scalable natural language processing and deep learning techniques. He has performed roles in technology consulting and advising Fortune 500 companies.
- Summary:
- Pig makes Hadoop programming simple, intuitive, and fun to work with. It removes the complexity from Map Reduce programming by giving the programmer immense power through its flexibility. What used to be extremely lengthy and intricate code written in other high level languages can now be written in almost one tenth of the size using its easy to understand constructs. Pig has proven to be the easiest way to learn how to program Hadoop clusters, as evidenced by its widespread adoption. This comprehensive guide enables readers to readily use design patterns to simplify the creation of complex da
- Contents:
- Cover; Copyright; Credits; Foreword; About the Author; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Setting the Context for Design Patterns in Pig; Understanding design patterns; The scope of design patterns in Pig; Chapter 2: Hadoop demystified - a quick reckoner; The enterprise context; Common challenges of distributed systems; The advent of Hadoop; Hadoop under the covers; Understanding the Hadoop Distributed File System; HDFS design goals; Working of HDFS; Understanding MapReduce; Understanding how MapReduce works; The MapReduce internals
- Pig - a quick introUnderstanding the rationale of Pig; Understanding the relevance of Pig in the enterprise; Working of Pig - an overview; Firing up Pig; The use case; Code listing; The dataset; Understanding Pig through the code; Pig's extensibility; Operators used in code; The EXPLAIN operator; Understanding Pig's data model; Primitive types; Complex types; Summary; Chapter 2: Data Ingest and Egress Patterns; The context of data ingest and egress; Types of data in the enterprise; Ingest and egress patterns for multistructured data; Considerations for log ingestion
- The Apache log ingestion patternBackground; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The Custom log ingestion pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The image ingress and egress pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for the NoSQL data; MongoDB ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results
- Additional informationThe HBase ingress and egress pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for structured data; The Hive ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for semi-structured data; The mainframe ingestion pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; XML ingest and egress patterns
- BackgroundMotivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; JSON ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; Summary; Chapter 3: Data Profiling Patterns; Data profiling for Big Data; Big Data profiling dimensions; Sampling considerations for profiling Big Data; Sampling support in Pig; Rationale for using Pig in data profiling; The data type inference pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Pig script; Java UDF
- Results
- Notes:
- Description based upon print version of record.
- Description based on online resource; title from PDF title page (ebrary, viewed April 29, 2014).
- ISBN:
- 9781783285563
- 1783285567
- OCLC:
- 880637473
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.