My Account Log in

4 options

Pig design patterns : simplify hadoop programming to create complex end-to-end enterprise big data solutions with pig / Pradeep Pasupuleti ; Srinivas Uppuluri, foreword.

EBSCOhost Academic eBook Collection (North America) Available online

View online

Ebook Central Academic Complete Available online

View online

Ebook Central College Complete Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Pradeep Pasupuleti, author.
Uppuluri, Srinivas, author of introduction, etc.
Series:
Community experience distilled
Language:
English
Subjects (All):
Apache Hadoop.
Open source software.
Programming languages (Electronic computers).
Physical Description:
1 online resource (310 p.)
Edition:
1st edition
Place of Publication:
Birmingham, England : Packt Publishing, 2014.
Language Note:
English
System Details:
text file
Biography/History:
Pasupuleti Pradeep: Pradeep Pasupuleti has over 17 years of experience in architecting and developing distributed and real-time data-driven systems. Currently, he focuses on developing robust data platforms and data products that are fuelled by scalable machine-learning algorithms, and delivering value to customers in addressing business problems by applying his deep technical insights. Pradeep founded Datatma expressly to humanize Big Data, simplify it, and unravel new value on a previously unimaginable scale in economy and scope. He has created COE (Centers of Excellence) to provide quick wins with data products that analyze high-dimensional multistructured data using scalable natural language processing and deep learning techniques. He has performed roles in technology consulting and advising Fortune 500 companies.
Summary:
Pig makes Hadoop programming simple, intuitive, and fun to work with. It removes the complexity from Map Reduce programming by giving the programmer immense power through its flexibility. What used to be extremely lengthy and intricate code written in other high level languages can now be written in almost one tenth of the size using its easy to understand constructs. Pig has proven to be the easiest way to learn how to program Hadoop clusters, as evidenced by its widespread adoption. This comprehensive guide enables readers to readily use design patterns to simplify the creation of complex da
Contents:
Cover; Copyright; Credits; Foreword; About the Author; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Setting the Context for Design Patterns in Pig; Understanding design patterns; The scope of design patterns in Pig; Chapter 2: Hadoop demystified - a quick reckoner; The enterprise context; Common challenges of distributed systems; The advent of Hadoop; Hadoop under the covers; Understanding the Hadoop Distributed File System; HDFS design goals; Working of HDFS; Understanding MapReduce; Understanding how MapReduce works; The MapReduce internals
Pig - a quick introUnderstanding the rationale of Pig; Understanding the relevance of Pig in the enterprise; Working of Pig - an overview; Firing up Pig; The use case; Code listing; The dataset; Understanding Pig through the code; Pig's extensibility; Operators used in code; The EXPLAIN operator; Understanding Pig's data model; Primitive types; Complex types; Summary; Chapter 2: Data Ingest and Egress Patterns; The context of data ingest and egress; Types of data in the enterprise; Ingest and egress patterns for multistructured data; Considerations for log ingestion
The Apache log ingestion patternBackground; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The Custom log ingestion pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The image ingress and egress pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for the NoSQL data; MongoDB ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results
Additional informationThe HBase ingress and egress pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for structured data; The Hive ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for semi-structured data; The mainframe ingestion pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; XML ingest and egress patterns
BackgroundMotivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; JSON ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; Summary; Chapter 3: Data Profiling Patterns; Data profiling for Big Data; Big Data profiling dimensions; Sampling considerations for profiling Big Data; Sampling support in Pig; Rationale for using Pig in data profiling; The data type inference pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Pig script; Java UDF
Results
Notes:
Description based upon print version of record.
Description based on online resource; title from PDF title page (ebrary, viewed April 29, 2014).
ISBN:
9781783285563
1783285567
OCLC:
880637473

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account