My Account Log in

1 option

Big Data Processing Using Spark in Cloud / edited by Mamta Mittal, Valentina E. Balas, Lalit Mohan Goyal, Raghvendra Kumar.

Springer Nature - Springer Engineering eBooks 2019 English International Available online

View online
Format:
Book
Contributor:
Mittal, Mamta, editor.
Balas, Valentina Emilia, editor.
Goyal, Lalit Mohan, editor.
Kumar, Raghvendra, 1987- editor.
SpringerLink (Online service)
Series:
Engineering (Springer-11647)
Studies in big data 2197-6503 ; 43.
Studies in Big Data, 2197-6503 ; 43
Language:
English
Subjects (All):
Big data.
Computer security.
Big Data.
Systems and Data Security.
Big Data/Analytics.
Local Subjects:
Big Data.
Systems and Data Security.
Big Data/Analytics.
Physical Description:
1 online resource (XIII, 264 pages) : 89 illustrations, 62 illustrations in color.
Edition:
First edition 2019.
Contained In:
Springer eBooks
Place of Publication:
Singapore : Springer Singapore : Imprint: Springer, 2019.
System Details:
text file PDF
Summary:
The book describes the emergence of big data technologies and the role of Spark in the entire big data stack. It compares Spark and Hadoop and identifies the shortcomings of Hadoop that have been overcome by Spark. The book mainly focuses on the in-depth architecture of Spark and our understanding of Spark RDDs and how RDD complements big data's immutable nature, and solves it with lazy evaluation, cacheable and type inference. It also addresses advanced topics in Spark, starting with the basics of Scala and the core Spark framework, and exploring Spark data frames, machine learning using Mllib, graph analytics using Graph X and real-time processing with Apache Kafka, AWS Kenisis, and Azure Event Hub. It then goes on to investigate Spark using PySpark and R. Focusing on the current big data stack, the book examines the interaction with current big data tools, with Spark being the core processing layer for all types of data. The book is intended for data engineers and scientists working on massive datasets and big data technologies in the cloud. In addition to industry professionals, it is helpful for aspiring data processing professionals and students working in big data processing and cloud computing environments.
Contents:
Concepts of Big Data and Apache Spark
Big Data Analysis in Cloud and Machine Learning
Security Issues and Challenges related to Big Data
Big Data Security Solutions in Cloud
Data Science and Analytics
Big Data Technologies
Data Analysis with Casandra and Spark
Spin up the Spark Cluster
Learn Scala
IO for Spark
Processing with Spark
Spark Data Frames and Spark SQL
Machine Learning and Advanced Analytics
Parallel Programming with Spark
Distributed Graph Processing with Spark
Real Time Processing with Spark
Spark in Real World
Case Studies. .
Other Format:
Printed edition:
ISBN:
978-981-13-0550-4
9789811305504
OCLC:
1041108960
Access Restriction:
Restricted for use by site license.

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account