My Account Log in

3 options

Big data architect's handbook : a guide to build proficiency in tools and systems used by leading big data experts / Syed Muhammad Fahad Akhtar.

EBSCOhost Academic eBook Collection (North America) Available online

View online

Ebook Central Academic Complete Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Fahad Akhtar, Syed Muhammad, author.
Language:
English
Subjects (All):
Big data--Handbooks, manuals, etc.
Big data.
Physical Description:
1 online resource (1 volume) : illustrations
Edition:
1st edition
Place of Publication:
Birmingham, England : Packt Publishing, 2018.
System Details:
text file
Biography/History:
Akhtar Syed Muhammad Fahad: Syed Muhammad Fahad Akhtar has 12+ years of industry experience in analysis, designing, developing, integrating, and managing large applications in different industries. He has vast exposure of working in UAE, Pakistan, and Malaysia and is currently working in ASIT Solutions as a solution architect. He received his masters from Torrens University, Australia, and bachelor of science in computer engineering from National University of Computer and Emerging Sciences (FAST), Pakistan.
Summary:
A comprehensive end-to-end guide that gives hands-on practice in big data and Artificial Intelligence About This Book Learn to build and run a big data application with sample code Explore examples to implement activities that a big data architect performs Use Machine Learning and AI for structured and unstructured data Who This Book Is For Big Data Architect's Handbook is for you if you are an aspiring data professional, developer, or IT enthusiast who aims to be an all-round architect in big data. This book is your one-stop solution to enhance your knowledge and carry out easy to complex activities required to become a big data architect. What You Will Learn Learn Hadoop Ecosystem and Apache projects Understand, compare NoSQL database and essential software architecture Cloud infrastructure design considerations for big data Explore application scenario of big data tools for daily activities Learn to analyze and visualize results to uncover valuable insights Build and run a big data application with sample code from end to end Apply Machine Learning and AI to perform big data intelligence Practice the daily activities performed by big data architects In Detail The big data architects are the “masters” of data, and hold high value in today's market. Handling big data, be it of good or bad quality, is not an easy task. The prime job for any big data architect is to build an end-to-end big data solution that integrates data from different sources and analyzes it to find useful, hidden insights. Big Data Architect's Handbook takes you through developing a complete, end-to-end big data pipeline, which will lay the foundation for you and provide the necessary knowledge required to be an architect in big data. Right from understanding the design considerations to implementing a solid, efficient, and scalable data pipeline, this book walks you through all the essential aspects of big data. It also gives you an overview of how you can leverage the power of various big data tools such as Apache Hadoop and ElasticSearch in order to bring them together and build an efficient big data solution. By the end of this book, you will be able to build your own design system which integrates, maintains, visualizes, and monitors your data. In addition, you will have a smooth design flow in each process, putting insights in action. Style and approach Comprehensive guide with a perfect blend of theory, examples and implementation of real-world use-cases
Contents:
Cover
Title Page
Copyright and Credits
Packt Upsell
Contributors
Table of Contents
Preface
Chapter 1: Why Big Data?
What is big data?
Characteristics of big data
Volume
Velocity
Variety
Veracity
Variability
Value
Solution-based approach for data
Data - the most valuable asset
Traditional approaches to data storage
Clustered computing
High availability
Resource pooling
Easy scalability
Big data - how does it make a difference?
Big data solutions - cloud versus on-premises infrastructure
Cost
Security
Current capabilities
Scalability
Big data glossary
Big data
Batch processing
Cluster computing
Data warehouse
Data lake
Data mining
ETL
Hadoop
In-memory computing
Machine learning
MapReduce
NoSQL
Stream processing
Summary
Chapter 2: Big Data Environment Setup
Oracle VM VirtualBox installation
Ubuntu installation
Hadoop prerequisite installation
Java installation
SSH installation and configuration
Hadoop system user
Apache Hadoop installation
Hadoop configuration
Path configuration for Hadoop commands
Hadoop server start and stop
Chapter 3: Hadoop Ecosystem
Apache Hadoop
Hadoop Distributed File System
HDFS hands-on
Creating a directory in HDFS
Copying files from a local file system to HDFS
Copying files from HDFS to a local file system
Deleting files and folders in HDFS
Hadoop MapReduce
Job Tracker and Task Tracker
The execution flow of MapReduce
Mapper
Shuffle and Sort
Reducer
Example program
Preparing the data file for analysis
Program code
Driver program
Mapper program
Reducer program
Observations and results
YARN
Resource Manager
Node Manager
Container
Application Master.
Apache Projects related to big data
Apache Zookeeper
Apache Kafka
Apache Flume
Apache Cassandra
Apache HBase
Apache Spark
Chapter 4: NoSQL Database
What is NoSQL?
Benefits of NoSQL databases
NoSQL versus RDBMS
The CAP theorem
The ACID properties
Data models in NoSQL
Key-value data stores
Document store
Column stores
Graph stores
Installation
Starting Cassandra
The Cassandra Query Language - CQL
The help command
Basic commands
Data manipulation
Creating, altering, and deleting a keyspace
Creating, altering, and deleting tables
Inserting, updating, and deleting data
The MongoDB database
Installing MongoDB
Starting MongoDB
Working on MongoDB
Creating and deleting databases
Creating and deleting collections
The create, retrieve, update, delete operations
Neo4j database
Installing Neo4j
Starting Neo4j
The cypher query language
Help
Basic operations in Cypher
Creating nodes, relationships, and properties
Updating nodes, relationships, and properties
Deleting nodes, relationships, and properties
Reading nodes, relationships, and properties
Chapter 5: Off-the-Shelf Commercial Tools
Microsoft Azure
Building a practical application
Microsoft Azure account
The Azure Event Hub
IoT simulation application
Setting up an Azure Stream Analytics job
Input
Query
Output
Dashboard in Power BI
Chapter 6: Containerization
Virtualization
Hypervisors
Hardware-based hypervisors
Software-based hypervisors
What is containerization?
Benefits of containers
Docker
Docker workflow
Docker images
Building a Docker image.
Running and verifying Docker images
Importing and exporting Docker images
Docker Swarm
Setting up Docker Swarm
Creating service containers
Replicating containers
Removing container services
Kubernetes
Key components
Pods
ReplicaSets
Deployments
PetSets
Deployment
Kubernetes Dashboard
Chapter 7: Network Infrastructure
Network
Local area networks
Metropolitan area networks
Wide area networks
Network connectivity
Wired
Wireless
Network visualization
Gephi
First run
Practical example
Chapter 8: Cloud Infrastructure
Companies moving to cloud
Driving factors
Infrastructure
Locality of data
Requirements
Design considerations
Open source versus commercial
Commodity hardware versus purpose build
Cloud versus on-premises
Scale up and down
Application architecture
Cost decision
Chapter 9: Security and Monitoring
Simple Network Management Protocol
Benefits of SNMP
Agents and Traps
Netflow
Nagios
Key benefits
Security Onion
Deployment scenarios
The Standalone model
The Server-Sensor model
Hybrid model
Preconfigured tools
Wireshark
Key features
Chapter 10: Frontend Architecture
React JS
Key concepts
Node.js
JSX
Unidirectional dataflow
Getting started with ReactJS
Single page application
React application project
React app directory structure
Components
Properties
Event handling
State
Redux
Architecture of Redux
Single store
Action
Reducers
Guestbook application
Create a store
Setting up Reducer
Setting up Dispatcher
Connect function
Setting up Subscribers
Final output
Summary.
Chapter 11: Backend Architecture
API
RESTful API
HTTP request methods
GET
POST
PUT
DELETE
Authentication
Basic authentication
JSON Web Token
Header
Payload
Signature
Practical
RESTful web service
Java client
Redis
Redis server
Redis client
Working with Redis
Redis data types and structures
String
HashMap
List
Set
Redis Publish/Subscribe
Common key operations
Chapter 12: Machine Learning
Types of algorithms
Parametric algorithms
Non-parametric algorithms
Supervised learning
The classification model
Binary classification
Multi-class classification
The regression model
Linear regression
Polynomial regression
Unsupervised learning
Clustering, k-means
Neural networks
Feedforward neural network
Recurrent neural network
Symmetrically connected neural network
Deep neural networks
Decision tree classifiers
Chapter 13: Artificial Intelligence
Artificial intelligence
Convolutional neural networks
Deep learning using TensorFlow
TensorFlow
TensorFlow program
Uninstalling TensorFlow
TensorBoard
Program
Launching TensorBoard
TensorBoard graph
Object detection using YOLO
Compiling YOLO library
Trained weights
Detecting objects in an image
Chapter 14: Elasticsearch
Installing Elasticsearch
Starting the Elasticsearch server
Auto starting the Elasticsearch service
Stopping the Elasticsearch server
Uninstalling Elasticsearch
Kibana
Starting Kibana
Uninstalling Kibana
Securing Elasticsearch
Securing Kibana
Understanding queries - CRUD commands
Creating
Reading
Updating
Deleting
Chapter 15: Structured Data
Data analysis
Installing MySQL
Importing data
Analyzing the data model
HBase
Starting an HBase instance
Stopping a HBase instance
Preparing an HBase for migration
Sqoop
Verifying the installation
MySQL JDBC driver
Verifying the imported data
Chapter 16: Unstructured Data
Moving data into Hadoop
Downloading Flume
Environment configuration
Configuring agent and sink
Running Apache Flume
Transferring a log file
Converting images into text for analysis
Tesseract OCR
Installing Tesseract
Complete code
Program execution
Chapter 17: Data Visualization
Matplotlib
Installing Matplotlib
Line chart
Bar charts
Stack charts
Scatter charts
Pie charts
Geographic projections
D3.js
Chapter 18: Financial Trading System
What is algorithmic trading?
Benefits of algorithmic trading
Big data in the financial market
Algorithmic trading strategies
Building an Expert Advisor
MetaTrader
Downloading and setting up MetaTrader
MetaQuotes language
Trading bot objective
Trading pattern - moving average
Decision time: buy or sell
Complete program
Backtesting in MetaTrader 4
Chapter 19: Retail Recommendation System
Types of recommendation system
Collaborative filtering
Content-based filtering
Demographic-based system
Utility-based system
Knowledge-based system
Commercial tools
Barilliance
Softcube
Strands
Monetate
Nosto
Book recommendation system
Dataset
Directory structure
Code
Reading the dataset
Verifying the dataset
Age group
Commutative rating.
Algorithms.
Notes:
Includes index.
Description based on print version record.
ISBN:
9781788836388
1788836383
OCLC:
1044741262

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account