3 options
Big data architect's handbook : a guide to build proficiency in tools and systems used by leading big data experts / Syed Muhammad Fahad Akhtar.
- Format:
- Book
- Author/Creator:
- Fahad Akhtar, Syed Muhammad, author.
- Language:
- English
- Subjects (All):
- Big data--Handbooks, manuals, etc.
- Big data.
- Physical Description:
- 1 online resource (1 volume) : illustrations
- Edition:
- 1st edition
- Place of Publication:
- Birmingham, England : Packt Publishing, 2018.
- System Details:
- text file
- Biography/History:
- Akhtar Syed Muhammad Fahad: Syed Muhammad Fahad Akhtar has 12+ years of industry experience in analysis, designing, developing, integrating, and managing large applications in different industries. He has vast exposure of working in UAE, Pakistan, and Malaysia and is currently working in ASIT Solutions as a solution architect. He received his masters from Torrens University, Australia, and bachelor of science in computer engineering from National University of Computer and Emerging Sciences (FAST), Pakistan.
- Summary:
- A comprehensive end-to-end guide that gives hands-on practice in big data and Artificial Intelligence About This Book Learn to build and run a big data application with sample code Explore examples to implement activities that a big data architect performs Use Machine Learning and AI for structured and unstructured data Who This Book Is For Big Data Architect's Handbook is for you if you are an aspiring data professional, developer, or IT enthusiast who aims to be an all-round architect in big data. This book is your one-stop solution to enhance your knowledge and carry out easy to complex activities required to become a big data architect. What You Will Learn Learn Hadoop Ecosystem and Apache projects Understand, compare NoSQL database and essential software architecture Cloud infrastructure design considerations for big data Explore application scenario of big data tools for daily activities Learn to analyze and visualize results to uncover valuable insights Build and run a big data application with sample code from end to end Apply Machine Learning and AI to perform big data intelligence Practice the daily activities performed by big data architects In Detail The big data architects are the “masters” of data, and hold high value in today's market. Handling big data, be it of good or bad quality, is not an easy task. The prime job for any big data architect is to build an end-to-end big data solution that integrates data from different sources and analyzes it to find useful, hidden insights. Big Data Architect's Handbook takes you through developing a complete, end-to-end big data pipeline, which will lay the foundation for you and provide the necessary knowledge required to be an architect in big data. Right from understanding the design considerations to implementing a solid, efficient, and scalable data pipeline, this book walks you through all the essential aspects of big data. It also gives you an overview of how you can leverage the power of various big data tools such as Apache Hadoop and ElasticSearch in order to bring them together and build an efficient big data solution. By the end of this book, you will be able to build your own design system which integrates, maintains, visualizes, and monitors your data. In addition, you will have a smooth design flow in each process, putting insights in action. Style and approach Comprehensive guide with a perfect blend of theory, examples and implementation of real-world use-cases
- Contents:
- Cover
- Title Page
- Copyright and Credits
- Packt Upsell
- Contributors
- Table of Contents
- Preface
- Chapter 1: Why Big Data?
- What is big data?
- Characteristics of big data
- Volume
- Velocity
- Variety
- Veracity
- Variability
- Value
- Solution-based approach for data
- Data - the most valuable asset
- Traditional approaches to data storage
- Clustered computing
- High availability
- Resource pooling
- Easy scalability
- Big data - how does it make a difference?
- Big data solutions - cloud versus on-premises infrastructure
- Cost
- Security
- Current capabilities
- Scalability
- Big data glossary
- Big data
- Batch processing
- Cluster computing
- Data warehouse
- Data lake
- Data mining
- ETL
- Hadoop
- In-memory computing
- Machine learning
- MapReduce
- NoSQL
- Stream processing
- Summary
- Chapter 2: Big Data Environment Setup
- Oracle VM VirtualBox installation
- Ubuntu installation
- Hadoop prerequisite installation
- Java installation
- SSH installation and configuration
- Hadoop system user
- Apache Hadoop installation
- Hadoop configuration
- Path configuration for Hadoop commands
- Hadoop server start and stop
- Chapter 3: Hadoop Ecosystem
- Apache Hadoop
- Hadoop Distributed File System
- HDFS hands-on
- Creating a directory in HDFS
- Copying files from a local file system to HDFS
- Copying files from HDFS to a local file system
- Deleting files and folders in HDFS
- Hadoop MapReduce
- Job Tracker and Task Tracker
- The execution flow of MapReduce
- Mapper
- Shuffle and Sort
- Reducer
- Example program
- Preparing the data file for analysis
- Program code
- Driver program
- Mapper program
- Reducer program
- Observations and results
- YARN
- Resource Manager
- Node Manager
- Container
- Application Master.
- Apache Projects related to big data
- Apache Zookeeper
- Apache Kafka
- Apache Flume
- Apache Cassandra
- Apache HBase
- Apache Spark
- Chapter 4: NoSQL Database
- What is NoSQL?
- Benefits of NoSQL databases
- NoSQL versus RDBMS
- The CAP theorem
- The ACID properties
- Data models in NoSQL
- Key-value data stores
- Document store
- Column stores
- Graph stores
- Installation
- Starting Cassandra
- The Cassandra Query Language - CQL
- The help command
- Basic commands
- Data manipulation
- Creating, altering, and deleting a keyspace
- Creating, altering, and deleting tables
- Inserting, updating, and deleting data
- The MongoDB database
- Installing MongoDB
- Starting MongoDB
- Working on MongoDB
- Creating and deleting databases
- Creating and deleting collections
- The create, retrieve, update, delete operations
- Neo4j database
- Installing Neo4j
- Starting Neo4j
- The cypher query language
- Help
- Basic operations in Cypher
- Creating nodes, relationships, and properties
- Updating nodes, relationships, and properties
- Deleting nodes, relationships, and properties
- Reading nodes, relationships, and properties
- Chapter 5: Off-the-Shelf Commercial Tools
- Microsoft Azure
- Building a practical application
- Microsoft Azure account
- The Azure Event Hub
- IoT simulation application
- Setting up an Azure Stream Analytics job
- Input
- Query
- Output
- Dashboard in Power BI
- Chapter 6: Containerization
- Virtualization
- Hypervisors
- Hardware-based hypervisors
- Software-based hypervisors
- What is containerization?
- Benefits of containers
- Docker
- Docker workflow
- Docker images
- Building a Docker image.
- Running and verifying Docker images
- Importing and exporting Docker images
- Docker Swarm
- Setting up Docker Swarm
- Creating service containers
- Replicating containers
- Removing container services
- Kubernetes
- Key components
- Pods
- ReplicaSets
- Deployments
- PetSets
- Deployment
- Kubernetes Dashboard
- Chapter 7: Network Infrastructure
- Network
- Local area networks
- Metropolitan area networks
- Wide area networks
- Network connectivity
- Wired
- Wireless
- Network visualization
- Gephi
- First run
- Practical example
- Chapter 8: Cloud Infrastructure
- Companies moving to cloud
- Driving factors
- Infrastructure
- Locality of data
- Requirements
- Design considerations
- Open source versus commercial
- Commodity hardware versus purpose build
- Cloud versus on-premises
- Scale up and down
- Application architecture
- Cost decision
- Chapter 9: Security and Monitoring
- Simple Network Management Protocol
- Benefits of SNMP
- Agents and Traps
- Netflow
- Nagios
- Key benefits
- Security Onion
- Deployment scenarios
- The Standalone model
- The Server-Sensor model
- Hybrid model
- Preconfigured tools
- Wireshark
- Key features
- Chapter 10: Frontend Architecture
- React JS
- Key concepts
- Node.js
- JSX
- Unidirectional dataflow
- Getting started with ReactJS
- Single page application
- React application project
- React app directory structure
- Components
- Properties
- Event handling
- State
- Redux
- Architecture of Redux
- Single store
- Action
- Reducers
- Guestbook application
- Create a store
- Setting up Reducer
- Setting up Dispatcher
- Connect function
- Setting up Subscribers
- Final output
- Summary.
- Chapter 11: Backend Architecture
- API
- RESTful API
- HTTP request methods
- GET
- POST
- PUT
- DELETE
- Authentication
- Basic authentication
- JSON Web Token
- Header
- Payload
- Signature
- Practical
- RESTful web service
- Java client
- Redis
- Redis server
- Redis client
- Working with Redis
- Redis data types and structures
- String
- HashMap
- List
- Set
- Redis Publish/Subscribe
- Common key operations
- Chapter 12: Machine Learning
- Types of algorithms
- Parametric algorithms
- Non-parametric algorithms
- Supervised learning
- The classification model
- Binary classification
- Multi-class classification
- The regression model
- Linear regression
- Polynomial regression
- Unsupervised learning
- Clustering, k-means
- Neural networks
- Feedforward neural network
- Recurrent neural network
- Symmetrically connected neural network
- Deep neural networks
- Decision tree classifiers
- Chapter 13: Artificial Intelligence
- Artificial intelligence
- Convolutional neural networks
- Deep learning using TensorFlow
- TensorFlow
- TensorFlow program
- Uninstalling TensorFlow
- TensorBoard
- Program
- Launching TensorBoard
- TensorBoard graph
- Object detection using YOLO
- Compiling YOLO library
- Trained weights
- Detecting objects in an image
- Chapter 14: Elasticsearch
- Installing Elasticsearch
- Starting the Elasticsearch server
- Auto starting the Elasticsearch service
- Stopping the Elasticsearch server
- Uninstalling Elasticsearch
- Kibana
- Starting Kibana
- Uninstalling Kibana
- Securing Elasticsearch
- Securing Kibana
- Understanding queries - CRUD commands
- Creating
- Reading
- Updating
- Deleting
- Chapter 15: Structured Data
- Data analysis
- Installing MySQL
- Importing data
- Analyzing the data model
- HBase
- Starting an HBase instance
- Stopping a HBase instance
- Preparing an HBase for migration
- Sqoop
- Verifying the installation
- MySQL JDBC driver
- Verifying the imported data
- Chapter 16: Unstructured Data
- Moving data into Hadoop
- Downloading Flume
- Environment configuration
- Configuring agent and sink
- Running Apache Flume
- Transferring a log file
- Converting images into text for analysis
- Tesseract OCR
- Installing Tesseract
- Complete code
- Program execution
- Chapter 17: Data Visualization
- Matplotlib
- Installing Matplotlib
- Line chart
- Bar charts
- Stack charts
- Scatter charts
- Pie charts
- Geographic projections
- D3.js
- Chapter 18: Financial Trading System
- What is algorithmic trading?
- Benefits of algorithmic trading
- Big data in the financial market
- Algorithmic trading strategies
- Building an Expert Advisor
- MetaTrader
- Downloading and setting up MetaTrader
- MetaQuotes language
- Trading bot objective
- Trading pattern - moving average
- Decision time: buy or sell
- Complete program
- Backtesting in MetaTrader 4
- Chapter 19: Retail Recommendation System
- Types of recommendation system
- Collaborative filtering
- Content-based filtering
- Demographic-based system
- Utility-based system
- Knowledge-based system
- Commercial tools
- Barilliance
- Softcube
- Strands
- Monetate
- Nosto
- Book recommendation system
- Dataset
- Directory structure
- Code
- Reading the dataset
- Verifying the dataset
- Age group
- Commutative rating.
- Algorithms.
- Notes:
- Includes index.
- Description based on print version record.
- ISBN:
- 9781788836388
- 1788836383
- OCLC:
- 1044741262
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.