1 option
Heron streaming : fundamentals, applications, operations, and insights / Huijun Wu, Maosong Fu.
Springer Nature - Springer Mathematics and Statistics eBooks 2021 English International Available online
View online- Format:
- Book
- Author/Creator:
- Wu, Huijun (Writer on cloud computing), author.
- Fu, Maosong, author.
- Language:
- English
- Subjects (All):
- Information retrieval.
- Data mining.
- Big data.
- Information organization.
- Physical Description:
- 1 online resource (211 pages) : illustrations
- Place of Publication:
- Cham, Switzerland : Springer, [2021]
- Summary:
- This book provides both a basic understanding of stream processing in general, and practical guidance for development and research with Apache Heron in particular. It delivers to developers of streaming applications basic and systematic knowledge about Heron, which is today only scattered across project documents, technique blogs and code snippets on the Web. The book is organized in four parts: Part I describes basic knowledge about stream processing, Apache Storm, and Apache Heron (Incubating), and also introduces the Heron source repository. Part II then goes into details and describes two data models to write Heron topologies and often used topology features, including stateful processing. This part is especially targeted at software developers who write topologies using Heron APIs. Next, part III describes Heron tools, including the command-line interface and the user interface, needed to manage a single topology or multiple topologies in a data center. This part is particularlyaimed at operators who deploy and manage running jobs. Eventually, part IV describes the Heron source code and how to customize or extend Heron. This part is especially suggested for software engineers who would like to contribute code to the Heron repository and who are curious about Heron insights. Overall, this book aims at professionals who want to process streaming data based on Apache Heron. A basic knowledge of Java and Bash commands for Linux is assumed.
- Contents:
- Intro
- Foreword
- Preface
- Who This Book Is For
- How This Book Is Organized
- What You Need for This Book
- Typographical Conventions
- Acknowledgments
- Contents
- About the Authors
- Part I Heron Fundamentals
- 1 Stream Processing
- 1.1 Big Data Processing
- 1.1.1 Lambda Architecture
- 1.1.1.1 Batch Processing Layer
- 1.1.1.2 Stream Processing Layer
- 1.1.1.3 Serving Layer
- 1.1.2 Kappa Architecture
- 1.2 Big Data Stream Processing
- 1.3 From Apache Storm to Apache Heron (Incubating)
- 1.3.1 Motivation for Heron
- 1.3.2 Heron Design Goals
- 1.3.3 Join the Apache Heron (Incubating) Community
- 1.4 Stream Processing Tools
- 1.5 Summary
- References
- 2 Heron Basics
- 2.1 Topology Data Model
- 2.1.1 Topology
- 2.1.2 Spout
- 2.1.3 Bolt
- 2.1.4 Grouping
- 2.2 Heron Architecture and Components
- 2.2.1 Cluster-Level Components (Six Components)
- 2.2.1.1 Scheduler
- 2.2.1.2 State Manager
- 2.2.1.3 Uploader
- 2.2.1.4 Heron CLI
- 2.2.1.5 Heron Tracker
- 2.2.1.6 Heron UI
- 2.2.2 Topology-Level Components (Four Components)
- 2.2.2.1 Heron Instance
- 2.2.2.2 Stream Manager
- 2.2.2.3 Topology Master
- 2.2.2.4 Metrics Manager
- 2.3 Submission Process and Failure Handling
- 2.4 Submit the First Topology
- 2.4.1 Preparation
- 2.4.2 Install the Heron Client
- 2.4.3 Heron Example Topologies
- 2.4.4 Submit the Topology JAR File
- 2.4.5 Observe the Running Topology
- 2.5 Summary
- 3 Study Heron Code
- 3.1 Code Languages
- 3.2 Requirements for Compiling
- 3.3 Prepare the Compiling Environment
- 3.4 Source Organization
- 3.4.1 Directory Organization
- 3.4.2 Bazel Perspective
- 3.5 Compile Heron
- 3.6 Examine Compiling Results
- 3.6.1 Examine the API
- 3.6.2 Examine Packages
- 3.7 Run Tests
- 3.7.1 Unit Test
- 3.7.2 Integration Test
- 3.8 Summary
- References.
- Part II Write Heron Topologies
- 4 Migrate Storm Topology to Heron
- 4.1 Prepare the Storm Topology Code
- 4.1.1 Examine the Storm Topology Code
- 4.1.2 Examine the Storm Flux Code
- 4.2 Migrate the Storm Topology Code to a HeronTopology Project
- 4.2.1 Adjust the Topology Java Code
- 4.2.2 Adjust the Project File pom.xml
- 4.2.2.1 Add Dependency
- 4.2.2.2 Build with Dependencies
- 4.2.3 Compile the Topology JAR File
- 4.3 Migrate Storm Flux to Heron ECO
- 4.4 Summary
- 5 Write Topology Code
- 5.1 Before Writing Code
- 5.1.1 Design Topology
- 5.1.2 Choose a Heron API
- 5.2 Write Topology in Java
- 5.2.1 Code the Topology
- 5.2.1.1 Code Main
- 5.2.1.2 Code Spout
- 5.2.1.3 Code Bolt
- 5.2.2 Understand Tuple Flow
- 5.2.2.1 How Tuple Is Constructed
- 5.2.2.2 How Tuple Is Routed
- 5.3 Write Topology in Python
- 5.3.1 Code Main
- 5.3.2 Code Spout
- 5.3.3 Code Bolt
- 5.3.4 Compile and Run
- 5.4 Summary
- Reference
- 6 Heron Topology Features
- 6.1 Delivery Semantics
- 6.1.1 At-Least-Once
- 6.1.2 Effectively-Once
- 6.1.2.1 Requirements for Effectively-Once
- 6.1.2.2 Exactly-Once Versus Effectively-Once
- 6.1.2.3 Stateful Topologies
- 6.1.2.4 Implement Effectively-Once
- 6.2 Windowing
- 6.2.1 Windowing Concepts
- 6.2.2 Windowing Example
- 6.3 Summary
- 7 Heron Streamlet API
- 7.1 Streamlet API Concepts
- 7.1.1 Streamlets
- 7.1.2 Operations
- 7.2 Write a Processing Graph with the Java Streamlet API
- 7.2.1 Sources
- 7.2.2 Sinks
- 7.2.3 Transform: Filter, Map, FlatMap
- 7.2.4 Partitioning
- 7.2.5 Clone and Union
- 7.2.6 Reduce by Key and Window
- 7.2.7 Join
- 7.2.8 Configuration
- 7.3 Write a Processing Graph with the Python Streamlet API
- 7.3.1 Source Generator
- 7.3.2 Processing Graph Construction
- 7.4 Write a Processing Graph with the Scala Streamlet API.
- 7.4.1 Install sbt
- 7.4.2 Source Directory
- 7.4.3 Compose Processing Graph
- 7.4.4 Examine the JAR File
- 7.5 Summary
- Part III Operate Heron Clusters
- 8 Manage a Topology
- 8.1 Install Heron Client
- 8.1.1 What Is Inside heron-core.tar.gz
- 8.1.2 YAML Configuration
- 8.2 Run Topology
- 8.2.1 Topology Life Cycle
- 8.2.2 Submit Topology
- 8.2.3 Observe the Topology Running Status
- 8.3 Explore the ``heron'' Command
- 8.3.1 Common Arguments and Optional Flags
- 8.3.2 Explore ``heron submit'' Options
- 8.3.3 Kill Topology
- 8.3.4 Activate and Deactivate Topology
- 8.3.5 Restart Topology
- 8.3.6 Update Topology
- 8.4 Summary
- 9 Manage Multiple Topologies
- 9.1 Install Heron Tools
- 9.2 Heron Tracker
- 9.3 Heron UI
- 9.4 Heron Explorer
- 9.5 Summary
- Part IV Heron Insights
- 10 Explore Heron
- 10.1 Heron Processes
- 10.1.1 Java Processes
- 10.1.2 C++ Processes
- 10.1.3 Python Processes
- 10.2 State Manager
- 10.3 Heron Scheduler
- 10.3.1 Three Plans: Logical, Packing, and Physical
- 10.3.2 Restart Dead Processes
- 10.4 Data Flow
- 10.4.1 Capture Packets
- 10.4.2 Communication Primitive
- 10.5 Metrics System
- 10.5.1 File Sink
- 10.5.2 MetricsCache Manager Sink and TopologyMaster Sink
- 10.6 Summary
- 11 Extending the Heron Metrics Sink
- 11.1 Time Series
- 11.2 Metrics Category
- 11.3 Customize a Metrics Sink
- 11.3.1 Metrics Sink SPI
- 11.3.2 Metrics Sink Configuration
- 11.4 MySQL Metrics Sink
- 11.4.1 Implement IMetricsSink
- 11.4.2 Configure the Metrics Sink
- 11.4.3 Observe Metrics
- 11.5 Summary
- 12 Extending Heron Scheduler
- 12.1 Scheduler SPI
- 12.1.1 ILauncher and IScheduler Work Together
- 12.1.2 ILauncher
- 12.1.3 IScheduler
- 12.2 Timeout Scheduler
- 12.2.1 Timeout Launcher
- 12.2.1.1 Prepare Container.
- 12.2.1.2 Launch by Service
- 12.2.1.3 Launch by Library
- 12.2.2 Timeout Scheduler
- 12.2.2.1 Abstract Parent Class
- 12.2.2.2 Service Mode
- 12.2.2.3 Library Mode
- 12.2.2.4 Service Mode Versus Library Mode
- 12.3 Summary
- 13 Heron Is Evolving
- 13.1 Dhalion and Health Manager
- 13.1.1 Dhalion
- 13.1.2 Health Manager
- 13.2 Deploy Mode (API Server)
- 13.3 Cloud-Native Heron
- 13.4 Summary
- Index.
- Notes:
- Includes bibliographical references and index.
- Description based on print version record.
- ISBN:
- 3-030-60094-7
- OCLC:
- 1243551051
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.