1 option
Kubernetes Autoscaling : Build Efficient, Cost-Optimized Clusters with KEDA and Karpenter.
- Format:
- Book
- Author/Creator:
- Melendez, Christian.
- Language:
- English
- Subjects (All):
- Kubernetes.
- Cloud computing.
- Virtual computer systems.
- Physical Description:
- 1 online resource (421 pages)
- Edition:
- 1st ed.
- Place of Publication:
- Birmingham : Packt Publishing, Limited, 2025.
- Summary:
- Learn to scale workloads efficiently, reduce cloud costs, and optimize performance with real-world strategies for event-driven and infrastructure-level scaling Key Features Autoscale Kubernetes workloads and infrastructure using KEDA and Karpenter Improve performance, reduce cloud costs, and eliminate resource waste with smarter scaling Work with.
- Contents:
- Intro
- FM
- Copyright
- Forewords
- Contributors
- Table of Contents
- Preface
- Free Benefits with Your Book
- Part 1: Getting started with Kubernetes Autoscaling
- Chapter 1: Introduction to Kubernetes Autoscaling
- Technical requirements
- Scalability foundations
- A bit of history
- Horizontal and vertical scaling
- Vertical scaling
- Horizontal scaling
- Kubernetes architecture
- Efficient Kubernetes data planes
- What do I mean by efficiency?
- Challenges and considerations
- Kubernetes autoscaling categories
- Application workloads
- Data plane nodes
- Hands-on lab: Creating a Kubernetes cluster
- Local Kubernetes cluster with Kind
- Installing Kind
- Creating a Kind cluster
- Cloud Kubernetes cluster in AWS
- Creating an Amazon EKS cluster
- Removing the Amazon EKS cluster
- Summary
- Get This Book's PDF Version and Exclusive Extras
- Chapter 2: Workload Autoscaling Overview
- Challenges of autoscaling workloads
- How does the Kubernetes scheduler work?
- Configuring requests
- Configuring limits
- What if you don't specify resource requests or limits?
- Pod configuration example
- What if the pod exceeds the resource limits?
- Recommendations for configuring resources and limits
- Workload rightsizing
- Monitoring
- Prometheus and Grafana
- Hands-on lab: Setting up Prometheus and Grafana
- Hands-on lab: Determining the right size of an application
- Establishing defaults
- Establishing default requests and limits
- Hands-on lab: Setting up default requests for CPU and memory
- Workload autoscalers
- HorizontalpodAutoscaler (HPA)
- VerticalpodAutoscaler (VPA)
- Kubernetes Event-driven Autoscaling (KEDA)
- Chapter 3: Workload Autoscaling with HPA and VPA.
- Technical requirements
- The Kubernetes Metrics Server
- The Metrics Server: The what and the why
- Hands-on lab: Setting up the Metrics Server
- Hands-on lab: Using the Metrics Server
- HPA basics
- How HPA scales resources
- Defining HPA scaling policies
- Hands-on lab: Scaling using basic metrics with HPA
- Deploy the sample application
- Create the HPA autoscaling policy
- Run load tests
- Watch autoscaling working
- HPA and custom metrics
- How does HPA work with custom metrics?
- Hands-on lab: Scaling using custom metrics with HPA
- Deploy the Prometheus Adapter
- Deploy the Service monitor
- Run load tests and see HPA in action
- VPA basics
- How VPA scales resources
- Defining VPA scaling policies
- Hands-on lab: Automatic vertical scaling with VPA
- Deploy VPA components
- Run load tests and see VPA in action
- How to work with HPA and VPA together
- Part 2: Workload Autoscaling and KEDA
- Chapter 4: Kubernetes Event-Driven Autoscaling - Part 1
- KEDA: What it is and why you need it
- KEDA's architecture
- KEDA Operator
- KEDA Metrics Server
- Admission Webhooks
- What Kubernetes objects can KEDA scale?
- KEDA scalers
- KEDA CRDs
- ScaledObject
- ScaledJob
- TriggerAuthentication and ClusterTriggerAuthentication
- Hands-on lab: Installing KEDA
- Scaling Deployments
- Hands-on lab: Scaling using latency
- Controlling autoscaling speed
- Scaling to zero
- Hands-on lab: Scaling from/to zero
- Deploying a RabbitMQ queue
- Deploying the sample application
- Deploying a message producer
- Watching KEDA in action
- Cleanup
- Scaling Jobs
- Hands-on lab: Scaling jobs
- Exploring the ScaledJob rule.
- Deploying the sample application
- Chapter 5: Kubernetes Event-Driven Autoscaling - Part 2
- Autoscaling in KEDA continued
- Scaling based on schedule
- Hands-on lab: Scaling to zero during non-working hours
- KEDA's HTTP add-on
- What if a KEDA scaler is not available?
- Caching metrics
- Pausing autoscaling
- Fallback scaling actions
- Advanced autoscaling features
- Scaling modifiers
- Important considerations with scalingModifiers
- Hands-on lab: Pausing autoscaling when resources are constrained
- Extending KEDA with external scalers
- KEDA with cloud providers
- KEDA on Amazon EKS
- KEDA on Azure Kubernetes Service
- KEDA on Google Kubernetes Engine
- Chapter 6: Workload Autoscaling Operations
- Workload autoscaling operations
- Troubleshooting workload autoscaling
- Troubleshooting HPA
- Reviewing HPA conditions and events
- Checking Metrics Server logs
- Troubleshooting VPA
- Reviewing the events and status of the VPA objects
- Reading the logs from VPA components
- Verifying why VPA can't apply recommendations
- Troubleshooting KEDA
- Reviewing events from ScaledObject or ScaledJob objects
- Reading the logs from KEDA's components
- Monitoring KEDA
- Hands-on lab: Deploying KEDA's Grafana dashboard
- Upgrading KEDA
- Best practices for workload efficiency
- Part 3: Node Autoscaling and Karpenter
- Chapter 7: Data Plane Autoscaling Overview
- What is data plane autoscaling?
- Data plane autoscalers
- CAS
- Karpenter
- CAS on AWS
- Hands-on lab: CAS on AWS.
- Step 1: Deploy CAS
- Step 2: Remove VPA rules that could resize the sample application
- Step 3: Deploy a sample application to see how CAS adds new nodes
- Step 4: Remove the sample application to see how CAS removes unnecessary nodes
- Step 5: Uninstall CAS
- CAS best practices
- Use homogeneous instance types
- Limit the number of node groups
- Configure a sane scaling speed
- Use a priority expander with multiple nodes
- Use a single Availability Zone for persistent volumes
- Relevant autoscalers
- Descheduler
- CPA
- CPVA
- Chapter 8: Node Autoscaling with Karpenter - Part 1
- What is Karpenter?
- How does Karpenter work in AWS?
- History of Karpenter
- Karpenter resources
- NodeClass
- NodePool
- NodePool requirements
- General recommendations
- NodeClaim
- Launching nodes
- 1. Scheduling
- 2. Batching
- 3. Bin-packing
- 4. Launching the request
- Deploying Karpenter
- Hands-on lab: Autoscaling with Karpenter
- Creating a fresh EKS cluster
- Deploying a sample application
- Deploying a default NodePool and EC2NodeClass
- Removing the sample application
- Workload scheduling constraints
- Node selector
- Node affinity
- Pod affinity and pod anti-affinity
- Tolerations and taints
- Topology spread constraints
- Scheduling nuances and minDomains for topology spread
- PersistentVolume topology and zonal placement
- Requesting specific hardware
- minValues
- NodePool weights
- Chapter 9: Node Autoscaling with Karpenter - Part 2
- Removing nodes
- Control flow architecture
- Disruption controller
- Termination controller
- Disruption
- Consolidation
- What happens when Karpenter skips consolidation?.
- Hands-on lab: Consolidating nodes after scaling events
- Drift
- What causes drift?
- How drift is detected
- Managing drift carefully
- Hands-on lab: Drifted nodes after updating AMI
- Expiring nodes
- Graceful node termination
- NodePool disruption budgets
- Disruptions by schedule
- Disruption by reasons
- Multiple disruptions budgets
- Hands-on lab: Disruption budgets
- Working with PDBs
- Karpenter best practices
- NodePools
- Workloads
- Chapter 10: Karpenter Management Operations
- Advanced operations and troubleshooting
- Troubleshooting Karpenter
- Getting Karpenter events
- Events reference table
- Getting Karpenter logs
- Exploring status conditions
- Common operational challenges
- Cost spikes
- Consolidation churn
- Having extra capacity due to scheduling preferences
- DaemonSet taints cause the launch of extra nodes
- Node provisioning challenges
- Subnet and security group selection
- Invalid launch templates or block device mappings
- IP address exhaustion in subnets
- Insufficient Instance Capacity error (ICE)
- Quota limits
- Consolidation failures
- Pods remain pending
- Node Auto Repair for unhealthy node handling
- Preventing cascading failures
- Enabling NAR
- Hands-on lab: NAR in action
- 1. Enable the NodeRepair feature
- 2. Check that the Node Monitoring Agent is running
- 3. Inject a failure into the node
- 4. Wait for Karpenter to replace the node
- Upgrading Karpenter
- 1. Pre-upgrade checks
- 2. Stage the upgrade
- 3. Upgrade process
- 4. Post-upgrade verification
- Observability and metrics
- Cluster resource efficiency metrics
- Provisioning reliability and performance metrics
- Hands-on lab: Deploy the Karpenter Grafana dashboard
- Summary.
- Get This Book's PDF Version and Exclusive Extras.
- Notes:
- Description based on publisher supplied metadata and other sources.
- ISBN:
- 1-83664-382-9
- 9781836643821
- OCLC:
- 1550453854
- Publisher Number:
- CIPO000303198
- CIPO000303199
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.