My Account Log in

1 option

Using Stable Diffusion with Python : Leverage Python to Control and Automate High-Quality AI Image Generation Using Stable Diffusion / Andrew Zhu (Shudong Zhu) and Matthew Fisher.

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Zhu, Andrew (Shudong Zhu), author.
Fisher, Matthew, author.
Language:
English
Subjects (All):
Python (Computer program language).
Artificial intelligence.
Physical Description:
1 online resource (352 pages)
Edition:
First edition.
Place of Publication:
Birmingham, England : Packt Publishing, [2024]
Biography/History:
Zhu (Shudong Zhu) Andrew: Andrew Zhu is an experienced Microsoft Applied Data Scientist with over 15 years of experience in the tech field. He is a highly regarded writer known for his ability to explain complex concepts in machine learning and AI in an engaging and informative manner. Andrew frequently contributes articles to Toward Data Science and other prominent tech publishers. He has authored the book "Microsoft Workflow Foundation 4. 0 Cookbook, " which has received a 4. 5-star review. Andrew has a strong command of programming languages such as C/C++, Java, C#, and Javascript, with his current focus primarily on Python. With a passion for AI and Automation, Andrew resides in WA, US, with his family, which includes two boys.
Summary:
Master AI image generation by leveraging GenAI tools and techniques such as diffusers, LoRA, textual inversion, ControlNet, and prompt design Key Features Master the art of generating stunning AI artwork with the help of expert guidance and ready-to-run Python code Get instant access to emerging extensions and open-source models Leverage the power of community-shared models and LoRA to produce high-quality images that captivate audiences Purchase of the print or Kindle book includes a free PDF eBook Book Description Stable Diffusion is a game-changing AI tool for image generation, enabling you to create stunning artwork with code. However, mastering it requires an understanding of the underlying concepts and techniques. This book guides you through unlocking the full potential of Stable Diffusion with Python. Starting with an introduction to Stable Diffusion, you'll explore the theory behind diffusion models, set up your environment, and generate your first image using diffusers. You'll learn how to optimize performance, leverage custom models, and integrate community-shared resources like LoRAs, textual inversion, and ControlNet to enhance your creations. After covering techniques such as face restoration, image upscaling, and image restoration, you'll focus on unlocking prompt limitations, scheduled prompt parsing, and weighted prompts to create a fully customized and industry-level Stable Diffusion application. This book also delves into real-world applications in medical imaging, remote sensing, and photo enhancement. Finally, you'll gain insights into extracting generation data, ensuring data persistence, and leveraging AI models like BLIP for image description extraction. By the end of this book, you'll be able to use Python to generate and edit images and leverage solutions to build Stable Diffusion apps for your business and users. What you will learn Explore core concepts and applications of Stable Diffusion and set up your environment for success Refine performance, manage VRAM usage, and leverage community-driven resources like LoRAs and textual inversion Harness the power of ControlNet, IP-Adapter, and other methodologies to generate images with unprecedented control and quality Explore developments in Stable Diffusion such as video generation using AnimateDiff Write effective prompts and leverage LLMs to automate the process Discover how to train a Stable Diffusion LoRA from scratch Who this book is for If you're looking to gain control over AI image generation, particularly through the diffusion model, this book is for you. Moreover, data scientists, ML engineers, researchers, and Python application developers seeking to create AI image generation applications based on the Stable Diffusion framework can benefit from the insights provided in the book.
Contents:
Intro
Title Page
Copyright and Credits
Dedication
Foreword
Contributors
Table of Contents
Preface
Part 1 - A Whirlwind of Stable Diffusion
Chapter 1: Introducing Stable Diffusion
Evolution of the Diffusion model
Before Transformer and Attention
Transformer transforms machine learning
CLIP from OpenAI makes a big difference
Generate images
DALL-E 2 and Stable Diffusion
Why Stable Diffusion
Which Stable Diffusion to use
Why this book
References
Chapter 2: Setting Up the Environment for Stable Diffusion
Hardware requirements to run Stable Diffusion
GPU
System memory
Storage
Software requirements
CUDA installation
Installing Python for Windows, Linux, and macOS
Installing PyTorch
Running a Stable Diffusion pipeline
Using Google Colaboratory
Using Google Colab to run a Stable Diffusion pipeline
Summary
Chapter 3: Generating Images Using Stable Diffusion
Logging in to Hugging Face
Generating an image
Generation seed
Sampling scheduler
Changing a model
Guidance scale
Chapter 4: Understanding the Theory Behind Diffusion Models
Understanding the image-to-noise process
A more efficient forward diffusion process
The noise-to-image training process
The noise-to-image sampling process
Understanding Classifier Guidance denoising
Chapter 5: Understanding How Stable Diffusion Works
Stable Diffusion in latent space
Generating latent vectors using diffusers
Generating text embeddings using CLIP
Initializing time step embeddings
Initializing the Stable Diffusion UNet
Implementing a text-to-image Stable Diffusion inference pipeline
Implementing a text-guided image-to-image Stable Diffusion inference pipeline
References.
Additional reading
Chapter 6: Using Stable Diffusion Models
Technical requirements
Loading the Diffusers model
Loading model checkpoints from safetensors and ckpt files
Using ckpt and safetensors files with Diffusers
Turning off the model safety checker
Converting the checkpoint model file to the Diffusers format
Using Stable Diffusion XL
Part 2 - Improving Diffusers with Custom Features
Chapter 7: Optimizing Performance and VRAM Usage
Setting the baseline
Optimization solution 1 - using the float16 or bfloat16 data type
Optimization solution 2 - enabling VAE tiling
Optimization solution 3 - enabling Xformers or using PyTorch 2.0
Optimization solution 4 - enabling sequential CPU offload
Optimization solution 5 - enabling model CPU offload
Optimization solution 6 - Token Merging (ToMe)
Chapter 8: Using Community-Shared LoRAs
How does LoRA work?
Using LoRA with Diffusers
Applying a LoRA weight during loading
Diving into the internal structure of LoRA
Finding the ​A​ and ​B​ weight matrix from the LoRA file
Finding the corresponding checkpoint model layer name
Updating the checkpoint model weights
Making a function to load LoRA
Why LoRA works
Chapter 9: Using Textual Inversion
Diffusers inference using TI
How TI works
Building a custom TI loader
TI in the pt file format
TI in bin file format
Detailed steps to build a TI loader
Putting all of the code together
Chapter 10: Overcoming 77-Token Limitations and Enabling Prompt Weighting
Understanding the 77-token limitation
Overcoming the 77-tokens limitation
Putting all the code together into a function
Enabling long prompts with weighting
Verifying the work.
Overcoming the 77-token limitation using community pipelines
Chapter 11: Image Restore and Super-Resolution
Understanding the terminologies
Upscaling images using Img2img diffusion
One-step super-resolution
Multiple-step super-resolution
A super-resolution result comparison
Img-to-Img limitations
ControlNet Tile image upscaling
Steps to use ControlNet Tile to upscale an image
The ControlNet Tile upscaling result
Additional ControlNet Tile upscaling samples
Chapter 12: Scheduled Prompt Parsing
Using the Compel package
Building a custom scheduled prompt pipeline
A scheduled prompt parser
Filling in the missing steps
A Stable Diffusion pipeline supporting scheduled prompts
Part 3 - Advanced Topics
Chapter 13: Generating Images with ControlNet
What is ControlNet and how is it different?
Usage of ControlNet
Using multiple ControlNets in one pipeline
How ControlNet works
Further usage
More ControlNets with SD
SDXL ControlNets
Chapter 14: Generating Video Using Stable Diffusion
The principles of text-to-video generation
Practical applications of AnimateDiff
Utilizing Motion LoRA to control animation motion
Chapter 15: Generating Image Descriptions Using BLIP-2 and LLaVA
BLIP-2 - Bootstrapping Language-Image Pre-training
How BLIP-2 works
Using BLIP-2 to generate descriptions
LLaVA - Large Language and Vision Assistant
How LLaVA works
Installing LLaVA
Using LLaVA to generate image descriptions
Chapter 16: Exploring Stable Diffusion XL
What's new in SDXL?
The VAE of the SDXL
The UNet of SDXL.
Two text encoders in SDXL
The two-stage design
Using SDXL
Use SDXL community models
Using SDXL image-to-image to enhance an image
Using SDXL LoRA models
Using SDXL with an unlimited prompt
Chapter 17: Building Optimized Prompts for Stable Diffusion
What makes a good prompt?
Be clear and specific
Be descriptive
Using consistent terminology
Reference artworks and styles
Incorporate negative prompts
Iterate and refine
Using LLMs to generate better prompts
Part 4 - Building Stable Diffusion into an Application
Chapter 18: Applications - Object Editing and Style Transferring
Editing images using Stable Diffusion
Replacing image background content
Removing the image background
Object and style transferring
Loading up a Stable Diffusion pipeline with IP-Adapter
Transferring style
Chapter 19: Generation Data Persistence
Exploring and understanding the PNG file structure
Saving extra text data in a PNG image file
PNG extra data storage limitation
Chapter 20: Creating Interactive User Interfaces
Introducing Gradio
Getting started with Gradio
Gradio fundamentals
Gradio Blocks
Inputs and outputs
Building a progress bar
Building a Stable Diffusion text-to-image pipeline with Gradio
Chapter 21: Diffusion Model Transfer Learning
Training a neural network model with PyTorch
Preparing the training data
Preparing for training
Training a model
Training a model with Hugging Face's Accelerate
Applying Hugging Face's Accelerate
Putting code together
Training a model with multiple GPUs using Accelerate
Training a Stable Diffusion V1.5 LoRA
Defining training hyperparameters.
Preparing the Stable Diffusion components
Loading the training data
Defining the training components
Kicking off the training
Verifying the result
Chapter 22: Exploring Beyond Stable Diffusion
What sets this AI wave apart
The enduring value of mathematics and programming
Staying current with AI innovations
Cultivating responsible, ethical, private, and secure AI
Our evolving relationship with AI
Index
Other Books You May Enjoy.
Notes:
Includes bibliographical references and index.
Description based on publisher supplied metadata and other sources.
Description based on print version record.
ISBN:
9781835084311
1835084311
OCLC:
1435803070

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account