My Account Log in

1 option

Fundamentals of Cost-Efficient AI : In Healthcare and Biomedicine.

Elsevier ScienceDirect eBook - Biomedical Science 2025 Available online

View online
Format:
Book
Author/Creator:
Kumar, Rohit.
Language:
English
Subjects (All):
Artificial intelligence.
Medical informatics.
Physical Description:
1 online resource (670 pages)
Edition:
1st ed.
Place of Publication:
Chantilly : Elsevier Science & Technology, 2025.
Summary:
Fundamentals of Cost-Efficient AI: In Healthcare and Biomedicine provides a comprehensive yet accessible introduction to the principles of designing, training, and deploying efficient artificial intelligence systems.
Contents:
Front Cover
Fundamentals of Cost-Efficient AI
Fundamentals of Cost-Efficient AI: In Healthcare and Biomedicine
Copyright
Contents
1 - Introduction
Why does efficient AI matter?
Model training pains
Model serving pains
Making AI accessible
Modern machine learning
Bringing simplicity to complexity
Book organization
Transformer Architectures
Model Fine-Tuning
Model Compression Techniques
Efficient Reinforcement Learning
Efficient Graph Techniques
Training Data
Training Data Augmentation
Training Data Generation
Efficient Mixture of Expert Models
Hardware awareness
GPU fundamentals and model inference
Fast Matrix Multiplications
Running Models Locally
Expert Interviews and Use Cases
Relevance to healthcare and biomedicine
Book audience
History of machine learning
Future of AI in healthcare and biomedicine
Types of machine learning
All the best!
References
2 - Efficient transformer architectures
Introduction
Attention and optimization
Introducing attention
Optimizing attention
Key-value cache
Without key-value cache
With key-value cache
Efficient sequence modeling
Order matters
Learning position encodings
Sinusoidal positional encoding
Sparse and localized attention
Sparse attention
Localized Receptive Fields
Sliding Window Attention
Top-K Attention
Random feature attention
Specialized architectures
Linformer
Longformer
Routing Transformer
Kronecker factorization
Nystromformer
Compressive transformer
Perceiver
Set Transformer
TokenLearner
Sinkhorn Transformer
Funnel Transformer
BigBird
Swin transformer
Long-short transformer
CharFormer
Axial Transformer
Progressive sparsification
Efficient activation functions
ReLU linear attention.
Polynomial attention
Cosine-based attention
Conclusion
AI disclosure
3 - Efficient model fine-tuning
The need for fine-tuning
Challenges and how to overcome them
Data challenges
Computational constraints
Catastrophic forgetting in fine-tuning
Parameter selection strategies
Specification-based selection
Heuristic-based approaches
Learning-based approaches
Masking techniques for parameter selection
Reparameterization techniques
Intrinsic dimensionality
Low-rank adaptation and variants
LoRA: Low-rank adaptation
LoRA-FA: LoRA with fine-tuning activation
MoELoRA: Mixture of Experts with LoRA
Layer-wise LoRA
LoRAPrune
Incremental LoRA
Dynamic LoRA
Adaptive LoRA
QLoRA: Quantized LoRA
QA-LoRA: Quantization-aware LoRA
Low-rank fine-tuning via quantization
Laplace-LoRA
KronA: Kronecker-factored adaptation
Hybrid fine-tuning techniques
MAM Adapter
Universal adaptation subspace
Gradual unfreezing
Adapter tuning methods
Pfeiffer adapter
Houlsby adapter
Parallel Adapter
Adapter fusion
Sparse Adapter
Compacter
Prefix tuning
Optimization techniques
Layer-wise learning rate scheduling
Training dynamics
Domain-adaptive pretraining
Memory-efficient fine-tuning (MeZO)
Key advantages of MeZO
How MeZO works
Tips for maximizing memory efficiency
Tokenizer optimization
Caching frequent tokens
Batch tokenization strategies
Sharpness-aware minimization
Research context and future directions
4 - Model compression techniques
Quantization
Representing real numbers
Floating point operations per second (FLOPS)
Why floating point?
Quantization types
Downcasting
Uniform quantization.
Symmetric versus asymmetric quantization
Nonuniform quantization
Power-of-two quantizer
Quantization decisions
Granularity
Other layers
Activation functions
Calibration methods
KL Divergence Calibration
Entropy-based calibration
Activation-aware Weight Quantization (AWQ)
Selection and preservation of weights
Scaling as an alternative
Generative pretrained transformers quantization (GPTQ)
Arbitrary order
Hessian matrix
Lazy update
Numerical stability
Weight grouping
Outline placeholder
Quantization aware training (QAT)
Quantization-aware training for Graph Neural Networks
XNOR nets
Mixed precision training
Low-rank and pruning
Low rank adaptation (LoRA)
Why LLMs have low-rank structure
Low rank and quantization
Pruning types
Pruning by structure
Unstructured pruning
Structured pruning
Pruning granularity
Temporal pruning
Static pruning
L2 norm-based pruning
Geometric median based pruning
Dynamic pruning
Nuclear norm-based pruning
Connection-based pruning
Penalty-based pruning
Sensitivity estimation algorithm
Sparse training from scratch
Efficiency under compute constraints
Cramming
Gradient checkpointing
DeepSpeed
Scaling principles
Depth
Sequence length
Curriculum learning
Reducing memory waste with batch shuffling
Tucker decomposition
Asynchronous updates
Hyperparameters
Model checkpointing
Layer wise checkpointing
Uniform checkpointing
Nonuniform checkpointing
Weight sharing
K-means weight sharing
HashNet style parameter hashing
5 - Efficient reinforcement learning
What is RL?
Core concepts
Environment
State
State space approaches
State space representation.
Dimensionality reduction
Tile coding
Hierarchical states
Temporal dependencies
Actions
Action granularity
Action space representation
Discrete Actions
Continuous actions
Composite actions
Action space simplification
State-dependent action
Action masking
Imitation learning or expert-guided actions
Rewards
Dense rewards
Sparse rewards
Shaped rewards
Multiobjective rewards
Intrinsic rewards
Inverse reinforcement learning (IRL) rewards
Risk sensitive rewards
Return
Transitions
Deterministic transitions
Stochastic transitions
Sparse transitions
Dense transitions
Simulated transitions
Storing transition models
Tabular representation
Parametric models
Learning Methods
Model-free RL
Value function
Q-value
Advantage functions
State value function V (s)
Action value function Q (s,a)
Estimating the value function
Temporal difference (TD) methods
Estimating the Q-function and advantage
Policy
Value-based methods
Policy-based methods
Proximal Policy Optimization (PPO)
Cost efficiency
Deploying RL
Constraint Policy Optimization (CPO)
Safe policy improvement with baseline bootstrapping (SPIBB)
Risk-sensitive RL using Conditional Value at Risk (CVaR)
Robust adversarial RL (RARL)
SafeOpt (Gaussian process safe exploration)
Offline RL algorithms (CQL, BCQ)
Interactive learning
Robust Adversarial Imitation Learning
Exploration-exploitation trade-off
ε-greedy exploration
Boltzmann exploration
Upper Confidence Bound (UCB)
Optimistic initialization
Count-based exploration/directed exploration techniques
Stochastic policies
Thompson sampling
Intrinsic motivation/curiosity (prediction error)
NoisyNets
Regret minimization
Contextual bandits
Optimal stopping.
Adaptive decision-making in dynamic environments
Belief updates
Planning horizon
Replay and experience methods
Hindsight Experience Replay (HER)
Prioritized Experience Replay (PER)
Improving value-based stability
Overestimation bias/double DQN
Categorical DQN/C51
Sim2Real
Population and search-based methods
Population-based training (PBT)
Monte Carlo Tree Search (MCTS)
Factorization techniques
Accelerated and efficient training strategies
Accelerated learning/curriculum learning
Efficient hyper-parameter search
Debugging RL training
Reward design pitfalls
6 - Efficient graph algorithms
Graph sampling
Edge sampling
Node sampling
Snowball sampling
Random walk sampling
Metropolis-Hastings (MH)
Graph representation
Edge list
Dynamic graphs
Knowledge graphs
Co-reference resolution
Entity extraction
Relationship classification
End-to-end autoregressive models
Training
Embeddings
Dependency-based embeddings
n-Gram language models
Graph representation learning
Node similarity metrics
Betweenness centrality
Closeness centrality
PageRank
Degree centrality
Structural Equivalence
Eigenvector Centrality
Graphlets
Clustering co-efficient
SimRank
Jaccard Similarity
Efficiency
Node embedding algorithms
DeepWalk
Node2Vec
LINE (large-scale information network embedding)
High-Order Proximity preserved Embedding (HOPE)
GraRep
Graph neural network-based approaches
Graph convolutional network
GraphSAGE
FastGCN
Graph Attention Networks (GAT)
Graph autoencoders (GAE)
Variational GAE (VGAE)
Deep Graph Infomax (DGI)
Scalable Inception Graph Neural Network (SIGN)
Graph Transformers
Metapath2Vec
Community detection.
Louvain algorithm.
Notes:
Description based on publisher supplied metadata and other sources.
Part of the metadata in this record was created by AI, based on the text of the resource.
ISBN:
0-443-33363-7
9780443333637
OCLC:
1561172765

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account