1 option
Fundamentals of Cost-Efficient AI : In Healthcare and Biomedicine.
- Format:
- Book
- Author/Creator:
- Kumar, Rohit.
- Language:
- English
- Subjects (All):
- Artificial intelligence.
- Medical informatics.
- Physical Description:
- 1 online resource (670 pages)
- Edition:
- 1st ed.
- Place of Publication:
- Chantilly : Elsevier Science & Technology, 2025.
- Summary:
- Fundamentals of Cost-Efficient AI: In Healthcare and Biomedicine provides a comprehensive yet accessible introduction to the principles of designing, training, and deploying efficient artificial intelligence systems.
- Contents:
- Front Cover
- Fundamentals of Cost-Efficient AI
- Fundamentals of Cost-Efficient AI: In Healthcare and Biomedicine
- Copyright
- Contents
- 1 - Introduction
- Why does efficient AI matter?
- Model training pains
- Model serving pains
- Making AI accessible
- Modern machine learning
- Bringing simplicity to complexity
- Book organization
- Transformer Architectures
- Model Fine-Tuning
- Model Compression Techniques
- Efficient Reinforcement Learning
- Efficient Graph Techniques
- Training Data
- Training Data Augmentation
- Training Data Generation
- Efficient Mixture of Expert Models
- Hardware awareness
- GPU fundamentals and model inference
- Fast Matrix Multiplications
- Running Models Locally
- Expert Interviews and Use Cases
- Relevance to healthcare and biomedicine
- Book audience
- History of machine learning
- Future of AI in healthcare and biomedicine
- Types of machine learning
- All the best!
- References
- 2 - Efficient transformer architectures
- Introduction
- Attention and optimization
- Introducing attention
- Optimizing attention
- Key-value cache
- Without key-value cache
- With key-value cache
- Efficient sequence modeling
- Order matters
- Learning position encodings
- Sinusoidal positional encoding
- Sparse and localized attention
- Sparse attention
- Localized Receptive Fields
- Sliding Window Attention
- Top-K Attention
- Random feature attention
- Specialized architectures
- Linformer
- Longformer
- Routing Transformer
- Kronecker factorization
- Nystromformer
- Compressive transformer
- Perceiver
- Set Transformer
- TokenLearner
- Sinkhorn Transformer
- Funnel Transformer
- BigBird
- Swin transformer
- Long-short transformer
- CharFormer
- Axial Transformer
- Progressive sparsification
- Efficient activation functions
- ReLU linear attention.
- Polynomial attention
- Cosine-based attention
- Conclusion
- AI disclosure
- 3 - Efficient model fine-tuning
- The need for fine-tuning
- Challenges and how to overcome them
- Data challenges
- Computational constraints
- Catastrophic forgetting in fine-tuning
- Parameter selection strategies
- Specification-based selection
- Heuristic-based approaches
- Learning-based approaches
- Masking techniques for parameter selection
- Reparameterization techniques
- Intrinsic dimensionality
- Low-rank adaptation and variants
- LoRA: Low-rank adaptation
- LoRA-FA: LoRA with fine-tuning activation
- MoELoRA: Mixture of Experts with LoRA
- Layer-wise LoRA
- LoRAPrune
- Incremental LoRA
- Dynamic LoRA
- Adaptive LoRA
- QLoRA: Quantized LoRA
- QA-LoRA: Quantization-aware LoRA
- Low-rank fine-tuning via quantization
- Laplace-LoRA
- KronA: Kronecker-factored adaptation
- Hybrid fine-tuning techniques
- MAM Adapter
- Universal adaptation subspace
- Gradual unfreezing
- Adapter tuning methods
- Pfeiffer adapter
- Houlsby adapter
- Parallel Adapter
- Adapter fusion
- Sparse Adapter
- Compacter
- Prefix tuning
- Optimization techniques
- Layer-wise learning rate scheduling
- Training dynamics
- Domain-adaptive pretraining
- Memory-efficient fine-tuning (MeZO)
- Key advantages of MeZO
- How MeZO works
- Tips for maximizing memory efficiency
- Tokenizer optimization
- Caching frequent tokens
- Batch tokenization strategies
- Sharpness-aware minimization
- Research context and future directions
- 4 - Model compression techniques
- Quantization
- Representing real numbers
- Floating point operations per second (FLOPS)
- Why floating point?
- Quantization types
- Downcasting
- Uniform quantization.
- Symmetric versus asymmetric quantization
- Nonuniform quantization
- Power-of-two quantizer
- Quantization decisions
- Granularity
- Other layers
- Activation functions
- Calibration methods
- KL Divergence Calibration
- Entropy-based calibration
- Activation-aware Weight Quantization (AWQ)
- Selection and preservation of weights
- Scaling as an alternative
- Generative pretrained transformers quantization (GPTQ)
- Arbitrary order
- Hessian matrix
- Lazy update
- Numerical stability
- Weight grouping
- Outline placeholder
- Quantization aware training (QAT)
- Quantization-aware training for Graph Neural Networks
- XNOR nets
- Mixed precision training
- Low-rank and pruning
- Low rank adaptation (LoRA)
- Why LLMs have low-rank structure
- Low rank and quantization
- Pruning types
- Pruning by structure
- Unstructured pruning
- Structured pruning
- Pruning granularity
- Temporal pruning
- Static pruning
- L2 norm-based pruning
- Geometric median based pruning
- Dynamic pruning
- Nuclear norm-based pruning
- Connection-based pruning
- Penalty-based pruning
- Sensitivity estimation algorithm
- Sparse training from scratch
- Efficiency under compute constraints
- Cramming
- Gradient checkpointing
- DeepSpeed
- Scaling principles
- Depth
- Sequence length
- Curriculum learning
- Reducing memory waste with batch shuffling
- Tucker decomposition
- Asynchronous updates
- Hyperparameters
- Model checkpointing
- Layer wise checkpointing
- Uniform checkpointing
- Nonuniform checkpointing
- Weight sharing
- K-means weight sharing
- HashNet style parameter hashing
- 5 - Efficient reinforcement learning
- What is RL?
- Core concepts
- Environment
- State
- State space approaches
- State space representation.
- Dimensionality reduction
- Tile coding
- Hierarchical states
- Temporal dependencies
- Actions
- Action granularity
- Action space representation
- Discrete Actions
- Continuous actions
- Composite actions
- Action space simplification
- State-dependent action
- Action masking
- Imitation learning or expert-guided actions
- Rewards
- Dense rewards
- Sparse rewards
- Shaped rewards
- Multiobjective rewards
- Intrinsic rewards
- Inverse reinforcement learning (IRL) rewards
- Risk sensitive rewards
- Return
- Transitions
- Deterministic transitions
- Stochastic transitions
- Sparse transitions
- Dense transitions
- Simulated transitions
- Storing transition models
- Tabular representation
- Parametric models
- Learning Methods
- Model-free RL
- Value function
- Q-value
- Advantage functions
- State value function V (s)
- Action value function Q (s,a)
- Estimating the value function
- Temporal difference (TD) methods
- Estimating the Q-function and advantage
- Policy
- Value-based methods
- Policy-based methods
- Proximal Policy Optimization (PPO)
- Cost efficiency
- Deploying RL
- Constraint Policy Optimization (CPO)
- Safe policy improvement with baseline bootstrapping (SPIBB)
- Risk-sensitive RL using Conditional Value at Risk (CVaR)
- Robust adversarial RL (RARL)
- SafeOpt (Gaussian process safe exploration)
- Offline RL algorithms (CQL, BCQ)
- Interactive learning
- Robust Adversarial Imitation Learning
- Exploration-exploitation trade-off
- ε-greedy exploration
- Boltzmann exploration
- Upper Confidence Bound (UCB)
- Optimistic initialization
- Count-based exploration/directed exploration techniques
- Stochastic policies
- Thompson sampling
- Intrinsic motivation/curiosity (prediction error)
- NoisyNets
- Regret minimization
- Contextual bandits
- Optimal stopping.
- Adaptive decision-making in dynamic environments
- Belief updates
- Planning horizon
- Replay and experience methods
- Hindsight Experience Replay (HER)
- Prioritized Experience Replay (PER)
- Improving value-based stability
- Overestimation bias/double DQN
- Categorical DQN/C51
- Sim2Real
- Population and search-based methods
- Population-based training (PBT)
- Monte Carlo Tree Search (MCTS)
- Factorization techniques
- Accelerated and efficient training strategies
- Accelerated learning/curriculum learning
- Efficient hyper-parameter search
- Debugging RL training
- Reward design pitfalls
- 6 - Efficient graph algorithms
- Graph sampling
- Edge sampling
- Node sampling
- Snowball sampling
- Random walk sampling
- Metropolis-Hastings (MH)
- Graph representation
- Edge list
- Dynamic graphs
- Knowledge graphs
- Co-reference resolution
- Entity extraction
- Relationship classification
- End-to-end autoregressive models
- Training
- Embeddings
- Dependency-based embeddings
- n-Gram language models
- Graph representation learning
- Node similarity metrics
- Betweenness centrality
- Closeness centrality
- PageRank
- Degree centrality
- Structural Equivalence
- Eigenvector Centrality
- Graphlets
- Clustering co-efficient
- SimRank
- Jaccard Similarity
- Efficiency
- Node embedding algorithms
- DeepWalk
- Node2Vec
- LINE (large-scale information network embedding)
- High-Order Proximity preserved Embedding (HOPE)
- GraRep
- Graph neural network-based approaches
- Graph convolutional network
- GraphSAGE
- FastGCN
- Graph Attention Networks (GAT)
- Graph autoencoders (GAE)
- Variational GAE (VGAE)
- Deep Graph Infomax (DGI)
- Scalable Inception Graph Neural Network (SIGN)
- Graph Transformers
- Metapath2Vec
- Community detection.
- Louvain algorithm.
- Notes:
- Description based on publisher supplied metadata and other sources.
- Part of the metadata in this record was created by AI, based on the text of the resource.
- ISBN:
- 0-443-33363-7
- 9780443333637
- OCLC:
- 1561172765
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.