1 option
The art of writing efficient programs : an advanced programmer's guide to efficient hardware utilization and compiler optimizations using C++ examples / Fedor G. Pikus.
- Format:
- Book
- Author/Creator:
- Pikus, Fedor G., author.
- Language:
- English
- Subjects (All):
- Computer programming.
- Physical Description:
- 1 online resource (465 pages)
- Place of Publication:
- Birmingham, England ; Mumbai : Packt Publishing, [2021]
- Biography/History:
- Pikus Fedor G. : Fedor G. Pikus is a Technical Fellow and head of the Advanced Projects Team in Siemens Digital Industries Software. His responsibilities include planning the long-term technical direction of Calibre products, directing and training the engineers who work on these products, design, and architecture of the software, and researching new design and software technologies. His earlier positions included a Chief Scientist at Mentor Graphics (acquired by Siemens Software), a Senior Software Engineer at Google, and a Chief Software Architect for Calibre Design Solutions at Mentor Graphics. He joined Mentor Graphics in 1998 when he made a switch from academic research in computational physics to the software industry. Fedor is a recognized expert in high-performance computing and C++. He is the author of two books on C++ and software design, has presented his works at CPPNow, CPPCon, SD West, DesignCon, and in software development journals, and is also an O'Reilly author. Fedor has over 30 patents and over 100 papers and conference presentations on physics, EDA, software design, and C++ language.
- Summary:
- Become a better programmer with performance improvement techniques such as concurrency, lock-free programming, atomic operations, parallelism, and memory managementKey FeaturesLearn proven techniques from a heavyweight and recognized expert in C++ and high-performance computingUnderstand the limitations of modern CPUs and their performance impactFind out how you can avoid writing inefficient code and get the best optimizations from the compilerLearn the tradeoffs and costs of writing high-performance programsBook DescriptionThe great free lunch of "performance taking care of itself" is over. Until recently, programs got faster by themselves as CPUs were upgraded, but that doesn't happen anymore. The clock frequency of new processors has almost peaked, and while new architectures provide small improvements to existing programs, this only helps slightly. To write efficient software, you now have to know how to program by making good use of the available computing resources, and this book will teach you how to do that. The Art of Efficient Programming covers all the major aspects of writing efficient programs, such as using CPU resources and memory efficiently, avoiding unnecessary computations, measuring performance, and how to put concurrency and multithreading to good use. You'll also learn about compiler optimizations and how to use the programming language (C++) more efficiently. Finally, you'll understand how design decisions impact performance. By the end of this book, you'll not only have enough knowledge of processors and compilers to write efficient programs, but you'll also be able to understand which techniques to use and what to measure while improving performance. At its core, this book is about learning how to learn.What you will learnDiscover how to use the hardware computing resources in your programs effectivelyUnderstand the relationship between memory order and memory barriersFamiliarize yourself with the performance implications of different data structures and organizationsAssess the performance impact of concurrent memory accessed and how to minimize itDiscover when to use and when not to use lock-free programming techniquesExplore different ways to improve the effectiveness of compiler optimizationsDesign APIs for concurrent data structures and high-performance data structures to avoid inefficienciesWho this book is forThis book is for experienced developers and programmers who work on performance-critical projects and want to learn new techniques to improve the performance of their code. Programmers in algorithmic trading, gaming, bioinformatics, computational genomics, or computational fluid dynamics communities will get the most out of the examples in this book, but the techniques are fairly universal. Although this book uses the C++ language, the concepts demonstrated in the book can be easily transferred or applied to other compiled languages such as C, Java, Rust, Go, and more.
- Contents:
- Cover
- Title Page
- Copyright and Credits
- Contributors
- Table of Contents
- Preface
- Section 1 - Performance Fundamentals
- Chapter 1: Introduction to Performance and Concurrency
- Why focus on performance?
- Why performance matters
- What is performance?
- Performance as throughput
- Performance as power consumption
- Performance for real-time applications
- Performance as dependent on context
- Evaluating, estimating, and predicting performance
- Learning about high performance
- Summary
- Questions
- Chapter 2: Performance Measurements
- Technical requirements
- Performance measurements by example
- Performance benchmarking
- C++ chrono timers
- High-resolution timers
- Performance profiling
- The perf profiler
- Detailed profiling with perf
- The Google Performance profiler
- Profiling with call graphs
- Optimization and inlining
- Practical profiling
- Micro-benchmarking
- Basics of micro-benchmarking
- Micro-benchmarking and compiler optimizations
- Google Benchmark
- Micro-benchmarks are lies
- Chapter 3: CPU Architecture, Resources, and Performance
- The performance begins with the CPU
- Probing performance with micro-benchmarks
- Visualizing instruction-level parallelism
- Data dependencies and pipelining
- Pipelining and branches
- Branch prediction
- Profiling for branch mispredictions
- Speculative execution
- Optimization of complex conditions
- Branchless computing
- Loop unrolling
- Branchless selection
- Branchless computing examples
- Chapter 4: Memory Architecture and Performance
- The performance begins with the CPU but does not end there
- Measuring memory access speed
- Memory architecture
- Measuring memory and cache speeds
- The speed of memory: the numbers.
- The speed of random memory access
- The speed of sequential memory access
- Memory performance optimizations in hardware
- Optimizing memory performance
- Memory-efficient data structures
- Profiling memory performance
- Optimizing algorithms for memory performance
- The ghost in the machine
- What is Spectre?
- Spectre by example
- Spectre, unleashed
- Chapter 5: Threads, Memory, and Concurrency
- Understanding threads and concurrency
- What is a thread?
- Symmetric multi-threading
- Threads and memory
- Memory-bound programs and concurrency
- Understanding the cost of memory synchronization
- Why data sharing is expensive
- Learning about concurrency and order
- The need for order
- Memory order and memory barriers
- Memory order in C++
- Memory model
- Section 2 - Advanced Concurrency
- Chapter 6: Concurrency and Performance
- What is needed to use concurrency effectively?
- Locks, alternatives, and their performance
- Lock-based, lock-free, and wait-free programs
- Different locks for different problems
- Lock-based versus lock-free, what is the real difference?
- Building blocks for concurrent programming
- The basics of concurrent data structures
- Counters and accumulators
- Publishing protocol
- Smart pointers for concurrent programming
- Chapter 7: Data Structures for Concurrency
- What is a thread-safe data structure?
- The best kind of thread safety
- The real thread safety
- The thread-safe stack
- Interface design for thread safety
- Performance of mutex-guarded data structures
- Performance requirements for different uses
- Stack performance in detail
- Performance estimates for synchronization schemes
- Lock-free stack.
- The thread-safe queue
- Lock-free queue
- Non-sequentially consistent data structures
- Memory management for concurrent data structures
- The thread-safe list
- Lock-free list
- Chapter 8: Concurrency in C++
- Concurrency support in C++11
- Concurrency support in C++17
- Concurrency support in C++20
- The foundations of coroutines
- Coroutine C++ syntax
- Coroutine examples
- Section 3 - Designing and Coding High-Performance Programs
- Chapter 9: High-Performance C++
- What is the efficiency of a programming language?
- Unnecessary copying
- Copying and argument passing
- Copying as an implementation technique
- Copying to store data
- Copying of return values
- Using pointers to avoid copying
- How to avoid unnecessary copying
- Inefficient memory management
- Unnecessary memory allocations
- Memory management in concurrent programs
- Avoiding memory fragmentation
- Optimization of conditional execution
- Chapter 10: Compiler Optimizations in C++
- Compilers optimizing code
- Basics of compiler optimizations
- Function inlining
- What does the compiler really know?
- Lifting knowledge from runtime to compile time
- Chapter 11: Undefined Behavior and Performance
- What is undefined behavior?
- Why have undefined behavior?
- Undefined behavior and C++ optimization
- Using undefined behavior for efficient design
- Chapter 12: Design for Performance
- Interaction between the design and performance
- Design for performance
- The minimum information principle
- The maximum information principle
- API design considerations
- API design for concurrency.
- Copying and sending data
- Design for optimal data access
- Performance trade-offs
- Interface design
- Component design
- Errors and undefined behavior
- Making informed design decisions
- Assessments
- Chapter 1:
- Chapter 2:
- Chapter 3:
- Chapter 4:
- Chapter 5:
- Chapter 6:
- Chapter 7:
- Chapter 8:
- Chapter 9:
- Chapter 10:
- Chapter 11:
- Chapter 12:
- Other Books You May Enjoy
- Index.
- Notes:
- Includes index.
- Description based on print version record.
- ISBN:
- 1-80020-274-1
- OCLC:
- 1276852437
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.