1 option
Parallel MATLAB for multicore and multinode computers / Jeremy Kepner.
- Format:
- Book
- Author/Creator:
- Kepner, Jeremy V., 1969-
- Series:
- Software, environments, tools ; 21.
- Software, environments, tools ; SE21
- Language:
- English
- Subjects (All):
- MATLAB.
- Parallel processing (Electronic computers).
- Multiprocessors.
- Physical Description:
- 1 electronic text (xxv, 253 p. : bill. (some col.)) : digital file.
- Place of Publication:
- Philadelphia, Pa. : Society for Industrial and Applied Mathematics (SIAM, 3600 Market Street, Floor 6, Philadelphia, PA 19104), 2009.
- Language Note:
- English
- System Details:
- Mode of access: World Wide Web.
- System requirements: Adobe Acrobat Reader.
- Summary:
- Parallel MATLAB for Multicore and Multinode Computers is the first book on parallel MATLAB and the first parallel computing book focused on the design, code, debug, and test techniques required to quickly produce well-performing parallel programs. MATLAB is currently the dominant language of technical computing with one million users worldwide, many of whom can benefit from the increased power offered by inexpensive multicore and multinode parallel computers. MATLAB is an ideal environment for learning about parallel computing, allowing the user to focus on parallel algorithms instead of the details of implementation. This book covers more parallel algorithms and parallel programming models than any other parallel programming book due to the succinctness of MATLAB and presents a "hands-on" approach with numerous example programs. Wherever possible, the examples are drawn from widely known and well-documented parallel benchmark codes representative of many real applications.
- Contents:
- I. Fundamentals
- 1 Primer: notation and interfaces
- 1.1 Algorithm notation: 1.1.1 Distributed array notation; 1.1.2 Distributed data access; 1.2 Parallel function interfaces; 1.2.1 Map-based programming; 1.2.2 Parallel execution
- 2. Introduction to pMatlab
- 2.1. Program: Mandelbrot (fine-grained embarrassingly parallel): 2.1.1. Getting started; 2.1.2. Parallel design; 2.1.3. Code; 2.1.4. Debug; 2.1.5. Test
- 2.2. Program: ZoomImage (coarse-grained embarrassingly parallel): 2.2.1. Getting started; 2.2.2. Parallel design; 2.2.3. Code; 2.2.4. Debug; 2.2.5. Test
- 2.3. Program: ParallelIO: 2.3.1. Getting started; 2.3.2. Parallel design; 2.3.3. Code; 2.3.4. Debug; 2.3.5. Test
- 2.4. Why these worked
- 3. Interacting with distributed arrays
- 3.1. Getting started
- 3.2. Parallel design
- 3.3. Code
- 3.4. Interactive debug and test: 3.4.1. Serial code correct; 3.4.2. Parallel code correct; 3.4.3. Local communication correct; 3.4.4. Remote communication correct; 3.4.5. Measuring performance
- 3.5. Advanced topic: Introduction to parallel pipelines
- II. Advanced techniques. 4. Parallel programming models. 4.1. Design: Knowing when to go parallel": 4.1.1. Memory and performance profiling; 4.1.2. Parallel programming patterns; 4.1.3. Blurimage parallel design
- 4.2. Coding: 4.2.1. Manager/worker; 4.2.2. Message passing; 4.2.3. Distributed arrays
- 4.3. Debug
- 4.4. Testing
- 4.5. Performance summary
- 4.6. Summary
- 5. Advanced distributed array programming. 5.1. Introduction
- 5.2. Pure versus fragmented distributed arrays
- 5.3. MATLAB distributed array interface and architecture design
- 5.4. Maps and distributions
- 5.5. Parallel support functions: 5.5.1. Cyclic distributions and parallel load balancing
- 5.6. Concurrency versus locality
- 5.7. Ease of implement
- 5.8. Implementation specifics
- 5.8.1. Program execution
- 5.9. Array redistribution
- 5.10. Message passing layer
- 5.11. Performance concepts: 5.11.1. Performance, performance, performance; 5.11.2. Benchmarks; 5.11.3. Minimize overhead; 5.11.4. Embarrassingly parallel implies linear speedup; 5.11.5. Algorithm and mapping are orthogonal; 5.11.6. Do no harm; 5.11.7. Watch the SLOC; 5.11.8. No free lunch; 5.11.9. Four easy steps
- 5.12. User results
- 6. Performance metrics and software architecture. 6.1. Introduction
- 6.2. Characterizing a parallel application: 6.2.1. Characteristics of the example programs; 6.2.2. Serial performance metrics; 6.2.3. Degrees of parallelism; 6.2.4. Parallel performance metrics (no communication); 6.2.5. Parallel performance metrics (with communication); 6.2.6. Amdahl's Law; 6.2.7. Characterizing speedup; 6.2.8. Spatial and temporal locality
- 6.3. Standard parallel computer: 6.3.1. Network model; 6.3.2. Kuck hierarchy
- 6.4. Parallel programming models
- 6.5. System metrics: 6.5.1. Performance; 6.5.2. Form factor; 6.5.3. Efficiency; 6.5.4. Software cost; 6.5.5. Software productivity
- III. Case studies. 7. Parallel application analysis
- 7.1. Historical overview
- 7.2. Characterizing the application space: 7.2.1. Physical memory hierarchy; 7.2.2. Spatial/temporal locality; 7.2.3. Logical memory hierarchy
- 7.3. HPC Challenge: Spanning the application space: 7.3.1. Stream; 7.3.2. FFT; 7.3.3. RandomAccess; 7.3.4. HPL
- 7.4. Intrinsic algorithm performance: 7.4.1. Data structures; 7.4.2. Computational complexity; 7.4.3. Degrees of parallelism; 7.4.4. Communication complexity
- 7.5. Hardware performance: 7.5.1. Spatial and temporal locality; 7.5.2. Performance efficiency; 7.5.3. Estimating performance; 7.5.4. Analysis results; 7.5.5. Performance implications
- 7.6. Software performance: 7.6.1. Stream; 7.6.2. Random access; 7.6.3. FFT; 7.6.4. HPL
- 7.7. Performance versus effort
- 8. Stream
- 8.1. Getting started
- 8.2. Parallel design
- 8.3. Code
- 8.4. Debug
- 8.5. Test
- 9. Randomaccess
- 9.1. Getting started
- 9.2. Parallel design: 9.2.1. Spray algorithm; 9.2.2. Tree algorithm
- 9.3. Code: 9.3.1. Spray code; 9.3.2. Tree code; 9.3.3. Coding summary
- 9.4. Debug
- 9.5. Test: 9.5.1. Multicore performance; 9.5.2. Multinode performance
- 10. Fast Fourier Transform. 10.1. Getting started
- 10.2. Parallel design
- 10.3. Code
- 10.4. Debug
- 10.5. Test: 10.5.1. Multicore performance; 10.5.2. Multinode performance
- 11. High Performance Linpack
- 11.1. Getting started
- 11.2. Parallel design: 11.2.1. Parallel LU; 11.2.2. Critical path analysis
- 11.3. Code
- 11.4. Debug
- 11.5. Test: 11.5.1. Multicore performance; 11.5.2. Multinode performance
- Appendix. Notation for hierarchical parallel multicore algorithms. A.1. Introduction
- A.2. Data parallelism: A.2.1. Serial algorithm; A.2.2. Parallel algorithm; A.2.3. Block parallel algorithm; A.2.4. Hierarchical parallel algorithm; A.2.5. Hierarchical block parallel algorithm
- A.3. Pipeline parallelism: A.3.1. Implicit pipeline parallel; A.3.2. Task pipeline parallel; A.3.3. Fine-grained task pipeline parallel; A.3.4. Generic hierarchical block parallel algorithm
- Index.
- Notes:
- Bibliographic Level Mode of Issuance: Monograph
- Includes bibliographical references and index.
- Description based on title page of print version.
- ISBN:
- 0-89871-812-0
- Publisher Number:
- SE21 siam
- SE21 SIAM
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.