My Account Log in

2 options

Networks-on-chip : from implementations to programming paradigms / Sheng Ma [and three others] ; editor-in-chief Zhiying Wang.

Ebook Central Academic Complete Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Ma, Sheng, author.
Contributor:
Wang, Zhiying, editor.
Language:
English
Subjects (All):
Networks on a chip--Design and construction.
Networks on a chip.
Networks on a chip--Reliability.
Physical Description:
1 online resource (383 p.)
Edition:
First edition.
Place of Publication:
Waltham, Massachusetts : Morgan Kaufmann, 2015.
Language Note:
English
System Details:
text file
Summary:
Networks-on-Chip: From Implementations to Programming Paradigms provides a thorough and bottom-up exploration of the whole NoC design space in a coherent and uniform fashion, from low-level router, buffer and topology implementations, to routing and flow control schemes, to co-optimizations of NoC and high-level programming paradigms. This textbook is intended for an advanced course on computer architecture, suitable for graduate students or senior undergrads who want to specialize in the area of computer architecture and Networks-on-Chip. It is also intended for practitioners in the industry in the area of microprocessor design, especially the many-core processor design with a network-on-chip. Graduates can learn many practical and theoretical lessons from this course, and also can be motivated to delve further into the ideas and designs proposed in this book. Industrial engineers can refer to this book to make practical tradeoffs as well. Graduates and engineers who focus on off-chip network design can also refer to this book to achieve deadlock-free routing algorithm designs. Provides thorough and insightful exploration of NoC design space. Description from low-level logic implementations to co-optimizations of high-level program paradigms and NoCs. The coherent and uniform format offers readers a clear, quick and efficient exploration of NoC design space Covers many novel and exciting research ideas, which encourage researchers to further delve into these topics. Presents both engineering and theoretical contributions. The detailed description of the router, buffer and topology implementations, comparisons and analysis are of high engineering value.
Contents:
Front Cover
Networks-on-Chip: From Implementations to Programming Paradigms
Copyright
Contents in Brief
Contents
Preface
About the Editor-in-Chief and Authors
Editor-in-Chief
Authors
Part I: Prologue
Chapter 1: Introduction
1.1 The dawn of the many-core era
1.2 Communication-centric cross-layer optimizations
1.3 A baseline design space exploration of NoCs
1.3.1 Topology
1.3.2 Routing algorithm
1.3.3 Flow control
1.3.4 Router microarchitecture
1.3.5 Performance metric
1.4 Review of NoC research
1.4.1 Research on topologies
1.4.2 Research on unicast routing
1.4.3 Research on supporting collective communications
1.4.4 Research on flow control
1.4.5 Research on router microarchitecture
1.5 Trends of real processors
1.5.1 The MIT Raw processor
1.5.2 The Tilera TILE64 processor
1.5.3 The Sony/Toshiba/IBM Cell processor
1.5.4 The U.T. Austin TRIPS processor
1.5.5 The Intel Teraflops processor
1.5.6 The Intel SCC processor
1.5.7 The Intel Larrabee processor
1.5.8 The Intel Knights Corner processor
1.5.9 Summary of real processors
1.6 Overview of the book
References
Part II: Logic implementations
Chapter 2: A single-cycle router with wing channels
2.1 Introduction
2.2 The router architecture
2.2.1 The overall architecture
2.2.2 Wing channels
2.3 Microarchitecture designs
2.3.1 Channel dispensers
2.3.2 Fast arbiter components
2.3.3 SIG managers and SIG controllers
2.4 Experimental results
2.4.1 Simulation infrastructures
2.4.2 Pipeline delay analysis
2.4.3 Latency and throughput
2.4.4 Area and power consumption
2.5 Chapter summary
Chapter 3: Dynamic virtual channel routers with congestion awareness
3.1 Introduction
3.2 DVC with congestion awareness
3.2.1 DVC scheme.
3.2.2 Congestion avoidance scheme
3.3 Multiple-port shared buffer with congestion awareness
3.3.1 DVC scheme among multiple ports
3.3.2 Congestion avoidance scheme
3.4 DVC router microarchitecture
3.4.1 VC control module
3.4.2 Metric aggregation and congestion avoidance
3.4.3 VC allocation module
3.5 HiBB router microarchitecture
3.5.1 VC control module
3.5.2 VC allocation and output port allocation
3.5.3 VC regulation
3.6 Evaluation
3.6.1 DVC router evaluation
3.6.2 HiBB router evaluation
3.7 Chapter summary
Chapter 4: Virtual bus structure-based network-on-chip topologies
4.1 Introduction
4.2 Background
4.3 Motivation
4.3.1 Baseline on-chip communication networks
4.3.1.1 Transaction-based bus
4.3.1.2 Packet-based NoC
4.3.2 Analysis of NoC problems
4.3.2.1 Multihop problem
4.3.2.2 Multicast problem
4.3.3 Advantages of a transaction-based bus
4.4 The VBON
4.4.1 Interconnect structures
4.4.1.1 Wire delay consideration
4.4.2 The VB mechanism
4.4.2.1 The VB construction
4.4.2.2 VB arbitration
4.4.2.3 Packet format
4.4.2.4 VB operation
4.4.2.5 A simple example for VB communication
4.4.3 Starvation and deadlock avoidance
4.4.4 The VBON router microarchitecture
4.5 Evaluation
4.5.1 Simulation infrastructures
4.5.1.1 Router choices for comparison
4.5.1.2 Network configuration
4.5.1.3 Traffic generation
4.5.2 Synthetic traffic evaluations
4.5.2.1 Single-level 4 4 VBON
4.5.2.2 Hierarchical 8 8 VBON
4.5.3 Real application evaluations
4.5.4 Power consumption analysis
4.5.5 Overhead analysis
4.6 Chapter summary
Part III: Routing and flow Control
Chapter 5: Routing algorithms for workload consolidation
5.1 Introduction
5.2 Background
5.3 Motivation.
5.3.1 Insufficient information
5.3.2 Intraregion interference
5.3.3 Inter-region interference
5.4 Destination-based adaptive routing
5.4.1 Destination-based selection strategy
5.4.1.1 Congestion information propagation network
5.4.1.2 DBSS router microarchitecture
5.4.2 Routing function design
5.4.2.1 Offered path diversity
5.4.2.2 VC reallocation scheme
5.5 Evaluation
5.5.1 Evaluation of routing functions
5.5.2 Single-region performance
5.5.2.1 Synthetic traffic results
5.5.2.2 Application results
5.5.3 Multiple-region performance
5.5.3.1 Results for a small regular region
5.5.3.2 Irregular-region results
5.5.3.3 Summary
5.5.4 CMesh evaluation
5.5.4.1 Configuration
5.5.4.2 Performance
5.5.5 Hardware overhead
5.5.5.1 Wiring overhead
5.5.5.2 Router overhead
5.5.5.3 Power consumption
5.6 Analysis and discussion
5.6.1 In-depth analysis of interference
5.6.2 Design space exploration
5.6.2.1 Number of propagation wires
5.6.2.2 DBSS scalability
5.6.2.3 Congestion propagation delay
5.7 Chapter summary
Chapter 6: Flow control for fully adaptive routing
6.1 Introduction
6.2 Background
6.2.1 Deadlock avoidance theories
6.2.2 Fully adaptive routing algorithms
6.3 Motivation
6.3.1 VC reallocation
6.3.2 Routing flexibility
6.4 Flow control and routing designs
6.4.1 Whole packet forwarding
6.4.2 Aggressive VC reallocation for EVCs
6.4.3 Maintain routing flexibility
6.4.4 Router microarchitecture
6.5 Evaluation on synthetic traffic
6.5.1 Performance of synthetic workloads
6.5.2 Buffer utilization of routing algorithms
6.5.3 Sensitivity to network design
6.5.3.1 SFP ratio
6.5.3.2 VC depth
6.5.3.3 VC count
6.5.3.4 Network size
6.6 Evaluation of PARSEC workloads.
6.6.1 Methodology and configuration
6.6.2 Performance
6.7 Detailed analysis of flow control
6.7.1 The detailed buffer utilization
6.7.1.1 Allowable EVCs
6.7.1.2 Performance analysis
6.7.2 The effect of flow control on fairness
6.8 Further discussion
6.8.1 Packet length
6.8.2 Dynamically allocated multiqueue and hybrid flow controls
6.9 Chapter summary
Appendix: Logical Equivalence of Alg and Alg + WPF
Chapter 7: Deadlock-free flow control for torus networks-on-chip
7.1 Introduction
7.2 Limitations of existing designs
7.2.1 Dateline
7.2.2 Localized bubble scheme
7.2.3 Critical bubble scheme
7.2.4 Inefficiency with variable-size packets
7.3 Flit bubble flow control
7.3.1 Theoretical description
7.3.2 FBFC-localized
7.3.3 FBFC-critical
7.3.4 Starvation
7.4 Router microarchitecture
7.4.1 FBFC routers
7.4.2 VCT routers
7.5 Methodology
7.6 Evaluation on 1D tori (rings)
7.6.1 Performance
7.6.2 Buffer utilization
7.6.3 Latency of short and long packets
7.7 Evaluation on 2D tori
7.7.1 Performance for a 44 torus
7.7.2 Sensitivity to SFP ratios
7.7.3 Sensitivity to buffer size
7.7.4 Scalability for an 88 torus
7.7.5 Effect of starvation
7.7.6 Real application performance
7.7.7 Large-scale systems and message passing
7.8 Overheads: Power and area
7.8.1 Methodology
7.8.2 Power efficiency
7.8.3 Area
7.8.4 Comparison with meshes
7.9 Discussion and related work
7.9.1 Discussion
7.9.2 Related work
7.10 Chapter summary
Part IV: Programming paradigms
Chapter 8: Supporting cache-coherent collective communications
8.1 Introduction
8.2 Message combination framework
8.2.1 MCT format
8.2.2 Message combination example
8.2.3 Insufficient MCT entries
8.3 BAM routing.
8.4 Router pipeline and microarchitecture
8.5 Evaluation
8.5.1 Performance
8.5.1.1 Overall network performance
8.5.1.2 Multicast transaction performance
8.5.1.3 Real application performance
8.5.2 Comparing multicast VN configurations
8.5.2.1 Unicast performance
8.5.2.2 Multicast performance
8.5.3 MCT size
8.5.4 Sensitivity to network design
8.5.4.1 VC count
8.5.4.2 Multicast ratio
8.5.4.3 Destinations per multicast
8.5.4.4 Network size
8.6 Power analysis
8.7 Related work
8.7.1 Message combination
8.7.2 NoC multicast routing
8.8 Chapter summary
Chapter 9: Network-on-chip customizations for message passing interface primitives
9.1 Introduction
9.2 Background
9.3 Motivation
9.3.1 MPI adaption in NoC designs
9.3.2 Optimizations of MPI functions
9.4 Communication customization architectures
9.4.1 Architecture overview
9.4.2 The customized NoC design: VBON
9.4.3 The MPI primitive implementation: MU
9.4.3.1 The architecture of the MU
9.4.3.2 MPI processing unit
9.4.3.3 The collective operation implementation
9.4.3.4 Communication protocols
9.5 Evaluation
9.5.1 Methodology
9.5.2 Experimental results
9.5.2.1 The effect of point-to-point communication: Bandwidth
9.5.2.2 The effect of collective communication: Broadcast operations
9.5.2.3 The effect of collective communication: Barrier operations
9.5.2.4 The effect of collective communication: Reduce operation
9.5.2.5 The effect of application communication: Performance
9.5.2.6 The effect of application communication: Power and scalability
9.5.2.7 Implementation overheads
9.6 Chapter summary
Chapter 10: Message passing interface communication protocol optimizations
10.1 Introduction
10.2 Background
10.2.1 Communication protocols in MPI.
10.2.2 Existing problems.
Notes:
Bibliographic Level Mode of Issuance: Monograph
Includes bibliographical references and index.
Description based on print version record.
ISBN:
9780128009796
0128009799
9780128011782
0128011785
OCLC:
894609116

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account