摘要 :
This paper presents the SPIN micro-network that is a generic, scalable interconnect architecture for system on chip. The SPIN architecture relies on packet switching and point-to-point bi-directional links between the routers impl...
展开
This paper presents the SPIN micro-network that is a generic, scalable interconnect architecture for system on chip. The SPIN architecture relies on packet switching and point-to-point bi-directional links between the routers implementing the micro-network. SPIN gives the system designer the simple view of a single shared address space and provides a variable number of VCI compliant communication interfaces for both initiators (masters) and targets (slaves). Performance comparisons between a classical PI-bus based interconnect and the SPIN micro-network are analyzed.
收起
摘要 :
This paper presents the SPIN micro-network that is a generic, scalable interconnect architecture for system on chip. The SPIN architecture relies on packet switching and point-to-point bi-directional links between the routers impl...
展开
This paper presents the SPIN micro-network that is a generic, scalable interconnect architecture for system on chip. The SPIN architecture relies on packet switching and point-to-point bi-directional links between the routers implementing the micro-network. SPIN gives the system designer the simple view of a single shared address space and provides a variable number of VCI compliant communication interfaces for both initiators (masters) and targets (slaves). Performance comparisons between a classical PI-bus based interconnect and the SPIN micro-network are analyzed.
收起
摘要 :
This paper presents the SPIN micro-network that is a generic, scalable interconnect architecture for system on chip. The SPIN architecture relies on packet switching and point-to-point bi-directional links between the routers impl...
展开
This paper presents the SPIN micro-network that is a generic, scalable interconnect architecture for system on chip. The SPIN architecture relies on packet switching and point-to-point bi-directional links between the routers implementing the micro-network. SPIN gives the system designer the simple view of a single shared address space and provides a variable number of VCI compliant communication interfaces for both initiators (masters) and targets (slaves). Performance comparisons between a classical P1-bus based interconnect and the SPIN micronetwork are analyzed.
收起
摘要 :
In this paper, we discuss a chip-package integration and miniaturization for managing high power supply demand of microprocessor. To deliver the high performance system with granular features, significant improvement of physical d...
展开
In this paper, we discuss a chip-package integration and miniaturization for managing high power supply demand of microprocessor. To deliver the high performance system with granular features, significant improvement of physical design of power supply network were implemented. There was also a redesign to the hierarchy power management including clock gating, voltage pre-core regulation and dynamic frequency scaling. The optimized performance metrics of the wafer integrations are presented with measured implications and simulations. The new generation microprocessor is demonstrated with a highly configurable power solution of high power density and fast response that increases the energy efficiency and peak performance.
收起
摘要 :
In this paper, we discuss a chip-package integration and miniaturization for managing high power supply demand of microprocessor. To deliver the high performance system with granular features, significant improvement of physical d...
展开
In this paper, we discuss a chip-package integration and miniaturization for managing high power supply demand of microprocessor. To deliver the high performance system with granular features, significant improvement of physical design of power supply network were implemented. There was also a redesign to the hierarchy power management including clock gating, voltage pre-core regulation and dynamic frequency scaling. The optimized performance metrics of the wafer integrations are presented with measured implications and simulations. The new generation microprocessor is demonstrated with a highly configurable power solution of high power density and fast response that increases the energy efficiency and peak performance.
收起
摘要 :
The communication is predicted to pass the computation as the limiting factor of performance of complex digital circuits. The most common communication medium is a shared bus. The contemporary buses have evolved as the requirement...
展开
The communication is predicted to pass the computation as the limiting factor of performance of complex digital circuits. The most common communication medium is a shared bus. The contemporary buses have evolved as the requirements for the communication have increased. The new properties of the buses affect also the arbitration schemes. In this paper, we present a study on distributed arbitration with an advanced on-chip bus, HIBI. MPEG-4 video encoder is used as a test case. The compared arbitration algorithms are round-robin, priority, their combination, and random, all with varying parameters. They are compared with different bus utilization ranging from 3% to 75% and limited transfer length. Results show that the arbitration algorithm may account for up to 60% increase in performance and different transfer lengths may increase the performance by 350%.
收起
摘要 :
Chip MultiProcessors (CMPs) will have dark silicon or frequently deactivated areas in a chip, as technology continues to scale down, due to power dissipation. In this work we estimate the influences of deactivated cores on perform...
展开
Chip MultiProcessors (CMPs) will have dark silicon or frequently deactivated areas in a chip, as technology continues to scale down, due to power dissipation. In this work we estimate the influences of deactivated cores on performance of network-on-chips (NoCs). Even when a chip has a two-dimensional mesh topology, a deactivated core that includes an on-chip router makes topology irregular. We thus assume that a topology-agnostic deadlock-free routing is used with a moderate number of virtual channels in such CMPs. Thorough cycle-accurate network simulations of a 2-D mesh NoC, we found that (1) indeed a deactivated core degrades the performance to some extent in terms of throughput, but (2) latency is not increased or even reduced when a deactivated core is located in the corner of a mesh. Hence, we recommend choosing a corner core for deactivation to maintain the performance of NoCs.
收起
摘要 :
Chip MultiProcessors (CMPs) will have dark silicon or frequently deactivated areas in a chip, as technology continues to scale down, due to power dissipation. In this work we estimate the influences of deactivated cores on perform...
展开
Chip MultiProcessors (CMPs) will have dark silicon or frequently deactivated areas in a chip, as technology continues to scale down, due to power dissipation. In this work we estimate the influences of deactivated cores on performance of network-on-chips (NoCs). Even when a chip has a two-dimensional mesh topology, a deactivated core that includes an on-chip router makes topology irregular. We thus assume that a topology-agnostic deadlock-free routing is used with a moderate number of virtual channels in such CMPs. Thorough cycle-accurate network simulations of a 2-D mesh NoC, we found that (1) indeed a deactivated core degrades the performance to some extent in terms of throughput, but (2) latency is not increased or even reduced when a deactivated core is located in the corner of a mesh. Hence, we recommend choosing a corner core for deactivation to maintain the performance of NoCs.
收起
摘要 :
The building block of a Network-on-Chip (NoCs) is its router. It is responsible to switch the channels which forward the messages exchanged by the cores attached to the NoC, and the costs and performance of the NoC strongly depend...
展开
The building block of a Network-on-Chip (NoCs) is its router. It is responsible to switch the channels which forward the messages exchanged by the cores attached to the NoC, and the costs and performance of the NoC strongly depends on the router architecture. In this paper, we present RASoC, a router architecture intended to be used in the building of low area overhead NoCs for embedded systems. The difference among RASoC and current routers relies on its implementation as a parameterized VHDL model, which improve the reuse of RASoC in the synthesis of NoCs with different sizes, and allows the tuning of the NoC parameters in order to meet the requirements of the target application. The paper presents details of RASoC architecture, the structure of the VHDL model and some experimental results which show the scalability of the soft-core and its costs.
收起
摘要 :
In this paper, we examine the design process of a Network on-Chip (NoC) for a high-end commercial System on-Chip (SoC) application. We present several design choices and focus on the power optimization of the NoC while achieving t...
展开
In this paper, we examine the design process of a Network on-Chip (NoC) for a high-end commercial System on-Chip (SoC) application. We present several design choices and focus on the power optimization of the NoC while achieving the required performance. Our design steps include module mapping and allocation of customized capacities to links. Unlike previous studies, in which point-to-point, per-flow timing constraints were used, we demonstrate the importance of using the application end-to-end traversal latency requirements during the optimization process. In order to evaluate the different alternatives, we report the synthesis results of a design that meets the actual throughput and timing requirements of the commercial SoC. According to our findings, the proposed technique offers up to 40% savings in the total router area and a reduction of up to 49% in the inter-router wiring area.
收起