Efficient Power Estimation Using DSENT for 3D-Mesh on Chip Optic Communication Network

Efficient Power Estimation Using DSENT for 3D-Mesh on Chip Optic Communication Network

Mushtaq Ahmed* Bhavna Ambudkar Akash Yadav

Department of Computer Science and Engineering, Malaviya National Institute of Technology Jaipur 302017, Rajasthan, India

Department of Electronics and Tele-Communication Engineering, Symbiosis Institute of Technology, Pune 412115, Maharashtra, India

Corresponding Author Email: 
mahmed.cse@mnit.ac.in
Page: 
27-35
|
DOI: 
https://doi.org/10.18280/isi.290104
Received: 
6 November 2023
|
Revised: 
30 November 2023
|
Accepted: 
7 December 2023
|
Available online: 
27 February 2024
| Citation

© 2024 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

The Network-on-Chip (NoC) stands as a potent solution for achieving heightened performance, efficient communication, and dependability in the migration of Very-Large Scale Integration (VLSI) architecture toward deep submicron technology, when contrasted with conventional connectivity networks. Considerable research endeavours have been allocated to diverse facets of NoCs, encompassing topology, routing algorithms, traffic behaviours, power management, and fundamental mapping. This paper explores the power consumption efficiency of Parameterized Path-Based, Randomized, Oblivious, Minimal for 3D Mesh (PROM3D) routing and ZXY routing algorithms for various traffic patterns like transpose, bit shuffle, and random traffic with the help of the integrated DSENT network model. The PROM3D routing algorithm selects a path randomly from all possible minimal pathways between the source and destination, whereas ZXY is a layer-based routing method. The Design Space Exploration of Network (DSENT) tool is used with the NoC Interconnect Routing and Applications Modeling (NIRGAM) simulator in experiments to measure the power consumption. The findings indicate that, within the 3D-Mesh environment, the ZXY routing algorithm exhibits a 0.02% of variation in power consumption while in saturation on varying loads for various traffic patterns in comparison to the PROM3D algorithm.

Keywords: 

DSENT, power consumption, 3D-mesh, optical communication, deterministic routing, adaptive routing, oblivious routing

1. Introduction

In response to the challenges posed by System-on-Chip (SoC) architectures, a novel solution known as network-on-chip (NoC) has emerged as a groundbreaking standard for facilitating inter-chip communication within extensive VLSI systems, as highlighted in the work by Bhaskar [1]. This innovative concept of NoC adopts a multi-layered approach that effectively mitigates the complexities associated with design while concurrently enabling the seamless distribution of data. Within the realm of NoC, the arrangement of nodes follows a meticulously planned topology, affording the capability for direct or indirect communication between any two nodes, irrespective of their physical proximity.

These nodes encompass a diverse array of IP-core functionalities, encompassing entities like Digital Signal Processors (DSPs), microprocessors, memory components, and Application-Specific Integrated Circuits (ASICs). Accompanying these IP-core contents, each node is furnished with a router, an instrumental component responsible for the efficient transmission of data packets to neighbouring nodes, thereby fostering optimal communication pathways [2]. The integration of NoC marks a paradigm shift in the way inter-chip communication is conceived, addressing complexities while enhancing the effectiveness of data exchange in modern VLSI systems.

Through the strategic utilisation of on-chip networking networks in lieu of conventional global ad-hoc high-density cable frameworks, Network-on-Chip (NoC) technology ushers in a realm of modularised construction, effectively obviating the requirement for extensive wiring deployments. The systematic approach to wiring inherent in NoCs serves as a cornerstone, affording meticulous control over power consumption thresholds. This, in turn, contributes to a judicious reduction in redundancy, creating an environment conducive to the deployment of highly efficient circuits that, in tandem, curtail latency and amplify bandwidth capabilities, as articulated by Nain et al. [3].

Inter-chip communication in NoC (Network-on-Chip) has various issues, including bandwidth allocation, reliability, latency management, power efficiency, and routing complexity. Low latency and high throughput are important qualities of a NoC design from a performance standpoint. The energy dissipation profile of the interconnect topologies is crucial because this can account for a large amount of the overall energy [4].

The proliferation of NoC research endeavours underscores the growing significance of this paradigm shift. A comprehensive landscape of investigations spans various dimensions of NoC technology, encompassing the gamut of topology design strategies, intricacies of routing algorithms, the characterisation of diverse traffic profiles, the imperative realm of power management, and the development of intricate mapping techniques. The NoC suffers from power consumption caused by Leakage power and switching activity in multi-core circuitry. Therefore, an estimation of power consumed by different elements of NoC can be helpful for determining the power required for needed speed and accuracy from the NoC. Many research works have contributed to this corpus of knowledge, providing invaluable insights that collectively pave the way for the continued advancement and refinement of NoC architectures [5, 6].

As the NoC framework steadily solidifies its position as a pivotal innovation in chip-level communication, the focus on modularization, efficiency optimization, and meticulous research endeavours underscores its pivotal role in shaping the landscape of cutting-edge integrated systems.

With the use of power estimation of hundreds of various network configurations, inefficient or infeasible networks can be immediately detected and eliminated before a thorough analysis is performed [7]. NoC deals with a power issue known as leakage power, which is becoming more prevalent as technology becomes smaller. Because the technology is smaller, the voltage is lower, resulting in increased power leakage and higher power consumption. NoC facilitates communication between various components within a chip, but in current chips, it can consume as much as 30% of the overall power. Therefore, reducing the power consumption of NoCs is critical to meeting the increasing demand for efficient chips [8, 9].

This paper specifically addresses the challenges concerning power consumption in the construction of NoC buildings and focuses on power estimation on NoC architectures for various deterministic and non-deterministic routing algorithms. The Design Space Exploration of Network (DSENT) model is integrated into the simulator framework to measure power consumption accurately.

The remaining sections of the paper are as follows: Section 2 discusses different routing algorithms. The significance of power estimation in NoC design and the various tools for estimating power consumption are discussed in Section 3. In Section 4, we discuss the experimental design and the analysis of the results, and then in Section 5, conclusions are drawn.

2. Literature Survey

Routing constitutes a cornerstone of data management, orchestrating the seamless traversal of information from source to destination while concurrently fine-tuning performance metrics for optimal outcomes [10, 11]. The intricate realm of routing algorithms can be broadly categorised into deterministic routing and adaptive routing paradigms, each bearing distinct attributes and implications. Deterministic routing algorithms, with their straightforward logic, bestow the advantage of lower latency, catering to expedited data transmission. Nevertheless, challenges surface when confronted with intricate traffic patterns, particularly when data flows exhibit varying bandwidth requisites; this is chiefly due to the unchanging nature of routes employed for diverse applications, as evidenced by Chen et al. [12].

On the other hand, adaptive routing strategies embody a concerted effort to curtail network congestion and traffic bottlenecks, thereby elevating the overall efficiency of data dissemination. By dynamically adapting routes based on real-time network conditions, adaptive and oblivious routing endeavours yield pronounced dividends. Hot spots are mitigated, and the network's throughput is notably enhanced. Crafting such adaptive routing algorithms is an intricate endeavour, poised at the nexus of complexity and adaptability. Striking a delicate equilibrium between these two facets emerges as a central challenge in this domain, as underscored by Hu et al. [13].

When delivering data, deterministic routing usually chooses the shortest path between two sites. When source routing is used, network congestion is not taken into account; the starting point simply determines the path. This may cause some paths to become overcrowded and slow down. On the other hand, based on traffic volume, Adaptive and oblivious (like PROM) routing enables routers to select a way that is both relatively short and less congested. Routing can be fully or partially adaptive. The first allows for uniform traffic distribution but may result in deadlock, whereas the second does not allow for adaptability in all directions [14, 15].

In essence, the realm of routing algorithms embodies a duality of deterministic efficiency and adaptive responsiveness. As the digital landscape continues to evolve, the judicious selection and refinement of routing algorithms play a pivotal role in orchestrating optimal communication pathways, thereby cementing their stature as linchpins of contemporary network architecture. The ongoing exploration of this intricate interplay between efficiency and adaptability underscores the dynamic evolution of routing strategies in the face of ever-evolving data demands and network conditions.

2.1 Parameterized path-based, randomized, oblivious, minimal routing (PROM3D) algorithm

The PROM selects a path at random from the set of all possible minimal ladder paths that connect the source nodes and the destination nodes. The algorithm chooses the next available node locally at each hop and continues doing so until it reaches its final destination [16]. Since the path selection decisions are made locally and randomly throughout the pool of possible minimal paths, there are many possible PROM variants [17].

PROM3D is a 3D-Mesh Network-on-Chip (NoC) routing algorithm that extends the parameterized PROM algorithm for 3D-Mesh. It concentrates on reducing network congestion, thereby enhancing performance in terms of latency, albeit at the expense of energy consumption due to the overhead computation for the probabilistic path selection.

The PROM3D routing strategy, known as uniform PROM3D, assigns an equal probability of selection to each of the possible paths originating at the source. The PROM3D algorithm includes a parameter denoted as f. The Uniform PROM3D is a parameterized variant of PROM3D in which the parameter f is set to 0. The ratio of f+x:f+y:f+z, or more simply x:y:z, is used to determine which node will be connected to the source node as the subsequent hop. However, as the packet advances to the next hop, the selection ratio is determined by the packet's ingress axis, whether it is in the x, y, or z-ingress. For example, if the packet is in the x-ingress, the direction or next-hop selection is determined by the x+f:y:z ratio. For packets in the y or z-ingress, similar ratio calculations are used. For packets entering at the y-ingress node, the ratio is x:y:z+f, while for packets entering at the z-ingress node, it is x:z:y+f. The following calculations are performed to estimate the value of parameter f. Let in a n*n*n D-Mesh NoC:

  • Source nodes are (Src1, Src2, Src3),
  • Destination nodes are (Dest1, Dest2, Dest3),
  • The value of fmax is 1,

and

$\alpha=\mid Dest x-\operatorname{Src}_x \mid$,        (1)

$\beta=\mid Dest_y-S r c_y \mid$,              (2)

$\gamma=\mid Dest_z-S r c_{z y} \mid$           (3)

The parameter function f values for source node and intermediary nodes can then be determined as follows:

$f=\frac{(\alpha+1) \times(\beta+1) \times(\gamma+1)}{\operatorname{Num}_{rows} \times N m_{cols} \times Num_{slices}} \times f_{\max }$          (4)

Figure 1. Example of (a) Deadlock in routing without VC, and (b) Use of virtual channel for deadlock free routing between Source (S) and Destination (D)

The PROM3D routing algorithm has many advantages over other Dimension Order Routing (DOR) algorithms. In particular, under random and transpose traffic, PROM3D exhibits improved average latency and throughput with a higher percentage of offered load. A disadvantage of the PROM3D method is that, in comparison to other routing algorithms, it does not perform well in terms of power efficiency despite its enhanced latency performance due to computational overhead. As shown in Figure 1 two virtual channels are used in PROM3D to prevent deadlocks [18, 19].

2.2 ZXY routing algorithm

Addressing the challenge of thermal emissions within three-dimensional Network-on-Chip (NoC) architectures, the ZXY routing technique emerges as an effective solution, as discussed in the research by An et al. [20]. This innovative approach, known as the ZXY method, presents significant enhancements over the traditional XY routing technique, as mentioned by Cai et al. [21]. The ZXY method introduces a layer-based routing scheme that successfully mitigates thermal issues, all the while minimizing the need for additional virtual channels (VCs) or introducing the risk of packet misrouting.

There are substantial thermal problems with 3D NoCs due to the high power density and restricted heat dissipation capabilities. Packets are directed by the routing algorithm to routers with better cooling or less traffic. It accomplishes this by employing techniques that evenly distribute generated heat across the chip area, utilizing both horizontal and vertical traffic distribution methods [22].

The pivotal strength of the ZXY method is its capacity to establish adaptable routing paths that effectively handle heat dissipation while enhancing overall performance. This advancement is achieved by strategically integrating layer-based routing strategies, which realized the three-dimensional nature of the architecture. The experiment shows improvement in network in terms of lower latency and better throughput while using the low packet rate of injection.

The evaluation of the ZXY routing's efficacy in ensuring deadlock-free operation is conducted through a rigorous assessment employing a restricted turn model. A remarkable facet of this evaluation is its independence from the requirement for hardware support for virtual channels within NoC routers [21]. In essence, the ZXY routing method aligns with the evolving demands of 3D NoC design, offering a holistic and effective solution to thermal challenges without the imposition of additional hardware complexities.

Figure 2 illustrates the constrained based restricted paths in ZXY dimensions to assure no cycle can occur in any dimensional path in the stackable architecture. Notably, ZXY routing, characterized by its Z-first approach, orchestrates data packet movement by initially directing them through layers (slices) to reach the destination slice before subsequently navigating rows and columns to arrive at the intended tile (node). This distinctive routing paradigm involves a reordering of dimensions, enabling the exploitation of alternate shortest paths within each layer of a 3D NoC, as visually depicted in Figure 3.

Through this innovative dimension reordering strategy, ZXY routing optimizes path utilization, improvising the overall routing efficiency. This approach not only holds promise in mitigating thermal issues but also enhances overall routing performance within the 3D NoC architecture.

Figure 2. The turn model of the ZXY routing algorithm. Dotted Arrows are restricted turns [23]

Figure 3. Multiple available minimal paths in XYZ-ZXY routing (a) selection between X and Y, and (b) selection between Z and XY [23]

3. Power Estimation

Communication dependability is regarded as a high barrier, and recent studies have been conducted on its dependency on energy consumption. One important factor affecting efficiency is the dissipation of power in stackable architecture. Differential power allocation is a crucial design barrier and has been considered necessary in the development of the core in the stackable System-on-Chip (SoC) architecture using NoC. Subnano meter architecture is allowing the high Gate density and the associated communication costs have led to the NoC emerging as an effective network communication solution.

Initially, high performance (high penetration networks and low latency) was a major goal. For this reason, most of the previous works in NoC projects focus on performance parameters only [24-26]. However, with the growing demand for network bandwidth and higher throughput, the power used by the network connection is also becoming a major concern.

With very high-density integrated networking, networks consume a large portion of the system's total capacity. Power dissipation has been identified as a critical obstacle in SoC design. It is important to get detailed information on energy efficiency at the beginning of the small design cycle. The total power consumption consists of two main elements, namely dynamic and static (leakage) power. The total power consumed by the CMOS-based circuit [27] is determined as

$P=\alpha \times C \times V_{D D}^2 \times f_{c l k}+P_{\text {static }}$            (5)

where,

  • $f_{c l k}$ is the clock frequency,

  • α is the switching factor,

  • C is the capacitance,

  • $V_{D D}$ is the core voltage,

  • $P_{\text {static }}$ is the static power

Apparently in the CMOS, it is observed that when the capacitance is reduced by 30%, the power dissipation is reduced by 50%; similarly the 30% reduction of voltage leads to the 50% further reduction of the power. Static power is the constant use of force on the Gates due to the receding position from the source to the ground, except for the position of the Gates and the switch function [28]. The static power (Pstatic) scattered on the router and links can be stated as:

$P_{\text {static }}=P_{\text {buffer }}+P_{\text {arbiter }}+P_{\text {crossbar }}+P_{\text {link }}$          (6)

where,

  • $P_{\text {buffer }}$ is the sum of buffer read and buffer write energy,

  • $P_{\text {arbiter }}$ is arbitration energy,

  • $P_{\text {crossbar }}$ is crossbar traversal energy

  • $P_{\text {link }}$ is link traversal energy.

Higher scalability, power consumption, and perceptible delays are among the challenges faced by the conventional electric on-chip network (NoC). As a solution, optical NoC have emerged as a next-generation option, with better speeds and reduced power requirements. Overall, the optical NoC succeeds over the electrical version in terms of latency, energy economy, and Bit Error Rate (BER) [29].

Signal propagation in Optical NoCs is carried out through waveguiding, i.e., the optical signals are guided or propagated through the optical interconnects within the chip. This typically involves the use of integrated optical waveguides made of transparent materials such as silicon or silicon dioxide. These waveguides are designed to have specific dimensions and refractive index profiles that enable the propagation of optical signals with minimal attenuation and dispersion and also offer higher bandwidth, low power consumption, and potential for parallel communication. The waveguides used are planar in nature and can be fabricated using thin film deposition and etching techniques, allowing for compact integration with other components. The optical signals in the waveguides can carry data encoded in the form of light pulses or modulated optical signals, as shown in Figure 4. Optical NoCs can achieve faster and more efficient data transfer compared to traditional electronic interconnects, which are limited by the speed and capacity of electrical signals.

Figure 4. Example of (a) Basic optical data transmission and (b) Optical transmission steps [30]

3.1 Design Space Exploration of Network (DSENT)

DSENT is a system library for photonics and electronics that lets you rapidly evaluate the area and power of optoelectronic on-chip interconnects across different levels of hierarchy [31]. The overall modeling accuracy is always traded-off against the amount of user input required. The DSENT framework enables high modeling flexibility by employing circuit and logic-level approaches to reduce the number of input data points without risking modeling precision [29]. SPICE modulo3-based current and voltage for the saturated and unsaturated gates are used for power estimation.

4. Experimental Evaluations

A small 3×3×3 3D-Mesh topology is used in the experiment. The size of the flow control units (flit) is five bytes, with one byte serving as the head and the other four as the payload. The number of VCs employed is two to prevent the deadlock condition while using the adaptive PROM3D routing from flits transmission from different sources to corresponding destinations in our simulation. In any adaptive routing which may lead to have the deadlock in NoC would require to have Pi-1 virtual channels, where P is the number of ports at any node i, to ensure deadlock free routing.

The NoC Interconnect Routing and Applications Modeling (NIRGAM) simulator is built on the modular and extensible SystemC library and is used in experimental setup for performance evaluation. This simulator provides significant assistance for experimenting with many elements of NoC architecture, allowing alterations at each level, including topology, switching techniques, virtual channels, buffer settings, routing mechanisms, and traffic modeling for various applications [32].

In our experimental simulation work, we used optic communication using compatible switches at the nodes which are placed at a uniform distance and communicate with each other using waveguided optical communication (OC). The global clock frequency is set at 1GHz, and 90nm Gate technology is considered in the experiments. VDD is kept at 1.2 volts. DSENT framework is used with the simulator for performance measures for different routing algorithms and traffic patterns.

In simulation, various statistics like the number of buffer reads/writes, arbitrations, VC writes/reads, crossbar traversal, and links traversal in each clock cycle are collected. The power dissipation is calculated by passing the above-collected statistics along with the various architectural parameters like flit size, the number of input and output ports, etc., to DSENT power models for buffers, crossbars, arbiters, and links using Eq. (6). Optical communication is used for robust transmission [33].

To create the real-time cycle accurate scenario, Bursty traffic, but without any fault, is used for evaluating the power consumption in Watts. Results for the Bursty data with a burst length of four and with an interval of three are used for various loads ranging from 20% to 100% using bit-shuffle, random, and transpose patterns with the variation of 10%. The generalized working of bit-shuffle and transpose traffic pattern is shown in Figure 5. The simulation was made for the ten times and average of them was taken as the final result to minimize the error.

Figure 5. Generalized example of (a) bit-shuffle traffic pattern, and (b) transpose traffic pattern Mesh NoC

Figure 6. Integration of DSENT model with the simulator

Figure 7. Two VCs are used to avoid deadlock in adaptive routing, and power is estimated at each stage as shown

Figure 6 demonstrates the overall structure of the experimental setup of the simulation. We have integrated DSENT models with the simulator and finally evaluated the dynamic and leakage power for specified configurations. Architectural parameters are defined at the initial stage, and performance stats are provided at the run-time to DSENT models to measure power consumption.

Figure 7 shows the traversal of the flit along the simulation route and the stages that cause power distribution. The following steps are used in the overall process for adaptive routing:

(1) The journey of a flit in the simulator begins with its entry in the input channel of a tile. Every flit has a VCid associated with it and is stored at the end of the First in First Out (FIFO) buffer of that VC (adaptive routing). At this point, the energy associated with buffer write is consumed. To keep track of this, the number of buffer writes is increased by one whenever a flit is stored in the FIFO.

(2) The next step consists of selecting one of the VCs from all the requesting ones associated with each input channel and then reading the flit at the front of the FIFO for sending the route request to the routing logic circuit. An OC arbiter is needed at every input port/channel, each having inputs equal to the number of VCs at each input channel.

(3) The routing logic sets the output direction of the flit. After this, again, a VC is selected from all at each input channel, and there is a need for a matrix arbiter with each input channel having inputs equal to the number of VCs. The number of arbitrations is increased by one over here.

(4) There is no need for the second stage of switch allocation in the simulation as it ensures that there will not be any conflict among the input channels for the same optical communication (OC) by maintaining a separate register r0 or r1 for each input channel at each OC. After the first stage of the switch allocation, the flit moves for VC allocation. It maintains a queue with entries for available VCs of the output direction tile. Energy is dissipated in reading and writing the queue.

(5) Next, the flit is removed from the input buffer. This corresponds to the buffer read count.

(6) Then, the flit moves to the output channel via crossbar traversal. Considering the power dissipation at the crossbars. The number of crossbar traversals is increased by one.

(7) Finally, the flit moves from one tile to another with an incremental counter.

In simulation, VCs are not used in Non-adaptive routing algorithms. The overall power consumption is dissected into the power usage of individual resources, as depicted in Eq. (6). Subsequently, an architectural simulator is employed to provide event counts at the network or router level, including a router or link traversals of these components, for calculating the power consumption using DSENT. We use different traffic patterns with varying loads to measure the power consumption efficiency of the routing algorithms. This performance analysis is discussed in the next section.

4.1 Performance analysis

We increase the load with a variation of 10% for different traffic patterns for adaptive and deterministic routing algorithms. Figures 8, 9, and Table 1 show the comparison of both algorithms for transpose, bit-shuffle and random traffic patterns, respectively.

It is clear from the results that both algorithms follow a similar structure for all three traffic patterns. The ZXY algorithm performs better in power dissipation at various load percentages. The efficiency of the PROM3D method relies on the Mesh's size and the number of VCs.

Figure 8. Power vs Load comparison of algorithms for transpose traffic pattern

Figure 9. Power vs Load comparison of algorithms for bit-shuffle traffic pattern

Table 1. Power vs Load comparison of algorithms for random traffic patterns

Load (%)

Power (in Watt)

PROM3D

ZXY

20

0.125438078

0.124258993

30

0.181525089

0.181982512

40

0.230093148

0.229821634

50

0.282071504

0.278701311

60

0.338838129

0.339066277

70

0.401450965

0.396807164

80

0.43545976

0.432890392

100

0.446600616

0.452374018

At each level, there is the computation of intermediate nodes whose buffer capacity is measured and based on the availability of buffer and congestion status, the packets (flits) move forward so they can handle the optimal path in congested routes. PROM3D is efficient for latency and throughput [34] as it always finds minimal paths at each hop by choosing the higher probability of less congestion paths and larger buffer on the cost of increased power consumption due to computational overhead and use of VCs. As it needs a minimum of two VCs so the number of buffer reads/writes operations shall be higher.

Unlike the ZXY routing, the expected power disputation in the PROM3D routing increases due to the fact that it requires at least two virtual channels to avoid the deadlock. The variation of 0.02% increase in power dissipation is observed in the transpose and Bit Shuffle traffic pattern. Power dissipation for random traffic in both algorithms is similar. However, the congestion in the traffic is reflected in the transpose and bit traffic pattern. There is no variation in the power while using the reduced load up to 30%. The network gets saturated after 60% load. In ZXY routing, packets are directly transferred from one slice to another, which results in low power dissipation in 3D Mesh NoC.

For random traffic patterns, where the source and target are not fixed and altered for every packet injection in NoC, adaptive routing gives more path diversity. and gives power consumption similar to dimension order routing due to higher congestion. It is observed that overall power dissipation in random traffic scenarios is more than that to both transpose and bit-shuffle traffic.

5. Conclusion and Future Work

Power consumption and traffic are the two critical concerns in the design of NoC architecture. In this paper, we compared the performance of deterministic and non-deterministic routing algorithms by analyzing the power consumption in 3D-Mesh NoC architecture for transpose, bit-shuffle, and random traffic patterns in a Bursty application with varying load (%). The DSENT power model is incorporated in the simulation for power estimation. It forms the essential element for the accurate analysis of applying NoC routing algorithms, topologies, traffic patterns, etc. The results show that the ZXY algorithm is more efficient (less power consumption) towards power dissipation compared to the PROM3D algorithm (high power consumption). In simulation results, PROM3D shows up to 0.02% more power consumption compared to the ZXY routing algorithm. In the future extension of this work, other traffic patterns like butterfly, and bit-reversal. It can also be implemented to test the power consumption of the different network topologies under various load conditions.

  References

[1] Bhaskar, A.V. (2022). A new method of power analysis of network-on-chip using analytical modelling. In 2022 Seventh International Conference on Parallel, Distributed and Grid Computing (PDGC), IEEE, Solan, Himachal Pradesh, India, pp. 222-227. https://doi.org/10.1109/PDGC56933.2022.10053136

[2] Alimi, I.A., Patel, R.K., Aboderin, O., Abdalla, A.M., Gbadamosi, R.A., Muga, N.J., Pinto, A.N., Teixeira, A.L. (2021). Network-on-chip topologies: Potentials, technical challenges, recent advances and research direction. Network-on-Chip-Architecture, Optimization, and Design Explorations. https://doi.org/10.5772/intechopen.97262

[3] Nain, Z., Ali, R., Anjum, S., Afzal, M.K., Kim, S.W. (2020). A network adaptive fault-tolerant routing algorithm for demanding latency and throughput applications of network-on-a-chip designs. Electronics, 9(7): 1076. https://doi.org/10.3390/electronics9071076

[4] Paramasivam, K. (2015). Network on-chip and its research challenges. ICTACT Journal on Microelectronnics, 1(02): https://doi.org/10.21917/ijme.2015.0015

[5] Tomita, T., Kurokawa, Y., Fukushi, M. (2021). A fault-tolerant routing method using bus functions in two-dimensional torus network-on-chips. In 2021 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), Penghu, Taiwan, pp. 1-2. https://doi.org/10.1109/ICCE-TW52618.2021.9602956

[6] Tatas, K., Sawa, S., Kyriacou, C. (2014). Low-cost fault-tolerant routing for regular topology nocs. In 2014 21st IEEE International Conference on Electronics, Circuits and Systems (ICECS), Marseille, France, pp. 566-569. https://doi.org/10.1109/ICECS.2014.7050048

[7] Zhang, H., Chen, Y., Huang, Z., Xia, C., Liang, J., Gu, H. (2021). Comparative analysis of simulators for optical network-on-chip (ONoC). In 2021 12th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), IEEE, Xi'an, China, pp. 19-23. https://doi.org/10.1109/PAAP54281.2021.9720307

[8] Farrokhbakht, H., Hessabi, S., Jerger, N.E. (2022). Power-gating in NoCs. In Advances in Computers. Elsevier, 124: 319-356. https://doi.org/10.1016/bs.adcom.2021.11.013

[9] Bijapur, A., Shirahatti, S.S., Jayagowri, R. (2020). Power optimization techniques for NOC. International Journal of Engineering Research & Technology (IJERT), 9(7): 127-132.

[10] Fernandes, R., Marcon, C., Cataldo, R., Sepúlveda, J. (2020). Using smart routing for secure and dependable NoC-based MPSoCs. IEEE/ACM Transactions on Networking, 28(3): 1158-1171. https://doi.org/10.1109/TNET.2020.2979372

[11] Azad, S.P., Niazmand, B., Janson, K., Kogge, T., Raik, J., Jervan, G., Hollstein, T. (2017). Comprehensive performance and robustness analysis of 2D turn models for network-on-chips. In 2017 IEEE International Symposium on Circuits and Systems (ISCAS), Baltimore, MD, USA, pp. 1-4. https://doi.org/10.1109/ISCAS.2017.8050634

[12] Chen, Z., Zhang, Y., Peng, Z., Jiang, J. (2019). A deterministic-path routing algorithm for tolerating many faults on wafer-level NoC. In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), IEEE, Florence, Italy, pp. 1337-1342. https://doi.org/10.23919/DATE.2019.8714948

[13] Hu, C., Meyer, M.C., Jiang, X., Watanabe, T. (2020). A fault-tolerant hamiltonian-based odd-even routing algorithm for network-on-chip. In 2020 35th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), IEEE, Nagoya, Japan, pp. 217-222.

[14] Puthal, M.K., Singh, V., Gaur, M.S., Laxmi, V. (2011). C-Routing: An adaptive hierarchical NoC routing methodology. In 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip, Hong Kong, China, pp. 392-397. https://doi.org/10.1109/VLSISoC.2011.6081616

[15] Zhang, W., Ye, Y. (2019). An approximate thermal-aware Q-routing for optical NoCs. In 2019 IEEE/ACM Workshop on Photonics-Optics Technology Oriented Networking, Information and Computing Systems (PHOTONICS), Denver, CO, USA, pp. 22-27. https://doi.org/10.1109/PHOTONICS49561.2019.00009

[16] Cho, M.H., Lis, M., Shim, K.S., Kinsy, M., Devadas, S. (2009). Path-based, randomized, oblivious, minimal routing. In Proceedings of the 2nd International Workshop on Network on Chip Architectures, pp. 23-28. https://doi.org/10.1145/1645213.1645220

[17] El Sayed, M.S., Salem, S.A., Awadalla, M.H., Mostafa, A.M. (2012). A power efficient, oblivious, path-diverse, minimal routing for mesh-based networks-on-chip. International Journal of Computer Science Issues (IJCSI), 9(2): 339-347.

[18] Sadat-Mehrizi, H., Sadat-Mehrizi, M., Zeinali, E. (2018,). An algorithm for tolerating multiple faulty channels in 2D NoCs. In 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, pp. 406-414. https://doi.org/10.1109/IEMCON.2018.8614964

[19] Xiang, D., Luo, W. (2011). An efficient adaptive deadlock-free routing algorithm for torus networks. IEEE Transactions on Parallel and Distributed Systems, 23(5): 800-808. https://doi.org/10.1109/TPDS.2011.145

[20] An, J., You, H., Sun, J., Cao, J. (2021). Fault tolerant XY-YX routing algorithm supporting backtracking strategy for NoC. In 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, pp. 632-635. https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00092

[21] Cai, Y., Xiang, D., Ji, X. (2018). Deadlock-free adaptive routing based on the repetitive turn model for 3d network-on-chip. In 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), Melbourne, VIC, Australia, pp. 722-728. https://doi.org/10.1109/BDCloud.2018.00109

[22] Ebrahimi, M., Chang, X., Daneshtalab, M., Plosila, J., Liljeberg, P., Tenhunen, H. (2013). DyXYZ: Fully adaptive routing algorithm for 3D NoCs. In 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, IEEE, Belfast, UK, pp. 499-503. https://doi.org/10.1109/PDP.2013.80

[23] Taheri, E., Patooghy, A., Mohammadi, K. (2016). XYZ-ZXY: A minimal routing algorithm for dynamic thermal management in 3D NoCs. In 2016 24th Iranian Conference on Electrical Engineering (ICEE), IEEE, Shiraz, Iran, pp. 1539-1544. https://doi.org/10.1109/IranianCEE.2016.7585766

[24] Reddy, B.N.K., Kishan, D., Vani, B.V. (2019). Performance constrained multi-application network on chip core mapping. International Journal of Speech Technology, 22: 927-936. https://doi.org/10.1007/s10772-019-09636-3

[25] Johari, S., Sehgal, V.K. (2015). Master-based routing algorithm and communication-based cluster topology for 2D NoC. The Journal of Supercomputing, 71(11): 4260-4286. https://doi.org/10.1007/s11227-015-1521-x

[26] Reddy, T.N.K., Swain, A.K., Singh, J.K., Mahapatra, K.K. (2014). Performance assessment of different network-on-chip topologies. In 2014 2nd International Conference on Devices, Circuits and Systems (ICDCS), IEEE, Coimbatore, India, pp. 1-5. https://doi.org/10.1109/ICDCSyst.2014.6926188

[27] Jordan, M.G., Korol, G., Knorst, T., Rutzig, M.B., Beck, A.C.S. (2023). Energy-aware fully-adaptive resource provisioning in collaborative CPU-FPGA cloud environments. Journal of Parallel and Distributed Computing, 176: 55-69. https://doi.org/10.1016/j.jpdc.2023.02.009

[28] Chowdhury, Z.I., Rahaman, M.I., Islam, S.M.M., Kiber, M.A. (2014). Effect of technology scaling on leakage power consumption in on-chip switches. In 2014 International Conference on Informatics, Electronics & Vision (ICIEV), IEEE, Dhaka, Bangladesh, pp. 1-5. https://doi.org/10.1109/ICIEV.2014.6850689

[29] Balti, M., Jemai, A. (2021). Performance survey of classic and Optic network‐on‐chip. IET Circuits, Devices & Systems, 15(4): 393-402. https://doi.org/10.1049/cds2.12025

[30] Bergman, K., Carloni, L.P., Biberman, A., Chan, J., Hendry, G. (2014). Photonic network-on-chip design. Springer, New York. https://doi.org/10.1007/978-1-4419-9335-9

[31] Sun, C., Chen, C.H.O., Kurian, G., Wei, L., Miller, J., Agarwal, A., Peh, L.S., Stojanovic, V. (2012). DSENT-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, Lyngby, Denmark, pp. 201-210. https://doi.org/10.1109/NOCS.2012.31

[32] Khichar, J., Choudhary, S., Mahar, R. (2017). Fault tolerant dynamic XY-YX routing algorithm for network on-chip architecture. In 2017 International Conference on Intelligent Computing and Control (I2C2), IEEE, Coimbatore, India, pp. 1-6. https://doi.org/10.1109/I2C2.2017.8321939

[33] Papamichael, M.K., Cakir, C., Chia-Hsin, C.S., Cheny, O., Ho, J.C., Mai, K., Pehy, L.S., Stojanovic, V. (2015). Delphi: A framework for rtl-based architecture design evaluation using dsent models. In 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Philadelphia, PA, USA, pp. 11-20. https://doi.org/10.1109/ISPASS.2015.7095780

[34] Ahmed, M., Kumar, R. (2012). Parameterized path-based, randomized, oblivious, minimal routing in 3D mesh NoC. In TENCON 2012 IEEE Region 10 Conference, Cebu, Philippines, pp. 1-6. https://doi.org/10.1109/TENCON.2012.6412341