Time-to-Digital Converter-Based Maximum Delay Sensor for On-Line Timing Error Detection in Logic Block of Very Large Scale Integration Circuits

In this paper, we present a time-to-digital converter (TDC)-based maximum delay sensor (MDS) for on-line timing error detection in the logic block of very large scale integration (VLSI) circuits. The MDS captured the maximum propagation delay of the target end point for on-line timing error detection. Because the MDS was TDC-based, the resolution was high. In addition, the periodic on-line maximum delay capturing for on-line timing error detection using an MDS did not interrupt normal operation. Because the MDS was a small digital circuit, it could be easily inserted into the logic blocks of high-speed and low-power processors and systems-on-chip (SOCs). With LTSPICE simulation using 45 nm metal gate/high-K/strained-Si of the predictive technology model, the behavior of the proposed analyzer was confirmed. The results showed that the area overhead is 34.9% on average.


Introduction
Smarter infrastructures, cloud computing, and large data analyses require very large scale integration (VLSI) circuits with a fast clock frequency and low power. (1,2) Highspeed and low-power VLSI circuits require further scaling of complementary metal oxide semiconductor (CMOS) technology. However, in scaled process technology, problems due to aging such as negative bias temperature instability (NBTI) become serious. (3) Aging causes timing-related circuit failure during normal operation.
Because the effect of aging strongly depends on the degradation of paths inside a circuit, monitoring delay degradation obtained from periodic on-line delay measurements is useful for predicting failure due to aging. (4) Additionally, periodic on-line delay measurement significantly improves the performance by enabling close to the bestmargin behavior or realizing significantly lower power by adaptive voltage scaling. (5) Furthermore, on-line delay measurements can record the history of delays along internal paths. Therefore, it is also useful for chip diagnostics.
In this paper, we present a time-to-digital converter (TDC)-based maximum delay sensor (MDS) for on-line timing error detection in the logic block of VLSI circuits. An MDS captures the maximum propagation delay, which is an important parameter for timing error detection. Because an MDS is TDC-based, the resolution is high. In addition, the periodic on-line maximum delay capturing for on-line timing error detection using MDS does not interrupt normal operation. Because an MDS is a small digital circuit, it can be easily inserted into the logic blocks of high-speed and low-power processors and systems-on-chip (SOCs).
The rest of the paper is organized as follows. In § 2, we discuss the basics of TDCs. In § 3, we explain the target materials and the details of MDSs. In § 4, we describe the results of evaluations. After the discussion in § 5, we present our conclusions in § 6.

Materials and Methods
The target block of an MDS is a logic block. Therefore, the target materials are those used for CMOS technologies or fin-shaped field-effect transistor (FinFET) CMOS technologies. (6,7) In this section, we explain the details of MDSs. MDS is based on an on-chip delay measurement circuit. In § 2.1, we give a brief explanation of the onchip delay measurement circuit. In § 2.2, we explain the timing error detection with the collected maximum delay sequence for an MDS.

Basics of TDC
The MDS was based on the monotonic TDC, which is a popular on-chip delay measurement circuit. (8) Figure 1(a) shows an example of the monotonic four-stage TDC architecture. A TDC is composed of four positive edge-triggered D-type flipflops, an upper delay line, and a lower clock line. The delay line was composed of three buffers with uniform delays. Each stage of the TDC was composed of a flip-flop and a buffer. Two input transition signals were launched from start and stop inputs. The TDC measured the time interval between a positive transition from start and a positive transition from stop. The time resolution was equal to the delay of a buffer. The thermometer code Q 0 Q 1 Q 2 Q 3 indicated the time interval. Figure 1(b) shows the timing chart of the basic TDC when the time interval between a transition signal from start and a transition signal from stop was 1. In this timing chart, τ 0 = τ 1 = τ 2 = 1. Table 1 shows the relationship between the time interval ∆t and Q 0 Q 1 Q 2 Q 3 when all buffer delays were 1. The relationship between ∆t and Q 0 Q 1 Q 2 Q 3 was linear when the thermometer code was smaller than 3 and all the buffers had a uniform delay.
The range of measurement of this four-stage TDC was 3. In general, the range of measurement of an N-stage TDC is N − 1.

Concept of timing error detection with analysis of maximum delay
We explain the idea of failure prediction using on-line maximum delay measurements. To predict failure, the maximum delay was obtained with the proposed delay sensor. As shown on the left side of Fig. 2, the input of the MDS was connected to an end point of paths (line ep in this example). The MDS captured the maximum delay value during normal operation. As a circuit suffers from aging, the maximum delay value increases. The circuit gives a warning when the value of the captured maximum delay is more than a predefined threshold value. The right side of Fig. 2 shows an example. The horizontal axis is the sampling time. The vertical axis is the maximum delay value. The threshold value is 2 in this example. When the sampling time is 3, the circuit gives a warning.

Basics
The basics of the minimum on-line delay measurement using MDS are illustrated in Fig. 3. In this example, delays in the four sensitizable paths p 0 -p 3 were included in the input logic cone whose root was connected to ep.
The MDS was composed of the embedded monotonic TDC and the extra digital elements. The embedded monotonic TDC had two inputs: start and stop. The TDC measured the time interval between two transitions launched on the two inputs. (8) The  Table 1 Relationship between ∆t and thermometer code.
where t start is the arrival time of the transition on start and t stop is the arrival time of the transitions on stop.
The MDS had two inputs, ep and clk. The input ep was connected to the target end point. The input clk was connected to the clock through the delay element DL. During on-line maximum delay measurements, ep was connected to stop and clk was connected to start.
The MDS had two modes, an initializing mode and a delay measurement mode. During the initializing mode, ep was connected to start and clk was connected to stop. During the delay measurement mode, a transition was launched to the start point of p 0 that was synchronized to a positive transition of the clock signal when a path p 0 was sensitized during normal operation.
The positive transition of the clock signal propagated to the start input of the TDC through the delay element DL and the clk of the MDS. The delay element DL was inserted to reduce the absolute value of the measured time interval. Because the area cost of a monotonic TDC is proportional to the absolute value of the measured time interval, the DL was inserted to reduce the extra area.
The transition launched to the start point of p 0 arrived on the stop input of TDC through the end point of p 0 and the ep input of MDS. The MDS measured the time interval ∆t = tp 0 − t clk , where tp 0 is the data arrival time of the transition launched to the start point of p 0 on stop, and t clk is the data arrival time of the positive transition of the clock on start. If Δt, which is the measured delay, was larger than Δt max , which is the maximum delay value kept in the MDS, the value was updated to Δt.
We assumed that we realized the fine tunable delay of DL using a delay lock loop (DLL). DLL could adjust the delay value in a fine constant resolution. (9,10) From eq. (1), the measurement range of a path delay t p using MDS is given. Meas. times where t DL is the propagation delay of DL. Furthermore, N and τ i must meet the following eq. (2) to measure a path delay t p at least.
where t res was the resolution of DLL. We assumed that standard cells were used for the buffers of MDS. They were vulnerable to process variation. On the other hand, DLL was more robust to the process variation than the standard cells. (10,11) Therefore t DL was approximated to be a multiple of t res . By adjusting t DL and N, we could measure the arbitrary path delay in an arbitrary range. Under the condition of eq. (2), we could choose the minimum N by adjusting t DL considering the required resolution of the measurement and the variation of the path delay due to aging. An extra area was required for DLL. However, because DLL was located outside of the target logic block, it did not degrade the performance of the target digital circuit directly.
Regarding the delay of buffers (τ i ), they were varied with the process variation. The variation was not negligible because we assumed the use of standard cells for the buffers. To compensate for the variation, an on-line calibration, such as the ones proposed in Refs. 12 and 13, was used to obtain accurate absolute delay values.

Gate level architecture
The buffer delays of the 1st, 2nd, and 3rd stages were τ 0 , τ 1 , and τ 2 , respectively. Each output of a flip-flop was fed back to the clock input through a 2-input OR gate. Another input of each OR gate was connected to stop. The input start of the TDC was connected to the 2-to-1 multiplexer S 0 . The input stop of the TDC was connected to the 2-to-1 multiplexer S 1 . Both S 0 and S 1 were controlled by the mode input.
The input ep was connected to a 2-input XOR gate. The other input of the XOR gate was connected to ntrn, which controlled the polarity of the transition from ep. When we measured delays of positive paths, 0 is set to ntrn. When we measured delays of negative paths, 1 is set to ntrn.
When on-line delay measurement was carried out, the value of mode was 1. When the MDS was initialized before measurement, the value of mode was 0. When mode was 0, the output of the XOR gate was connected to start, and clk was connected to stop. When mode was 1, clk was connected to start, the output of the XOR gate was connected to stop, and under that condition, the output of the sensor was initialized. In this example, ep was connected to the end point of a path p T through the redundant line p R . The start point of p T was FF S ; the end point of p T was FF E . Figure 5 shows how the MDS captures the maximum delay value during normal operation. In this example, the number of stages of the embedded monotonic TDC inside the MDS was four.

On-line maximum delay measurement with MDS
The delay of all the buffers inside TDC was 1. First, both mode and rst were set to 0. The outputs of the flip-flops of the TDC were reset to 0, and the clock signal was provided to clk. Then, the outputs of the MDS were initialized to 0.  Figure 5 depicts the two cases, (a) and (b). In both cases, the thermometer code of MDS was Q 0 Q 1 Q 2 Q 3 = 1100 when t = n − 1. This means that the maximum time interval arriving before t = n was 2. Note that the clock inputs of FF 0 and FF 1 were fixed to 1 because the output values of these flip-flops were 1.
In case (a), the time interval 0.5 was applied to the inputs when t = n. Then, the values 1, 0, 0, 0 arrived on the inputs of the flip-flops, FF 0 , FF 1 , FF 2 , and FF 3 , respectively. The flip-flops FF 0 and FF 1 kept their previous values because the clock inputs of the flipflops were fixed at a static 1. The flip-flops FF 2 and FF 3 captured the input values.
The captured values were the same as the previous ones. Consequently, MDS kept the previous thermometer code.
In case (b), the time interval 2 was applied to the inputs when t = n. Then, the values 1, 1, 1, 0 arrived on the inputs of the flip-flops, FF 0 , FF 1 , FF 2 , and FF 3 , respectively. The flip-flops FF 0 and FF 1 kept the previous values because the clock inputs of the flipflops were fixed at a static 1. The flip-flops FF 2 and FF 3 captured the input values. The value of FF 2 was updated to 1, while the value of FF 3 was updated to 0. Accordingly, the thermometer code of the MDS was updated to Q 0 Q 1 Q 2 Q 3 = 1110, which was the thermometer code when ∆t was more than 2 and less than 3.
In this way, the current thermometer code of the sensor was kept when the arrival time interval was lower than the current maximum delay. The current thermometer code was updated when the arrival time interval was longer than the current maximum delay.

Results
In this section, we present the evaluation of the MDS. First, we confirmed the circuit behavior using LTSPICE. (14) In all the simulations, the VDD supply voltage was 1.0 V and the temperature was 27 °C. We implemented the proposed four-stage delay sensor using SPICE models of the standard cell libraries made with 45 nm metal gate/high-K/ strained-Si of the predictive technology model. (15) First, we obtain the input-output specification of the four-stage MDS. The delay of DL is fixed to 30 ps. The time interval is swept up from 4 to 40 ps. Every 2 ps, we obtain the thermometer code. Figure 6 plots the result. The horizontal axis shows the time interval. The vertical axis shows the thermometer code. According to the result, the measurement resolution is on the order of 10 ps. Figure 7 shows an LTSPICE simulation result. This figure illustrates the waveforms of rst, mode, clk, ep, Q 3 , Q 2 , Q 1 , and Q 0 going from top to bottom in the diagram.
In the first clock of clk, the sensor was initialized. In the succeeding three clocks, the time interval between clk and ep was measured three times. The time intervals of the 2nd, 3rd, and 4th clocks were 20, 10, and 40 ps, respectively. As shown in Fig. 6, the measurements when ∆t = 20, 10, and 40 ps were 1100, 1000, and 1111, respectively. In the 2nd clock, when t = 15 ns, the measurement result 1100 was captured in the flip-flops. In the 3rd clock, the launched time interval was shorter than the previous measurement, and thus the previous measurement was preserved in the flip-flops. In the 4th clock, the time interval was longer than the preserved one. Accordingly, the measurement result 1111 was captured.
The area overhead of the sensor was evaluated. The MDS was described using verilog hardware description language (HDL). Each verilog description was synthesized using Synopsys Design Compiler. Rohm 0.18 μm process technology is used for this evaluation. The area and area overhead of each circuit were estimated from the result. Table 2    The four-stage MDS was implemented on the Virtex-5 FPGA ML501 evaluation platform. (16) Figure 8 shows the experimental circuit. The clock frequency is 100 MHz. The upper part was the reconfigurable delay line controlled by the 6-bit control signals S 5 S 4 S 3 S 2 S 1 S 0 . The delay became minimum when S 5 S 4 S 3 S 2 S 1 S 0 = 000000. The delay became maximum when S 5 S 4 S 3 S 2 S 1 S 0 = 111111. The maximum delay is measured using the left-bottom four-stage MDS. Note that rst was the negative reset line. The signal ntrn was fixed to 0 in this experiment. The right-bottom extra circuit was for the measurement of each of the four-stage MDS. When the control line cal was 1, a 100 MHz clock signal was applied to start and stop. The clock phase of stop was swept up by 39 ps with the digital clock manager (DCM). (9) The MDS measured the applied time interval one by one. According to the results, we could obtain the delay of each τ 0 , τ 1 , and τ 2 . Table 3 shows the delay of each τ 0 , τ 1 , and τ 2 . When cal was 0, the MDS worked as delay sensor. We varied the delay of the reconfigurable delay line by changing S 5 S 4 S 3 S 2 S 1 S 0 = 101101 (Step 1), S 5 S 4 S 3 S 2 S 1 S 0 = 101100 (Step 2), and S 5 S 4 S 3 S 2 S 1 S 0 = 101110 (Step 3) sequentially. We observed the thermometer code in each step using the Xilinx on-chip logic analyzer Chip Scope Pro 14.7. The observation results confirmed that the MDS worked correctly.

Discussion
The proposed MDSs were put on the end points of the critical paths to monitor if the delays of these end points suffered from aging. Let a clock frequency of a circuit be 1 GHz. When the delay of the critical path was 80% of the clock period, the nominal maximum delay was 800 ps. Let the increase of path delay for aging be 10% of the clock period. Then the variation of the path delay of the critical path was 80 ps. The resolution of the MDS was on the order of 10 ps. A 10 ps resolution was sufficient to monitor the variation of the maximum delay for timing error detection.
Because an MDS was an on-chip circuit, it also suffered from aging or process variation. The error due to aging and process variation could be compensated using onchip delay calibration. (12,13)

Conclusions
In this paper, we presented an MDS for the on-line timing error detection in the logic block of VLSI circuits. The MDS captured the maximum propagation delay of the target end point for on-line timing error detection. Because the MDS was TDC-based, the resolution was high. In addition, the MDS did not interrupt normal operation. Because the MDS was a small digital circuit, it could be easily inserted into the logic blocks of high-speed and low-power processors and SOCs. With LTSPICE simulation using 45 nm metal gate/high-K/strained-Si of the predictive technology model, the behavior of the proposed analyzer was confirmed. The results show that the area overhead was 34.9% on average.