### UNIVERSITY OF BRITISH COLUMBIA Faculty of Applied Sciences Department of Electrical and Computer Engineering

Prof. R. Saleh

EECE579 Spring 2006

Study Hints (See end of this page) and Sample Midterm Solutions

(open book, calculator permitted)

I hereby agree not to receive or provide help to any other student during this examination.

Signed\_\_\_\_\_(sign name)

| Problem 1 | 10pts                               |
|-----------|-------------------------------------|
| Problem 2 | 15pts                               |
| Problem 3 | 20pts (not covered in 2008 midterm) |
| Totals    | (45pts)                             |

| Useful Parameters (90nm)      | NMOS (Lmin=0.1um) | PMOS (Lmin=0.1um) |
|-------------------------------|-------------------|-------------------|
| 1X inverter                   | 0.2um/0.1um       | 0.4um/0.1um       |
| Req                           | 12.5KΩ/           | 30ΚΩ/             |
| Rsq (copper) = $0.054\Omega/$ | N/A               | N/A               |
| Cg (gate cap per unit W)      | 2fF/um            | 2fF/um            |
| Ceff (drain/source cap )      | 1fF/um            | 1fF/um            |
| Cwire=0.2fF/um                | N/A               | N/A               |
| Vdd=1V                        |                   |                   |

## PRINT NAME\_\_\_\_\_\_ STUDENT NO.\_\_\_\_\_

## **STUDY HINTS FOR MIDTERM ON MARCH 26, 2008.**

3 problems (40min. each), from Lectures 1-6, HW1, HW2 and HW3, and text book Chapters 10 and 11. Main topics for the three problems are: Interconnect, Power Grid, and Low Power Design.

## **1.** Transistor Models and Scaling (10pts)

a) (4pts) Explain drain-induced barrier lowering (DIBL) using an NMOS device crosssectional diagram. Why is it a concern today and what can be done to reduce its effect? Show this using a circuit diagram.

**Ans.** In the subthreshold region, DIBL is the  $V_T$  lowering due to a high value of  $V_{DS}$  on the transistor. When a large  $V_{DS}$  is applied, part of the drain depletion region extends into the channel which acts to assist the inversion process, thereby lowering  $V_T$ .



b) (2pts) As technology is scaling, V<sub>DD</sub> is scaling faster than V<sub>T</sub>. Explain why V<sub>DD</sub> is scaling by a factor of 0.7 while V<sub>T</sub> is scaling by 0.9, and the associated tradeoffs. What are process engineers doing to help designers?

**Ans.**  $V_{DD}$  was scaling by 0.7 to reduce power and to keep the fields constant.  $V_T$  is only scaling by 0.9 due to subthreshold current levels. Process engineers are providing multiple  $V_T$ 's in a process (HVT, SVT, LVT) to help designer cope with the need for high speed and low leakage.

c) (4 pts) Write the equations for dynamic power and static power (separately). In the plot of  $V_{DD}$  vs.  $V_T$ , sketch the contour lines of constant dynamic power and constant static (subthreshold leakage) power. (note: calculations are not required)

Ans.

 $P_{dyn} = \alpha C_L (V_{DD})^2 f_{clk} = k_o$  $V_{DD} = \sqrt{\frac{k_o}{\alpha C_L f_{clk}}}$  therefore, this produces horizontal lines, assuming f\_{clk} is fixed

 $P_{static} = I_o e^{-V_T / nV_{th}} V_{DD} = k_1$  $V_{DD} = k_1 e^{V_T / nV_{th}} / I_o$  therefore, this produces exponential characteristics



# 2. Wires and Wire Models (15 pts.)

a) (8pts) The graph shown below shows the FO4 delay of a gate and the delay of three different wire lengths as a function of technology scaling. Show how you would obtain the circled values. That is, show the details of how you would compute the FO4 delay, and then the wire delay for a 3mm wire in a 90nm technology node. What assumption do you have to make about the wire width? Is this a reasonable assumption?



#### Ans.

For the FO4 calculation, assume L=0.1um in 90nm CMOS. It is obtained by a gate driving four identical gates. Use the NMOS delay, although PMOS delay is larger.

$$FO4 = RC = \frac{12.5K\Omega}{W/L} \times (C_{eff} + 4C_g) 3W$$
  
$$FO4 = \frac{12.5K\Omega}{1/0.1um} \times ((1fF/um) + 4(2fF/um)) 3 = 34 ps$$

For the 3mm wire delay, use Elmore delay:

$$t_{wire} = \frac{R_{wire} C_{wire}}{2}$$
  
=  $\frac{(0.054\Omega / sq)(3000um / W)(0.2 fF / um \times 3000um)}{2}$   
= 486 ps (with W = 0.1um)

To get this delay, W must be set to the minimum value. This value of W is not reasonable for top level metal running 3mm in length. It should be at least 0.3um, but usually bigger than that depending on what we are routing and the buffer insertion.

- b) (7pts) A clock cycle of 10 FO4 delays is used for a processor design in a 90nm copper technology. However, the chip is so large that it is impossible to get a global signal across the chip in a single clock cycle.
  - i. (1pt) Estimate the maximum possible wire length that one can have in the design, assuming that it is unbuffered, based on the graph on the previous page?

**Ans.** 10FO4 delays would be about 350ps. This means that the wire would have to be about 2.5mm or shorter.

ii. (6pts) Estimate how long a wire can be, assuming that it is buffered, before we have to insert a flip-flop in the signal path? That is, what is the maximum buffered wire length possible? (hint: take a long wire, perform buffer insertion, determine the delay of each section, and finally how many sections fit into a 10FO4 clock cycle)

**Ans.** The length would be greater than 2.5mm. Try 3mm just to see what happens since this we have some of those numbers already.

$$\begin{split} FO1 &= RC = \frac{12.5K\Omega}{W/L} \times (C_{eff} + C_g) 3W \\ FO1 &= \frac{12.5K\Omega}{1/0.1um} \times ((1fF/um) + (2fF/um)) 3 = 11ps \\ N &= \sqrt{\frac{t_{wire}}{FO1}} = \sqrt{\frac{486}{11}} = 6.6 \ stages \approx use \ 7 \ stages \\ M &= \sqrt{\frac{R_{eqn}C_{int}}{C_gW(1+\beta)R_{int}}} = 88 \approx use \ 100X \\ t_{stage} &= \frac{R_{eqn}}{(100W/L)} (100(3W)C_J + C_{wire}/2) + (\frac{R_{eqn}}{(100W/L)} + R_{wire})(C_{wire}/2 + 100(3W)C_g) \\ &= 130(30\ fF + 42\ fF) + (120 + 230)(42\ fF + 60\ fF) = 50\ ps \\ t_{total} &= 7 \times 46\ ps = 320\ ps \end{split}$$

If we used 7 buffered stages of a 3mm, we would get about 350ps. So a buffered 3mm wire would stay within 10FO4 delays even though an unbuffered one does not.

# 3. Various Topics in SoC/DSM (20 pts) (not on midterm)

- a) (2pts) When designing a reusable IP block, what portion of the design cycle takes the longest to complete? For a platform-based design, what is the most time-consuming portion?
  Ans. IP Verification takes the longest. For a platform-based design, it is system validation.
- b) (2pts) Why is the MIPS processor more popular than the ARM processor in the settop box application?
   Ans. It is a high-speed application so it requires MIPS. The ARM processor is for low-power applications such as cell phones.
- c) (4pts) List 4 major issues that are expected at the 65nm technology node.
   Ans.
   Thin-oxide gate leakage
   Subthreshold leakage
   Intra-die Process Variations
- Photolithographic Problems
  d) (2pts) Which is the bigger concern today in power supply noise, IR drop or Ldi/dt variations? Why?
  Ans. IR drop is a bigger concern for the power grid itself. Ldi/dt is due to the package pin inductance and is increasing due to larger di/dt values.

e) (3pts) What is the typical cost of a mask set at 90nm? What is the overall design cost for a 50M transistor chip? In that case, why are designers so worried about mask costs?

**Ans.** Around \$1M for a mask set. For a chip, the cost is around \$25M-\$30M. Designers are worried about mask sets since each respin requires a new mask set.

- f) (4pts) Briefly describe scan-based testing using a diagram. Why is scan-based testing such a popular technique in the industry? What are the limitations and overhead associated with the approach.
   Ans. See EECE578 (not covered in our course this year since 578 is offered)
- g) (3pts) In a two-column format, compare and contrast EEPROMs with FLASH memories.
  - Ans.

<u>EEPROM</u> Two-transistor cell Write using FN Tunneling Selective Erase Larger Area

#### FLASH

One-transistor cell Write: Hot-carrier, Erase: FN Tunneling Block Erase Smaller Area