

ICTP Microprocessor Laboratory African Regional Course on Advanced VLSI Design Techniques Kwame Nkrumah University of Science and Technology, Kumasi, Ghana 24 November – 12 December 2003





# Noise and Matching in CMOS (Analog) Circuits



Giovanni Anelli CERN - European Organization for Nuclear Research Experimental Physics Division Microelectronics Group CH-1211 Geneva 23 – Switzerland Giovanni.Anelli@cern.ch









- Definitions and important formulas
- Thermal, shot and 1/f noise
- > Noise sources in an MOS transistor
- Some measurements examples
- Are identically designed IC components really identical?
  - Matching: definitions
  - Causes of mismatch
  - Matching characterization
  - Measurement examples
  - Matching golden rules



*Noise* can be defined as any unwanted disturbance that obscures or interferes with a desired signal.

Noise is an extremely important parameter in analog design. The resolution of a sensor, the smallest detectable signal in a circuit, the dynamic range of a system are determined by noise.

Noise is a totally random signal. It consists of frequency components that are random in both amplitude and phase. The exact noise amplitude at any instant can not be predicted. We can only measure its long term root mean square (rms) value, or predict its "randomness". The average power of noise is also predictable.

Most noise sources of interest for us have a Gaussian distribution of instantaneous amplitudes versus time.

C. D. Motchenbacher and J. A. Connelly, Low Noise Electronic System Design, John Wiley and Sons, 1993.



## Gaussian (Normal) distribution

9

$$\mathbf{p(x)} = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(\mathbf{x}-\mu)^2}{2\sigma^2}}$$

p(x) is a Probability Density Function. The area under the curve represents the probability that a particular event will occur.

 $\mu$ : mean value

**σ: standard deviation (or root mean square) of the variable x** 

 $\sigma^2$ : variance (or mean square)

68% of the events occur within ±  $\sigma$ 

99.7% of the events occur within ±3  $\sigma$ 



John R. Taylor, *An Introduction to Error Analysis*, University Science Books, 2<sup>nd</sup> Edition, 1997, Chapter 5.



Average noise power 
$$P_{av} = \lim_{T \to \infty} \frac{1}{T} \int_{-T/2}^{T/2} x^2(t) dt$$
 [V<sup>2</sup>]

**Power spectral density (PSD):** shows how much power the signal (noise in our case) carries at each frequency. It is generally indicated by S(f), and measured in V<sup>2</sup>/Hz.

If a signal (noise) with PSD  $S_{IN}(f)$  is applied to a linear time-invariant system with transfer function H(f), then the output spectrum  $S_{OUT}(f)$  is given by:  $S_{OUT}(f) = S_{IN}(f) \cdot |H(f)|^2$ 

The average noise power of the sum of two separate noise sources is:

$$P_{av_{tot}} = P_{av_{1}} + P_{av_{2}} + \lim_{T \to \infty} \frac{1}{T} \int_{-T/2}^{T/2} 2x_{1}(t)x_{2}(t)dt$$

The last term express the "correlation" between the 2 noise sources

Kumasi, December 2003

Giovanni Anelli, CERN



Thermal noise is caused by the random thermally excited vibration of the charge carriers in a conductor.

Thermal noise was first observed by J. B. Johnson in 1927, and a theoretical analysis was provided by H. Nyquist in 1928. Thermal noise is therefore also called Johnson or Nyquist noise.



 $S_{v}(f) = 4kTR, f \ge 0 [V^{2}/Hz]$  $\overline{v_{n}^{2}} = 4kTR \cdot \Delta f [V^{2}]$ 

 $\mathbf{k} = \mathbf{1.38} \cdot \mathbf{10}^{-23} \mathbf{J/K}$  (Boltzmann's constant)



Thermal noise does not depend on frequency (up to ~ 100 THz), and it is therefore called "white". As white light is made up by many colors, white noise is made up by many frequency components.





Shot noise is *always* associated with a direct current flow. It is present in diodes, MOS transistors and bipolar transistors. Shot noise is given by the current "granularity", <u>and</u> it is associated with current flow across a potential barrier.







1/f noise (also called flicker noise, excess noise, pink noise, lowfrequency noise and semiconductor noise) is found in all active devices as well as in some discrete passive components as carbon resistors. It exists only in association with a direct current. The origins of flicker noise are varied, but it is caused mainly by traps associated by contamination and crystal defects.

$$\mathbf{S}_{1/f}(\mathbf{f}) = \mathbf{K} \frac{\mathbf{I}^{\mathsf{b}}}{\mathbf{f}^{\alpha}} \quad \mathbf{f} \ge \mathbf{0} \quad [\mathbf{A}^2/\mathbf{H}\mathbf{z}] \qquad \qquad \mathbf{\overline{i}_n^2} = \mathbf{K} \frac{\mathbf{I}^{\mathsf{b}}}{\mathbf{f}^{\alpha}} \cdot \Delta \mathbf{f} \quad [\mathbf{A}^2]$$

 $\alpha$  is a constant  $\approx$  1, b is a constant in the range 0.5 to 2.



P. R. Gray et al., Analysis and design of analog integrated circuits, John Wiley and Sons, 4<sup>th</sup> Edition, 2002, Chapter 11.



#### Noise in passive components



N.B. Discrete resistors made with carbon granules also show 1/f noise

There are no sources of noise in ideal capacitors or inductors. In practice, real components have parasitic resistance that does display thermal noise!



#### Thermal noise vs shot noise





To be independent from the gain of a given system, we use the concept of input-referred noise. This allows comparing easily the noise performance of different circuits (with different gains), and calculating easily the Signal-to-Noise Ratio (SNR).

At the input of our linear two-port circuit, we use two noise generator (one noise voltage source and one noise current source) to represent the noise of the system regardless the impedance at the input of the circuit and of the source driving the circuit.





Channel thermal noise: due to the random thermal motion of the carriers in the channel

1/f noise: due to the random trapping and detrapping of mobile carriers in the traps located at the Si-SiO<sub>2</sub> interface and within the gate oxide.

Bulk resistance thermal noise: due to the distributed substrate resistance.

Gate resistance thermal noise: due to the resistance of the polysilicon gate and of the interconnections.

Gate leakage current shot noise: due to the gate leakage current, which is generally very small (not shown in the following slides).



### Noise generators in equiv. circuit



Z.Y. Chang and W.M.C. Sansen, "Low-noise wide-band amplifiers in bipolar and CMOS technologies", Kluwer Academic Publishers, 1991. Kumasi, December 2003 Giovanni Anelli, CERN



#### Input-referred noise generators



Z.Y. Chang and W.M.C. Sansen, "Low-noise wide-band amplifiers in bipolar and CMOS technologies", Kluwer Academic Publishers, 1991.Kumasi, December 2003Giovanni Anelli, CERN





I.C. = Inversion coefficient

K<sub>a</sub> = 1/f noise parameter



## F(I.C.) from W.I. to S.I.



Y. Tsividis, Operation and Modeling of The MOS Transistor, 2nd edition, McGraw-Hill, 1999, pp. 426-427



### Bulk resistance noise evaluation



S. Tedja et al., "Noise Spectral Density Measurements of a Radiation Hardened CMOS Process in the Weak and Moderate Inversion", *IEEE Transactions on Nuclear Science (IEEE TNS)*, vol. 39, no. 4, 1992, pp. 804-808.

S. Tedja et al., "Analytical and Experimental Studies of Thermal Noise in MOSFET's", *IEEE Transactions on Electron Devices (IEEE TED)*, vol. 41, no. 11, November 1994, pp. 2069-2075.

Kumasi, December 2003

Giovanni Anelli, CERN



#### Noise measurement system





#### $W = 2 mm, L = 0.5 \mu m$





 $W = 2 \text{ mm}, I_{DS} = 0.5 \text{ mA}, V_{DS} = 0.8 \text{ V}, V_{BS} = 0 \text{ V}$ 





 $W = 2 \text{ mm}, I_{DS} = 0.5 \text{ mA}, V_{DS} = 0.8 \text{ V}, V_{BS} = 0 \text{ V}$ 





NMOS, W = 2 mm, L = 0.5  $\mu$ m, V<sub>DS</sub> = 0.8 V, V<sub>BS</sub> = 0 V





## 1/f Noise parameter K<sub>a</sub>

#### **0.25 μm CMOS technology**





#### Excess noise factor $\Gamma$

#### **0.25 μm CMOS technology**





### White noise "visual estimation"



Look at the noise with an oscilloscope. If you recognize that it is white, take the peak-to-peak value and divide it by 6. This will give you a fairly good estimate of the rms noise value.





- Noise in CMOS IC components
  - Definitions and important formulas
  - Thermal, shot and 1/f noise
  - Noise sources in an MOS transistor
  - Some measurements examples
- Are identically designed IC components really identical?
  - Matching: definitions
  - Causes of mismatch
  - Matching characterization
  - Measurement examples
  - Matching golden rules



## The importance of matching

#### Yield of an N-bit converter as a function of the comparator mismatch



M.J.M. Pelgrom et al., "Matching Properties of MOS Transistors", *IEEE Journal of Solid-State Circuits (IEEE JSSC)*, vol. 24, no. 10, 1989, p. 1433. M.J.M. Pelgrom et al., "A 25-Ms/s 8-bit CMOS A/D Converter for Embedded Application", *IEEE JSSC*, vol. 29, no. 8, Aug. 1994, pp. 879-886.



### What is "matching"?



#### **DEFINITION:**

Matching is the statistical study of the differences between identically designed components placed at a small distance in an identical environment and used with the same bias conditions



## Relative & absolute mismatch

# Mismatch occurs for all IC components (resistors, capacitors, bipolar and MOS transistors)



#### **Relative mismatch**

**Absolute mismatch** 



### Mismatch in MOS transistors





Mismatch is given by differences in physical parameters (such as  $t_{ox}$ ,  $N_a$ ,  $\mu$ ) as well as layout dimensions (W, L)



## Mismatch in MOS transistors

Mismatch in physical parameters (N<sub>a</sub>,  $\mu$ , T<sub>ox</sub>) and layout dimensions (W, L) gives origin to mismatch in electrical parameters (V<sub>T</sub>,  $\beta$  and therefore I<sub>D</sub>)





### What causes mismatch?

Differences between (supposed) identically designed components can be attributed to two classes of effects. We will see that the real mismatch is given by stochastic effects. Systematic effects give origin to offsets.

#### **Stochastic effects:**

- Ion implantation
- Dopant diffusion
- Dopant clustering
- Interface states
- Edge roughness
- Polysilicon grain effects

#### **Systematic effects:**

- Dimensional errors
- Device orientation
- Mechanical stress
- Temperature differences
- Different DC bias
- Parasitic components



#### Variations across a wafer





#### Variations within a chip





#### Matched transistors

|  |                           |                         |                         |  | a <b>b</b> a <b>b</b> |  |  |  |
|--|---------------------------|-------------------------|-------------------------|--|-----------------------|--|--|--|
|  |                           |                         |                         |  |                       |  |  |  |
|  | d <b>o d</b> o            |                         | d <b>o d</b> o          |  |                       |  |  |  |
|  |                           |                         |                         |  |                       |  |  |  |
|  | a <b>j</b> e a <b>j</b> e | a <b>b</b> a <b>b</b> o | a <b>b</b> a <b>b</b> o |  |                       |  |  |  |
|  |                           |                         |                         |  |                       |  |  |  |
|  | $\overline{\ }$           |                         |                         |  |                       |  |  |  |
|  |                           |                         |                         |  |                       |  |  |  |



The short distance between the two devices eliminates the systematic effects and leaves only the stochastic effects

$$\Delta \mathbf{V}_{\tau} \approx 0.1 - 10 \text{ mV}$$
$$\frac{\Delta \beta}{\beta} \approx 0.1 - 5 \%$$



H. P. Tuinhout, "Design of Matching Test Structures", *Proceedings IEEE 1994 International Conference on Microelectronic Test Structures*, vol. 7, March 1994, pp. 21-27.

Giovanni Anelli, CERN



### Electrical systematic effects



Mind the parasitics (for example, line resistances) Also to be considered in analog and digital circuits: timing offsets

Giovanni Anelli, CERN



## Environmental systematic effects

Temperature effects:  $V_T$  and  $\beta$  are very sensitive to temperature. Blocks dissipating a lot of power on a chip can induce differences in devices which are not on the same isothermal curve. These effects depend on:

- Type of package
- Bonding wires
- Chip attachment (glue)

 $\Delta V_{T} \approx 1 \text{ to } 3 \text{ mV/°C}$  $\frac{\Delta \beta}{\beta} \approx -0.5 \%/^{\circ}\text{C}$ 

**Mechanical stress effects:** mainly given by packaging. Depend on:

- Type of package
- Mounting of the chip
- Materials used
- Chip coating

## Mechanical stresses can affect the mobility through the Piezoelectric effect.

J. Bastos, M. Steyaert, B. Graindourze and W. Sansen, "Influence of die attachment on MOS matching", *Proceedings of the IEEE 1996* International Conference on Microelectronic Test Structures, Vol. 9, March 1996, pp. 27-31.

Kumasi, December 2003



## Technological systematic effects

• Proximity effects: plasma etching is sensitive to pattern density. Use dummy structures.

• Device orientation: ion implantation is done tilting the beam. This can create different parasitics. The current flow direction is also important.

- Metal coverage effects: do not cover one of the devices with metal!
- Internal mechanical stresses: caused by LOCOS or STI and by strained layers in the chip
- Charging damage: do not introduce "antennas"
- Contacts and vias non uniformities

H. P. Tuinhout, M. Pelgrom, R. P. de Vries and M. Vertregt, "Effects of metal coverage on MOSFET matching", *Technical Digest of the IEEE International Electron Device Meeting 1996*, pp. 735-738.

R. W. Gregor, "On the Relationship Between Topography and Transistor Matching in an Analog CMOS Technology", *IEEE Transactions on Electron Devices*, vol. 39, no. 2, February 1992, pp. 275-282.

H. P. Tuinhout and M. Vertregt, "Test Structures for Investigation of Metal Coverage Effects on Mosfet Matching", *Proceedings IEEE 1997* International Conference on Microelectronic Test Structures, vol. 10, March 1997, pp. 179-183.

H. P. Tuinhout and W. C. M. Peters, "Measurement of Lithographical Proximity Effects on Matching of Bipolar Transistors", *Proceedings IEEE* 1998 International Conference on Microelectronic Test Structures, vol. 11, March 1998, pp. 7-12.



## Stochastic effects: matching!

True stochastic mismatch is caused by random fluctuations of device properties. Obtaining zero offset is a matter of good engineering, but stochastic mismatch is something we can not get rid of!!



The main causes are:

- Random fluctuation of the number of dopant atoms in the channel
- Fluctuations in the number of oxide charges and interface states
- Polysilicon gate granularity
- Dimensions effects
- Series resistances

H. P. Tuinhout, A. H. Montree, J. Schmitz and P. A. Stolk, "Effects of Gate Depletion and Boron Penetration on Matching of Deep Submicron CMOS Transistors", *Technical Digest of the IEEE International Electron Device Meeting 1997*, pp. 631-634.

P. A. Stolk, F. P. Widdershoven and D. B. M. Klaassen, "Modeling Statistical Dopant Fluctuations in MOS Transistors", *IEEE Transactions on Electron Devices, vol. 45, no. 9, September 1998*, pp. 1960-1971.



Strong gradients across a wafer can be a source of mismatch. This can appear as a random effect if we do not know where the chips were taken on the wafer.

This effect, which is important only for large devices or large distances, can be accounted for in the following way:

$$\sigma_{\Delta P}^2 = \frac{A_P^2}{WL} + S_P^2 \cdot D^2$$

D is the distance between the components.  $S_{Vth}$  and  $S_{\beta}$  can be of the order of 1  $\mu$ V /  $\mu$ m and 1 ppm /  $\mu$ m respectively, i.e. for a distance of 1 cm we have an offset in threshold voltages of 10 mV or a mismatch in currents of 1%.

 $S_P$  depends also a lot on the "maturity" of the process. Processes used for mass production (very mature) have generally lower  $S_P$ 's.

M. J. M. Pelgrom, A. C. J. Duinmaijer, A. P. G. Welbers, "Matching Properties of MOS Transistors", *IEEE JSSC*, vol. 24, Oct. 1989, pp. 1433-1440.



Random effects "average out" better if the area is bigger. Therefore we expect something like



K. R. Lakshmikumar, R. A. Hadaway and M. A. Copeland, "Characterization and Modeling of Mismatch in MOS Transistors for Precision Analog Design", *IEEE Journal of Solid-State Circuits*, vol. 21, no. 6, December 1986, pp. 1057-1066.

Giovanni Anelli, CERN



## Matching characterization



• Transistor pairs of different dimension are designed and repeated in several positions on the wafer

- For each transistor pair dimension:
  - $V_T$ ,  $\beta$  are measured for each transistor of each pair
  - $\Delta$  V<sub>T</sub> and  $\Delta$   $\beta$ / $\beta$  are calculated for each pair measured
  - $\sigma_{\!\Delta\,\text{Vth}}$  and  $\sigma_{\!\Delta\,\beta\!/\beta}$  are extracted from the two distributions



### Matching characterization



Matching is a statistical study. Therefore, for a given transistor pair, we need to measure a statistically significant number N of matched pairs. The fractional uncertainty in  $\sigma$  is:



John R. Taylor, An Introduction to Error Analysis, University Science Books, 2<sup>nd</sup> Edition, 1997, p. 140.

Giovanni Anelli, CERN



• The measurement system must be able to measure many devices quickly and with high precision (< 10  $\mu$ V for threshold voltages and < 0.1 % for currents).

• It is generally composed by a computer controlled wafer prober connected to a High Precision Semiconductor Parameter Analyzer (which is made up by several Source Monitor Units).

• All the connections are done with triaxial cables (between the external grounded shield and the internal wire carrying the signal there is a driven guard connected to a separate low noise amplifier).

• The extraction algorithm must be optimized to obtain the required resolution in the shortest possible time.



## Data analysis procedure

- Sort values low-high
- Disregard "outliers"
- Cumulative distribution plots
- x-y coordinates scatter plots
- The statistical uncertainties are estimated using the bootstrap analysis

#### **Cumulative distribution plot**



P. Diaconis and B. Efron, "Computer-Intensive methods in Statistics", Scientific American, vol. 248, no. 5, May 1983, pp. 96-108.



#### Expected mismatch

$$\sigma_{\Delta V_{th}} = \frac{A_{V_{th}}}{\sqrt{W L}} \quad A_{V_{th}} = \sqrt{A_N^2 + A_{IT}^2 + \dots} \quad \bigwedge_{N} = \sqrt{2} \cdot C \cdot \frac{t_{ox}}{\varepsilon_{ox}} \cdot \sqrt{N_{IT}}$$

$$\sigma_{\Delta V_{th}}^2 = \frac{A_{\mu}^2}{W L} + \frac{A_{C_{ox}}^2}{W L} + \frac{A_{W}^2}{W^2 L} + \frac{A_{L}^2}{W L^2} \quad \longrightarrow \quad \sigma_{\Delta \beta / \beta} = \frac{A_{\beta}}{\sqrt{W L}}$$

$$\sigma_{\Delta \beta / \beta} = \frac{A_{\beta}}{\sqrt{W L}} + \frac{A_{C_{ox}}^2}{W L} + \frac{A_{W}^2}{W^2 L} + \frac{A_{L}^2}{W L^2} \quad \longrightarrow \quad \sigma_{\Delta \beta / \beta} = \frac{A_{\beta}}{\sqrt{W L}}$$

$$\sigma_{\Delta \beta / \beta} = \frac{A_{\beta}}{\sqrt{W L}} + \frac{A_{V_{th}}^2}{W L} + \frac{A_{W}^2}{W L^2} + \frac{A_{W}^2}{W L^2} \quad \longrightarrow \quad \sigma_{\Delta \beta / \beta} = \frac{A_{\beta}}{\sqrt{W L}}$$



## MOST $V_T$ mismatch: example

0.25  $\mu$ m technology – t<sub>ox</sub> = 5.5 nm



"Behind" each point of this plot there are 180 measurements

The same behavior is found for the β mismatch



### Common centroid transistor pair





Low supply voltage and small devices increase the impact of transistor property variations on chip performance not only for analog circuits

Examples of digital circuits sensitive to mismatch are memory cells and clock distribution circuits.

Mismatch in general reduces the immunity to noise of digital circuits.

In the case of memories, the difference in the threshold voltages of two transistors of the same memory can be of the order of 100 mV or more.

As we will see in the next lecture, this can become a problem especially in more advanced technologies.



#### What to take into account designing matched components

- Same dimensions, shape, interconnections, orientation and temperature
- Small distance
- Same bias conditions, currents in the same directions
- Do not use minimum sizes
- Mind voltage drops in wiring
- No metal wiring over components
- Use cross coupled structures in presence of strong gradients
- Use dummy structures (up to 20  $\mu\text{m}$ )
- Use star connections for power and for delicate timing
- Do not create large current loops
- Keep the power of different blocks separated
- Stay away (40  $\mu$ m) from other blocks and (200  $\mu$ m) from the chip edges
- Mind packaging effects