Proceedings of the ASME 2011 Pacific Rim Technical Conference & Exposition on Packaging and Integration of Electronic and Photonic Systems InterPACK2011 July 6-8, 2011, Portland, Oregon, USA

# InterPACK2011-52240

# **IC-PACKAGE THERMAL CO-ANALYSIS IN 3D IC ENVIRONMENT**

Stephen H. Pan Apache Design Solutions San Jose, CA, USA Norman Chang Apache Design Solutions San Jose, CA, USA Ji Zheng Apache Design Solutions San Jose, CA, USA

## ABSTRACT

Power on chip is highly temperature dependent in deep sub-micron VLSI. With increasing power density in modern 3D-IC and SiP, thermal induced reliability and performance issues such as leakage power and electromigration must be taken into consideration in the system level design. This paper presents a new methodology and its applications to accurately and efficiently predict power and temperature distribution for 3D ICs.

## INTRODUCTION

As VLSI technology scales, thermal issues are becoming a critical factor in determining the performance, reliability and cost of high performance ICs. It is well known that excessively high temperature can significantly degrade the reliability of interconnect and device, and even cause functional failures through the electro-thermal coupling [1, 2, 4, and 9].

In [2], temperature contours within a conventional 2D SoC and 3D (stack-die with TSV) ICs were obtained through a threedimensional FE (Finite Element) thermal simulation for both power dissipation in the devices and Joule heating in the wires. The thermal boundary conditions were simplified as adiabatic, and assumed heat transfer coefficients on the side of heat sink were used. Besides, there was no discussion on temperature dependency of IC power. In [3], an iterative process for a 2D-IC using finite difference method was introduced to obtain converged temperature with power which is thermal-aware. Here "converge" means that steady temperature-power on chip is reached in package/board environment when power on chip is temperature-dependent. Similar to [2], effects of package/board were over-simplified. There were other studies using equivalent thermal resistors to construct thermal models for chip [6, 7], also using simplified thermal boundary for package/board environment. Using smeared metal and dielectric layers with a constant IC power density map on chip, full 3D IC-packageboard was modeled by finite element method and analyzed in [8]. There is no temperature dependency in the IC power density map.

Since the package substrate is the most important thermal passage to dissipate heat to PCB in a BGA design, the desired IC thermal analysis should first accurately capture the heat conduction in the package - it must model metal/via on interconnect layers in sufficient details. Second, the thermal analysis must be able to solve the converged IC power and temperature distributions simultaneously. In addition, the thermal analyzer should be 3D IC and stack-die package capable.

A new methodology for 3D IC-package thermal co-analysis including the thermal-power coupling was proposed in [13]. This paper summarizes the co-analysis methodology and use case studies and numerical results for factors affecting thermal results in co-analysis of 3D-IC environment.

## NOMENCLATURE

- $\rho = \text{mass density (gm/mm^3)}$
- $C_p$  = specific heat (Joule/gm-°K)
- T = Temperature (°C)
- t = time (sec)
- r = location (mm)
- $\kappa$  = thermal conductivity (W/°K-mm)
- p = heat generation in chip (W)
- $P_{avg}$  = average power dissipation on chip at a frequency
- $P_{dynamic}$  = power dissipation due to switching

 $P_{short-circuit}$  = power dissipation due to the direct path short-circuit current

 $P_{static}$  = power dissipation due to static current, including leakage current

q = heat flux (W) h = heat transfer coefficient (W/mm<sup>2</sup>-°K) I= electric current (Amp) R= electric resistance (Ohm)

# THERMAL ANALYSIS METHODOLOGY

### A. MODELING SYSTEM HEAT GENERATION AND DISSIPATION

Fig. 1 shows the important components in a typical IC-Package/Board system. In general, there could be multiple chips in the package.

The governing equation of heat conduction is:

$$\rho C_p \frac{\partial T(\vec{r},t)}{\partial t} = \nabla \cdot [\kappa(\vec{r},t) \nabla T(\vec{r},t)] + p(\vec{r},t))$$
(1)

subject to the thermal boundary conditions:

$$q_i = -\kappa(\vec{r}, t) \frac{\partial T(\vec{r}, t)}{\partial n_i} = h_i T(\vec{r}, t)$$
(2)

where *T* is the time dependent temperature at any point,  $\rho$  is the density of the material,  $C_p$  is the specific heat,  $\kappa$  is the thermal conductivity as a function of temperature and position, *p* is the heat energy generation or power dissipation rate,  $h_i$  is the heat transfer coefficient on the boundary surface,  $q_i$  is the surface heat flux, and  $\partial/\partial n_i$  is the differentiation along the outward direction normal to the boundary surface  $S_i$ .



Fig. 1 Chip and surrounding cooling environment.

Heat generation is mostly in chips. There are also Joule heating or self-heating in package and PCB metal traces. Self-heating power is calculated by  $I^2R$ . As the physical size of the traces in package and PCB are much larger than that in IC, hence self-heating power from package and PCB is usually ignored for the purpose of IC temperature prediction.

The major heat source in a VLSI chip is the power generated by transistors or devices in a layer attached to the silicon substrate. The total average power dissipation of a VLSI circuit consists of dynamic power, short-circuit power, and static power [1, 5].

$$P_{avg} = P_{dynamic} + P_{short-circuit} + P_{stati}$$

For deep sub-micron technologies, temperature dependent leakage currents are a major source of static power consumption. Current in on-chip metal interconnections also generates heat through resistive loss, resulting in self-heating power. In addition, heat from neighboring chips also affects the final temperature distribution, which is particularly important for 3D stacked die thermal analysis. Therefore, the simplified thermal boundary conditions such as thermal resistance model of package or adiabatic assumptions cannot model the complex thermal environment 3D IC and SiP.

Due to extreme feature size differences in IC, package, and PCB, it is not realistic to include all the details of the components in one analysis model, using either finite element method or other numerical schemes. Previously, e.g., in [3 and 9], chip-level model with detailed interconnection layers was analyzed alone, separate from that of package and PCB. An accurate IC thermal solution needs detailed spatial modeling of package and PCB thermal dissipation channel.

### B. CTM-BASED THERMAL CO-ANALYSIS

To overcome the difficulty in coupling IC and Package/PCB in power-thermal co-analysis, a new methodology[13] was developed to perform IC–package thermal co-analysis on a unified platform using Chip Thermal Model (CTM), which is generated by chip-level power integrity analysis environment (Fig. 2).

Chip Thermal Model (CTM) is primarily a temperaturedependent IC power consumption model for tiles on chip layers, including  $P_{avg}$  power on transistor layer and self-heating power on interconnection layers. In Fig. 2, "P" stands for Power and "T" stands for Temperature. An IC power integrity analysis tool [11] is customized to generate CTM for each chip. To generate CTM, IC power tool calculates instance power at multiple temperatures, e.g., 25C, 45C, 65C, 85C and 105C, and then maps the instance power to small and uniform tiles, e.g., 10um by 10um, across the chip area. The power includes leakage power which is highly temperature dependent as well as the self-heating power. The CTM models are then imported into a Package/PCB thermal analyzer [12] to perform power-thermal analysis.

For any temperature distribution on chip, each tile has its own P/T table to look up for power with given temperature. Fig. 3 shows an example of the exponential power variation with temperature in two tiles of a chip. Fig. 12 in the Case Study section of this paper shows the total powers on the layer with transistors and for the whole chip grow exponentially with temperature.

As an extension of the basic definition, CTM can also contain the multi-layer metal distribution (power grid) in IC, sufficient to construct a multi-layer detailed thermal model for chip, as a sub-model to be solved in the thermal analyzer. Fig. 4 shows the metal distribution on four interconnect metal/via layers in a typical chip cross-section. The contours in colours stand for the metal percentage in tiles from high (red) to low (blue). Metal of TSV (Through-Silicon-Via in 3D-IC), if exists, will be included in CTM.

The proposed algorithms of CTM-based thermal analyzer for 3D-IC and SiP are:

- CTMs are first generated for all the chips in 3D IC or SiP.
- 2) Package/PCB physical designs, along with appropriate convection and radiation boundary conditions, are imported into the thermal analyzer, which also calculates the initial power map from CTMs, based on assumed temperature on chip. The thermal analyzer then simulates and updates temperature on each of the chips.
- Continue looking up in CTMs for updated power maps and recalculate temperatures, until the total power on each chip converged, e.g., less than a predefined threshold between iterations.
- 4) Detailed full chip sub-models are then constructed from the metal layer distributions in CTM and physical chip geometry. With the converged power map and chip thermal boundary conditions obtained from analysis step 3), the temperature maps on device and metal/via layers in each chip is now simulated and made available in the IC power integrity analysis environment for IC level IR drop and reliability (such as EM) analysis.

The implementation of the proposed approach for 3D ICs and SiP is straightforward. Through metal density mapping, TSVs could be naturally included in the Chip Thermal Model. The TSVs design is also in die and die attach of the package model so that the material distributions are compatible to IC sub-model.



Fig. 2 New methodology of CTM-based thermal analysis



Fig. 3 CTM power on tiles grows exponentially as temperature rises. (Typical)



Fig. 4 Example of metal density map on metal and via layers (16 layers) in CTM.

# **CASE STUDIES**

## A. Detail Mesh for Converged Solution

As package substrate is the most important thermal passage to dissipate heat to PCB, a study of effects of substrate modeling details was conducted. Substrate layers in five packages of different sizes and features (Table I) were modelled in two different ways, one with smeared thermal conductivity and the other with exact trace/via outlines. The exact outline model has accurate metal distribution and should be the most accurate FE model for converged solutions. Vias in package substrate is modelled as solid cylinders of exact outline geometry from CAD, but with adjusted conductivity in the axial direction for equivalent copper plating thickness in each via hole. Note that the smeared model also has metal conductivity distribution based on CAD design using area percentage mapping, just that the mesh is coarser than what is required for exact geometry. There are areas of traces shorted with equivalent thermal conductivity based on metal area percentage. This is already a closer approximation than the model using uniform equivalent conductivity in substrate layers. The results for uniform power on chip show that the percentage errors of the smeared ones in predicting thermal resistance (Theta-JA) of the package are significant, up to 20% in PKG4 (Fig. 5). If the mesh in smeared model is sufficiently refined, the solutions will approach those of the exact trace models, i.e., converged solution in both mesh size and material distribution. These results demonstrate that the modelling details have significant impact to package thermal resistance prediction.

TABLE I. PACKGES FOR TRACE/VIA MODELING STUDY

|      | Sizes  |        |         | Chip | Chip |
|------|--------|--------|---------|------|------|
|      | (mm)   | Layers | Solders | Туре | No   |
| PKG1 | 12x9.4 | 2      | 44      | Wire | 1    |
| PKG2 | 10x10  | 4      | 144     | FC   | 1    |
| PKG3 | 36x36  | 4      | 256     | Wire | 4    |
| PKG4 | 35x35  | 4      | 388     | Wire | 1    |
| PKG5 | 16x16  | 6      | 225     | Wire | 2    |



Fig. 5 Effect of modeling details to thermal resistance of package

## B. CTM for Converged Solution

Further comparisons are performed for uniform power and CTM power of PKG2 in Table I, a FCBGA, with mesh of accurate trace/via geometry. The total power level of the uniform power case was adjusted to be the same as the converged CTM power level. There are five interconnection layers in this chip with active elements in a range about 2.2mm by 2.2mm region around central area of the chip. Fig. 6 shows up to 12.8% difference in calculations of package thermal resistance. Fig. 7 compares the thermal profiles on package substrate metals, showing effects of uniform power and CTM power. CTM power has localized hot spots and that lead to higher temperatures in package. Fig. 8 shows the power density

(W/mm<sup>2</sup>) map on the transistor/device layer of the chip at 50um resolution. As expected, the transistor layer dissipates majority of the total power. CTM also allocates self-heating power on interconnection layers of the chip. Fig. 9 and 10 show the metal distribution of the first and the 2nd metal layer on chip, METAL1 and METAL2, and the power density due to self-heating, about 2 orders of magnitude less than that on transistor layer. Note that both the temperature and power maps displayed are converged ones in realistic IC-Package-PCB environment.

Figures 11 to 13 reveal more details of power distribution in CTM. CTM includes the temperature-power table of each tile, e.g., 10um x 10um, on chip layers. The total power on each layer is used for comparison. Fig. 11 shows that the power on device layer of the chip accounts for 84% to 95% of the total power on chip. The rests are self-heating power in interconnection layers. The trend of power distribution at different temperatures seems irregular. But the temperature dependency of power on device and interconnection layers is clear in Fig. 12 and 13. The device power increase is dominated by leakage power which rises exponentially with temperature (Fig. 3 and 12), while the trend of self-heating power is linear (Fig. 13) due to linear behaviour of electrical resistivity in copper wires. The power in the plots is normalized by the device power at 25C.



Fig. 6 Thermal resistance difference using CTM or uniform power



Uniform power CTM power Fig. 7 Temperature distributions on package substrate metals due to uniform and converged CTM power.







Fig. 9 Metal (left) and self-heating power density (right) distribution on the  $1^{st}$  interconnection layer on chip



Fig. 10 Metal (left) and self-heating power density (right) distribution on the  $2^{nd}$  interconnection layer on chip



Fig. 11 Percentage to total power on chip layers at different temperature



Fig. 12 Temperature dependency of power on device layer (DEVICE) and in whole chip (TOTAL). (The power in the plot is normalized by the device power at 25C.)



Fig. 13 Temperature dependency of self-heating power on interconnection layers. METAL1 and METAL2 are the names of the interconnection layers shown in Fig. 11. (The power in the plot is normalized by the device power at 25C)

# C. 3D-IC Environments

A stacked wire-bond SiP test case (Fig. 14) in TSMC Reference Flow 10.0 [10] was used to study CTM based thermal co-analysis. The SRAM chip is on the top and the SoC chip is at the bottom, separated by a silicon spacer. TSVs could be included in chips and the spacer to study the thermal effects in 3D IC.

Comparisons of thermal results for two cases, with and w/o metal trace details in FE models, are listed in Table II. The percentage differences of the two cases are also listed. Theta-JA and Psi-JT [14] for the package were calculated with total power in the package, i.e., sum of powers on the SRAM and Logic chips. The difference in Tmax, Theta-JA, and total power on chips are not significant in the two cases, possibly due to fine meshes used on chips, about 50um in element size, while temperature on package top-center, i.e., T\_pkg\_top, and Psi-JT have more than 10% differences. Heat dissipations (O) through package top and package size are also quite different. This is because of differences in heat flow paths, i.e., trace details, on substrate layers in package model. In the case with trace details, the DOF (Degrees-of-freedom) for solution exceeds 2.5 million while the other is only 117k. This is related to peak memory usage and elapse time listed in Table II, run on a Windows x64 platform.

The converged power and temperature maps on SoC chip are plotted in Fig. 15. The converged power and temperature map on SRAM chip are in Fig. 16. As the total power on this chip is 70X (~ 2.57W/0.036W) smaller so that the temperature map is shadowed by that from SoC.

Fig. 17 is a 3D view of temperature profiles in SRAM + TSVs + SoC. There are 100 TSVs of 0.1mm in diameter connecting the two chips. The TSVs helps in reducing peak temperature on both chips, but the effects are not significant as the silicon spacer is very conductive itself and heat flow upward is almost saturated even without TSVs. The effect of TSVs to peak temperature change in SoC with silicon spacer is only 0.42%. The effect increases to 5% if less conductive spacer is used.

CTM based thermal analysis was also performed for a 42.5x42.5mm 8-Layer FCBGA package with 1265 solder joints. For high power dissipation on the 18x18mm chip, it uses a heat spreader and heat sink to extract majority of heat from top of the package. Fig. 18 shows the converged thermal responses on physical layers in the 16-layer chip of 90nm node and ~30M instances. The thermal responses include converged power map, temperature profile and heat flux distribution on each layer. The heat flux maps reflect the wire/via distribution in interconnect layers. The converged total power is around 76W, with peak temperature at 86 C on device layer, and temperature gradient across the chip is more than 13 C. The chip temperature profile is helpful to identify the temperature

differences of the on-chip temperature sensors to the actual peak temperature on chip.



Fig. 14 Test case of 2-die stacked SiP with 3D-IC connected by TSVs

Table II. TRACE MODELING COMPARISON

| Thermal Result Comparison for 3D-IC Case |                 |                    |                   |         |  |  |  |
|------------------------------------------|-----------------|--------------------|-------------------|---------|--|--|--|
|                                          |                 | With Trace Details | W/O Trace Details | Diff %  |  |  |  |
|                                          | T_pkg_top ( C)  | 64.31              | 71.07             | 10.51%  |  |  |  |
|                                          | Tmax (C)        | 103.82             | 103.16            | -0.64%  |  |  |  |
|                                          | Power (W)       | 0.035809           | 0.035761          | -0.13%  |  |  |  |
| SRAM                                     | Theta-JA (C/W)  | 32.11              | 31.97             | -0.44%  |  |  |  |
|                                          | Psi-JT (C/W)    | 15.13              | 12.34             | -18.44% |  |  |  |
|                                          | Tmax (C)        | 106.44             | 105.64            | -0.75%  |  |  |  |
|                                          | Power (W)       | 2.574604           | 2.56544           | -0.36%  |  |  |  |
| SoC                                      | Theta-JA (C/W)  | 33.11              | 32.92             | -0.57%  |  |  |  |
|                                          | Psi-JT (C/W)    | 16.14              | 13.29             | -17.66% |  |  |  |
|                                          | Q_pkg_top (%)   | 10.41%             | 8.91%             | -14.41% |  |  |  |
| Heat Flow                                | Q_pkg_sides (%) | 2.77%              | 1.71%             | -38.27% |  |  |  |
|                                          | Q_board (%)     | 86.82%             | 89.39%            | 2.96%   |  |  |  |
| DOF                                      |                 | 2,526,864          | 117,372           |         |  |  |  |
| Memory (MB)                              |                 | 15,878             | 1,757             |         |  |  |  |
| Elapse Time                              |                 | 9 hrs 38 mins      | 9 mins            |         |  |  |  |



Fig. 15 Converged power map compared with device layout on SoC, and converged temperature map with hot spots consistent to high power density zones



Fig. 16 Converged power and temperature map in SRAM.



### Fig. 17 Temperature on SRAM+TSVs+SoC



Fig. 18 Example of power density map on device layer, layer temperature profiles, and layer heat flux distribution.

## CONCLUSIONS

A new IC thermal analysis methodology is proposed in this paper to perform power-thermal coupled simulation for 3D ICs in the package/PCB, where the multi-layer IC thermal analysis uses accurate system environment. It is 3D IC and SiP capable, including TSVs in IC and package analysis models. The converged temperature map will then be used for IC power, electromigration, and voltage drop evaluations.

Studies showed that smeared metal layer approximation in both IC and package metal modeling is not accurate compared to models with exact trace outlines. For accurate IC temperature prediction, it is required that both IC and package metal trace details be included in analysis model. The power allocation in a logic chip design which is included in a Chip Thermal Model (CTM) was also analyzed and presented. About 90% of the total power is from the transistor layer of the chip.

## ACKNOWLEDGMENTS

The authors are thankful to Mark Qi Ma and Jianhua Xu of Apache/Shanghai office in the implementation of self-heating power on interconnection layers of chip model in CTM-based thermal analysis.

## REFERENCES

[1] K. Banerjee, M. Pedram, and A. Ajami, Analysis and optimization of thermal issues in high-performance VLSI, ACM/SIGDA International Symposium on Physical Design (ISPD), pages 230-237, April 2001.

[2] S. Im and K. Banerjee, Full chip thermal analysis of planar (2-D) and vertically integrated (3-D) high performance ICs, Tech. Dig. IEDM, 2000, pp. 727-730.

[3] T. Wang and C. Chen, Thermal-ADI: A linear-time chip-level dynamic thermal simulation algorithm based on alternating-direction-implicit (ADI) method, IEEE Trans. on Very Large Scale Integration (VLSI) Systems, pp. 691700, 2003.

[4] T. Wang, J. Tsai, and C. Chen, "Thermal and Power Integrity based Power/Ground Networks Optimization," Design, Automation and Test in Europe Conference and Exhibition (DATE), 2004.

[5] A. P. Chandrakasan and R. W. Brodersen, Low Power Digital CMOS Design, Kluwer Academic Publishers, 1995.

[6] M. R. Stan, K. Skadron, M. Barcella, W. Huang, K. Sankaranarayanan, and S. Velusamy, Hotspot: a dynamic compact thermal model at the processor-architecture level, Microelectronics Journal, pp. 11531165, 2003.

[7] H. Yu, Y. Shi, L. He, and T. Karnik, Thermal via allocation for 3D ICs considering temporally and spatially variant thermal power, in ACM/IEEE ISLPED, ISLPED-2006.

[8] B. Black, et.al., Die Stacking (3D) Microarchitecture, The 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06)

[9] T. Wang and M. Schmitt, Accurate Thermal Analysis of Chip/Package Systems, http://networksystemsdesignline.com/howto/198001556

[10] Application Note: Chip-Package Thermal Analysis for SiP using Sentinel-TI, TSMC Reference Flow Release 10.0, July 2009.

[11] Redhawk user manual, version 10.1, Apache Design Solutions, 2010.

[12] Sentinel-TI user manual, version 9.2, Apache Design Solutions, 2010.

[13] S. Pan, N. Chang, and J. Zheng, A New Methodology for IC-Package Thermal Co-Analysis in 3D IC Environment, EDAPS2010, Dec. 2010.

[14] JESD 51-12, "Guidelines for Reporting and Using Electronic Package Thermal Information", JEDEC standard, May 2005.