# A 10-Gb/s –18.8 dBm Sensitivity 5.7 mW Fully-Integrated Optoelectronic Receiver With Avalanche Photodetector in 0.13-μm CMOS

Spoorthi Nayak<sup>®</sup>, Abdelrahman H. Ahmed<sup>®</sup>, *Student Member, IEEE*, Ahmad Sharkia<sup>®</sup>, *Student Member, IEEE*, Ajith Sivadhasan Ramani<sup>®</sup>, *Student Member, IEEE*, Shahriar Mirabbasi<sup>®</sup>, *Member, IEEE*, and Sudip Shekhar<sup>®</sup>, *Senior Member, IEEE* 

Abstract—Avalanche photodetectors (APDs) improve the sensitivity of optoelectronic (O/E) receivers (RXs) due to their high multiplication gain and responsivity. When implemented monolithically with a CMOS transimpedance amplifier (TIA) on the same chip, they provide further advantages, such as low cost and reduced parasitics. However, APDs require high bias voltages, are sensitive to variations in operating conditions, and have a limited gain-bandwidth product. This paper presents an 850 nm APD implemented in a standard 0.13-\mu CMOS process with a responsivity of 3.92 A/W and a large-signal -3 dB bandwidth of 3.5 GHz. Advantages of on-chip integration with a TIA are described and a noise-canceling active balun following the single-ended TIA is presented. The O/E-RXs front-end achieves a measured sensitivity of -18.8 dBm, the best-reported among 10 Gb/s linear CMOS TIAs operating at 850 nm. The energy/bit is 0.57 pJ/b. An on-chip voltage booster is described and implemented to generate a large APD bias using nominal CMOS voltage supplies. A modified hill-climbing algorithm is also presented that can enable bias stabilization for the voltage booster and the optoelectronic front-end for a complete all-bulk-**CMOS** implementation.

Index Terms—Active balun, silicon avalanche photodetector, slope detection, transimpedance amplifiers, tuning and stabilization, voltage booster.

# I. INTRODUCTION

WITH the rapid growth of datacenters, there is a significant demand to leverage the high volume manufacturing capabilities of the silicon industry and reduce the overall hardware cost of the optical interconnects connecting different servers and switches. With the optoelectronic transceiver market poised towards a significant annual growth, it is imperative to reduce the bill of materials and lower the cost of the transceivers. As most of these transceivers utilize an

Manuscript received October 25, 2018; revised February 12, 2019; accepted March 19, 2019. Date of publication May 6, 2019; date of current version July 3, 2019. This work was supported in part by the Natural Sciences and Engineering Research Council of Canada and in part by the Intel Corporation. This paper was recommended by Associate Editor N. Krishnapura. (Corresponding author: Spoorthi Nayak.)

S. Nayak and A. S. Ramani are with Semtech, Burlington, ON L7L5M4, Canada (e-mail: spoorthi@ece.ubc.ca; ajithsr@ece.ubc.ca).

A. H. Ahmed, A. Sharkia, S. Mirabbasi, and S. Shekhar are with The University of British Columbia, Vancouver, BC V6T1Z4, Canada (e-mail: abdelrahman@ece.ubc.ca; sharkia@ece.ubc.ca; sharkia@ece.ubc.ca; sharkia@ece.ubc.ca; sudip@ece.ubc.ca).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSI.2019.2909284

off-chip photodetector (PD) a fully-monolithic implementation of the PD will reduce the cost for both the component and the associated packaging. From the link perspective, there are also requirements to increase the data rate from 10 Gb/s to 25 Gb/s and higher, and reduce the overall power consumption. A single-chip optoelectronic (O/E) receiver (RX) with a monolithic PD and the associated CMOS circuits can significantly reduce the parasitics and ease the overall O/E-RX design. Beyond the transceiver market, a fully-integrated O/E-RX is also desirable for other applications in high-performance computing and sensors [1].

A significant majority of the optical interconnects in the existing datacenters today span a distance of less than 300 m, and operate at 850 nm with multi-mode fibers (MMFs), vertical-cavity surface emitting lasers (VCSELs) and PDs [2]. The goal of this paper is to propose a fullyintegrated, single-chip CMOS O/E-RX incorporating a CMOS avalanche PD (APD) for 850 nm applications. Section II of the paper describes the motivation behind a CMOS APD O/E-RX design. The benefits of using on-chip PDs in comparison to off-chip PDs are presented, and the advantages and challenges of using APDs are discussed. A noise-canceling active balun is described to generate differential swings out of the singleended transimpedance amplifier (TIA). Measurement results of a proof-of-concept prototype 0.13-µm CMOS O/E RX front-end (RXFE) operating at 10 Gb/s with -18.8 dBm sensitivity are presented in Section III. As APDs require large bias voltage, an on-chip voltage booster circuit using nominal CMOS voltage supplies and its measurement results are presented in Section IV. Section V proposes a complete APD O/E-RX system with bias stabilization loop. Finally, Section VI draws the main conclusions of this work.

### II. INTEGRATING APDS WITH CMOS RXS

## A. APD Responsivity and Link Budget

PDs convert the light into an electrical current, and therefore are used at the front-end of every O/E-RX, as shown in Fig. 1(a). PD responsivity, R, defined as the ratio of the output current  $I_{PD}$  to the input optical power  $P_{in}$ , is a measure of its gain. The responsivity of the PD is given as [3]:

$$R = \frac{I_{PD}}{P_{in}} = \frac{\eta}{hc/\lambda} \tag{1}$$



Fig. 1. (a) External PD or APD with a CMOS electronic RXFE, (b) fully-integrated O/E-RXFE with APD in a modified CMOS process and external biasing, and (c) fully-integrated APD O/E-RX in standard CMOS process with bias generation and stabilization.

where,  $\eta$  is the quantum efficiency of the PD,  $\lambda$  is the wavelength of the incident light, h is the Planck's constant and c is the velocity of the light. The relation between the optical sensitivity,  $P_{op-sen}$ , of the O/E-RXFE and the electrical sensitivity,  $i_{sen}^{pp}$ , of the electronic RXFE can be shown as [3]:

$$P_{op-sen} = \frac{i_{sen}^{pp}}{2R} \tag{2}$$

Higher responsivity of the PD therefore significantly improves the O/E-RXFE optical sensitivity. Traditionally, external PDs implemented in III-V technologies offer *R* of up to 1 A/W.

One approach to boost the responsivity is to leverage the avalanche effect [4] and design an APD. In an APD, the photoelectric effect first converts the incident photons into electrons, similar to regular PDs. Then, by applying a high reverse bias voltage, these electrons are accelerated to create impact ionization and generate many more carriers. This "avalanche" effect further increases the current gain, boosting the effective responsivity,  $R_{eff}$ , of the APD by a multiplication factor, M.

$$I_{APD} = MI_{PD} = MRP_{in} (3)$$

$$R_{eff} = MR = \frac{i_{sen}^{pp}}{2P_{op-sen}} \tag{4}$$

The multiplication gain of the APD has a huge impact on the overall link budget. Consider a typical VCSEL and MMF link in a datacenter. The link budget of such a link can be described by the following equation [5]:

$$P_{RX} = P_{TX} - 2P_{MMF-CPL} - P_{MMF-att} - P_{MMF-dis} - P_{pen} - P_{margin} = \frac{i_{sen}^{pp}}{2MR}$$
 (5)

where,  $P_{TX}$  represents the optical power output of the VCSEL when directly modulated by the CMOS driver,  $P_{MMF-CPL}$  represents the coupling loss of the optical signal into and out of the MMF,  $P_{MMF-att}$  and  $P_{MMF-dis}$  represent the attenuation and dispersion losses, respectively,  $P_{pen}$  includes the link penalty due to crosstalk, ISI and relative intensity noise,



Fig. 2. Impact of APD effective responsivity,  $R_{eff}$ , on the VCSEL power consumption.

and  $P_{RX}$  represents the received optical power at the PD [5]. Assuming a 300 m length of OM4 MMF with 3.5 dB/km of  $P_{MMF-att}$ ,  $P_{MMF-CPL}$  of 1.1 dB each,  $P_{MMF-disp}$  of 5 dB,  $P_{pen}$  of 4.8 dB, a margin of 3 dB, and RX electrical sensitivities of -15 dBm and -10 dBm at 10 Gb/s and 25 Gb/s, respectively, the VCSEL wall-plug power required for different values of MR are shown in Fig. 2. A wall-plug efficiency of 17% for the VCSEL and a baseline R of 0.02 A/W is assumed for the APD. Clearly, use of APDs, and improving M for the APDs can significantly reduce the overall power consumption of the link. Conversely, for the same laser power, the design of the O/E-RXFE can be considerably relaxed.

## B. On-Chip PD vs. Off-Chip PD

Traditionally, PDs are implemented in a separate process compared to the CMOS electronic RX to increase the PD performance, namely, -3 dB bandwidth (BW) of its frequency response, and responsivity. PDs are generally made in expensive technologies such as Ge [6], GaAs [7], [8] or InP-InGaAs [9] to enhance their performance. However, connecting the external PD to a CMOS electronic RXFE (Fig. 1(a)) using wirebonding assembly results in several issues: increase in manufacturing and packaging cost, possible decrease in yield, crosstalk between the bondwires degrading RXFE performance especially when implemented as arrays of PDs connecting multiple RXFEs, requirement for ESD devices, additional packaging parasitics degrading the sensitivity of the RXFE, etc. A flip-chip package reduces, but does not eliminate, parasitics and crosstalk. Consider the system shown in Fig. 1(a) where an external PD (or APD) is followed by a TIA used as the gain stage to convert the PD current,  $I_{PD}$ , to voltages that can be further amplified by the main amplifiers (MAs). Considering an inverter-based TIA, the input BW of the TIA,  $BW_{in}$ , is given by [5]:

$$BW_{in} = \frac{1}{2\pi C_T \left(\frac{R_f}{1 + (g_{mn} + g_{mp})(r_{dsn} || r_{dsp})}\right)}$$
(6)

where,  $R_f$  is the feedback resistor,  $g_{mn}$  ( $g_{mp}$ ) is the transconductance and  $r_{dsn}$  ( $r_{dsp}$ ) is the output impedance of the NMOS (PMOS) transistor of the inverter, and  $C_T$  is the total input capacitance at the TIA input,  $C_T = C_{PD} + C_{ESD} + 2C_{PAD} + C_{TIA,in}$ .  $BW_{in}$  is therefore inversely proportional to  $C_T$ , and additional capacitance from the pads and



Fig. 3. A survey of -3 dB bandwidth of APD vs. its reverse bias voltage.

ESD significantly limit the maximum achievable data rate. For example, in 0.13- $\mu$ m CMOS process, typical values of  $C_{TIA,in} = 100$  fF,  $2C_{PAD} = 160$  fF,  $C_{ESD} = 200$  fF, and  $C_{PD} = 100$  fF imply that the pads and ESD constitute about 72% of total input capacitance.

To overcome the aforementioned problems with a discrete PD, a fully monolithic CMOS PD can be implemented by using additional modifications to the CMOS process [10] (Fig. 1(b)). However, adding Ge to CMOS process to improve the PD performance increases manufacturing cost and complexity of fabrication, and is detrimental to the performance of CMOS transistors [11]. A Ge-based APD has high optical absorption only in 1.3 to 1.55  $\mu$ m wavelength range, and is therefore not suitable for 850 nm applications. Fully-integrated bulk CMOS PDs [12], [13] have low responsivity, and are not very attractive for high-speed links.

# C. On-Chip CMOS APD

Fig. 1(c) shows the proposed system in which the APD is designed in the same bulk CMOS process as the electronic RXFE circuits in order to reduce cost, manufacturing complexity, bondwire crosstalk, and parasitics. Due to monolithic integration, no external signal pads are needed  $(C_{pad} = 0 f F)$ , and the ESD requirement is also considerably reduced. On-chip APDs thus improve O/E-RXFE sensitivity and BW and have inspired several recent research efforts in the design of on-chip CMOS APDs [4].

Despite the advantages of CMOS APDs, there are certain drawbacks that have limited their practical use. Next, we describe these drawbacks, and propose solutions to overcome them.

Need for large bias voltage: APDs in bulk CMOS process require bias voltage of up to 10 V [4]. Even the Ge-based APDs require high bias voltages. Fig. 3 shows the bias voltage requirement for different APDs. As the nominal voltage supplies on bulk CMOS processes are typically limited to 3.5 V (for I/Os), and the breakdown voltages of the substrate diode are also limited (10 to 12 V), this has resulted in APDs being used only as external components. In Section V, we present an on-chip voltage booster to generate the required APD voltage in a standard CMOS process.

APD Noise and RXFE Sensitivity: The APD noise current consists of two main components – shot noise  $(\overline{I_S^2})$  and thermal noise  $(\overline{I_T^2})$ . Shot noise is white and is a function of the PD

current and the noise BW,  $BW_{noise}$ . The shot noise can be expressed as [3], [14]:

$$\overline{I_S^2} = 2qM^2F(RP_{in} + I_D)BW_{noise}$$
 (7)

where,  $I_D$  is the dark current of APD and F is the excess noise factor. The relationship between F and M can be shown to be [3], [14]:

$$F = kM + (1 - k)\left(2 - \frac{1}{M}\right) \tag{8}$$

where, k is the ratio of ionization co-efficient of electron and hole. For silicon APDs, k is approximately 0.02 to 0.05 [3]. Assuming R is 0.02 A/W, for  $P_{in}$  of -15 dBm, the desired PD signal current is  $RP_{in}M = 0.6M\mu\text{A}$ , whereas the dark current component is only in the range of 5M nA. Thus, in (7), the dark current component can be ignored.

Thermal noise is a function of temperature, T, and is inversely proportional to the TIA input resistance ( $R_{SH}$ ) which acts as the shunt load resistance for the PD. The thermal noise can be written as [3], [14]:

$$\overline{I_T^2} = \left(\frac{4kTBW_{noise}}{R_{SH}}\right) \tag{9}$$

where, k is Boltzmann's constant  $(1.38 \times 10^{-23} J/K)$ . Assuming  $R_{SH} = 200\Omega$  and  $BW_{noise} = 5$  GHz for a 10 Gb/s RX, the RMS thermal noise is 0.64  $\mu$ A and can be reduced by increasing the input resistance of TIA at the cost of RXFE input BW. On the other hand, the input-referred RMS noise of a CMOS TIA,  $i_{n,TIA}^{rms}$ , for 10 Gb/s is usually in the range of few  $\mu$ A [3]. Hence, the overall input-referred noise of an O/E-RXFE is often dictated by  $i_{n,TIA}^{rms}$  in comparison to the thermal noise of the APD.

Signal-to-noise ratio (SNR) is defined as the ratio of mean-free average signal power to the average noise power [3]. For a DC balanced signal, mean-free power is  $(i_s^{pp}/2)^2$  [3]. Assuming equal number of ones and zeros, the noise power is calculated as  $(i_{n,0}^2 + i_{n,1}^2)/2$  [3], [14], where:

$$\overline{i_{n,1}^2} = \overline{I_S^2} + (i_{n,TIA}^{rms})^2 = 2qM^2F(RP_{in})BW_{noise} + (i_{n,TIA}^{rms})^2$$
(10)

$$\overline{i_{n,0}^2} = \left(i_{n,TIA}^{rms}\right)^2 \tag{11}$$

$$SNR = \frac{\left(i_s^{pp}/2\right)^2}{\left(\overline{i_{n,0}^2} + \overline{i_{n,1}^2}\right)/2}$$

$$= \frac{(MRP_{in})^2}{2\left(2qM^2F(RP_{in})BW_{noise} + 2\left(i_{n,TIA}^{rms}\right)^2\right)}$$
(12)

At very low bias voltages, the avalanche gain is negligible and hence the signal and the shot noise are minimal. The total noise at this lower bias voltage is dictated by the TIA noise. As the reverse bias increases, APD gain increases, so does the signal strength and shot noise. After avalanche breakdown, shot noise increases significantly as the gain is very high. There exists an optimal region for the bias voltage near the avalanche region where the SNR reaches its maximum, as shown in Fig. 4.



Fig. 4. (a) APD signal and noise current, and (b) SNR vs. APD gain.

The optimum gain,  $M_{OPT}$ , defined as the M which gives the maximum SNR, can be computed as [14]:

$$\frac{d(SNR)}{d(M)} = 0 (13)$$

$$M^{3} + \frac{(1-k)}{k}M = \frac{2\left(i_{n,TIA}^{rms}\right)^{2}}{qkRP_{in}BW_{noise}}$$
(14)

An approximate solution can be shown to be [14]:

$$M_{OPT} \approx \sqrt[3]{\frac{2\left(i_{n,TIA}^{rms}\right)^2}{qkRP_{in}BW_{noise}}}$$
 (15)

Fig. 4(b) shows the SNR based on (15), assuming k = 0.02 for silicon APD [3] with a baseline responsivity of 0.02 A/W and an incident optical power of -18 dBm. The RMS TIA noise is assumed to be 3.7  $\mu$ A at 10 Gb/s [3].

Sensitivity to reverse bias voltage and temperature: The gain of the APD fluctuates with change in the chip temperature and the applied reverse bias [15]-[18]. Performance of the APD is also subjected to random process variations during its manufacturing. Several ideas have been proposed in prior-art to maintain steady gain and BW of APD using temperature monitoring and compensation techniques. These techniques sense the temperature and compensate for the variations either by changing the bias voltage of the APD or by heating or cooling the APD. In [19], the cooling characteristic of a Peltier cell is used in conjunction with a thermistor to maintain a constant low temperature for APD, as shown in Fig. 5(a). This implementation is, however, difficult to realize in standard CMOS technology. In [20], [21], a temperature sensor is used in the vicinity of APD to maintain a stable bias voltage with a pre-tabulated temperature vs. APD bias voltage data, as shown in Fig. 5(b). The pre-tabulated data is obtained with a standalone APD. However, generating an accurate pre-tabulated data for a monolithic implementation where the temperature of the APD and the electronic RX changes together over time is difficult. In [19], [22], a thermistor-based logic is used to compensate for the change in temperature by changing its bias voltage. In [23], clock and data recovery (CDR) logic is used to make decisions based on eye quality. This technique increases complexity and is only applicable to cases where the CDR is implemented on the same chip. In [24], [25], matched APDs, one biased in unity-gain region and another biased in highgain mode through a feedback loop are used to attain constant multiplication gain, as shown in Fig. 5(c). The main difficulties



Fig. 5. (a) Heating/cooling an APD, (b) temperature sensors to monitor APD's temperature and alter bias, and (c) dummy APD to stabilize APD bias



Fig. 6. A 10 Gb/s APD O/E-RX in 0.13- $\mu$ m CMOS process. The blocks enclosed with the dotted line are implemented on the chip.

in these implementations are the required matching of APDs, and defining and maintaining unity-gain bias. In Section V, we propose a fully-integrated O/E-RX with on-chip biasing, tuning and stabilization of APD in a standard CMOS process using nominal voltage supplies.

#### III. PROPOSED 10 GB/S CMOS APD O/E-RXFE

In this Section, we describe the details of the high-speed path of the 10 Gb/s APD O/E-RXFE implemented in a 0.13- $\mu$ m CMOS process, as shown in Fig. 6. The CMOS APD is followed by a TIA to convert the electric current to an electric voltage with gain ( $A_{TIA}$ ), and 4 stages of MAs for further voltage amplification. An offset cancellation loop consisting of a low-pass filter (LPF), error amplifier and a current source is also shown in Fig. 6.

An APD, simplistically, is a reverse biased PN junction diode. Fig. 7 shows the cross-section of the APD implemented in a 0.13- $\mu$ m CMOS process as N+/P-well junction, and is based on the device in [4]. No additional masks or



Fig. 7. Cross-section of the  $0.13-\mu m$  CMOS APD.



Fig. 8. (a) Die micrograph showing the CMOS APDs in 0.13- $\mu$ m process and (b) measurement setup for APD bandwidth.

anti-reflective coatings are employed to enhance the performance of the APD. At the junction edges, high concentration of electric field will result in premature breakdown. This would result in a lower breakdown voltage and hence lower multiplication gain. Shallow trench isolation (STI) guard rings are used to enhance breakdown voltage and thus provide higher avalanche gain [26].

When light penetrates the depletion region, it creates electron-hole pairs leading to diffusion current. The penetration depth of light in silicon is around 20  $\mu$ m which is more than the depletion region (around 2  $\mu$ m). Thus, there is a high chance for the light to enter P-substrate and produce electron-hole pair which results in slow drifting current. When this slow diffusion current reaches depletion region, it will increase the total current and hence improve responsivity. On the other hand, due to slow diffusion the speed and hence the BW of the APD is reduced. N+/Pwell junction based APD suffers from slow diffusion current at Pwell-Psubstrate junction. Minimizing this slow diffusion current would enhance the BW of the APD. We use Deep N-well to shield Pwell from Psubstrate in order to prevent the slow diffusion current from reaching the depletion region and boost the BW of the APD.

Salicide and passivation layers are avoided above active region of APD to ensure most the light is absorbed by N+/P-well region. Deep-N-well and P-substrate are connected to ground to shield their effect on photodetection. With an active area of 30  $\mu$ m×30  $\mu$ m, the APD needs a reverse bias voltage of < 10 V at 300 K of operating temperature. APD dimensions are kept small to minimize the capacitance,  $C_{APD}$ . The lower limit on the dimensions is dictated by the APD-to-fiber alignment.

Fig. 8 (a) shows the die micrograph of the prototype, and Fig. 8(b) shows a setup used for APD measurement. Three N+/Pwell-Deep-Nwell based APDs with core area of 40  $\mu$ m  $\times$  40  $\mu$ m, 30  $\mu$ m  $\times$  30  $\mu$ m and 10  $\mu$ m  $\times$  10  $\mu$ m



Fig. 9. Measured APD (a) gain M vs. reverse bias voltage, and (b) largesignal frequency response.

and one P+/Nwell-Deep-Nwell APD with core area of  $30~\mu m \times 30~\mu m$  were fabricated. Measurement results of N+/Pwell-Deep-Nwell with a core area of  $10~\mu m \times 10~\mu m$  are shown. An 850 nm VCSEL die is used as an optical source and its output is coupled to the fiber using a collimated lens set-up with a coupling efficiency of 50%, where the optical power of the VCSEL output is measured using an optical spectrum analyzer (Agilent 86146B). This coupled light is flashed on the APD using a 50/125  $\mu$ m multimode lensed fiber, where the coupling loss between the fiber and the APD interface is approximately 2 dBm. Next, the VCSEL is modulated with an alternating data pattern (1010) from a pattern generator (Anrtisu MP1800A), the output of the APD is directly connected to an electrical spectrum analyzer (Rohde & Schwarz 26.5 GHz FSW).

The measured multiplication gain M of the APD vs. its reverse bias voltage is shown in Fig. 9(a). At low bias the gain is almost unity. The gain of the APD is calculated as the ratio of photocurrent at a given bias to the average photocurrent at lower voltage (0 V to 1.0 V) [4]. The gain and overall responsivity of the APD near avalanche breakdown of 9.51 V is measured to be 191.2 and 3.92 A/W, respectively. Fig. 9(b) shows the normalized large-signal frequency response of the APD. Due to lack of an external pre-calibrated TIA in our measurement setup, small-signal frequency response of the APD could not be measured due to low output current levels. It also posed a restriction for pulse response, which is often an alternative method to characterize the small-signal frequency response of a PD. Instead, the VCSEL is modulated with alternating data pattern from 0 to 0.8 V, and the output electrical current from the APD is observed on a  $50-\Omega$ spectrum analyzer. As the frequency of the alternating data pattern is swept from few hundreds of MHz to GHz, the output power is recorded, and thus, a large-signal frequency response is obtained.

## IV. TIA, ACTIVE BALUN AND MAS

The output of a PD is single-ended. As differential designs are preferred in an RXFE, a differential TIA driven by a PD and a dummy PD can be implemented [27], where the dummy PD does not have any light incident on it. Another popular topology is to implement a single-ended TIA followed by a differential amplifier where the TIA is conventionally connected to a differential amplifier, with the other input of the differential amplifier connected to a replica TIA, Fig. 11(a).



Fig. 10. (a) A differential amplifier or (b) a noise-canceling active balun to convert single-ended TIA output to differential signals for the MAs. (c) Comparison of their simulated noise referred to TIA input, when both have the same gain and bandwidth.



Fig. 11. Simplified schematic of (a) the main amplifier (MA) and (b) the error amplifier (EA). (c) % breakdown based on post-layout simulations (with measured power consumption of 5.52 mW in the RXFE).

However, these methods cause mismatch in gain and phase of the differential output, resulting in asymmetric signals. Moreover, a dummy TIA also increases the power consumption of the RXFE.

Let us consider the differential amplifier in Fig. 11(a). Transistor M1 is connected to the TIA and M2 is connected to dummy TIA for better matching. Transconductance  $(g_m)$  and the load (R) of both transistors M1 and M2 are considered to be matched. The output-referred noise,  $(\overline{V_{(o,n)}^2})$ , and the input-referred noise,  $(\overline{V_{(i,n)}^2})$ , of the differential amplifier can be calculated as follows:

$$\overline{V_{o,n}^2} = 2\left[4kTR + 4kT\gamma g_m R^2\right] = 8kTR\left[1 + \gamma g_m R\right] \quad (16)$$

$$\overline{V_{i,n}^2} = \frac{8kTR\left[1 + \gamma g_m R\right]}{\left(g_m R\right)^2} = \frac{8kT\left[1 + \gamma g_m R\right]}{\left(g_m^2 R\right)} \quad (17)$$

In this work, we implement a single-ended inverter-based push-pull TIA followed by a self-noise-canceling active balun to convert the single-ended TIA output to differential signals (Fig. 6). A detailed discussion for an inverter-based TIA design is given in [28]. Fig. 11(b) shows the active balun implementation, inspired by noise-canceling low-noise amplifiers [29].  $V_{in}$  represents the single-ended signal from the TIA,  $R_s$  is the output resistance of the TIA,  $V_B$  is the gate bias for the common-gate transistor M2, and  $V_P$  and  $V_N$  are differential signal outputs. As shown in Fig. 11(b), the TIA signal,  $V_{in}$ , undergoes amplification by M2 in phase, and amplification at M1 out of phase, and thus the two signals add up differentially at the output. The gain of the balun,  $A_{Balun}$ , can be shown to be:

$$A_{Balun} = \frac{g_{m1}R_1 + g_{m2}R_2}{1 + g_{m2}R_s} = g_{m1}R_1 \tag{18}$$

Here,  $g_{m1}$  and  $g_{m2}$  are transconductance of transistors M1 and M2, respectively.  $R_1$  and  $R_2$  are the loads seen by M1 and M2 transistors and can be approximated to  $1/g_{m4}$  and  $1/g_{m3}$ , respectively. For the output to have matched swings at  $V_P$  and  $V_N$ , the gain of the two paths should be designed such that  $g_{m1}R_1 = g_{m2}R_2$ .

On the other hand, the gate-referred equivalent noise of M2 has two paths to the differential outputs. The noise of M2,  $V_{n2}$ , with M2 acting as a common-source device, is inverted at P,  $V_{P,n2}$ .  $V_{n2}$  is also sensed at node A in-phase,  $V_{An2}$ , and then inverted by M1 to appear at the node N as  $V_{N,n2}$ .

$$V_{P,n2} = V_{n2} \left( \frac{-g_{m2} R_2}{1 + g_{m2} R_s} \right) \tag{19}$$

$$V_{A,n2} = V_{n2} \left( \frac{g_{m2} R_s}{1 + g_{m2} R_s} \right) \tag{20}$$

$$V_{N,n2} = V_{A,n2} \left( -g_{m1} R_1 \right) = V_{n2} \left( \frac{-g_{m1} R_1 g_{m2} R_s}{1 + g_{m2} R_s} \right) (21)$$

For the differential outputs, if  $R_s = 1/g_{m2}$ , then the effective noise of M2 is canceled, as per the following equation:

$$V_{out,n2} = V_{n2} \left( \frac{g_{m1} R_1 g_{m2} R_s - g_{m2} R_2}{1 + g_{m2} R_s} \right)$$
 (22)

Considering noise from other sources, M1,  $R_1$  and  $R_2$ , we can calculate  $\overline{V_{(a,n)}^2}$  and  $\overline{V_{(i,n)}^2}$  as follows:

$$\overline{V_{o,n}^2} = 4kTR \left[ 2 + \gamma g_m R \right]$$
 (23)

$$\frac{\overline{V_{i,n}^2}}{V_{i,n}^2} = \frac{8kT \left[ 1 + \gamma \, g_m \, R/2 \right]}{g_{in}^2 \, R} \tag{24}$$

Comparing (24) with (17), the input-referred noise of the active balun is smaller than that of a differential amplifier with dummy TIA. Fig. 10(c) compares the noise, when referred to the TIA input, of the active balun with a differential amplifier based implementation when both are designed and simulated for same gain and BW. The total RMS input-referred noise of the TIA with active balun and differential amplifier is  $2.2 \mu A$  and  $3.1 \mu A$  RMS, respectively. Apart from minimizing the mismatches in gain and delay as compared to a dummy TIA, the active balun provides gain ( $A_{Balun}$ ), thereby reducing the input referred noise from the MAs. The loads  $R_1$  and  $R_2$  are implemented as active shunt-peaking inductors for BW



Fig. 12. (a) Die micrograph of the APD O/E-RXFE in  $0.13-\mu m$  CMOS. (b) Measured optical eye diagram at 10 Gb/s. (c) Simulated (post-layout) magnitude and phase response, and (d) optical eye diagram.

extension [30]–[32] at an expense of additional noise. The active balun is followed by 4 stages of MAs, with each stage implemented as a differential amplifier employing shunt-peaking active inductors. MAs are provided with two different power supplies, 1.24 V for the transistors and 1.54 V for resistors of active inductor. The amplified signal is finally buffered to the output using 50- $\Omega$  drivers for measurement purposes.

The APD O/E-RXFE, as shown in Fig. 6, is implemented as a proof-of-concept prototype. Fig. 12(a) shows the die micrograph. The core area of APD is 10  $\mu$ m  $\times$  10  $\mu$ m and that of TIA with MAs, buffers and offset cancellation loop is  $170 \ \mu \text{m} \times 140 \ \mu \text{m}$  (without pads). For measurement purposes, the chip is wirebonded to a CQFP80 package and soldered onto a PCB. The power consumption for the TIA is 1.74 mW and for the balun and MA stages is 3.96 mW. A more detailed breakdown of the power consumed in various RXFE circuits is shown in Fig. 11(c) based on post-layout simulations. The optical eye diagram measurement for the APD O/E-RXFE at  $P_a vg = -18.8$  dBm is shown in Fig. 12(b), for a 10 Gb/s PRBS7 signal generated using a VCSEL rated at 25 Gb/s equivalent BW at a 6 dB extinction ratio. For sake of comparison, a post-layout simulated optical eye diagram is also shown in Fig. 12(d). Fig. 12(c) shows the magnitude and phase response of the RXFE. A further redesign of the RXFE considering peak-distortion analysis [38], time-domain response [32] or phase response [39] should reduce the data-dependent jitter in the output eye. Table I provides the performance summary and comparison to prior-art CMOS linear (non-clocked) TIA based 10 Gb/s receivers operating at 850 nm with CMOS APD [36] and n-well based PD [37], along with the state-ofthe-art designs with external PDs [33]-[35]. To the authors' knowledge, this work achieves the best sensitivity for 10 Gb/s CMOS linear RXFE at 850 nm. The design also compares favorably to state-of-the-art in area, power and energy efficiency. BER measurements are carried out using an SHF 11125 analyzer and a bathtub plot is shown in Fig. 13. Further improvements in sensitivity or data rate can be achieved by leveraging linear equalization techniques [37], [40], [41] or inductive-peaking [31], [32].

#### V. VOLTAGE BOOSTER FOR APD BIAS GENERATION

In order to provide the large reverse-bias voltage needed for the APD  $(VDD_{HI})$  from an external supply of 2.5 V, a fully integrated voltage booster is implemented. Fig. 14(a) shows the simplified schematic of the proposed voltage booster. The power for the voltage booster is provided through  $V_{REF}$ , which can be swept from 0.5 V to 2.5 V.  $V_{REF}$  can also be controlled from the output of a DAC, as described later in Section V. A bias voltage,  $V_{bias}$ , controls the dropout across the transistor M1 to provide a variable supply,  $V_{DDX}$ , to a pair of inverters driven by differential clock phases, CLK and CLKB. Provided externally in the prototype, these differential signals can be easily generated by an on-chip ring oscillator. The output of the inverters swings from 0 to  $V_{DDX}$ , and can therefore be varied as needed.  $V_{DDX}$  also provides the input voltage to the core of the voltage booster.

The core of the voltage booster is modified from a Dickson voltage multiplier [42]. Fig. 14(b) shows the conventional NMOS-based Dickson voltage booster in which diodeconnected NMOS transistors are connected in series and the intermediate node shares capacitors. The bottom plate of these capacitors are connected to differential phases of a clock in an alternating fashion. A disadvantage of the traditional Dickson architecture is that the diode-connected NMOS transistors turn-off when the gate-source bias falls below the threshold voltage  $V_{TN}$ . The threshold drop can be compensated by using a back-compensated voltage booster implemented using PMOS transistors, as proposed in [43]. A PMOS transistor requires a negative gate-source voltage below its threshold to remain ON. As shown in Fig. 14(c), a negative gate-source bias voltage is provided by a diode connecting the PMOS gate to the source of the previous stage instead of the traditional diode connection. The voltage of the intermediate nodes increases across the stack where the top plates of the capacitors are connected. As the bottom plates of the capacitors are fed with differential signals of the same swing, these charge pump architectures impose a high voltage stress on the capacitors of the last few stages.

A modified version of the threshold-compensated PMOS rectifier based voltage booster core is shown in Fig. 14(d), where voltage drops are limited to no more than 2.5 V (maximum value of  $V_{REF}$ ) between any two nodes of any of the capacitors or transistors. Furthermore, thick-oxide I/O transistors and MIM capacitors are used. The capacitor connected to the OUT node is implemented as a series of multiple capacitors to reduce the voltage drop and ensure reliability of each capacitor. This allows the proposed circuit to achieve high voltages in standard CMOS process without reliability issues. As the breakdown voltage for the substrate wells in this process is limited to  $\sim 10$  to 12 V range, the operation of the APD at a bias voltage of < 10 V is permissible. If larger bias voltages are needed, or if the system is to be designed in a scaled CMOS process, substrate isolation and field oxide isolation techniques as proposed recently in [44] can be adopted.

The proposed voltage booster has 11 cascaded stages. Ideally, each stage should increases the voltage by  $V_{REF} - V_{thp}$ ,

|                         | [33]    | [34]           | [35]      | [36]    | [37]            | This Work  |
|-------------------------|---------|----------------|-----------|---------|-----------------|------------|
| CMOS Tech. (nm)         | 120     | 130            | 65        | 65      | 65              | 130        |
| Arch.                   | TIA+MA  | TIA+MA         | TIA+MA    | APD+TIA | Nwell-PD+TIA+MA | APD+TIA+MA |
| Gain (dBΩ)              | 81.1*   | 87             | 63.2      | 60      | 102             | 71         |
| BW (GHz)                | NA      | 6.6            | NA        | 6       | 12.5/0.5        | NA/3.5     |
| Resp. (A/W)             | NA      | 0.67           | 0.55      | NA      | NA              | 3.92       |
| Data Rate (Gb/s)        | 10      | 10             | 10        | 10      | 9               | 10         |
| BER                     | 1e-12   | 1e-12          | 1e-12     | 1e-12   | 1e-12           | 1e-12      |
| PRBS                    | 7       | 7              | 7         | 7       | 15              | 7          |
| VDD (V)                 | 1.0/1.7 | 1.8            | NA        | 1.2     | 1/1.2           | 1.24/1.54  |
| VDD-PD (V)              | 3.5     | 2.5            | NA        | 10.7    | 0.5             | 9.51       |
| Power (mW)              | 8*      | 44* **         | 68.2      | 13.7*   | 48* ***         | 5.7*       |
| Energy/bit (pJ/b)       | 0.8     | 3.52**         | 6.82      | 1.37    | 5.33***         | 0.57       |
| Area (mm <sup>2</sup> ) | 0.043   | NA             | 0.004**** | 0.024   | 0.23            | 0.024      |
| Pavg (dBm)              | -13.1   | -12.3 to -12.7 | -16.1     | -6.5    | -11.5           | -18.8      |
| OMA (dBm)               | -13.1   | -11.5          | -15.6     | NA      | NA              | -18        |

TABLE I
PERFORMANCE SUMMARY AND COMPARISON TO 850 NM LINEAR CMOS TIAS

<sup>\*</sup> Excluding 50  $\Omega$  buffer, \*\* @ 12.5 Gb/s, \*\*\* @ 9 Gb/s, \*\*\*\* Excluding offset cancellation



Fig. 13. Measured BER vs. RXFE sensitivity at 10 Gb/s.

however, due to leakage and parasitics, the amount by which the voltage is increased in each stage diminishes as more stages are added. The voltage booster works with a wide range of clock frequencies, from 120 MHz to 2.4 GHz. Higher frequencies are preferred to minimize the output ripple and the overall circuit area. To further reduce the ripple, two boosters cores can be used in parallel with opposite clock phase. The output range of the voltage booster can be varied from 0 V to 10 V through  $V_{bias}$  ( $V_{DDX}$ ). Across PVT corners, the output of the voltage booster varies by approximately up to 1V based on simulations, which can be mitigated by the bias stabilization loop.

Fig. 15(a) shows the die micrograph of the voltage booster. It occupies an area of 400  $\mu$ m  $\times$  140  $\mu$ m, mostly limited by the area of the capacitors in the voltage booster core. The measurement result of the voltage booster output voltage,  $VDD_{HI}$ , as a function of  $V_{REF}$ , is plotted in Fig. 15. With a nominal I/O supply of 2.5 V in this process, the voltage booster output varies from 0 to 7.42 V. This is about 3 V smaller than the design target, and the discrepancy is attributed to additional leakage not accounted for in the simulations.



Fig. 14. (a) Proposed Voltage booster circuit. (b) Conventional Dickson voltage multiplier, (c) Back-compensated charge multiplier [43], and (d) proposed charge multiplier.

This prevented the measurement of the system with voltage booster and APD together. Assuming the leakage scales linearly with the number of stages, we estimate that three stages must be added in the future work so as to provide the necessary bias voltage (  $\approx 10 \text{ V}$ ) for APD.

# VI. CMOS APD O/E-RX SYSTEM WITH BIAS STABILIZATION

As described in Section II, the performance of APD is sensitive to the reverse bias voltage and temperature, and all of the prior-art in bias stabilization circuits have been limited to off-chip implementations. Typically, a shift in temperature by 1°C changes the APD bias by 0.05% [10] for Si APD and 0.2% [3] for Ge APD. To maintain a stable performance,



Fig. 15. (a) Die micrograph of voltage booster in 0.13  $\mu$ m CMOS, and (b) its measured output vs. input voltage characteristic.

either a constant operating temperature can be maintained for the APD, or the bias voltage of the APD can be tuned. Heating/cooling APD is power inefficient, bulky and not easily compatible for CMOS monolithic applications.

Based on the fact that any change in temperature leads to change in APD I-V characteristics [4], [15], [18], temperature sensors can be eliminated in a fully integrated system (Fig. 6). A change in the biasing of the APD due to temperature variations leads to a change in its responsivity, and therefore, the overall gain of the RXFE is also affected. The average of the signal extracted at the output of the MA using an LPF in Fig. 6 has the information of the varying responsivity of the APD. Because the high-speed signal path inherently has an LPF in the offset correction loop so as to effectively AC-couple the incoming high-speed current from the APD, the output of this LPF can be tapped by both the error amplifier of TIA, as well as an ADC. The digitized output from the ADC is then processed by the control logic to generate a control voltage,  $V_{REF}$ , from a DAC. In our system simulations, we assume 7-bits for the DAC and ADC. A simple successiveapproximation-register (SAR) ADC is sufficient for such an implementation.

The output signal current of the APD shows a steady rise with increasing reverse-bias voltage till the breakdown voltage. Beyond the breakdown voltage, there is an exponential increase in the noise current leading to saturation of the RXFE. If we consider the low pass envelope output as a function of the reverse-bias voltage, we see that there is a steady increase in derivative of the slope till the PD reaches the avalanche region. Near the avalanche region there is a sudden increase in the derivative of the slope as a result of avalanche effect, and beyond the avalanche region the derivative of the slope reduces as the current becomes nearly linear. Thus, the plot for the second derivative of the LPF output  $(\Delta^2 V_{LPF})$  vs. the reverse-bias voltage peaks near the avalanche region as shown in the Fig. 16(b).

A modified hill-climbing algorithm that can be easily implemented using digital logic is adopted to track the peak in the second derivative of LPF output which is a function of the bias voltage of the APD. In Fig. 16(a), the onset of the avalanche region is encircled. The reverse-bias voltage of the APD should be increased optimally beyond this point, as a large excursion can lead to the diode getting permanently damaged due to



Fig. 16. (a) APD I-V curve shows APD breakdown due to excessive reverse bias voltage, and (b) derivative of the average TIA output vs. the reverse bias voltage.



Fig. 17. Control logic FSM for APD bias stabilization.

very high currents flowing through the APD (Fig. 16(a)). Therefore, a major difference between conventional hill-climbing algorithms and the proposed algorithm is that the algorithm takes the slope at avalanche,  $\Delta x_{ref}$ , as input and does not allow the system to overshoot this value. The reference slope,  $\Delta x_{ref}$ , is fed to the control logic based on the characteristic calibration of the APD. The control algorithm also takes into account shifts in I-V curve because of temperature effects while setting the bias of the APD.

The logic for the control algorithm has four states: START, UP, DOWN and WAIT. Fig. 17 shows the finite-state-machine (FSM) of the control logic. The system is reset at the START state. In the UP or DOWN state, the control voltage is incremented or decremented respectively. In the WAIT state, the control voltage is kept steady. The step size of the control voltage may be incremented or decremented in steps of one or two to achieve faster settling.

At every clock cycle, the present ADC value, x(n) is compared with the previous value, x(n-1) and the difference between them is computed,  $\Delta x$ . This difference is then compared with the reference slope, shown in Fig. 18(a),  $\Delta ref = \Delta x_{ref} - |\Delta x|$ . The control remains in UP state until it reaches the avalanche region. Once it reaches avalanche, it either moves to wait state or dithers at the top based on the value of  $\Delta x_{ref}$  as shown in Fig. 18(a) and Fig. 18(b). The cases for positive and negative temperature drift are highlighted in Fig. 18(c) and Fig. 18(d), respectively. As shown in Fig. 18(c), if there is a rise in temperature, the bias voltage



Fig. 18. Control logic cases: (a) Locking at the avalanche region, (b) dithering at the top, (c) positive temperature drift, and (d) negative temperature drift.



Fig. 19. Transient response of APD O/E-RX system. (a) APD current, (b) LPF output and (c) APD bias voltage.

is increased to lock to the avalanche region of the shifted curve. Similarly, the loop adjusts the bias voltage in case of a temperature fall.

In order to verify the algorithm, the control logic is written in Verilog. Simulation results for the entire APD O/E-RX system of Fig. 6, with Verilog implementations for the off-chip ADC, control logic, DAC, and on-chip post-layout implementations for rest of the circuits, are shown in Fig. 19. Fig. 19(a) shows the high-speed output current waveform of the APD. Fig. 19(b) plots the corresponding LPF output, which is equivalent to the average value of the APD current. As seen in Fig. 19(c), the control loop ensures that the bias voltage of APD is slowly increased till it reaches the avalanche region and then is kept steady. The ripple after settling is less than 5 mV and does not impact the RX eye diagram. This technique of bias stabilization can be applied in external APDs based RX and Ge-doped CMOS processes as well to enhance the RX performance.

# VII. CONCLUSION

An APD-based O/E-RX greatly relaxes the sensitivity requirements of the electronic RXFE because of the inherent avalanche gain of the APD. However, due to high reverse

bias requirement and temperature sensitivity of the APD, APD-based RXs have been traditionally implemented as multidie solutions. This work proposes the first monolithic solution of a CMOS based O/E-RX with APD. Fully-monolithic implementation further improves the bandwidth at the input of the electronic RXFE by eliminating package parasitics. A simple solution for on-chip biasing and bias stabilization of APD is also described. The APD O/E-RXFE achieves the best-reported sensitivity among 850 nm linear CMOS TIAs at 10 Gb/s.

#### ACKNOWLEDGMENT

The authors would like to thank R. Mehrabadi for computer aided design (CAD) tools assistance, Dr. R. Rosales and M. Al-Taha for technical help, and Prof. L. Chrostowski, Prof. N. Jaeger, Dr. H. Jayatilleka, Dr. G. Polovy, H. Shoman, and J. Schmidt of UBC for providing measurement help. Access to CAD tools was facilitated by CMC Microsystems.

#### REFERENCES

- A. N. Tait et al., "Feedback control for microring weight banks," Opt. Express, vol. 26, no. 20, pp. 26422–26443, Oct. 2018.
- [2] D. Mahgerefteh et al., "Techno-economic comparison of silicon photonics and multimode VCSELs," J. Lightw. Technol., vol. 34, no. 2, pp. 233–242, Jan. 15, 2016.
- [3] E. Sackinger, Broadband Circuits for Optical Fiber Communication. Hoboken, NJ, USA: Wiley, 2005.
- [4] M.-J. Lee and W.-Y. Choi, "A silicon avalanche photodetector fabricated with standard CMOS technology with over 1 THz gain-bandwidth product," Opt. Express, vol. 18, no. 23, p. 24189–24194, Nov. 2010.
- [5] A. H. Ahmed, A. Sharkia, B. Casper, S. Mirabbasi, and S. Shekhar, "Silicon-photonics microring links for datacenters—Challenges and opportunities," *IEEE J. Sel. Topics Quantum Electron.*, vol. 22, no. 6, pp. 194–203, Nov./Dec. 2016.
- [6] N. Duan, T.-Yang. Liow, A. E.-J. Lim, L. Ding, and G. Q. Lo, "310 GHz gain-bandwidth product Ge/Si avalanche photodetector for 1550 nm light detection," *Opt. Express*, vol. 20, no. 10, pp. 11031–11036, May 2012.
- [7] J. Choi, B. J. Sheu, and O. T.- C. Chen, "A monolithic GaAs receiver for optical interconnect systems," *IEEE J. Solid-State Circuits*, vol. 29, no. 3, pp. 328–331, Mar. 1994.
- [8] C. Takano, K. Tanaka, A. Okubora, and J. Kasahara, "Monolithic integration of 5-Gb/s optical receiver block for short distance communication," *IEEE J. Solid-State Circuits*, vol. 27, no. 10, pp. 1431–1433, Oct. 1992.
- [9] J. H. Jang, G. Cueva, D. C. Dumka, W. E. Hoke, P. J. Lemonias, and I. Adesida, "Long-wavelength In<sub>0.53</sub> Ga<sub>0.47</sub>As metamorphic p-i-n photodiodes on GaAs substrates," *IEEE Photon. Technol. Lett.*, vol. 13, no. 2, pp. 151–153, Feb. 2001.
- [10] J. E. Bowers, D. Dai, Y. Kang, and M. Morse, "High-gain high-sensitivity resonant Ge/Si APD photodetectors," *Proc. SPIE*, vol. 7660, p. 76603H, May 2010.
- [11] J. Wang and S. Lee, "Ge-photodetectors for Si-based optoelectronic integration," *Sensors*, vol. 11, no. 1, pp. 696–718, 2011.
- [12] F. Tavernier and M. S. J. Steyaert, "High-speed optical receivers with integrated photodiode in 130 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 10, pp. 2856–2867, Oct. 2009.
- [13] T. S. C. Kao, F. A. Musa, and A. C. Carusone, "A 5-Gbit/s CMOS optical receiver with integrated spatially modulated light detector and equalization," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 11, pp. 2844–2857, Nov. 2010.
- [14] G. P. Agrawal, Fiber-Optic Communication Systems. Hoboken, NJ, USA: Wiley, 2002.
- [15] K. K. Hamamatsu-Photonics. (Nov. 2011). Si APD. Accessed: Jul. 2018. [Online]. Available: https://www.hamamatsu.com/resources/ pdf/ssd/siapdkapd0001e.pdf
- [16] W. S. Zaoui et al., "Frequency response and bandwidth enhancement in Ge/Si avalanche photodiodes with over 840 GHz gain-bandwidthproduct," Opt. Express, vol. 17, no. 15, pp. 12641–12649, Jul. 2009.

- [17] T. Ikagawa et al., "Performance of large-area avalanche photodiode for low-energy X-rays and γ-rays scintillation detection," Nucl. Instrum. Methods Phys. Res. Sect. A, Accel. Spectrometers, Detect. Assoc. Equip., vol. 515, no. 3, pp. 671–679, Dec. 2003.
- [18] H. T. Chen et al., "High sensitivity 10 Gb/s Si photonic receiver based on a low-voltage waveguide-coupled Ge avalanche photodetector," Opt. Express, vol. 23, no. 2, pp. 815–822, Jan. 2015.
- [19] M. A. P. Garcia, J. C. C. Rodriguez, N. A. B. Mendoza, and J. C. Anton Alvarez, "Low-cost temperature stabilization in APD photo sensors by means a high frequency switching DC/T converter," in *Proc.* 19th IEEE Instrum. Meas. Technol. Conf., May 2002, pp. 1733–1737.
- [20] J. Kataoka et al., "An active gain-control system for Avalanche photodiodes under moderate temperature variations," Nucl. Instrum. Methods Phys. Res. Sect A, Accel. Spectrometers, Detect. Assoc. Equip., vol. 564, no. 1, pp. 300–307, Aug. 2006.
- [21] N. Zhang, J. Camp, and M. Schmand, "Temperature compensation schemes for APD detectors in PET," in *Proc. IEEE Nucl. Sci. Symp. Conf. Rec.*, Oct. 2011, pp. 2995–2996.
- [22] L. Tian, "Bias voltage compensating circuit," U.S. Patent 103940507A, Jul. 23, 2014.
- [23] W. Wang, "Dynamic control of photodiode bias voltage," U.S. Patent 7 103 288 B2, Jun. 9, 2006.
- [24] S. Deng, "Control circuits for avalanche photodiodes," Ph.D. dissertation, Dept. Elect. Electron. Eng., Univ. College Cork, Cork, Ireland, 2013
- [25] D. O. Connell, A. P. Morrison, K. G. Mccarthy, P. Angove, and B. O. Flynn, "Miniature gain and bias control circuit for avalanche photodiodes," *Electron. Lett.*, vol. 43, no. 5, pp. 67–68, Mar. 2007.
- [26] M. Lee, H. Rucker, and W.-Y. Choi, "Effects of guard-ring structures on the performance of silicon avalanche photodetectors fabricated with standard CMOS technology," *IEEE Electron Device Lett.*, vol. 33, no. 1, pp. 80–82, Jan. 2012.
- [27] J. S. Youn, H.-S. Kang, M.-J. Lee, K.-Y. Park, and W.-Y. Choi, "High-speed CMOS integrated optical receiver with an avalanche photodetector," *IEEE Photon. Technol. Lett.*, vol. 21, no. 20, pp. 1553–1555, Oct. 15, 2009.
- [28] F. Y. Liu et al., "10-Gbps, 5.3-mW optical transmitter and receiver circuits in 40-nm CMOS," IEEE J. Solid-State Circuits, vol. 47, no. 9, pp. 2049–2067, Sep. 2012.
- [29] F. Bruccoleri, E. A. M. Klumperink, and B. Nauta, "Wide-band CMOS low-noise amplifier exploiting thermal noise canceling," *IEEE J. Solid-State Circuits*, vol. 39, no. 2, pp. 275–282, Feb. 2004.
- [30] E. Sackinger and W. C. Fischer, "A 3-GHz 32-dB CMOS limiting amplifier for SONET OC-48 receivers," *IEEE J. Solid-State Circuits*, vol. 35, no. 12, pp. 1884–1888, Dec. 2000.
- [31] S. Shekhar, J. S. Walling, and D. J. Allstot, "Bandwidth extension techniques for CMOS amplifiers," *IEEE J. Solid-State Circuits*, vol. 41, no. 11, pp. 2424–2439, Nov. 2006.
- [32] J. S. Walling, S. Shekhar, and D. J. Allstot, "Wideband CMOS amplifier design: Time-domain considerations," *IEEE Trans. Circuits Syst. I, Regular Papers*, vol. 55, no. 7, pp. 1781–1793, Aug. 2008.
- [33] D. Guckenberger, J. D. Schaub, D. Kucharski, and K. T. Kornegay, "1 V, 10 mW, 10 Gb/s CMOS optical receiver front-end," in *IEEE Radio Freq. Integr. Circuit Symp. Dig. Papers*, Jun. 2005, pp. 309–312.
- [34] C. L. Schow, F. E. Doany, C. W. Baks, Y. H. Kwark, D. M. Kuchta, and D. M. Kuchta, "A single-chip CMOS-based parallel optical transceiver capable of 240-Gb/s bidirectional data rates," *J. Lightw. Technol.*, vol. 27, no. 7, pp. 915–929, Apr. 1, 2009.
- [35] H. Morita et al., "A 12×5 two-dimensional optical I/O array for 600 Gb/schipto-chip interconnect in 65 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 140–141
- [36] H.-Y. Jung, J.-M. Lee, and W.-Y. Choi, "A high-speed CMOS integrated optical receiver with an under-damped TIA," *IEEE Photon. Technol. Lett.*, vol. 27, no. 13, pp. 1367–1370, Jul. 1, 2015.
- [37] Q. Pan, Y. Wang, Y. Lu and C. P. Yue, "An 18-Gb/s fully integrated optical receiver with adaptive cascaded equalizer," *IEEE J. Quantum Electron.*, vol. 22, no. 6, Nov./Dec. 2016, Art. no. 6100509.
- [38] B. K. Casper, M. Haycock, and R. Mooney, "An accurate and efficient analysis method for multi-Gb/s chip-to-chip signaling schemes," in Symp. VLSI Circuits Dig. Tech. Papers, Jun. 2002, pp. 54–57.
- [39] Y. Chen, P.-I. Mak, H. Yu, C. C. Boon, and R. P. Martins, "An area-efficient and tunable bandwidth-extension technique for a wideband CMOS amplifier handling 50+ Gb/s signaling," *IEEE Trans. Microw. Theory Tech.*, vol. 65, no. 12, pp. 4960–4975, Dec. 2017.

- [40] S. Shekhar, J. E. Jaussi, F. O. Mahony, M. Mansuri, and B. Casper, "Design considerations for low-power receiver front-end in high-speed data links," in *Proc. IEEE Custom Integr. Circuits Conf.*, Sep. 2013, pp. 1–8.
- [41] T. Musah et al., "A 4–32 Gb/s bidirectional link with 3-Tap FFE/6-Tap DFE and collaborative CDR in 22 nm CMOS," IEEE J. Solid-State Circuits, vol. 49, no. 12, pp. 3079–3090, Dec. 2014.
- [42] J. F. Dickson, "On-chip high-voltage generation in MNOS integrated circuits using an improved voltage multiplier technique," *IEEE J. Solid-State Circuits*, vol. JSSC-11, no. 3, pp. 374–378, Jun. 1976.
- [43] Z. Hameed and K. Moez, "Fully-integrated passive threshold-compensated PMOS rectifier for RF energy harvesting," in *Proc. IEEE 56th Int. Midwest Symp. Circuit Syst.*, Aug. 2013, pp. 129–132.
- [44] Y. Ismail, H. Lee, S. Pamarti, and C.-K. K. Yang, "A 36-V 49% efficient hybrid charge pump in nanometerscale bulk CMOS technology," *IEEE J. Solid-State Circuits*, vol. 52, no. 3, pp. 781–798, Mar. 2017.



Spoorthi Nayak received the B.Tech. degree in electronics and communication engineering from the National Institute of Technology Karnataka, India, in 2014, and the M.A.Sc. degree in electrical and computer engineering from The University of British Columbia, Vancouver, BC, Canada, in 2017. From 2016 to 2018 she was with Microsemi, Vancouver, Canada. She is currently with the Signal Integrity Product Group at Semtech, Burlington, ON, Canada, where she involves in high performance mixed-signal integrated circuits design for high-speed I/O applications.



Abdelrahman H. Ahmed (S'14) received the B.Sc. degree (Hons.) in electronics and communication engineering from Alexandria University, Egypt, in 2012, and the M.Sc. degree in electronics engineering from American University, Cairo, Egypt, in 2014. He is currently pursuing the Ph.D. degree in electrical and computer engineering with The University of British Columbia, Vancouver, BC, Canada.

From 2012 to 2014, he was a Research Assistant with the Center for Nanoelectronics and Devices,

Zewail City for Science and Technology, Cairo, Egypt. From 2017 to 2018, he was an Intern with Elenion Tech., New York, NY, USA. His research interests include circuits for high-speed electrical and optical I/O interfaces.



Ahmad Sharkia (S'15) received the B.A.Sc. and M.A.Sc. degrees in electrical and computer engineering from The University of British Columbia, Vancouver, BC, Canada, in 2013 and 2015, respectively, where he is currently pursuing the Ph.D. degree. In 2012, he was a Research Assistant with the UBC Microsystems and Nanotechnology Group, where he developed and tested high-sensitivity capacitive readout circuits. In 2014, he joined Qualcomm Inc., San Jose, CA, USA, as an Engineering Intern, where he developed an automated

stroboscope-based system for characterizing digital microshutter (DMS) displays. His current research interests include RF, analog, and mixed-signal circuit design. He was a recipient of the NSERC Postgraduate Scholarship and the UBC Four Year Fellowship (FYF).



Ajith Sivadhasan Ramani (S'11) received the B.Tech. degree in electronics and communication engineering from the National Institute of Technology Karnataka, India, in 2014, and the M.A.Sc. degree in electrical and computer engineering from The University of British Columbia, Vancouver, BC, Canada, in 2017. He was with the Samsung Research Institute, Bangalore, India, from 2014 to 2015. In 2018, he was with Maxlinear, Vancouver, Canada. He is currently with the Signal Integrity Product Group at Semtech, Burlington, ON, Canada,

where he involves in high performance mixed-signal integrated circuits design for high-speed I/O applications.



Shahriar Mirabbasi (S'95–M'02) received the B.Sc. degree in electrical engineering from the Sharif University of Technology, Tehran, Iran, in 1990, and the M.A.Sc. and Ph.D. degrees in electrical and computer engineering from The University of Toronto, Toronto, ON, Canada, in 1997 and 2002, respectively. Since 2002, he has been with the Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, BC, Canada, where he is currently a Professor. His current research interests include analog, mixed-

signal, RF, and mm-wave integrated circuit and system design with an emphasis on communication, sensor interface, and biomedical applications.



**Sudip Shekhar** (S'00–M'10–SM'14) received the B.Tech. degree from the IIT Kharagpur, and the Ph.D. degree from the University of Washington, Seattle, in 2003 and 2008, respectively.

From 2008 to 2013, he was with the Circuits Research Laboratory, Intel Corporation, Hillsboro, OR, USA, where he was involved in the design of high-speed I/O architectures. He is currently an Associate Professor of electrical and computer engineering with The University of British Columbia. His current research interests include circuits for

high-speed electrical and optical I/O interfaces, frequency synthesizers, and wireless transceivers.

Dr. Shekhar was a recipient of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS Darlington Best Paper Award in 2010 and a co-recipient of IEEE Radio-Frequency IC Symposium Best Student Paper Award in 2015.