# A Simple All-digital PET System

Qingguo Xie<sup>12</sup>, Chien-Min Kao<sup>2</sup>, Rongsheng Xia<sup>3</sup>, Xi Wang<sup>3</sup>, Na Li<sup>3</sup>, Xin Jiang<sup>3</sup>, Li Zhi<sup>2</sup>, Zhi Zhang<sup>2</sup>, Zhonghua Deng<sup>3</sup>, Chin-Tu Chen<sup>1</sup>

<sup>1</sup>Department of Radiology, The University of Chicago, Chicago, Illinois, USA <sup>2</sup>Department of Biomedical Engineering, Huazhong University of Science and Technology, Wuhan, China <sup>3</sup>Department of Control Science and Engineering, Huazhong University of Science and

Technology, Wuhan, China

## ABSTRACT

Positron emission tomography (PET) systems employ mixed-signal front-end to carry out relatively simple, and *ad hoc*, processing of the charge pulses generated upon event detection. To obtain, and maintain over time, proper calibrations of the mixed-signal circuitry for generating accurate event information is a challenging task due to the simplicity of the event processing, and the huge number of channels and multiplexing of the input signals found in modern PET systems. It is also difficult to modify or extend the event-processing technologies when needs arise because it would involve making changes to the circuitry. These limitations can be circumvented by applying digital signal-processing technologies for analyzing event pulses generated in PET. With digital technologies, optimized event-processing algorithms can be implemented and they can be modified or extended with ease when needed. The resulting PET data-acquisition (DAQ) system is easier to calibrate and maintain, can generate more accurate event information, and has better extendibility. In this paper, we present our work toward developing a scalable all-digital DAQ system for PET, built upon a personal-computer platform for reducing cost. We will present the overall architecture of this digital DAQ system, and describe our implementations of

Keywords: Positron Emission Tomography, Front-end Electronics, Time-to-Digital Converter, Peripheral Connect Interface

#### 1. INTRODUCTION

Positron Emission Tomography (PET) is a proven molecular imaging technology for providing quantitative, functional information of biological systems, *in vivo*<sup>1,2</sup> Clinically, PET has been widely used for cancer diagnosis<sup>1,2</sup> and has demonstrated value in the early detection and evaluation of other diseases, such as Parkinson's disease<sup>3</sup> and Alzheimer's disease<sup>4,5</sup> In biomedical research, in combination with the use of animal models of human biology PET is a powerful tool for studying normal biology and disease mechanisms, for assisting the development of drugs and disease treatments, and in translational research.

With the proven value and growing utilization of PET, there arises the need for improving the performance characteristics of the technology. One area that can substantially affect the performance characteristics of a PET system is the data-acquisition (DAQ) system<sup>6</sup>.<sup>7</sup> Pulse processing in PET remains fundamentally the same as the technology was first developed several decades ago. Modern PET systems typically employ mixed-signal front-end that pre-amplifies and shapes the charge pulse generated upon the interaction of an annihilation photon and the detector.<sup>8</sup> After shaping, the pulse is sampled and digitized at a single point, often at the peak, to obtain an estimate of total charges contained in the event pulse, i.e., the detected energy of the annihilation photon. In detector designs that employ signal multiplexing to reduce the number of electronics channels needed (such as in the block detector design and depth-of-interaction detector designs that employ segmented crystal layers),

Medical Imaging 2007: Physics of Medical Imaging, edited by Jiang Hsieh, Michael J. Flynn, Proc. of SPIE Vol. 6510, 651041, (2007) · 1605-7422/07/\$18 · doi: 10.1117/12.713846

Further author information:

Qingguo Xie: E-mail: qgxie@uchicago.edu / qgxie@mail.hust.edu.cn, Telephone: 1 773 834 8061 Chien-Min Kao: E-mail: c-kao@uchicago.edu, Telephone: 1 773 702 6273

one sums relevant digitized readings to yield the detected photon energy and applies the simple arithmetic calculations to these readings (e.g., the Anger logic) to obtain the decoding information. The event time, on the other hand, is obtained by passing the pre-amplified charge pulse to a constant fraction discriminator (CFD) to generate a logic pulse indicating an event detection. A time-to-digital converter (TDC) then converts the logic pulse to a digital time stamp. To obtain correct decoding and energy readings, the signal channels need to be precisely calibrated. Inadequate calibration leads to degraded spatial, energy, and timing resolutions. The huge number of channels and complex signal multiplexing found in modern PET systems,<sup>9</sup> and the simplicity of the above-described event processing, make the calibration a tedious and challenging task.

One promising approach to deal with the above-observed limitations with the conventional PET event processing is to develop digital DAQ system for PET. The superiority of digital signal-processing technologies over their analog counterparts is a well-known fact. As described above, currently PET event processing is rather simple and *ad hoc*. Although mixed-signal front-ends are employed, in terms of signal processing the technology is essentially analog because the event pulses are not digitally processed to produce relevant event information.<sup>10</sup> By employing digital technologies, it becomes possible to apply sophisticated algorithms for optimally extracting relevant information from the event pulses, in accordance with the characteristics of the pulse and noise. Accurate calibrations will be easier to achieved because corrections to the detected signals can be easily applied. It also offers the possibility of adapting the algorithms to changes in the imaging conditions, such as in reducing errors due to event pileups at high event rates. Furthermore, when needs arise and when better event-processing algorithms become available, digital DAQ systems can be readily modified and upgraded by updating the firmware. Generally, the design and production of the digital chips are also easier and cheaper than the production of the analog chips. Therefore, there is also a potential cost advantage in employing digital technologies.

In this paper, we present our work toward developing a digital PET DAQ system. We propose a scalable architecture that employs low-cost personal computers (PCs) for achieving cost attractiveness. In Section 2, we will discuss the architecture of the system and describe our current implementations of several components of the system. Initial results of these components are presented in Section 3. Discussion and conclusions are given in Section 4.

#### 2. METHODS

The overall architecture of the scalable digital DAQ system is illustrated in Fig. 1. The system consists of three main units: the SAMPLING UNITS (SUs), the SINGLES PROCESSING UNITS (SPUs), and the COINCI-DENCE PROCESSING PCs (CPs). The SUs sample the event pulses at an appropriate sampling rate and digitally process the sampled pulses to derive relevant information for PET imaging, including the event energy, event time, and parameters characterizing the pulse shapes. The SPUs receive event information from multiple SUs and perform additional processing, including digital synchronizations of the time stamps of the associated SUs. When signal-multiplexing are employed, such as with the popular block detector design in which four (4) event pulses are generated by one 8x8 crystal blocks, positioning decoding is also performed at the SPUs. Multiple SPUs are connected to a CPs for coincidence determination based on the event time stamps. Depending on the number of channels, multiple CPs are employed when needed.

SAMPLING UNIT: The input charge pulses are converted to voltage pulses by a 50  $\Omega$  resistor. When needed, the pulses are pre-amplified to reduce signal noise. The voltage pulses are sampled by a high-speed analog-to-digital converter (ADC), and the samples are temporarily stored in a FIFO (first-in-first-out). The TDC determines the time of the sampling with respect to the system time (the main clock) with a high precision, and the time values are temporarily stored in another FIFO. To achieve sub-nanosecond timing precision, a 1.5 GHz local sampling clock is employed. Both ADC and TDC operate in free-run modes; they are enabled only when the voltage pulses are above a certain small reference voltage for rejecting noisy pulses (due to, for example, dark currents). A digital signal processing unit (DSPU0) analyzes the digitized samples of a voltage pulse, and the associated sampling time determined by the TDC, to obtain event parameters that are relevant for PET imaging, including the event energy, event time, and pulse-shape parameters. Sophisticated signal-processing algorithms can be implemented to provide optimal estimation of such event parameters, in accordance with the *a priori* characteristics of the pulse shape and of the noise. Algorithms can also be implemented to, for example, detect



Figure 1. Architecture of the scalable, digital DAQ system for PET.

pile-up events and apply corrections to reduce pile-up errors. We note that the event time can be estimated from the ADC samples and the TDC times, independent of the pulse amplitude. Therefore, the CFD is eliminated from the SU in our design.

SINGLES PROCESSING UNIT: A PCI interface provides the connection between an SPU and multiple SUs. The SPU is an embedded system; it takes event information generated by the SUs, contains a DSP to apply further processing, and sends the results to a CP through a communication channel (wired or wireless). In our design, an SPU will be able to handle up to 60 million event per second, and will connect with 4 to 32 SUs. The SPU will provide calibration for the event energy and synchronization of timing stamps generated by the connecting SUs. When block-detector configuration are employed, the input event information provided by the connecting SUs will also be processed for position decoding, and to generate DOI measurements when relevant. From the timing and pulse-shape information produced by the connecting SUs, it may also be possible to identify detected events that involve multiple photon-detector interactions, and to provide better estimates of the first interaction positions and produce more accurate time-of-flight measurements. The DSP component of the SU will provide computing power for carrying out sophisticated calculations needed in these tasks.

COINCIDENCE PROCESSING PC: To ease the performance requirements on communication channel and the computing power, in our current design a CP will connect with only two SPUs. Coincidence events will be determined based on the time stamps of the input single-event streams generated by the two connecting SPUs. The detected coincidence data will be stored as list-mode data with all the event information retained such that reconstruction algorithms can exploit the information for achieving optimum image reconstruction. The resulting PET DAQ system will consist of multiple low-cost PC units arranged in a clustered architecture, communicating with each other through high-speed TCP/IP interface. The system is thus scalable, depending on the number of SPUs needed by a PET scanner design.

We note that the DSP capability available with the SUs and SPUs in our DAQ architecture will permit the development and implementation of novel event-processing algorithms and data-correction methods having complexities beyond what we described above. Below, we present our implementations of several components of the proposed DAQ system.



Figure 2. Top: The TDC structure based on Nutt interpolation method. Bottom: The delay cells for the TDC.

# 2.1. Sampling Unit

We have selected the 8-bit ADC MAX108 chip, which provides a 1500MSPS sampling rate, for the SU. The MAX108 integrates a high-performance track/hold (T/H) amplifier and provides either differential or singleended use with a  $\pm 250$  mV input voltage range, which matches well with the typical pulse amplitude generated by a scintillator/photomultiplier unit. We will employ a 1.5GSps local clock for the SU and a 75MHz main clock. The 1.5GSps sampling rate shall be adequate for sampling the event pulses generated by fast scintillators such as LSO, of which scintillation pulse has a decay constant of 40-50 ns. We are also implementing a new digital pulse-processing algorithm that we previously presented in Refs. 11 and 12. In this work, we employ the HSPICE (Simulation Program with Integrated Circuit Emphasis) package to simulate the operation of the electronics implementation proposed in Refs. 11 and 12 for carrying out the proposed event processing.

## 2.2. Time-to-Digital Converter

The TDC unit is implemented on an Altera Stratix FPGA (field programmable gate array) by applying the technology developed by Kalisz et al.<sup>13–15</sup> The timing accuracy of the resulting implementation depends critically on the specific electronics layout inside the FPGA: a 26 ps timing standard deviation of the Flip-Flop (FF) cells at different locations inside the FPGA is measured. We use Modelsim to simulate, Quartus II to design and synthesize the TDC, which is implemented in a Stratix FPGA. Our current implementation achieves a timing accuracy of about 250ps. By strategically selecting the FF cells to use, however, we believe that a timing resolution of about 100ps can be achieved. Figure 2a shows the TDC structure based on the Nutt interpolation method. The TDC consists of two delay lines and a main clock (counter). The time interval between Start/Stop edges and the main clock is measured by the TDC. The delay line structure is shown in Fig. 2b.

## 2.3. PCI

In addition to the FPGA implementation of the TDC, we have also designed and developed the PCI interface of the SPU. Figure 3 shows the architecture of the PCI interface board and a photograph of our current prototype, which can connect to two SUs. The performance of the prototypical PCI interface board is under testing. In



**Figure 3.** Left: The architecture of the PCI interface board. Right: The photograph of our current prototype. Two PCBs are shown in the photograph. The PCB on the left contains the CPLD, FIFO, and peripheral circuits; it receives and transfers event information output by DSPU0 of the SU. The PCB on the right side contains the PCI bridge, PCI9054, a 40MHz local clock, and some peripheral circuits. PCI protocol is implemented with the PCI end of PCI9054 (near to the golden fingers) on the right-side PCB. The local end of PCI9054 (near to the ribbon cable) receives data transferred from the FIFO on the left side PCB via ribbon cable. The CPLD, on the left-side PCB, controls data transfer between PCI9054 and FIFO via ribbon cable. Our current design divides into two PCBs for purpose of debugging during the prototype development. In future we will implement all the components on a single PCB.

our implementation, PCI9054 is employed to serve as PCI bridge, EPM7128SLC84 CPLD as a controller, and  $32K \times 36$  IDT72V3690 as a FIFO.

#### 3. RESULTS

#### 3.1. Sampling Unit

In HSPICE simulation for the SU, we employ MAX9602 to implement the comparators in the previously proposed circuitry.<sup>11</sup> MAX9602 is a ultra-high-speed quad comparator with extremely low propagation delay. The outputs are complimentary digital signals, compatible with PECL (Positive Emitter Coupled Logic) systems. Figure 4 shows the histograms of the calculated event energy obtained from digitized event pulses (at 5GHz sampling rate) detected by a phoswich detector made of LSO/LYSO scintillators with Na-22 and Cs-137 sources. In addition to the event energy shown, we are also able to estimate the timing and DOI information from the pulse samples. The above-described processing algorithm shows one possible digital signal processing technique that can be implemented on the SU with DSPU0.

### **3.2.** TDC

To estimate the accuracy of the TDC we implement on Stratix FPGA, we use 70,000 time intervals generated randomly in a time range of [0,T], in which, T is the clock period. Let K denote the first active delay cell number and P the last one among the 127 delay cells in a delay line, we have,  $1 \le K < P < 127$  and M = P - K + 1, where M denotes the number of active cells. It's not difficult to see that the error of the TDC estimate is given by

$$E_{Ti} = \sum_{j=K}^{i} \left(\frac{n_j}{N} - \frac{1}{M}\right)$$
(1)

where N is the total number of random inputs and  $n_j$  denotes the number of times the delay cell j is fired. Figure 5 shows the estimate errors of Start delay line and Stop delay line of the TDC. This experiment was conducted with a 75MHz clock, 68 active delay cells for the start delay line, and 52 active delay cells for the stop delay line. Table I summaries the TDC mean resolutions and the standard deviation of the error.



Figure 4. Histograms of the generated energy obtained for the point sources. Left: Na-22; Right: Cs-137. Stars: The readings of the histogram drawn directly from the recorded data. Diamonds: The histogram of estimated energy generated by the HSPICE simulation.



Figure 5. The estimate errors of delay lines of the TDC. Left: The Start delay line. The last two delay cells are inactive. Right: The Stop delay line. The first 6 and last 12 delay cells are inactive

| PCI数据采                      | 条集卡SglDMA传输                         |
|-----------------------------|-------------------------------------|
| 设备选择:                       | Pci9054-0 at Slot 0 💌               |
| 传输数据量(MB):                  | 350                                 |
| 一次DMA传输数据量(MB):             | 2                                   |
| 传输状态显示:                     |                                     |
|                             |                                     |
|                             |                                     |
| 传输时间(ms): <sup>8141</sup> 传 | 输速度 (MB/s): 42.9923 近代始後输 ]<br>数据保存 |

Figure 6. The dialog of the running Windows application program on our PCI board. The buffer for one time DMA transfer is 2MB. It takes 8141 ms to transfer 350MB data, corresponding to a data transfer rate of 43MB/s.

Table 1. Estimated Timing Errors for TDC delay lines.

| Delay line | Timing Resolution           |
|------------|-----------------------------|
| Start Line | $196 \pm 15.68 \mathrm{ps}$ |
| Stop Line  | $256\pm7.68\mathrm{ps}$     |

# 3.3. PCI

We have developed the PCI interface board, as well as the drivers for Windows and Linux operating systems. Figure 6 shows the dialog of the running application program for data transferring on a Windows PC. It is indicated that the transferring rate of 43MB/s can be achieved by use of our PCI interface with a total of 350MB data and 2MB transferring buffer.

## 4. DISCUSSION AND CONCLUSIONS

We have proposed a digital DAQ architecture for PET. The architecture employs low-cost PC and adopts a scalable configuration for improving its cost attractiveness. We are implementing and testing several key components of the proposed architecture, including a FPGA implementation of the TDC cell of the SU and the PCI interface of the SPU. We also investigate the implementation of an advanced event signal processing algorithm<sup>11</sup> on the SPU. We will in the process of implementing a complete SPU and the technique for dispatching



Figure 7. Scintillator Samples. From left to right:  $2 \times 4$  LYSO block consisting of  $2mm \times 2mm \times 10mm$  LYSO crystals;  $1mm \times 1mm \times 10mm$  LYSO crystals;  $2mm \times 2mm \times 10mm$  LYSO crystal;  $2mm \times 2mm \times 15mm$  LYSO crystal;  $3mm \times 3mm \times 25mm$  GSO crystals;  $4mm \times 4mm \times 30mm$  BGO crystals

high-accuracy main clock to multiple SPUs. We will test the resulting SPUs with event pulses generated by different scintillators (shown in Fig. 7) attached to photomultipliers.

It is believed that the adoption of all-digital signal processing in PET can generate more accurate event information by allowing the use of sophisticated signal processing. In addition, when needs arise, modifications and extensions to the event processing algorithm can be implemented with use. Potentially, the calibration process can also be made easier and more accurate. The development of all-digital event processing technologies for PET event detection, as proposed in this work, is therefore of great significance.

#### ACKNOWLEDGMENTS

This work is supported in part by Grant 60602028 provided by the National Natural Science Foundation of China and by the National High Technology Research and Development Program 2006AA02Z333 provided by the Ministry of Science and Technology of China. Its contents are solely the responsibility of the authors and do not necessarily represent the official view of the National Natural Science Foundation of China or the Ministry of Science and Technology of China.

#### REFERENCES

- 1. H. Varmus, "The new era in cancer research," Science **312**, pp. 1162–1165, 2006.
- 2. R. Weissleder, "Molecular imaging in cancer," Science 312, pp. 1168–1171, 2006.
- R. d. l. Fuente-Fernandez, T. J. Ruth, V. Sossi, M. Schulzer, D. B. Calne, and A. J. Stoessl, "Expectation and dopamine release: Mechanism of the placebo effect in parkinson's disease," *Science* 293, pp. 1164–1166, 2001.
- K. Herholz, D. Perani, E. Salmon, G. Franck, F. Fazio, W. D. Heiss, and C. D., "Comparability of fdg pet studies in probable alzheimer's disease," *Nuclear Medicine, The Journal of* 34, pp. 1460–1466, 1993.
- L. Mosconi, A. Pupi, T. R. D. Cristofaro, M. Fayyaz, S. Sorbi, and K. Herholz, "Functional interactions of the entorhinal cortex: An 18f-fdg pet study on normal aging and alzheimer's disease," *Nuclear Medicine*, *The Journal of* 45, pp. 382–392, 2004.
- M. Streun, G. Brandenburg, H. Larue, C. Parl, and K. Ziemons, "The data acquisition system of clearpet neuro - a small animal pet scanner," *Nuclear Science, IEEE Transactions on* 53, pp. 700–703, 2006.
- A. Mann, B. Grube, I. Konorov, S. Paul, L. Schmitt, D. P. McElroy, and S. I. Ziegler, "A sampling adc data acquisition system for positron emission tomography," *Nuclear Science*, *IEEE Transactions on* 53, pp. 297–303, 2006.
- M. Streun, G. Brandenburg, H. Larue, E. Zimmermann, K. Ziemons, and H. Halling, "Pulse recording by free-running sampling," *Nuclear Science*, *IEEE Transactions on* 48, pp. 524–526, 2001.

- J. W. Young, J. C. Moyers, and M. Lenox, "Fpga based front-end electronics for a high resolution pet scanner," *Nuclear Science, IEEE Transactions on* 47, pp. 1676–1680, 2000.
- R. Fontaine, M. A. Tetrault, F. Belanger, N. Viscogliosi, R. Himmich, J. B. Michaud, S. Robert, J. D. Leroux, H. Semmaoui, P. Berard, J. Cadorette, C. M. Pepin, and R. Lecomte, "Real time digital signal processing implementation for an apd-based pet scanner with phoswich detectors," *Nuclear Science, IEEE Transactions on* 53, pp. 784–788, 2006.
- Q. Xie, C. M. Kao, Z. Hsiau, and C. T. Chen, "A new approach for pulse processing in positron emission tomography," *Nuclear Science, IEEE Transactions on* 52, pp. 988–995, 2005.
- Q. Xie, C.-M. Kao, and C.-T. Chen, "A preliminary experimental validation of a new approach for pulse processing in pet," in *Nuclear Science Symposium Conference Record*, 2004 IEEE, 3, pp. 1428–1431 Vol. 3, 2004.
- J. Kalisz, R. Szplet, J. Pasierbinski, and A. Poniecki, "Field-programmable-gate-array-based time-to-digital converter with 200-ps resolution," *Instrumentation and Measurement, IEEE Transactions on* 46, pp. 51–55, 1997.
- 14. R. Szplet, J. Kalisz, and R. Szymanowski, "Interpolating time counter with 100 ps resolution on a single fpga device," *Instrumentation and Measurement, IEEE Transactions on* **49**, pp. 879–883, 2000.
- 15. R. Pelka, J. Kalisz, and R. Szplet, "Nonlinearity correction of the integrated time-to-digital converter with direct coding," in *Precision Electromagnetic Measurements Digest*, 1996 Conference on, pp. 548–549, 1996.