Defining the dimensions of circulating tumor cells in a large series of breast, prostate, colon, and bladder cancer patients

Molecular characterization of circulating tumor cells is of high clinical relevance. Since circulating tumor cell (CTC) detection and isolation often rely on cell dimensions, we determined the size of 71 612 CellSearch‐detected CTCs using accept software. Strikingly, CTC size differs between tumor types and significantly deviates from the size of cultured tumor cells, which is currently used in the development of CTC isolation methods.


Introduction
Circulating tumor cells (CTCs) are disseminated from solid malignancies and present in the peripheral circulation of patients of multiple tumor types [1]. They are thought to originate from all tumor lesions present in the body. Therefore, CTCs could reflect a sample of the molecular landscape of the disease in real time when successfully captured and characterized. Because the median concentration in patients with metastatic Abbreviations ACCEPT, Automated CTC Classification, Enumeration, and PhenoTyping software; BC, breast cancer; BLC, bladder cancer; CD, computed diameter; CEL, cultured tumor cell (cell line); CK, cytokeratin; CRC, colorectal cancer; CTC-L, circulating tumor cells derived from cerebrospinal fluid (liquor); CTCs, circulating tumor cells; DAPI, 4 0 6-diamidino-2-phenylindole; EMT, epithelial-mesenchymal transition; EpCAM, epithelial cell adhesion molecule; IQR, interquartile range; KW test, Kruskal-Wallis test; MWU test, Mann-Whitney U test; NCR, nucleus/cytoplasm ratio; P2A, perimeter to area; PC, prostate cancer; TIF, tagged Image Format files; TXT, text file; lm, micrometer; µm 2 , square micrometers. disease is estimated as 1 CTC per billion white blood cells, capturing them is technically challenging. This emphasizes the need for sensitive and specific CTC detection and isolation methods. The CellSearch Ò system is the only FDA-cleared CTC enumeration platform [2]. This method uses antibodies recognizing the epithelial cell adhesion molecule (EpCAM) coupled to ferrofluids to enrich peripheral blood for CTCs. The immunomagnetically enriched cells are stained with the nucleic acid dye 4 0 6-diamidino-2-phenylindole (DAPI) and labeled with fluorescent antibodies against cytokeratins 8, 18+, and/or 19+ (CK) and CD45. The processed blood fraction is transferred into a cartridge of which images are taken. An expert user of the CellSearch Ò method assesses all presented events and identifies CTCs manually. CTCs are defined as CK+/DAPI+/CD45À events, whereas leukocytes are CKÀ/DAPI+/CD45+ events. Cell-Search-based CTC enumeration has prognostic value in multiple tumor types [3][4][5]. However, since the method relies on EpCAM positivity, it will only detect CTCs which express EpCAM. Epithelial-mesenchymal transition (EMT) is thought to play a key role in dissemination of cancer [6,7]. To a certain degree, epithelial markers are down regulated in CTCs which have undergone EMT. This can lead to a detection failure using the CellSearch Ò method if EpCAM expression is too low or absent. Hence, EpCAM-based detection methods may introduce a selection bias in CTCs which can be studied.
Therefore, antibody-independent CTC detection and isolation methods are being developed and many of them are based on physical characteristics [8][9][10][11][12][13]. Sizebased methods frequently use cultured cell line cells to characterize the performance of their system. Although evidence is lacking, CTCs are considered to be larger and less deformable than blood cells. Additionally, it is suggested that morphological differences of CTCs exist between tumor types [14,15]. This implies the need for improved knowledge on CTC size from different tumor types. Therefore, we determined the median CTC size, computed the approximate median diameter (CD) of the cytoplasm and nucleus per cell, and determined the nucleus/cytoplasm ratio (NCR) of CTCs from breast (BC), prostate (PC), colorectal (CRC), and bladder (BLC) cancer patients. For this purpose, a large series of images of CellSearch Ò cartridges were re-analyzed through the 'Automated CTC Classification, Enumeration and PhenoTyping software package' (ACCEPT) (https://github.com/LeonieZ/ ACCEPT) [16]. In addition, we compared the CTC results to data derived from lymphocytes and wellknown cancer cell lines.

Data collection/CellSearch â
CellSearch Ò CTC enumeration is part of ongoing clinical research of the medical oncology department at the Erasmus Medical Center, Rotterdam, the Netherlands. Here, we retrospectively studied the images of Cell-Search Ò cartridges of BC, PC, CRC, and BLC patients. All patients provided written informed consent and were included in clinical trials designed in accordance with the Helsinki Declaration and approved by the local ethics board of the Erasmus MC University Medical Center (Table S1). Subject numbers, cartridge numbers, and CellSearch Ò appointed CTC counts were collected for all patients. To acquire images of cultured tumor cells, cartridges of experiments in which three BC cell lines (MCF-7, SKBR3, and TD47D) and one PC cell line (LNCAP) were spiked into tubes of blood of healthy donors were studied. Leukocyte analysis was enabled through the selection CD45+/DAPI+/CKÀ events on one melanoma patient-derived cartridge. CellSearch Ò generated data consisting of a maximum of 175 Tagged Image Format files (.TIF) per sample, an Extensible Markup Language (.xml) containing the coordinates of the manually marked CTCs, were acquired per cartridge. The images were annotated by tumor type, patient, and material of origin (blood vs liquor). Data collection took place between January 2017 and January 2019.

Data processing/ACCEPT and SPSS STATISTICS 25
ACCEPT is an open source program developed at the University of Twente, which enables the re-analysis of images of CTCs acquired through the CellSearch Ò method. [16]. Here, collected CellSearch Ò images were re-analyzed using the 'marker characterization mode' of ACCEPT. This software feature enables the analysis of only those events which were originally defined as CTCs by an expert user of the CellSearch Ò platform. ACCEPT used the coordinates of marked events present in the CellSearch Ò generated .xml file to trace back CTCs and selected spiked tumor cells and leukocytes. The program automatically detects the immunofluorescent signals present in the DAPI, CK, and CD45 channel and marks the borders of these events. Per individual event, ACCEPT reports multiple parameters of roundness, signal intensity, and size for the DAPI, CK, and CD45 channel per cartridge in an extended excel file (Microsoft Excel v.10, 365 Office; Microsoft, Redmond, WA, USA). Subject numbers, tumor type, and type of material (blood vs liquor) were added manually per included study. Subsequently, the excel files containing ACCEPT data of all cartridges were combined into one IBM SPSS STATISTICS 25 file [International Business Machines Corporation (IBM), Armonk, NY, USA].

Data analysis
Descriptive statistics on the origin of the cartridges, and CTC counts were collected from different data files. CellSearch Ò selected CTC counts were derived from prior clinical study files (excel). To determine the ability of ACCEPT to represent the CellSearch Ò marked events, ACCEPT enumeration results were compared to the known CellSearch Ò CTC counts. All following analyses were performed using the SPSS file containing multiple variables per event.

Selection of cohort for CK and DAPI size analysis
To examine the extent to which the reported ACCEPT events met the CellSearch Ò CTC criteria, CK and DAPI positivity were assessed per event. All criteria mentioned were applied to patient-derived CTCs as well as cell line cells. To adequately determine the size of CTCs per tumor type, it was important to analyze only those events containing confirmed single CTCs. Therefore, events which entered the size analysis had to meet two selection criteria to correct for any inaccuracies introduced by the ACCEPT processing. First, AC-CEPT reported events had to have a positive CK signal. Importantly, the CellSearch Ò marked CTC coordinates in the .xml file could contain images of multiple CK positive CTCs. Therefore secondly, a prior described method based on the 'perimeter to area' (P2A) ratio of the CK channel was used to discriminate between single CTCs, doublets, and small and large clusters to enable the selection of single CTCs for further analyses [17]. Only single, CK positive events were included in the size analysis. To enable the analysis of leukocytes, the CK selection criteria were applied to the CD45 signal.
Single CK-positive events had to meet three additional criteria to analyze the cytoplasm/nucleus ratio (NCR) from each CTC. First, as ACCEPT did not detect a DAPI signal in all single CK positive CTCs, the DAPI signal had to be positive. This was defined as the DAPI size being > 0 lm 2 . Secondly, the P2A measurement of the DAPI signal was assessed to select only those CTCs with a single nuclear signal. The third criterion to enter the analysis was that the nucleus had to be intact. This was defined by the fact that the DAPI signal (µm 2 ) had to be smaller than the CK signal (µm 2 ).

CK and DAPI size determination and cytoplasm/nucleus assessment
The size of the CK and DAPI signals in square micrometers (µm 2 ) as reported by ACCEPT was used to compute an approximate diameter of the cytoplasm and nucleus per CTC. The median size of the predefined CK-positive single CTCs was calculated per tumor type. To assess whether a difference in CK and or DAPI size existed between the four tumor types, the Kruskal-Wallis test (KW test) was used. Subsequently, when a significant size difference existed, the Mann-Whitney U test (MWU test) was applied pairwise to assess the size differences between the tumor types. Within the BC CTCs, the MWU test was also used to assess the size difference between blood-and liquor-derived CTCs. Additionally, for BC and PC, the size difference between patient-derived CTCs and tumor cell line cells was assessed using the MWU test. Finally, this test was used to compare the size of leukocytes to the size of blood-derived CTCs of all tumor types. The computed DAPI and CK diameters were used to determine the NCR in CK positive single CTCs, which had an intact, single positive DAPI signal. The nuclear size and NCR differences between tumor types were assessed through the same nonparametrical testing as mentioned above. All reference cell line cells were spiked in healthy donor blood and processed by the CellSearch Ò method. Regarding the BC cell line data, three cartridges containing SKBR3 cells, two cartridges with MCF-7 and one TD47D cells resulted in 530, 1616, and 1249 evaluable events, respectively. One PC cartridge resulted in 1054 LNCAP cells. Selection of leukocytes in one cartridge of a melanoma patient resulted in 130 evaluable events (Table S2).

ACCEPT output-data description and optimization cohort for CK and DAPI size analysis
ACCEPT detected a total of 72 840 events distributed over the four tumor types. In BC, the number of detected events in blood-derived cartridges was 45 695 and 5327 in liquor-derived cartridges. As summarized in Fig. S3 and Table S3, 99.5% of the events detected in blood and 99.9% of the events detected in liquor were CK positive, while selecting for both CK and DAPI positivity resulted in 86.2% and 99.1% of the events in blood and liquor, respectively. In PC, 20 624 events were detected of which 97.5% was CK positive and 89.3% was CK and DAPI positive. As seen in Fig. S4  single signal, 4 times double, and 2 times a clustered signal (Table S4). samples median test (P < 0.001) and the KW test (P < 0.001) showed an existing size difference within the patient-derived CTCs of the different tumor types. When testing the size differences between two tumor types at a time, the MWU test resulted in P < 0.001 between every tumor type. MWU testing also showed that the median size difference of 20.9 lm between blood-derived and liquor-derived BC CTCs was significant (P < 0.001). Additionally, the median size differences between cell line cells and patient-derived CTCs of 145 lm in BC and 254.3 lm in PC were both significant (P < 0.001). Finally, the MWU test was applied to assess statistical significance between the median size differences between leukocytes and BC (À50.8 lm, P < 0.001), PC (À14.0 lm, P < 0.001), CRC (+25 lm, P < 0.001), and BLC (À11.8 lm, P = 0.31). Figure 4 shows a median size of 56.1 lm 2 (IQR 41.8) for the nucleus of the blood-derived BC CTCs resulting in a NCR of 0.47. The median size of BC cell line cells was 94.2 lm 2 (IQR 50, NCR 0.37). Both nucleus size and NCR were statistically significant between patient-derived CTCs and BC cell line cells (P < 0.001). In liquor-derived BC CTCs, the median size of the nucleus was 64.3 lm 2 (IQR 36.9, NCR = 0.46). For PC CTCs, the median size of the nucleus was 52.0 lm 2 (IQR 35.2, NCR = 0.62), while PC cell line cells had a median nucleus size of 161.0 lm 2 (IQR 2.85, NCR = 0.48). Again, both the nucleus size difference as well as the NCR were statistically different between CTCs and cultured cells (MWU P < 0.001). The nucleus size of CRC CTCs was 26.83 lm 2 (IQR 29.0, NCR = 0.60), and in BLC 43.4 lm 2 (IQR 33.6, NCR = 0.75). Nonparametrical testing between the different tumor types was applied as stated above; the independent samples median test resulted in P < 0.001 and the KW test in P < 0.001. When testing the nuclear size differences between two tumor types at a time, the MWU test resulted in P < 0.001 between every tumor type except for the PC vs BLC analysis (P = 0.073).

Discussion
In this study, we assessed the median size of CTCs in a large cohort of BC, PC, CRC, and BLC patients using ACCEPT hypothesizing that there might be a difference in CTCs across tumor types and between patient-derived CTCs and cell line cells, implying the need for tumor-specific CTC isolation methods. We indeed found different median CTC sizes between the four tumor types with BC CTCs being the largest and CRC CTCs the smallest, respectively. Furthermore, we found that liquor-derived BC CTCs are larger than blood-derived CTCs in the same population of patients, suggesting a morphological difference between cells derived of the two types of origin. Finally, the median size of the nucleus also differed between the different tumor types although the variance was smaller than within the distribution of the CK signal.
Two prevailing dogmas regarding CTC size, which are highly important with respect to the future development of size-based isolation methods, are addressed by the analysis of our patient-derived CTC data in comparison with our reference data. The first widely accepted hypothesis states that CTCs are generally larger than white blood cells. Literature describes a median size of lymphocytes of 7.1-10.5 lm, and 8.7-9.9 lm for granulocytes [18], which implies that the CTC size as computed in this study is much more similar to the known size of white blood cells than postulated. Our data supports this hypothesis as the leukocytes we analyzed did not statistically differ from BLC CTCs and were even significantly larger than CRC CTCs. Secondly, the development of size-based isolation methods heavily relies on spike experiments using cell line tumor cells, based on the assumption that their median size is comparable to that of CTCs. Our data show that patient-derived CTCs are generally smaller than the cultured cell line cells of the respective tumor type as described in the literature. In BC, the CD of patient-derived CTCs was 12.4 lm, compared to the larger, cultured cells which were 15-17 lm (SKBR-3), and 16.5 lm (MCF-7). In prostate cancer, the median CD of 10.3 lm is also significantly smaller than the size of a cultured cells: 18-21 lm (PC3-9). This is also the case in colorectal cancer, where our computed median diameter is 7.5 lm vs 11 lm in cultured cells (SW480) [19]. Our analysis of three BC cell lines (median size of 18.4 lm) and one PC cell line (20.7 lm) were comparable to these data. With the results of this study, we show that the median size of cell line cells is indeed significantly larger than patientderived CTCs in both BC and PC. When the results of our analysis were compared to the expected input data retrieved from the CellSearch Ò data files, the efficiency of ACCEPT to analyze previously CellSearch Ò marked events was high. However, the increased number of ACCEPT detected events compared to the CellSearch Ò marked cell count in CRC was striking. A possible explanation is that cartridges of the specific clinical trial in which this occurred contained the images of the buffy coat of 30 mL blood instead of the usual 7.5 mL whole blood. Within this technique, 30 mL of blood is pooled into 2 9 15 mL tubes, after which plasma separation and the creation of a buffy coat is achieved through a centrifuging step. After pipetting off the blood plasma, the two buffy coats are pooled into one CellSave tube. After these preparation steps, the CellSearch technique is applied in the same manner as on 7.5 mL blood. It is possible that the high background of leukocytes in these samples led to a more difficult distinction for ACCEPT to recognize CTCs as a single event.
The main limitations of this study concern the introduction of a selection bias in the analyzed data. First, its retrospective nature led to the inclusion of a cohort of patients with a wide range of disease characteristics. The cartridge data of all patients included in one of the historical clinical trials were re-analyzed, irrespective of the disease stage, metastatic status, and the treatment line or choice. This could lead to an underor overestimation of the CTC counts within the current cohort. Second, as we made use of images acquired through the CellSearch Ò method, the analysis is limited to EpCAM positive CTCs. Although this type of CTCs is of prognostic value in all studied tumor types, it is increasingly suggested that cells which have undergone epithelial-mesenchymal transition (EMT) have a higher metastatic potential and are therefore of interest regarding their characterization. These EMT cells could not be studied through the current method. The third limitation regards the marker characterization mode of ACCEPT. We chose to analyze only those events which were previously marked as CTCs and therefore had to meet the CellSearch Ò criteria of CK positivity, CK signal ≥ 4 lm 2 , DAPI signal which overlaps the CK signal for ≥ 50%, and a morphological appearance of an intact cell. It is possible that the studied CTC cohort reflects a certain population of cells due to this selection.

Conclusion
Although the re-analyzed data in this study were subject to a certain degree of selection bias, to the best of our knowledge it does reflect the largest cohort of morphological studied CellSearch Ò depicted CTCs which are directly compared to reference data. In addition, one could suggest that the size difference as found in this cohort, irrespective of a possible selection bias, reflects an even more pronounced difference when corrected for this limitation. The shown differences in CTC size between tumor types, and between CTCs and our reference data of cultured tumor cells and patient-derived leukocytes is striking. In conclusion, we therefore suggest that the size of CTCs does matter and should be kept in mind when designing and optimizing size-based isolation methods.
contributed to the statistical design. PAJM and EOH analyzed and interpreted the data. PAJM, JK, JWMM, and SS wrote, reviewed and revised the manuscript.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. Number of CellSearch cartridges per tumor type. Fig. S2. Number of patients per tumor type. Fig. S3. ACCEPT efficiency vs. CellSearch and ACCEPT results in breast and prostate cancer. Fig. S4. ACCEPT efficiency vs. CellSearch and ACCEPT results in colorectal and bladder cancer. Fig. S5. Cell diameter (lm) and size (lm 2 ) per cell line. Fig. S6. Nucleus diameter (lm) and size (lm 2 ) per cell line. Table S1. Trial details of included cartridges.