Gene expression profiles of breast cancer metastasis according to organ site

In advanced breast cancer, biomarker identification and patient selection using a metastatic tumor biopsy is becoming more necessary. However, the biology of metastasis according to the organ site is largely unknown. Here, we evaluated the expression of 771 genes in 184 metastatic samples across 11 organs, including liver, lung, brain, and bone, and made the following observations. First, all PAM50 molecular intrinsic subtypes were represented across organs and within immunohistochemistry‐based groups. Second, HER2‐low disease was identified across all organ sites, including bone, and HER2 expression significantly correlated with ERBB2 expression. Third, the majority of expression variation was explained by intrinsic subtype and not organ of metastasis. Fourth, subtypes and individual subtype‐related genes/signatures were significantly associated with overall survival. Fifth, we identified 74 genes whose expression was organ‐specific and subtype‐independent. Finally, immune profiles were found more expressed in lung compared to brain or liver metastasis. Our results suggest that relevant tumor biology can be captured in metastatic tissues across a variety of organ sites; however, unique biological features according to organ site were also identified and future studies should explore their implications in diagnostic and therapeutic interventions.

In advanced breast cancer, biomarker identification and patient selection using a metastatic tumor biopsy is becoming more necessary. However, the biology of metastasis according to the organ site is largely unknown. Here, we evaluated the expression of 771 genes in 184 metastatic samples across 11 organs, including liver, lung, brain, and bone, and made the following observations. First, all PAM50 molecular intrinsic subtypes were represented across organs and within immunohistochemistry-based groups. Second, HER2-low disease was identified across all organ sites, including bone, and HER2 expression significantly correlated with ERBB2 expression. Third, the majority of expression variation was explained by intrinsic subtype and not organ of metastasis. Fourth, subtypes and individual subtype-related genes/signatures were significantly associated with overall survival. Fifth, we identified 74 genes whose expression was organ-specific and subtype-independent. Finally, immune profiles were found more expressed in lung compared to brain or liver metastasis. Our results suggest that relevant tumor biology can be captured in metastatic tissues across a variety of organ sites; however, unique biological features according to organ site were also identified and future studies should explore their implications in diagnostic and therapeutic interventions.

Introduction
Advanced or metastatic breast cancer affects multiple organs and is a main cause of cancer death [1]. Common metastatic sites include bone, liver, lung, brain, lymph node, pleura, and skin [1][2][3][4][5][6]. Interestingly, the different breast cancer intrinsic subtypes (i.e., luminal A and B, HER2-enriched, and basal-like) have distinct preferred metastatic sites [7], and both the tumor cell and the metastatic microenvironment might contribute to this organ specificity [8]. To date, however, the biology of breast cancer metastasis according to organ site remains largely unknown.
In advanced breast cancer, biomarker identification and patient selection using a biopsy from a metastatic lesion is becoming a clinical need. On one hand, it confirms the breast origin of the disease. On the other hand, it allows the identification of predictive biomarkers such as PIK3CA mutations or the expression of PD-L1, HER2, and hormone receptors (HR). Although tumor tissue of primary disease obtained years before remains of value to identify these biomarkers when available [9,10], significant biological differences exist between primary and metastatic disease [11]. For instance, loss of estrogen receptor (ER) has been reported in about 20% of cases [12], while 3-10% discordance of HER2 gene amplification exists in primary versus metastatic tissue [13]. Moreover, advanced disease is enriched with new genetic alterations such as ESR1 mutations or the APOBEC genetic signature [14] and with phenotypic changes such as the acquisition of the HER2-enriched subtype in HR-positive (HR+)/HER2-negative disease [11]. Importantly, many of these biological alterations during metastatic disease might lead to resistance and treatment failure [15][16][17][18]. In this direction, clinical trials with novel agents are mandating a metastatic tumor biopsy to select patients based on their tumor's genomic profile.
One critical question that patients, clinicians, and researchers face is which metastatic lesion is better to biopsy or analyze. In certain cases, the most accessible metastatic lesion is chosen. In other circumstances, different options such as liquid biopsies may be available. A better understanding of the molecular profiles of the different metastatic sites might be of value. For example, PD-L1 expression in immune cells in triplenegative breast cancer (TNBC) is not recommended in liver biopsies due to the general lack of immune cells in this organ [19,20]. Another example is determination of HER2 in bone metastasis, which is generally not recommended for technical reasons due to the decalcification procedures. To improve our understanding of the biology of breast cancer metastasis according to the organ site, we performed a phenotypic and molecular characterization of HER2 and 771 genes in 184 metastatic samples across 11 organs, including liver, lung, brain, and bone.

Study population
This retrospective and exploratory study included 184 metastatic tumor samples from 176 patients over the age of 18 years with a histologic diagnosis of metastatic breast cancer detected at the time of diagnosis, at first relapse or after disease progression. Tissues were collected from Hospital Clinic of Barcelona (n = 161) and Hospital Universitario 12 de Octubre (n = 23) in Madrid between years 2000 and 2019. To be included, patients were required to have a formalinfixed paraffin-embedded (FFPE) tissue sample from a locoregional or a distant metastatic lesion. Primary tumor biopsies were allowed if the biopsy was obtained in the context of de novo metastatic disease (n = 7). Core biopsies were performed according to the routine clinical practice, and HR and HER2 receptor statuses were determined locally in the metastatic biopsy according to the American Society of Clinical Oncologists (ASCO)/College of American Pathologists (CAP) guidelines [21,22]

Gene expression analysis
RNA was extracted using the High Pure FFPET RNA isolation kit (Roche, Indianapolis, IN, USA) following the manufacturer's protocol. One to five 10 lm FFPE slides depending on tumor cellularity were used for each tumor sample, and macrodissection was performed, when needed, to avoid normal tissue contamination. A minimum of 100 ng of total RNA was analyzed at the nCounter platform (NanoString Technologies, Seattle, WA, USA) using the Breast Cancer 360 Panel, which measures the expression of 771 breast cancer-related genes and 5 housekeeping genes (ACTB, MRPL19, PSMC4, RPLP0, and SF3A1) [24]. Expression counts were then normalized using custom scripts in R 3.6.3.

PAM50 molecular subtypes and gene signatures
All tumors were assigned to an intrinsic molecular subtype of breast cancer (luminal A, luminal B, HER2enriched, basal-like, and normal-like) using the previously reported PAM50 subtype predictor [25]. For each sample, we calculated scores for 9 signatures including the 5 PAM50 signatures (luminal A, luminal B, HER2-enriched, basal-like, and normal-like) [11], the proliferation signature [26], two risk of recurrence (ROR) signatures at 10 years: ROR score based on subtype (ROR-S) and based on subtype and proliferation (ROR-P) as described previously [26], and the previously reported PAM50MET signature, which is based on 17 variables [27]. Gene expression data will be deposited in the Gene Expression Omnibus under the accession number GSE175692.

Statistical analysis
Chi-square tests were performed to determine the differences in the distribution of variables. Data were subjected to unsupervised hierarchical clustering and principal component analysis (PCA) to identify patterns of expression and to clean the dataset from outliers, 3 out of 184 samples were excluded. Unpaired and multiclass significance analysis of microarrays (SAM) [28], using false discovery rate (FDR), was used to identify differential gene expression across metastatic sites (n = 181). Due to the low number of metastasis from ovary (n = 4), muscle (n = 2), and peritoneum sites (n = 2), these samples were excluded from the multiclass SAM analyses. Logistic regressions were used to identify organ-specific genes.
Overall survival (OS) was defined as the period of time of first diagnosis of metastatic disease to death or last follow-up. Censoring was done at 120 months. Estimates of survival were from the Kaplan-Meier curves and tests of differences by the log-rank test. Univariate and multivariable Cox models were used to test the prognostic significance of each variable. The Bonferroni correction method was used to control the family-wise error rate in case of multiple comparisons [29]. All differences were considered significant at Pvalue < 0.05. All statistical computations were carried out in R 3.6.3 (http://cran.r-project.org).

Functional and pathway enrichment analyses
Gene ontology (GO) annotation analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were analyzed by the Database for Annotation, Visualization and Integrated Discovery (DAVID, http://david.abcc.ncifcrf.gov/) online tool [30]. The list of 771 available genes was used as the background or reference gene list. A P < 0.05 was considered statistically significant.

Effect of organ site in gene expression profiling
The influence of organ site in gene expression profiling of metastatic breast cancer has not been formally addressed. To start approaching it, we first assessed the expression of 771 breast cancer-related genes across the 184 metastatic tumor samples and 11 organ sites. Three out of 184 samples were identified as outliers and were excluded from the gene expression analysis (Fig S2). Unsupervised hierarchical clustering (Fig. S3) and principal component analysis (Fig. 4) revealed that the intrinsic subtypes explain a greater amount of gene expression variability than organ of metastasis. Secondly, we combined gene expression data from 186 patients with early-stage breast cancer representative of all subtypes in the 181 metastatic dataset. The combined dataset (n = 367) of tumors obtained from early-stage and metastatic breast cancer revealed that the 2 main principal components (i.e., PC1 and PC2) are also explained by intrinsic subtype (Fig. S4).

Genes and biological processes associated with organ site
To explore differences in gene expression across metastatic sites, we performed a multiclass SAM analysis. Using a FDR < 5%, we identified a total of 631 differentially expressed genes (81.1%) across organs (Table S3). Some examples were IBSP, which was highly expressed in bone metastasis; FGF1, which was highly expressed in brain metastasis; PCK1, which was highly expressed in liver metastasis; CAV1, which was highly expressed in lung metastasis; or KRT14, which was highly expressed in skin metastasis. Next, we performed a two-class unpaired SAM analysis between each organ versus the rest of samples to identify genes whose expression is associated with each organ of metastasis. Using a FDR < 5%, we identified a total of 518 upregulated genes (67.2%) across organs (i.e., 204 in bone, 201 in skin, 109 in brain, 91 in liver, 29 in lung, and 7 in breast) (Table S4 and Fig. S5A). We did not identify any upregulated gene in lymph node or pleural metastases, nor any common gene upregulated in all metastatic sites (Fig. S5B). We also compared the bone samples of patients with bone-only metastasis vs patients with metastasis in bone and other sites, and we could not find any significant differential expressed gene (Table S4). We then carried out functional enrichment GO analyses using the upregulated gene lists, and although these analyses were limited by a minority of genes in each gene list, they revealed biological processes and pathways significantly enriched (P < 0.05) in each metastatic site (Table S5). In bone metastasis, ossification, the bone morphogenetic protein (BMP), the TGF-beta, and the Hippo signaling pathways were enriched. In brain metastases, enriched GO and pathways included regulation of transcription and GTPase activity and cell migration and also brain-related processes such as nervous system development, chemical synaptic transmission, adult behavior, dopamine synapse, or amphetamine addiction. In liver metastases, we identified enrichment in processes such as oxidation-reduction, glucose metabolism, chromatin remodeling, cholesterol esterification or vasodilation, and the AMPK and calcium signaling pathway. In lung metastasis, enriched GO and pathways included regulation of transcription, immune response, regulation of nitric oxide regulation, regulation of IL6 production, or regulation of vasoconstriction. Finally, enriched GO and pathways in skin metastasis included cell adhesion, angiogenesis, extracellular matrix organization, proteolysis, wound healing, epidermis development, collagen-related processes and ERK and Notch pathways and lipid metabolism (Fig. S6).
The previous gene expression results could be confounded by differences in subtype distribution across organs. To identify genes whose high expression was specific of metastatic site and independent of subtype, we performed adjusted logistic regression analysis for each individual gene. A total of 74 genes were identified (P < 0.05): 36 bone-specific genes, 18 liver-specific genes, 12 brain-specific genes, and 8 skin-specific genes ( Table 2). Of note, we identified known organ-specific genes such as the integrin-binding sialoprotein (IBSP) for bone, the crystallin alpha B (CRYAB) for brain, the aldehyde dehydrogenase 1 family member A1 (ALDH1A1) for liver, or KRT14 for skin. In addition, we identified 3 genes found in the PAM50 gene list to be associated with bone (FOXC1) and skin (KRT14 and KRT5) metastasis.
Finally, we interrogated the 74 genes in 390 breast primary tumors from Lawler et al. [23] publicly available dataset, where 3 types of metastatic spread have been identified: bone and visceral metasynchronous spread, bone-only spread, and visceral-only metastasis. Among the 74 genes, 26 genes (35.1%) were found significantly associated with the type of metastatic spread, including 5 bone-specific genes (CHAD, EYA1, TGFB1, BAX, and HOXA9) whose high expression was associated with bone-only metastasis, 4 bone-specific genes (WIF1, VIT, FOXC2, and MME) whose high expression was associated with bone and visceral metastasis, 2 brain-specific genes (FGF1 and SOX2) whose high expression was associated with bone and visceral metastasis, 2 brain-specific genes (RASGRF1 and CHI3L1) whose high expression was associated with visceral-only metastasis, and 2 liverspecific genes (GGH and MARCO) whose high expression was associated with visceral-only metastasis ( Table 2). This result suggests that particular metastatic organ-specific genes might also be indicative of the type of metastatic spread when analyzed in primary tumors.

Immune expression profiles across organ sites
We then investigated differences in the expression of 95 immune genes across metastatic sites. The expression of 89 genes was found significantly different across metastatic sites (FDR < 5%) (Table S6 and Fig. S7A). Among them, we identified 18 genes of the tumor inflammation signature (TIS) (CCL5, CD27, CD276, CD274, CD8A, CMKLR1, CXCL9, CXCR6, HLA-DQA1, HLA-DRB1, HLA-E, IDO1, LAG3, NKG7, PDCD1LG2, PSMB10, STAT1, and TIGIT) which has been previously associated with anti-PD-1/PD-L1 response [28,29], 4 genes associated with CD8 T cells (CD8A, GZMM, CD8B, and PRF1), a marker of functional regulatory T cells (Treg) (FOXP3), a biomarker for B cells (CD19), and 4 macrophage-related genes (C163, CD84, CD68, and CYBB) (Fig. 5). Lung and pleura were the sites with higher expression of immune genes, while brain had the lowest expression of immune genes. Moreover, we found 86 immune genes differentially expressed across the molecular subtypes (FDR < 5%) (Table S7). Basal-like was the subtype with the highest expression of immune genes (Fig. S7B). Finally, to identify immune genes whose expression was specific of metastatic site and independent of subtype, we performed adjusted logistic regression analysis for each individual immune gene. Regardless of molecular subtype, we identified 27 highly expressed genes in lung, 23 highly expressed genes in pleura, 18 highly expressed and 9 lowly expressed genes in bone, 10 highly expressed and 21 lowly expressed genes in liver, or 7 highly expressed and 39 lowly expressed genes in brain (Table S8).

Associations with overall survival
We evaluated the prognostic ability of the PAM50 subtypes and the site of metastasis. PAM50 molecular subtypes were associated with OS (P < 0.001) and better discriminated prognosis than site of metastasis (Fig. 6A) We then explored the association of 771 individual genes and 9 signatures with OS. We identified 1 signature score (i.e., luminal A signature score) and 51 genes whose high expression was significantly associated with better OS, and 2 signature scores (i.e., basal-like signature score and PAM50MET signature score [27]) and 25 genes whose high expression was significantly associated with worse OS (Table S9). When adjusting for PAM50 subtype, 45 of the 771 genes (5.8%) were significantly associated with OS ( Fig. S8 and Table S10), of which the high    CDCA5, FAM83D, ANLN, MMP7, CRYAB, FOXC1, E2F1, and PDCD1) including the bone-specific gene FOXC1, the brain-specific genes CRYAB, or the immune-related gene PDCD1 (Fig. 6B). Finally, we performed a multivariate analysis adjusting for clinicopathological variables including menopausal status, type of metastasis (de novo or relapsed), number of metastatic sites, metastatic site of the biopsy, PAM50 molecular subtype, and number of lines of therapy and found that 5 of the 9 genes were still significantly associated with worse OS (FAM83D, ANLN, CRYAB, FOXC1, and E2F1) (Table S10).

Discussion
To our knowledge, this is the first study to evaluate gene expression profiles of breast cancer across metastatic organs. In particular, we explored genomic differences between sites of metastatic disease and made the following observations: (a) All intrinsic molecular subtypes are identified within IHC groups; (b) HER2-low disease is identified in all metastatic sites; (c) intrinsic molecular subtypes determined in the metastatic site are associated with OS regardless of where biopsy was performed; (d) lung and pleural metastases have the highest expression of immune genes, while brain and liver have the lowest; and (e) the expression of individual genes is organ-specific and is associated with OS. Previously, we reported that approximately 15% of primary luminal A and B HR+/HER2-negative tumors become HER2-enriched once they metastasize, regardless of HER2 status [11,17]. Concordant with this observation, here we observed a higher frequency of HER2enriched and basal-like subtypes in HR+/HER2negative metastasis compared to primary tumors [17].  of TNBC primary tumors have been reported to be basal-like and 9% HER2-enriched [32,33] and our results showed similar distribution of molecular subtypes in metastatic TNBC. The acquisition of more aggressive molecular subtypes in the metastatic setting, such as HER2-enriched and basal-like [34], especially in HR+/HER2-negative disease, may be due to patient selection, changes in the tumor biology due to its inherent evolution, the effects of therapies, or a combination of all. Recently, comprehensive genomic studies of metastatic breast cancers linked an increase in APOBEC genetic signatures with metastatic HR+/HER2-negative breast cancer [14,35]. Interestingly, high frequency of APOBEC3Bassociated mutations occurs in HER2-enriched subtype [36] which is consistent with the increase in this subtype observed in the metastatic setting.
Here, we also report HER2-low disease in all metastatic sites, including bone. Bone metastases are usually not accepted for inclusion in clinical trials due to decalcification procedures related to IHC. Here, we show that ERBB2 mRNA is highly correlated with HER2 IHC in bone metastasis, suggesting that bone metastasis might be a reliable organ to detect HER2 expression. Nonetheless, alternative quantitative measurements of HER2 (i.e., ERBB2 mRNA) may help better identify patients who might benefit from potent anti-HER2 antibody-drug conjugates, like T-DXd [31,37].
Interestingly, we have previously observed in a wide population of almost 1600 HER2-negative tumors that HER2-low disease was enriched in luminal molecular subtypes (about 80%), especially when compared to HER2-0 (about 50%) [29]. In the present study on a smaller sample size (80 HER2-low specimens), luminal subtypes accounted for roughly half of the total and no difference in subtypes distribution was observed between HER2-low and HER2-0 tumors. However, in  our previous study, only 2.4% of HER2-low tumors were metastatic. A potential shift in molecular subtype distribution between primary and metastatic tumors might thus merit a more careful evaluation, along with its potential prognostic and therapeutic implications. Our study identified particular genes differentially expressed across metastatic sites, suggesting a potential role of the tumor microenvironment. Indeed, our functional enrichment analysis of upregulated genes in each metastatic site identified biological processes and pathways related with the organ where metastasis was seeded. Moreover, we have validated some previously reported overexpressed genes such as TGFB1, IBSP, MMP9, or ITGB3 in bone metastasis [4,38,39]; CRYAB, NRCAM, and SOX2 in brain metastasis [40][41][42]; VEGF and IL6 in lung metastasis [4,43]; and CYP4F3 in liver metastasis [44]. ALDH1A1, PCK1, and APOE were previously described to be upregulated in liver metastases of colorectal cancer [45][46][47]. Further studies are required to understand whether these genes could be used as therapeutic targets or biomarkers of response.
Our data indicate that lung and pleura are the sites of metastasis with higher expression of immune genes, while brain and liver have the lowest expression of immune genes. This is consistent with the findings of a recently published study of over 400 metastatic samples which indicate that lung metastasis has the highest TIS compared to other metastatic sites regardless of the cancer of origin [48]. Notably, high TIS is a biomarker of response to immunotherapy [49,50]. Taken together, these data suggest that patients with lung and pleural metastasis might benefit from immune checkpoint blockade, while other treatment approaches could be more suitable for liver and brain metastases.
Concordant with early-stage breast cancer [51], PAM50 subtypes in metastatic tissues were found highly prognostic. At the same time, we identified 45 genes whose expression provides prognostic information beyond PAM50 subtypes. For example, we identified high PD1 expression as being associated with poor prognosis. Functional studies are needed to better understand whether the genes associated with worse OS could also be therapeutic targets. Indeed, high PD1 mRNA might be a tumor-agnostic biomarker of benefit from anti-PD1 therapy [52].
Our study has some limitations worth noting. First, this is a retrospective study using the available metastatic tumor samples at Hospital Clinic of Barcelona and a set of 23 TNBC samples from Hospital Universitario 12 de Octubre; therefore, selection bias is likely. For instance, we had a very small HER2+ sample size. However, the distribution of IHC groups was very similar to the seminal work by Bertucci and colleagues [14]. On the other hand, the distribution of the organs of the selected biopsies may not reflect the actual frequency of breast cancer metastatic sites due to the accessibility to the organs of metastasis. Second, our cohort is very heterogeneous in terms of systemic therapies received. Thus, we could not link the biological findings with treatment benefit. Third, our dataset surprisingly had longer median OS than expected, possibly because those patients that have biopsies are the more likely to survive longer. Indeed, having a biopsy upon recurrence has been associated with longer survival [53]. Fourth, our analyses are limited to 771 genes; whether different results might be obtained with more genes is unknown. Fifth, we did not explore the biological differences across metastatic sites within a single patient.

Conclusion
In summary, although main molecular features from primary tumors are known to be maintained in advanced disease [9,10], here we report higher proportion of aggressive molecular subtypes in the metastatic setting, especially in HR+/HER2-negative disease, and unique biological features of each metastatic site, indicating a role of the tumor microenvironment and the need to biopsy metastatic disease in patients with advanced breast cancer to better select the treatment strategy for each patient. Understanding the biology of each metastatic site can potentially impact the design of new therapies and ultimately improve patient outcomes. Finally, our study provides a precious dataset of cancer metastasis that can be further exploited.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. Correlation between ESR1 mRNA and % ER protein expression. Pearson correlation between ESR1 mRNA and ER protein expression across all metastatic sites (n = 148) and across bone metastasis (n = 29) with coloring of PAM50 molecular subtype.      .  Table S1. Distribution of PAM50 subtypes and IHC groups across metastatic sites. Table S2. Distribution of PAM50 subtypes in each metastatic site according to IHC group. Table S3. Multiclass SAM of 771 genes between metastatic sites. Table S4. Unpaired SAM between each metastatic site and others. Table S5. Functional enrichment GO analyses of the up-regulated genes in each metastatic site. Table S6. Multiclass SAM of 95 immune genes between metastatic sites. Table S7. Multiclass SAM of 95 immune genes between molecular subtypes. Table S8. Adjusted logistic regression analysis for 95 immune genes. Table S9. Univariate analysis to investigate the association of 771 individual genes and 9 signatures with OS.