Systems therapeutics analyses identify genomic signatures defining responsiveness to allopurinol and combination therapy for lung cancer

The ability to predict responsiveness to drugs in individual patients is limited. We hypothesized that integrating molecular information from databases would yield predictions that could be experimentally tested to develop genomic signatures for sensitivity or resistance to specific drugs. We analyzed TCGA data for lung adenocarcinoma (LUAD) patients and identified a subset where xanthine dehydrogenase expression correlated with decreased survival. We tested allopurinol, a FDA approved drug that inhibits xanthine dehydrogenase on a library of human Non Small Cell Lung Cancer (NSCLC) cell lines from CCLE and identified sensitive and resistant cell lines. We utilized the gene expression profiles of these cell lines to identify six-gene signatures for allopurinol sensitive and resistant cell lines. Network building and analyses identified JAK2 as an additional target in allopurinol-resistant lines. Treatment of resistant cell lines with allopurinol and CEP-33779 (a JAK2 inhibitor) resulted in cell death. The effectiveness of allopurinol alone or allopurinol and CEP-33779 were verified in vivo using tumor formation in NCR-nude mice. We utilized the six-gene signatures to predict five additional allopurinolsensitive NSCLC lines, and four allopurinol-resistant lines susceptible to combination therapy. We found that drug treatment of all cell lines yielded responses as predicted by the genomic signatures. We searched the library of patient derived NSCLC tumors from Jackson Laboratory to identify tumors that would be predicted to be sensitive or resistant to allopurinol treatment. Both patient derived tumors predicted to be allopurinol sensitive showed the predicted sensitivity, and the predicted resistant tumors were sensitive to combination therapy. These data indicate that we can use integrated molecular information from cancer databases to predict drug responsiveness in individual patients and thus enable precision medicine.


Introduction
Lung cancers are the most common cause of death related to cancers worldwide (1). Non-small cell lung cancer (NSCLC) is a widely occurring lung cancer that includes three main subtypes: adenocarcinoma, squamous cell carcinoma and large cell carcinoma (2). Targeted treatment for lung cancer based on attacking the major mutational characteristics and responsiveness to immunotherapy has significantly increased life span (3)(4)(5). Often, many of the mutated gene products that are drivers of the cancers are part of, and controlled by, complex networks of cellular components within cancer cells. Such cellular regulatory networks give rise to the biological capabilities that are characteristic of cancer cells (6). In spite of the steady advances in the treatment of lung cancers, a targeted therapy often works only on a subset of patients with the target driver mutation. One approach to search for other possible therapies, rests with the possibility that many pathways are uniquely dysregulated in individual patients, and these pathways can be used to find targets for potential efficacious drugs. Systems level analyses that consider different types of omics data can provide both the breadth and depth needed to identify pathways that can be targeted therapeutically. Such analyses can also enable the discovery of prognostic genomic biomarker sets associated with the therapeutic targets, and thus represent an important step in precision medicine.
We used a combination of cancer databases for data integration to identify specific drugs that are effective in a predictable manner in individuals. We started with The Cancer Genome Atlas (TCGA) (7) to test our hypothesis that integrated consideration of the molecular characteristics of individual patient tumors will allow us to identify actionable drug targets. TCGA contains both clinical and molecular data from individual patients for different types of cancers including lung cancer. These data have led to reclassification of many cancers based on molecular characteristics (8)(9)(10). We focused on lung adenocarcinoma (LUAD). We explored TCGA data from LUAD patients to find new pathways and targets which had not previously been used for drug therapy of lung cancer. Our strategy was to focus on targets that are not well-known mutations or that have protein kinase activity, to be able to explore unidentified potential cancer genes (11). We found the xanthine dehydrogenase (XDH) gene highly expressed in a subset of patients with lower survival rates in TCGA LUAD data. XDH and its interconvertible form xanthine oxidase have been known as drug targets for over fifty years. In fact, an inhibitor of XDH, allopurinol, was synthesized and tested as an early potential anti-cancer agent. Although allopurinol was a not a successful anti-leukemic drug (12), it has been used successfully to treat gout for over fifty years and is also used to prevent kidney stones associated with hyperuricemia caused by cancer chemotherapy (13,14). We then experimentally analyzed LUAD cell lines in the Cancer Cell Line Encyclopedia (CCLE) (15) to identify cell lines that are either sensitive or resistant to allopurinol. We used molecular data associated with these cell lines to identify the transcriptomic signatures that predict sensitivity or resistance to allopurinol. We also used network analysis to predict that cell lines resistant to allopurinol alone could be successfully treated with combination therapy of allopurinol with a JAK2 inhibitor. We tested this prediction experimentally and found it to be valid. We then used the molecular signatures from integration of TCGA and CCLE data to analyze transcriptomic data from patient-derived tumors (PDX) from Jackson laboratory and were able to identify tumors that had allopurinol sensitivity indicating that molecular signatures for allopurinol sensitivity can be identified.

Analysis of TCGA and identification of XDH as a target in NSCLC
We analyzed methylation and gene expression data (12905 genes) from patients with lung adenocarcinoma (LUAD) in TCGA. All differentially expressed genes and methylation markers were identified. The genes which were common to both the set of differentially expressed genes and the set of differentially expressed methylation markers were selected (25 genes). These 25 genes then were evaluated for correlation with "days to death" for individual patients. Among these 25 genes, we found 16 genes which had either positive correlations between "days to death" and DNA methylation signatures or negative correlations between "days to death" and gene expression signatures. Of these 16 genes, 4 of them were selected as novel and druggable:

Allopurinol-sensitive and allopurinol-resistant phenotypes in NSCLC
We selected XDH for subsequent investigations, because there is already an FDA-approved drug, allopurinol, that inhibits the enzyme, and that failed as an anti-neoplastic drug when it was first synthesized and tested. (12). Based on our correlation analysis of TCGA data, XDH was among those genes whose higher level of expression correlated with lower survival rates for a subset of patients. Hence, we reasoned that inhibiting XDH could potentially change the course of cancer cell progression. We treated our panel of twelve NSCLC cell lines with allopurinol and calculated the IC50 of allopurinol for cell viability. Basal level of expression of XDH protein in these 12 cell lines negatively correlated with IC50 for allopurinol (Spearman r=-0.8667, P=0.002) which implies an addiction to XDH protein in sensitive cell lines (Figure 1-S4).
We then tested two allopurinol sensitive (NCI-H358 and NCI-H460) and one allopurinol-

Genomic signatures of responsiveness to allopurinol
A key mutation differentiating allopurinol-sensitive and allopurinol-resistant cell lines appears to be KRAS. However, it is neither necessary nor sufficient. KRAS mutations were found in 4 out of 5 sensitive cell lines (the remaining one had an NRAS mutation), but in only 1 out of 7 resistant cell lines (one other of which had an NRAS mutation). Given this variability, we reasoned that additional molecular determinants could provide a more predictive signature. To find the genomic determinants of responses of these cell lines to allopurinol, we used CCLE transcriptomic data to find differentially expressed genes in allopurinol-sensitive and allopurinolresistant cells. We used a t-test (p < 0.001) to identify the genes with the highest expression  Figure  3-F shows the CI for 6 combinatory doses of these drugs in these 2 cell lines indicating synergism, and Figure 3-S1 shows the CI for the NCI-H2170 cell line.

Gene signatures and the allopurinol sensitivity in PDX models of NSCLC
To test the efficacy of allopurinol and combination therapy with CEP-33779 in patient-derived xenograft (PDX) models of NSCLC, and to evaluate the power of gene signatures to assign tumors for best treatment options, we used the gene signatures of sensitivity and resistance to allopurinol in the PDX models of NSCLC provided by The Jackson Laboratory. We analyzed the gene signatures in two sets of data available from the Jackson Laboratory NSCLC PDX models: RNAseq data was used to select one PDX model as allopurinol sensitive and Affymetrix gene expression data was used to select one PDX model as allopurinol-sensitive and one as allopurinol-resistant. We used an algorithm similar to cell line selection by defining a score for sensitivity and resistance based on gene signatures, and we also considered the RAS mutation status (Figure 4-A). In this algorithm, carrying a KRAS or NRAS mutation was a boolean function to select sensitive models, as RAS mutation was previously observed in most sensitive cell lines. These PDX models were grafted as subcutaneous tumors in NSG mice, and allopurinol  Taken together, these data indicate that genomic signatures derived from TCGA can correctly predict allopurinol and allopurinol/CEP-33779 responsiveness in patient derived tumors.

Discussion
TCGA has been very useful in developing molecular classifications of cancer subtypes that underlie key concepts of precision medicine. In addition, as TCGA contains both molecular and clinical data, it is possible to analyze the relationship between these classes of data to develop predictive signatures for progression of cancers in individuals. Additionally, as our study shows, TCGA data sets are a gold mine for identifying targets for new drugs, drug repurposing and combination therapy. A fundamental premise for such mining is that a particular drug therapy is likely to be effective in only a subset of patients. The transcriptomic signature provides a clear way to identify these patients. While our study was being completed, several papers have Although superficially similar, our approach differs from these studies in the following important ways. In our initial search of TCGA we considered all molecular changes including genomic and epigenomic variations individually, not just transcriptomic changes, for predictions (Figure 1-A).
We subsequently focused on gene expression levels, as this was a facile way to integrate TCGA and CCLE data, using the signatures to identify individual NSCLCs, both in the CCLE cell lines and the PDX tumors. We used network building and analyses to identify relationships between drug targets. The two approaches we have used should be broadly useful in identifying additional drug targets and predicting responsiveness for these drugs in other cancers as well. Combining identification of targets for a drug with specification of which patients might benefit from treatment with that drug, can be a substantive step forward in precision medicine We focused on allopurinol both for historical and practical reasons. As described by Elion in her Nobel Prize essay (28)  drug which could readily be tested in the clinic for patients whose molecular profiling indicates that it could be effective.
Our data indicate that two characteristics predict effectiveness: a RAS mutation and a precision transcriptomic signature are both necessary for treatment of NSCLCs with allopurinol and for combination therapy with allopurinol and a JAK2 inhibitor The role of RAS mutation in NSCLC has been shown to be related to heterogeneity in metabolic dependencies and metabolic reprogramming (29,30). It has been shown that transcriptomic profiles of tumors present a metabolic heterogeneity among individual patients which needs to be taken into account for designing precision therapeutics (31). In our cell line studies, most. but not all, sensitive cell lines had a RAS mutation and most, but not all, resistant cells lacked a RAS mutation. This fact indicates the need for a dual signature set including both RAS mutation and gene expression pattern for assignment for treatment with allopurinol or allopurinol plus a JAK2 inhibitor.
Although the histology of all cells and tumors in this study was NSCLC, the TCGA data were from adenocarcinoma tumors and all but one of the cell lines used to find the signatures were also adenocarcinoma while some of the predicted cell lines and PDX models were not adenocarcinoma. Interestingly, one of the cell lines predicted as sensitive with squamous cell carcinoma histology was at the border of sensitivity and resistance compared to other predicted cell lines which were adenocarcinoma.
This study starts with data from individual patients and ends with predictive treatment of tumors from individual patients. Although cost considerations prevented us from testing a large number of PDX tumors, the two tumors we predicted would be allopurinol sensitive were shown to be so.

Cell Cycle and Cell Viability Assay and Calculation of IC50 and CI
For cell cycle analysis, cells were fixed in 70% ethanol. Fixed cells were treated with RNase for 20 minutes before addition of 5µg/mL Propidium Iodide (PI) and analyzed by FACS. Cell viability was detected by luminescent cell viability dye (CellTiterGlo, Promega Corporation, USA). Cells were seeded in triplicate into 96-well plates in full growth media. After 24 hours, drugs of interest (allopurinol and/or CEP-33779) were added in 12 different concentrations (varying from 0 to 4mM) and after 48 hours of drug treatment, 20µL of dye was added to each well containing 100µL of treated media. Cell viability was calculated by dividing each luminescent reading by the average of the luminescent readings obtained for vehicle-control. Concentration-response curves were generated and fitted in Prism 7.0 (GraphPad Software, Inc., USA). The IC50 values were generated using the log inhibitor-normalized response variable slope function: Y=100/(1+10^((X-LogIC50))). IC50 values are shown with 95% confidence interval from at least three independent experiments. To evaluate synergism, CI values were calculated based on the method proposed by Chou and Talalay (24) using CompuSyn software (32). The following single doses of allopurinol were used: 400µM, 800µM and 1000µM. The following single doses of CEP-33779 were used: 1.6µM, 3.2µM and 16µM. The following combination doses were used: Allopurinol=400µM combined with CEP-33779 (1.6µM, 3.2µM and 16µM) and Allopurinol=800µM combined with CEP-33779 (1.6µM and 3.2µM and 16µM).

Xenograft Cell Line in vivo Experiment
NCR-nude female athymic mice were purchased from Taconic Farms, Inc. Mice were injected in the flank region with 1.5*10 6 cells, while anesthetized with a combination of ketamine and xylazine. Size of tumors was measured in three dimensions using a caliper and tumor volume was calculated by this formula: V=0.5*length*width*height. When tumors reached a minimum size of 100mm 3 , mice were randomly assigned to treatment groups and drug treatment was administered by oral gavage. Allopurinol (200mg/kg three times a week) and CEP-33779 (10 mg/kg three times a week) were diluted in PBS for treatment groups and PBS was given to the control group as placebo. Tumors and weights of the mice were measured 3 times a week.

Colony Formation in Soft Agar
Cells (1×10 5 to 2×10 5 per plate) were suspended in soft agar containing 5% serum, dosed with vehicle and drugs and allowed to grow for 2 to 3 weeks with periodic dosing to keep the dosing media fresh and the agar hydrated. Viable colonies were stained with Iodonitrotetrazolium Chloride at 0.5 mg/mL overnight. Colonies larger than 0.3 mm in each field were manually scored using a light microscope.

Immunofluorescent and Western Blot analysis of Tumor Tissue
Mice bearing subcutaneous tumors were sacrificed after the treatment course and tumors were resected. These resected tumors were snap-frozen in isopentane, submerged in liquid nitrogen and sectioned onto positive slides. Unstained frozen sections were fixed for 15 minutes in icecold acetone, dried, rehydrated in PBS and blocked in Tris-buffered saline (TBS) containing 1% Bovine Serum Albumin (BSA), 10% goat serum followed by overnight (4°C) incubation with primary antibodies for caspase 3 and CD31. After washing, Alexafluor 568-Goat anti-rabbit secondary antibodies (Fisher Scientific Company) were incubated with the tissue for 1 hour at room temperature, followed by 4',6-diamidino-2-phenylindole (DAPI) (Molecular Probes) staining. Staining was visualized using an Olympus MVX10 Macroview microscope with a 2X Apochromat lens with 5× zoom. For western blot analysis, a 2 to 3 mm cross-sectional slice of the tumor was lysed in RIPA buffer by sonication and the resulting lysates were analyzed by western blot following standard methods.

PDX Models in vivo Experiments
PDX models were purchased from the Jackson Laboratory and they were received as a single tumor engrafted subcutaneously in a NSG mouse (The NOD.Cg-Prkdc scid Il2rg tm1Wjl /SzJ). This original mouse was sacrificed and the tumor was divided and engrafted in 5 other NSG mice subcutaneously and allowed to grow. Then each of the new tumors was engrafted in 5-10 more mice. Drug treatments were started at passage 4 when enough tumor samples were available. All NSG mice were purchased from the Jackson Laboratory. When tumor sizes were between 50-150 mm 3 , mice bearing tumors were randomly assigned to treatment groups. Each group had at least 8 mice at the beginning of the experiments. Drug preparation, administration and tumor measurements were the same as in the xenograft cell line in vivo experiment, but the allopurinol, CEP-33779, and combination therapy were applied at the following doses: allopurinol (70 mg/kg daily), CEP-33779 (10mg/kg daily) and combination therapy (allopurinol 50mg/kg daily+CEP-33779 2.5mg/kg daily). Tumors and weights of the mice were measured 3 times a week. After the treatment course (30 days), 3 mice from each group were used for in vivo imaging using Pan Caspase (VAD-FMK) near infrared assay (Vergent Bioscience) in The IVIS® Spectrum in vivo imaging system.

Statistical Analysis
All experimental data are shown as mean±SEM. Unpaired t-test and one-way ANOVA were used and p<0.05 was considered as significance. All statistical analyses of experimental data were done in GraphPad Software 7.

TCGA Candidate Gene Signature Identification
In brief, we identified patients with both methylation data and gene expression data from the TCGA-LUAD dataset (Snapshot 12/2012). In addition, we excluded any patient who did not have tissue level control samples. We divided samples into four categories: case-male, casefemale, control-male, and control-female. We evaluated the significance of the difference between case-control by determining the absolute difference among the mean divided by the square root of the sum of variance among each of the groups for each gene. This is given by the formula below: In this formula ! " is the significance of a gene expression, $ / is the average gene expression for the specified category, , / is the standard deviation of the gene expression for the specified category. The gene expressions were determined by the Agilent 4502A microarray. We selected for gene targets that had a positive gene expression change where ! " > 1.5.
We repeated this procedure selecting for methylation markers assessed by Illumina Human Methylation 27k microarray. In this procedure we selected for markers that had a ! " < -2. We repeated the procedure for markers assessed by Illumina Human Methylation 450k microarray. We identified methylation patterns that were selected based on both array formats. We then selected genes that were selected by both gene expression and methylation differentials.
Upon identifying gene signatures of interest, we correlated the gene expression signatures and methylation signatures to the "days to death". We used linear correlations to evaluate the associations between molecular data and clinical data. A gene signature was identified as correlated if it had a correlation coefficient that had a one-sided P < 0.05 as evaluated by the student's T distribution. Through this, we identified genes that had either positive correlations between "days to death" and DNA methylation signatures or negative correlations between "days to death" and gene expression signatures. (Supplemental Files 1, 2, and 3)

Extracting the Gene Signatures of Sensitivity and Resistance to Allopurinol
CCLE gene expression data of 12 cell lines tested for siRNA screening were used. A Welch's ttest, with P<0.001 was used to compare the differentially expressed genes in two sets of cell lines of allopurinol sensitive and allopurinol-resistant. 12 genes were found (6 in each set) which were differentially expressed.

Network analysis of Gene Signatures
To find a gene set capable of forming a protein-interaction network, we selected the top 10 upregulated genes and top 10 down-regulated genes. We used X2K to build a protein interaction network using these new gene sets.

Selecting Cell Lines for Validation of Gene Signatures
For predicting new cell lines as allopurinol-sensitive and allopurinol-resistant, we extracted CCLE gene expression data of all NSCLC cell lines. We then used mean of normalized expression of all genes in gene signature of allopurinol sensitivity to rank all of these cell lines; this rank of cell lines was called sensitivity rank. We also used mean of normalized expression of all genes in gene signature of allopurinol resistance to rank all of these cell lines, this rank of cell lines was called resistance rank. We calculated sensitivity score as Sensitive Score=Resistance Rank-Sensitive Rank and we calculated resistance rank as Resistance Score=Sensitivity Rank-Resistance Rank. We selected the top 5 cell lines (those available to purchase) with highest sensitivity score as allopurinol-sensitive cell lines (if a cell line was not available, we used the next available cell line in the ranked list of allopurinol-sensitive cell lines). The same method was used to select allopurinol-resistant cell lines and 4 cell lines were selected for validation in vitro.

Selecting PDX Models for Validation of Gene Signatures
We extracted gene expression and RAS mutation data of all NSCLC PDX models provided by the Jackson Laboratory. We analyzed the data based on the technology used for measuring gene expression separately (Supplemental files 4, 5 and 6). There were 35 NSCLC PDX models available with RNAseq expression data and 18 NSCLC PDX models available with Affymetrix hg10st gene expression data. For selecting allopurinol-sensitive models, we first calculated the sensitive score and resistance score the same way we calculated them for the cell lines. Then among models with highest sensitivity score and positive for RAS mutation, we selected a model for validation as an allopurinol-sensitive model using RNAseq and Affymetrix hg10st gene expression data (TM01563 model and TM00206 Model respectively). Among models with highest resistance score and negative for RAS mutation, we selected a model for validation as an allopurinol-resistant model using Affymetrix hg10st gene expression data (TM00188).

Gene Set Enrichment Analysis
The sets of gene signatures (allopurinol-sensitivity and allopurinol-resistance) were used for gene set enrichment analysis by Enricher (Go ontology and WikiPathways) and MBC ontology (16,33).         C) Protein interaction network built using genomic signatures. Relaxation of stringency led to a gene set capable of forming a network with additional intermediary nodes like JAK2. D) Validation of combination treatment with allopurinol (400µM) and knockdown of JAK2 in 3 cell lines, percent of cell viability compared to control is shown. Combination treatment significantly decreased cell viability compared to allopurinol treatment alone and JAK2 knockdown alone. (mean+SEM, One-way ANNOVA test, ***p<0.001, ****p<0.0001) E) Effects of combination treatment with CEP-33779 and allopurinol on cell viability of NCI-H1975 and HCC827 cell lines. Combination therapy significantly decreased cell viability compared to allopurinol treatment alone. Percent of cell viability compared to vehicle-control is shown. (mean±SEM, one-way ANNOVA, *p<0.05, **p<0.01, ****p<0.0001) F) Combination Index (CI) for different doses of CEP-3379 and allopurinol in NCI-H1975 and HCC827 cell lines. CI lower than 1 indicates a synergistic effect; which is the case for most of the combination doses. G) Combination therapy with CEP-33779 and allopurinol in xenograft models using 3 different cell lines. PBS, allopurinol (200mg/kg) alone, CEP-33779 (10mg/kg) alone and combination doses of allopurinol (200mg/kg) and CEP-33779 (10mg/kg) were administered by oral gavage 3 times a week. [placebo group (n= 6 for HCC827, n=6 for NCI-H1650 and n=5 for NCI-H1975), allopurinol treatment group (n= 6 for HCC827, n=6 for NCI-H1650 and n=5 for NCI-H1975), CEP-33779 treatment group (n=7 for HCC827, n=6 for NCI-H1650 and n=6 for NCI-H1975) and combination therapy group (n=7 for HCC827, n=7 for NCI-H1650 and n=8 for NCI-H1975)]. The tumor size in mice bearing HCC827 and NCI-H1650 and receiving combination therapy was significantly decreased compared to placebo and single treatment groups at day 18. The tumor size in mice bearing NCI-H1975 and receiving combination therapy was significantly decreased compared to placebo and single treatment groups at day 14 (mean+SEM, unpaired t-test, *p<0.05, **p<0.01, ****p<0.0001)      C) IC50 comparison for the predicted cell lines shown as mean±SEM; error bars are 95% confidence intervals. The IC50 cut-off to determine sensitive and resistant cells was considered to be 755µM (red dotted line). D) Concentration-response curves for allopurinol treatment in predicted sensitive and resistant cell lines (mean±SEM). E) Effects of combination treatment with allopurinol and CEP-33779 on two cell lines predicted as resistant (COR-L105 and NCI-H1568). Combination therapy significantly decreased cell viability compared to treatments with allopurinol alone. Percent of cell viability compared to vehicle-control is shown. (mean+SEM, one-way ANNOVA, *p<0.05, **p<0.01) F) CI for different concentrations of allopurinol and CEP-33779 in COR-L105 and NCI-H1568 cell lines; most of the combinations showed synergistic effects (CI < 1).   A) Flowchart describing how the sensitive and resistant PDX models were selected from a set of available PDX models. B, C and D) Post-treatment tumor size and tumor weight in 4 PDX models; TM01563 and TM00206 models were predicted as sensitive to allopurinol and TM00188 and TM00939 models were predicted as resistant to allopurinol based on the genomic signatures. Tumor size was significantly lower in the allopurinol group compared to placebo group and at the end of the study tumor weights were significantly lower in the allopurinol group compared to placebo group in TM01563 and TM00206 models. Combination therapy decreased the post-treatment tumor size and tumor weight significantly in the TM00188 and TM00939 models compared to single treatment with allopurinol while allopurinol alone and CEP_33779 alone were not able to decrease the tumor size compared to placebo (PBS). (mean+SEM, unpaired t-test, *p<0.05, **p<0.01, ***p<0.001) (Allopurinol (70 mg/kg daily), CEP-3379 (10 mg/kg daily), combination therapy (Allopurinol 50 mg/kg and CEP-33779 2.5 mg/kg daily) and PBS as placebo daily)