SIPA1L3 methylation modifies the benefit of smoking cessation on lung adenocarcinoma survival: an epigenomic–smoking interaction analysis

Smoking cessation prolongs survival and decreases mortality of patients with non‐small‐cell lung cancer (NSCLC). In addition, epigenetic alterations of some genes are associated with survival. However, potential interactions between smoking cessation and epigenetics have not been assessed. Here, we conducted an epigenome‐wide interaction analysis between DNA methylation and smoking cessation on NSCLC survival. We used a two‐stage study design to identify DNA methylation–smoking cessation interactions that affect overall survival for early‐stage NSCLC. The discovery phase contained NSCLC patients from Harvard, Spain, Norway, and Sweden. A histology‐stratified Cox proportional hazards model adjusted for age, sex, clinical stage, and study center was used to test DNA methylation–smoking cessation interaction terms. Interactions with false discovery rate‐q ≤ 0.05 were further confirmed in a validation phase using The Cancer Genome Atlas database. Histology‐specific interactions were identified by stratification analysis in lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) patients. We identified one CpG probe (cg02268510SIPA 1L3) that significantly and exclusively modified the effect of smoking cessation on survival in LUAD patients [hazard ratio (HR)interaction = 1.12; 95% confidence interval (CI): 1.07–1.16; P = 4.30 × 10–7]. Further, the effect of smoking cessation on early‐stage LUAD survival varied across patients with different methylation levels of cg02268510SIPA 1L3. Smoking cessation only benefited LUAD patients with low methylation (HR = 0.53; 95% CI: 0.34–0.82; P = 4.61 × 10–3) rather than medium or high methylation (HR = 1.21; 95% CI: 0.86–1.70; P = 0.266) of cg02268510SIPA 1L3. Moreover, there was an antagonistic interaction between elevated methylation of cg02268510SIPA 1L3 and smoking cessation (HR interaction = 2.1835; 95% CI: 1.27–3.74; P = 4.46 × 10−3). In summary, smoking cessation benefited survival of LUAD patients with low methylation at cg02268510SIPA 1L3. The results have implications for not only smoking cessation after diagnosis, but also possible methylation‐specific drug targeting.

Smoking cessation prolongs survival and decreases mortality of patients with non-small-cell lung cancer (NSCLC). In addition, epigenetic alterations of some genes are associated with survival. However, potential interactions between smoking cessation and epigenetics have not been assessed. Here, we conducted an epigenome-wide interaction analysis between DNA methylation and smoking cessation on NSCLC survival. We used a twostage study design to identify DNA methylation-smoking cessation interactions that affect overall survival for early-stage NSCLC. The discovery phase contained NSCLC patients from Harvard, Spain, Norway, and Sweden. A histology-stratified Cox proportional hazards model adjusted for age, sex, clinical stage, and study center was used to test DNA methylation-smoking cessation interaction terms. Interactions with false discovery rate-q ≤ 0.05 were further confirmed in a validation phase using The Cancer Genome Atlas database. Histology-specific interactions were identified by stratification analysis in lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) patients. We identified one CpG probe (cg02268510 SIPA1L3 ) that significantly and exclusively modified the effect of smoking cessation on survival in LUAD patients [hazard ratio (HR) interaction = 1.12; 95% confidence interval (CI): 1.07-1.16; P = 4.30 9 10 -7 ]. Further, the effect of smoking cessation on early-stage LUAD survival varied across Ruyang Zhang, Linjing Lai contributed equally to the work. David C. Christiani is a senior author who supervised the work. patients with different methylation levels of cg02268510 SIPA1L3 . Smoking cessation only benefited LUAD patients with low methylation (HR = 0.53; 95% CI: 0.34-0.82; P = 4.61 9 10 -3 ) rather than medium or high methylation (HR = 1.21; 95% CI: 0.86-1.70; P = 0.266) of cg02268510 SIPA1L3 . Moreover, there was an antagonistic interaction between elevated methylation of cg02268510 SIPA1L3 and smoking cessation (HR interaction = 2.1835; 95% CI: 1.27-3.74; P = 4.46 9 10 À3 ). In summary, smoking cessation benefited survival of LUAD patients with low methylation at cg02268510 SI-PA1L3 . The results have implications for not only smoking cessation after diagnosis, but also possible methylation-specific drug targeting.

Introduction
Lung cancer is a leading cause of cancer mortality worldwide. In the United States, lung cancer was estimated as likely to account for 154 050 deaths in 2018, or one-fourth of all cancer deaths (Siegel et al., 2017). A large proportion of lung cancer cases are attributed to smoking, a well-known risk factor (Flanders et al., 2003), and smoking cessation prolongs survival and decreases mortality of lung cancer patients (Balduyck et al., 2011;Parsons et al., 2010). However, the underlying mechanisms of these benefits remain largely unclear (Bhatt et al., 2015;Parsons et al., 2010).
DNA methylation, a reversible epigenetic modification, regulates gene expression and provides potential cancer biomarkers and therapeutic targets (Egger et al., 2004;Feinberg and Tycko, 2004), including for non-small-cell lung cancer (NSCLC) Shen et al., 2018;Wei et al., 2018). Furthermore, as a potential mechanistic link between cigarette smoking and disease, DNA methylation changes can result from various environmental exposures and may explain part of the association between smoking and cancer recurrence or mortality (Lee and Pausova, 2013;Shui et al., 2016).
Progression of complex diseases, such as cancer, results from interactions between clinical, environmental, genetic, and epigenetic factors (Lacombe et al., 2016;Mcnerney et al., 2017). However, most epigenome-wide association studies are designed to identify main effects using a standard marginal test (Karlsson et al., 2014) while ignoring epigenetic-environment interactions. These traditional mining procedures may reduce the power to identify new epigenomic biomarkers (Slade and Kraft, 2016).
In this study, we hypothesized that epigenetic and smoking cessation interactions may affect NSCLC survival. Epigenome-wide DNA methylation data composed of four study cohorts containing lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cases were used for discovery, and the findings were independently validated in The Cancer Genome Atlas (TCGA) data.

Study population
Early-stage (stage I-II) LUAD and LUSC patients who were former or current smokers were included in the study. Never smokers were defined as those who smoked ≤ 100 cigarettes over a lifetime. Current smokers were defined as those who were smoking within 1 year of diagnosis. Former smokers were defined as smokers who quit > 1 year before diagnosis or interview (Suk et al., 2006). We encoded the variable smoking cessation as 'yes' for former smokers and 'no' for current smokers. Data were harmonized from five international study centers, which have been previously described Shen et al., 2018;Wei et al., 2018;Zhang et al., 2019). All patients provided written informed consent, and the study methodologies conformed to the standards set by the Declaration of Helsinki and received approval by its respective institutional review board.

Harvard
The Harvard Lung Cancer Study cohort was described previously (Suk et al., 2006). Briefly, all patients were recruited at Massachusetts General Hospital (MGH) from 1992 to present and had newly diagnosed, histologically confirmed primary NSCLC. We included 133 early-stage LUAD and LUSC patients who were former or current smokers for the current study. DNA was extracted from tumor specimens that were evaluated by an MGH pathologist for amount (tumor cellularity > 70%) and quality of tumor cells and histologically classified using World Health Organization criteria. The study protocol was approved by the Institutional Review Boards at Harvard School of Public Health and MGH.

Spain
The Spain study population was reported previously (Sandoval et al., 2013) and included 196 LUAD and LUSC patients recruited at eight subcenters from 1991 to 2009. In brief, tumor DNA was extracted from fresh-frozen tumor specimens that were collected by surgical resection, and the median clinical follow-up was 7.2 years. The study was approved by the Bellvitge Biomedical Research Institute institutional review board.

Norway
The Norway cohort consisted of 116 LUAD patients with operable lung cancer tumors who were seen at Oslo University Hospital, Rikshospitalet, Norway, in 2006(Bjaanaes et al., 2016. Tumor tissues obtained during surgery were snap-frozen in liquid nitrogen and stored at À80°C until DNA isolation. The project was approved by Oslo University institutional review board and regional ethics committee (S-05307).

Sweden
The Sweden cohort included 85 LUAD and LUSC patients. Tumor DNA was collected from early-stage lung cancer patients who underwent an operation at the Skane University Hospital, Lund, Sweden (Karlsson et al., 2014). The study was approved by the Regional Ethical Review Board in Lund, Sweden (Registration no. 2004/762 and 2008/702).

TCGA
The TCGA database contains 562 early-stage LUAD and LUSC patients who have full information of survival time and covariates. Level 1 HumanMethyla-tion450 DNA methylation data (image data) for each patient were downloaded on October 1, 2015.

Quality control procedures
DNA methylation was profiled using Infinium HumanMethylation450 BeadChips (Illumina Inc., San Diego, CA, USA) for all patients. Raw image data were imported into GenomeStudio Methylation Module V1.8 (Illumina Inc.) to calculate methylation signals and to perform normalization, background subtraction, and quality control (QC). Beta values, which range from 0% (unmethylated) to 100% (methylated), were used to measure methylation level of each probe. Unqualified probes were excluded if they met any one of the following QC criteria: (a) failed detection (P > 0.05) in > 5% samples; (b) coefficient of variance of < 5%; (c) methylated or unmethylated in all samples; (d) common single nucleotide polymorphisms located in probe sequence or 10-bp flanking regions; (e) cross-reactive or cross-hybridizing probes ; or (f) did not pass QC in all centers. Samples with > 5% undetectable probes were excluded. Methylation signals were further processed for quantile normalization, design bias correction for type I and II probes, and batch effect adjustment using ComBat correction (Marabita et al., 2013). We performed QC procedures above in each center separately and then merged all data together before association analysis. Details of QC processes are described in Fig. S1.

Gene expression data
Expression and mRNA sequencing data were available for 281 LUAD and 277 LUSC patients of the TCGA dataset (Table S1). TCGA mRNA sequencing data processing and QC were done by the TCGA workgroup. Raw counts were normalized using RNA sequencing by expectation maximization. Level 3 gene quantification data were downloaded from the TCGA data portal (https://tcga-data.nci.nih.gov; now hosted at https://portal.gdc.cancer.gov) and were further checked for quality. Gene expression data were extracted and log2-transformed before analysis.

Epigenome-wide DNA methylation-smoking cessation interaction analysis
Analysis flow is described in Fig. 1. Patients from the first four study centers (Harvard, Spain, Norway, and Sweden) were assigned into the discovery phase. A histology-stratified Cox proportional hazards model was used to test the interaction item, which was the interaction effect between DNA methylation of each CpG probe and smoking cessation (CpG probe 9 smoking cessation) on overall survival. The model was adjusted for age, sex, smoking cessation, clinical stage, and study center. Hazard ratio (HR) and 95% confidence interval (CI) were described per 1% methylation increment. Multiple testing corrections were performed using the false discovery rate method (FDR, measured by FDR-q value) by the Benjamini-Hochberg procedure. CpG probes with interaction FDR-q ≤ 0.05 were replicated in the validation phase using the TCGA dataset. Robustly significant probes were retained if they met all criteria: (a) interaction P ≤ 0.05 in the validation phase; and (b) consistent effect direction in both discovery and validation phases. We performed stratified analysis for robustly significant CpG probes in LUAD and LUSC patients. Finally, CpG probes with a significant interaction with smoking cessation in both phases were identified as histology-specific probes.

Sensitivity analysis for significant CpG probes
Due to the complex tumor microenvironment-including noncancerous components, which might alter analysis of tumor samples (Aran et al., 2015)-we assessed tumor purity with InfiniumPurify (Zhang et al., 2015) using methylation array data from TCGA samples. Tumor purity was included as an additional covariate in the Cox regression model for sensitivity analysis.

Genome-wide methylation transcription analysis
For robustly significant histology-specific prognostic CpG probes, we also performed genome-wide methylation transcription analysis using mRNA sequencing data from TCGA. The correlation between DNA methylation and gene expression was tested using a linear regression model adjusted for the same covariates mentioned above. Association with FDR-q ≤ 0.05 was considered significant. Additionally, we tested the association between gene expression and overall survival using a Cox proportional hazards model adjusted for the same covariates. Genes involved in significant associations with both methylation and NSCLC survival were filtered.

Statistical analysis
Continuous variables were summarized as mean AE standard deviation (SD), and categorized variables were described by frequency (n) and proportion (%). Kaplan-Meier survival curves were used to compare survival difference among subgroups. Statistical analyses were performed using R version 3.4.4 (The R Foundation for Statistical Computing).

Results
After QC, epigenome-wide DNA methylation data including 311 891 CpG sites from 1092 tumor samples of early-stage (stage I-II) NSCLC patients were retained. There were 530 patients (N LUAD = 413 and N LUSC = 117) in the discovery phase and 562 patients (N LUAD = 285 and N LUSC = 277) in the validation phase. Table 1   information for the study population. There were 37% and 27% current smokers in the discovery and validation phases, respectively.
After including tumor purity as an additional covariate in sensitivity analysis, DNA methylation at cg02268510 SIPA1L3 retained a significant interaction with smoking cessation on LUAD survival (HR interaction = 1.18; 95% CI: 1.02-1.36; P = 0.024). The interaction P-value was still significant but slightly inflated due to (a) the smaller sample size (51% of original) of the sensitivity analysis, which was only performed in TCGA; and (b) low tumor purity (~60%) for NSCLC samples in TCGA due to mixed cell types (Zheng et al., 2017).
To better illustrate the interaction pattern between DNA methylation and smoking cessation, patients were categorized into low, medium, and high groups based on tertiles of cg02268510 SIPA1L3 methylation. The effect of smoking cessation varied across LUAD patients with different DNA methylation levels.
These results also indicated that LUAD patients who did not quit smoking (current smokers) had the poorest prognosis if their methylation of cg02268510 SIPA1L3 was in a low level. So we combined the medium and high methylation groups and performed further analysis. Current smokers in the low methylation group had 1.94 times the mortality risk compared with the medium or high methylation group (Fig. 3A), but there was no statistically significant difference between groups for former smokers (Fig. 3B). The results also indicated that smoking cessation was quite urgent for LUAD patients with low methylation of cg02268510 SIPA1L3 .
In addition, we evaluated the joint effect of CpG methylation level (medium-high vs low) and smoking cessation (Yes vs No) on LUAD survival (Table 2). We used the poorest-prognosis group (current smokers with low methylation) as the reference to evaluate effect of elevated methylation level, smoking cessation, and their interaction. In the combined dataset, the effect of smoking cessation was HR = 0.5506 (95% CI: 0.36-0.84; P = 5.62 9 10 À3 ) and the effect of medium-high methylation of cg02268510 SIPA1L3 was HR = 0.5214 (95% CI: 0.34-0.81; P = 3.48 9 10 À3 ). However, the joint effect was HR = 0.6268 (95% CI: 0.43-0.92; P = 1.84 9 10 À2 ), which was greater than the product of the two individual protective effects (0.5506 9 0.5214 = 0.2871). The joint effect of two protective factors was less protective than expected, indicating an antagonistic interaction between elevated methylation of cg02268510 SIPA1L3 and smoking cessation (HR interaction = 2.1835; 95% CI: 1.27-3.74; P = 4.46 9 10 À3 ).
A growing body of research has reported potential associations of DNA methylation with age and smoking (Fraga and Esteller, 2007;Wan et al., 2012;Zaghlool et al., 2015). Therefore, we also tested the association between methylation of cg02268510 SIPA1L3 and age, as well as smoking-related variables: packyear of smoking, years of smoking, and years of smoking cessation using a linear regression model adjusted for age, sex, clinical stage, and study centers. Smoking-related characteristics of former and current smokers in early-stage LUAD are described in Table S3. There was no significant association between methylation of cg02268510 SIPA1L3 and age (b = À0.01; P = 0.521) or years of smoking (b = 0.03; P = 0.210), but pack-year of smoking (b = 0.02; P = 3.42 9 10 À3 ) as well as years of smoking cessation (b = À0.06;   (100) 196 (100) 116 (100) 85 (100) 530 (100) 444 (  Censored rate is the proportion of samples lost to follow-up or alive at the study end. c Restricted mean survival time is provided because median was not available. *Statistically significant difference (P ≤ 0.05) was observed between combined discovery set and validation set (TCGA). P = 5.08 9 10 À3 ) in former smoker LUAD patients (Fig. S4). Further, because cg02268510 SIPA1L3 maps to SIPA1L3, the association between cg02268510 SIPA1L3 and SIPA1L3 expression was evaluated using the TCGA dataset. We observed a significant association between cg02268510 SIPA1L3 and SIPA1L3 expression (b = À0.02; P = 0.015) in LUAD patients (Fig. 4), indicating that cg02268510 SIPA1L3 cis-regulates gene expression. Moreover, genome-wide methylation   transcription analysis revealed that expression of 633 genes was significantly correlated with methylation level of cg02268510 SIPA1L3 (Fig. S5A). Among them, expression of only seven genes was significantly associated with overall survival: growth arrest and DNA damage-inducible gamma (GADD45G), maturin (MTURN), TMEM200B, RGS20, RELT-like 1 (RELL1), PGM2, and receptor-interacting serine/threonine kinase 2 (RIPK2; Fig. S5B-H).

Discussion
In this study, we systematically evaluated all pairwise DNA methylation-smoking cessation interactions on an epigenome-wide scale and further confirmed these interactions in an independent population. To our knowledge, this is the first study with a large sample size to investigate interactions between DNA methylation and smoking behavior on lung cancer survival, and it provides new evidence to account for the missing heritability of complex diseases (Trerotola et al., 2015). Our results show that the effect of smoking cessation on early-stage LUAD patient survival varies with methylation level of cg02268510 SIPA1L3 . Smoking cessation only benefits LUAD patients with low methylation, rather than medium or high methylation, of cg02268510 SIPA1L3 . Further, there is an antagonistic interaction between elevated methylation of cg02268510 SIPA1L3 and smoking cessation.
We found that in LUAD patients with low methylation of cg02268510 SIPA1L3 , current smokers with more accumulative exposure had worse survival than former smokers. However, for a population with mediumhigh methylation, the prognosis of current smokers was similar to that of former smokers. The effect of smoking cessation is therefore modified by DNA methylation level, indicating opportunities for epi-drug intervention due to the inherent reversibility of epigenetic events (Wright, 2013).
Up to 50% of lung cancer patients are estimated to keep smoking after diagnosis or to frequently relapse after smoking cessation (Park et al., 2012;Walker et al., 2006). Our results indicated that smoking cessation was urgent especially for LUAD patients with low methylation of cg02268510 SIPA1L3 . On the other hand, reduced methylation of cg02268510 SIPA1L3 might strengthen the protective effect of smoking cessation on survival.
Many studies have reported significant associations between smoking cessation and overall survival (Koshiaris et al., 2017;Nia et al., 2005), while other studies have reported negative results (Baser et al., 2006;Parsons et al., 2010). Based on our interaction analysis, we suspected that epigenetic modifications might account for this inconsistent phenomenon. Because the effect of smoking cessation varies across a Patients were categorized into two groups (medium-high vs low) by tertiles of cg02268510 SIPA1L3 methylation level. b Main effects of elevated methylation and smoking cessation and their joint effect and interaction were derived from the Cox proportional hazards model adjusted for covariates. c Interaction = Joint effect Ä (main effect 1 9 main effect 2 ). 2.1835 = 0.6268 Ä (0.5506 9 0.5214). populations with different methylation levels of cg02268510 SIPA1L3 , the effect could be neutralized in a population of patients with mixed cg02268510 SIPA1L3 methylation levels. Thus, the traditional marginal test for association between smoking cessation and cancer survival inherently loses statistical power to report significant findings due to complex association patterns. SIPA1L3, the gene in which cg02268510 is located, encodes GTPase-activating proteins (GAPs) specific for the GTP-binding protein Ras-associated protein-1 (RAP1), which is implicated in regulation of cell adhesion, cell polarity, and cytoskeletal organization (Kooistra et al., 2007). SIPA1L3 is a member of the SPA1 family of RapGAPs, which play a crucial role in spatiotemporal control of Rap1 activation in cells (Mochizuki et al., 2001). Rap1 plays many roles during cell invasion and metastasis in different cancers . Additionally, overexpression of RAP1 may desensitize NSCLC cells to cisplatin, a first-line drug to treat NSCLC (Besse et al., 2014). Our results suggest that low methylation at cg02268510 SIPA1L3 might promote SIPA1L3 expression, further leading to Rap1 activation and resulting in poor prognosis (Fig. 5).
Many of the deleterious effects of smoking are due to induction of inflammatory responses that contribute to lung cancer progression (Crusz and Balkwill, 2015;Walser et al., 2008). In vitro experiments in human umbilical vein endothelial cells demonstrate that nicotine stimulates cellular inflammatory responses by activating the NF-jB transcription factor axis by a second messenger pathway (Ueno et al., 2006). Activation of NF-jB, one of the most investigated transcription factors, controls multiple cellular processes in cancer, including inflammation, transformation, proliferation, angiogenesis, invasion, metastasis, chemoresistance, and radioresistance (Chaturvedi et al., 2010). Nicotine protects NSCLC cells against chemotherapy-induced  apoptosis and serum deprivation-induced apoptosis through NF-jB, and NF-jB activity is also directly stimulated by nicotine (Anto et al., 2002;Tsurutani et al., 2005). Therefore, for current smokers, nicotine in tobacco stimulates activation of NF-jB, induces inflammatory responses, and is relevant to poor patient prognosis (Fig. 5).
Moreover, Rap1 is an essential modulator of NF-jB-mediated pathways. NF-jB is induced by ectopic expression of Rap1, whereas its activity is inhibited by Rap1 depletion (Teo et al., 2010). Furthermore, levels of Rap1 are positively regulated by NF-jB, and human breast cancers with NF-jB hyperactivity show elevated levels of cytoplasmic Rap1 (Teo et al., 2010). Thus, positive feedback mechanisms might exist between Rap1 expression and NF-jB activation (Fig. 5). In terms of cg02268510 SIPA1L3 and smoking cessation interaction, keeping smoking was associated with poor prognosis only in LUAD patients with low methylation, rather than medium or high methylation, possibly because high activation of both Rap1 and NF-jB may only occur in patients with low methylation.
We also found that methylation level of cg02268510 SIPA1L3 increased along with long pack-year of smoking, but decreased with long years of smoking cessation. As presented in Fig. 5, low methylation and keeping smoking resulted in the worst prognosis, which might be due to the positive feedback in Rap1 and NF-jB. However, methylation of cg02268510 SIPA1L3 increased with the cumulative amount of smoking. But, high methylation of cg02268510 SIPA1L3 resulted in low SIPA1L3 expression that was hard to active Rap1 and NF-jB, and then might weaken the harmful effect of smoking, which also indicated an antagonistic effect. It is implied that there might be a self-protective mechanism in the human body that prevents the body from receiving excessive damage from exposure. As reported, smoking increases reactive oxygen species (ROS) production and is a significant source of oxidative stress (Athanasios et al., 2013), but in vivo, there is a variety of antioxidant defense mechanisms existed to counteract the detrimental effects of ROS by regulating the production of free radicals and their metabolites (Deponte, 2013;He et al., 2017). It may be an adaptive defense mechanism to counteract the increased ROS production that superoxide dismutase enzyme levels in blood and salivary were increased in smokers (Jenifer et al., 2015). Moreover, a previous study has found that activation of Rap1 serves to attenuate ROS production (Remans et al., 2004) and there is a potential interrelationship between Rap1, ROS, and NF-jB activation (Moon et al., 2011). But further functional studies are warranted to elucidate the mechanism of cg02268510 SIPA1L3 and smoking cessation interaction on LUAD survival.
GADD45G is a member of the GADD45 family, which plays an essential role in cellular stress response, survival, senescence, and apoptosis regulation (Liebermann et al., 2011). GADD45G has been reported to be a tumor suppressor in multiple cancer types and can inhibit cell growth and induce apoptosis (Ying et al., 2005). Patients with high GADD45G expression had a better prognosis in our study. MTURN is a neural progenitor differentiation regulator homolog. 12-O-tetradecanoylphorbol-13-acetate (TPA) is an effective cancer therapeutic reagent for myelocytic leukemia patients (Han et al., 1998), and MTURN is TPA-responsive and may promote both leukemic and normal megakaryocyte differentiation (Sun et al., 2014). Indeed, differentiation therapy by forced differentiation of cancer cells has been successful in curing acute promyeloid leukemia (Chen et al., 2011). Similarly, LUAD patients with high MTURN expression had favorable survival in our study. RGS20 is suggested to promote cellular characteristics that contribute to metastasis, including enhanced cell aggregation, motility, and invasion. Selective inhibition of RGS20 expression may represent an alternative means to suppress metastasis (Yang et al., 2016). Its high expression is significantly associated with progression and prognosis of triple-negative breast cancer . Additionally, our study showed similar results in LUAD patients. Though there is a lack of explicit evidence of relevance between these genes and smoking, what we found may inspire functional studies of these potential genes and further help to complete a picture of the mechanism pathway of cg02268510 SIPA1L3 and smoking cessation interaction on LUAD survival.
Our study has some significant strengths. First, this is the first study to investigate the interaction between DNA methylation and smoking cessation on lung cancer survival on an epigenome-wide scale, which provides new evidence to account for the missing heritability of complex diseases (Trerotola et al., 2015). Second, the two-stage study design we used to exhaustively search for interactions, as well as the sensitivity analysis, is quite conservative in controlling for false positives. Third, our study included a large sample size to analyze DNA methylation-smoking cessation interactions of early-stage NSCLC prognosis, providing an opportunity to identify complex associations with small-medium effect size.
Despite the strengths of our study, we acknowledge some limitations. First, data measured categorical smoking cessation rather than smoking pack-years, which may render less power in the study. Second, smoking cessation was collected at the time of diagnosis and was not reassessed during follow-up. Previous studies have found that 'former smokers' might more accurately represent a mixed exposure status, since quitters are more likely to relapse (Hughes et al., 2004;Walker et al., 2006). Thus, we likely underestimated the benefits of smoking cessation. Third, the association between cg02268510 SIPA1L3 and expression of several genes requires more biological evidence, though methylation is believed to play a crucial role in regulating gene expression (Bird, 2007) and further influence disease gene function (Sch€ ubeler, 2015), cell differentiation, or reprogramming (Khavari et al., 2010). Thus, functional experiments are warranted to confirm these associations, so our findings should be biologically interpreted with caution thus far. In addition, our study consisted mainly of a Caucasian population (89.19%), since TCGA data contained onlỹ 10% non-Caucasian samples. Our results should therefore be translated with caution for other populations. Lastly, the censored rate of survival time for the TCGA population is relatively high, since early-stage NSCLC patients need longer follow-up time. Thus, the validation phase using TCGA population had low statistical power. However, we still successfully replicated one significant interaction, indicating a quite conservative and robust result (Leung et al., 1997;Watt et al., 1996).

Conclusion
This epigenome-wide DNA methylation-smoking cessation interaction analysis of early-stage NSCLC identified one LUAD-specific CpG probe, cg02268510 SIPA1L3 , which could significantly modify effects of smoking cessation on lung cancer survival.
Smoking cessation benefited survival of LUAD patients with low methylation at cg02268510 SIPA1L3 . These results have implications for not only smoking cessation after diagnosis, but also possible methylation-specific drug targeting.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. Quality control processes for DNA methylation chip data. Fig. S2. Manhattan plot of DNA methylation-smoking cessation interaction P-values (A) and main effect P-values (B) derived from histology-stratified Cox proportional hazards model in the discovery phase. Fig. S3. Fixed-effect meta-analysis of interaction between DNA methylation of cg02268510 and smoking cessation for LUAD patients from five centers. Fig. S4. Linear regression analysis between methylation of cg02268510 and age (A) as well as smokingrelated variables: pack-year of smoking (B), year of smoking (C), and year of smoking cessation (D), adjusted for age, sex, smoking status, clinical stage, and study center.  Table S1. Demographic and clinical characteristics of early-stage NSCLC patients in TCGA dataset. Table S2. Results for 15 methylation-smoking interactions using a two-stage association study. Table S3. Smoking-related characteristics of former and current smokers in early-stage LUAD.