Identification of key miRNAs and genes associated with stomach adenocarcinoma from The Cancer Genome Atlas database

Stomach adenocarcinoma (STAD) is the second leading cause of cancer death and a fuller understanding of its molecular basis is needed to develop new therapeutic targets. miRNA and mRNA data were downloaded from The Cancer Genome Atlas database, and the differentially expressed miRNAs and genes were identified. The target genes of differentially expressed miRNAs were screened by prediction tools. Furthermore, the biological function of these target genes was investigated. Several key miRNAs and their target genes were selected for validation using quantitative real‐time polymerase chain reaction (qRT‐PCR). The Gene Expression Omnibus (GEO) dataset was used to verify the expression of selected miRNAs and target genes. The diagnostic value of identified miRNAs and genes was accessed by receiver operating characteristic analysis. A total of 1248 differentially expressed genes were identified in STAD. Additionally, nine differentially expressed miRNAs were identified and 160 target genes of these nine miRNAs were identified via target gene detection. Interestingly, they were remarkably enriched in the calcium signaling pathway and bile secretion. qRT‐PCR confirmed the expression of several key miRNAs and their target genes. The expression levels of hsa‐miR‐145‐3p, hsa‐miR‐145‐5p, ADAM12,ACAN,HOXC11 and MMP11 in the GEO database were compatible with the bioinformatics results. hsa‐miR‐139‐5p, hsa‐miR‐145‐3p and MMP11 have a potential diagnostic value for STAD. Differential expression of the mature form of miRNAs (hsa‐miR‐139‐5p, hsa‐miR‐145‐3p, hsa‐miR‐145‐5p and hsa‐miR‐490‐3p) and genes including ADAM12,ACAN,HOXC11 and MMP11 and calcium and bile secretion signaling pathways may play important roles in the development of STAD.

Stomach adenocarcinoma (STAD) is the second leading cause of cancer death and a fuller understanding of its molecular basis is needed to develop new therapeutic targets. miRNA and mRNA data were downloaded from The Cancer Genome Atlas database, and the differentially expressed miR-NAs and genes were identified. The target genes of differentially expressed miRNAs were screened by prediction tools. Furthermore, the biological function of these target genes was investigated. Several key miRNAs and their target genes were selected for validation using quantitative real-time polymerase chain reaction (qRT-PCR). The Gene Expression Omnibus (GEO) dataset was used to verify the expression of selected miRNAs and target genes. The diagnostic value of identified miRNAs and genes was accessed by receiver operating characteristic analysis. A total of 1248 differentially expressed genes were identified in STAD. Additionally, nine differentially expressed miRNAs were identified and 160 target genes of these nine miRNAs were identified via target gene detection. Interestingly, they were remarkably enriched in the calcium signaling pathway and bile secretion. qRT-PCR confirmed the expression of several key miRNAs and their target genes. The expression levels of hsa-miR-145-3p, hsa-miR-145-5p, ADAM12, ACAN, HOXC11 and MMP11 in the GEO database were compatible with the bioinformatics results. hsa-miR-139-5p, hsa-miR-145-3p and MMP11 have a potential diagnostic value for STAD. Differential expression of the mature form of miRNAs (hsa-miR-139-5p, hsa-miR-145-3p, hsa-miR-145-5p and hsa-miR-490-3p) and genes including ADAM12, ACAN, HOXC11 and MMP11 and calcium and bile secretion signaling pathways may play important roles in the development of STAD.
According to the Lauren classification, there are two main types of STAD defined as the diffuse type and the intestinal type. Some risk factors are involved in STAD including smoking, high-salt diet, chronic gastritis and Helicobacter pylori infection [3]. Up to now, the principal treatment of STAD has been gastrectomy accompanied by chemotherapy and radiation therapy. Despite advances in the treatment of STAD, the 5-year survival rate is 5-15% [4]. Therefore, understanding the pathogenesis of STAD and searching for new therapeutic targets of STAD are urgent issues.
Recently, several studies have improved our understanding of the molecular mechanisms and signaling pathways underlying tumorigenesis in STAD. For example, some tyrosine kinase receptors, including ERBB2, EGFR, FGFR2 and MET, are activated, which leads to tumorigenesis in STAD [5]. Moreover, the phosphoinositide-3-kinase-Akt signaling pathway is activated and results in aggressive proliferation of STAD tumors [6][7][8].
The Cancer Genome Atlas (TCGA) database is an application platform for genome analysis consisting of large-sample genome sequencing data analysis for 33 types of cancers, including STAD [9]. miRNAs function as post-transcriptional regulators that can repress translation or promote degradation or cleavage of complementary target mRNA sequences [10], and moreover, miRNAs have emerged as key players in the pathogenesis of STAD [11]. In this study, we downloaded the miRNA and mRNA data for STAD from TCGA database and identified several differentially expressed miRNAs and a number of differentially expressed genes (DEGs). Then, we obtained the target genes of these differentially expressed miRNAs and analyzed their biological function. We used the quantitative real-time polymerase chain reaction (qRT-PCR) method to validate some bioinformatics analysis results. The Gene Expression Omnibus (GEO) was used to verify the expression of selected miRNAs and target genes. Finally, the diagnostic value of the identified miRNAs and genes was accessed by receiver operating characteristic (ROC) analysis. These findings may enable us to understand the progression and development of STAD.

Identification of differentially expressed miRNAs and genes
From TCGA database (http://cancergenome.nih.gov), we downloaded miRNA (IlluminaHiSeq_miRNASeq) data of 84 samples (42 case samples and 42 normal samples) and mRNA (IlluminaHiSeq_RNASeqV2) data of 64 samples including 32 cases and 32 normal samples. Differential expression between normal and case samples was assessed using the R-bioconductor package DESEQ 2, and the P-value was calculated. Multiple hypothesis testing was performed via the Benjamini-Hochberg procedure. Differentially expressed miRNAs were screened with the threshold of the false discovery rate (FDR) < 0.001, log2 fold change > 2 and mean base > 100. The DEGs were screened with the threshold of FDR < 0.001 and absolute value of log2 fold change > 2.

Functional annotation of target genes
To further investigate the biological function, all DEGs and target genes of differentially expressed miRNAs were analyzed in the context of several databases such as Gene Ontology (GO) functional categories and the Kyoto Encyclopedia of Genes and Genomes (KEGG) biochemical

qRT-PCR validation
Among the identified differentially expressed miRNAs and their target genes, we selected miRNAs and genes with expression significance for and association with STAD. Depending on this criterion, hsa-miR-139-5p, hsa-miR-145-3p, hsa-miR-145-5p, hsa-miR-490-3p and their target genes ADAM12, ACAN, HOXC11 and MMP11 were selected for validation. Three tumor tissues and three para-carcinoma tissues from participating individuals were obtained immediately after surgery. The collected tissues were frozen in  liquid nitrogen for further RNA extraction. All participating individuals provided informed consent with the approval of the China-Japan Friendship Hospital. Total RNA of the tissue samples was extracted using the TRIzol Ò Reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's protocols. One microgram of RNA was applied to synthesize DNA by SuperScript Ò III Reverse Transcriptase (Invitrogen). Then real-time PCR was performed in an ABI 7500 real-time PCR system with SYBR Ò Green PCR Master Mix (Invitrogen). To confirm their reliability and validity, U6 and 18S rRNA were selected as the endogenous standards. All reactions were carried out in triplicate and relative gene expressions were analyzed by the 2 ÀDDCt method. The primer sequence is shown in Table 1.

Validation of the expression of miRNAs and target genes by GEO
The GEO (http://www.ncbi.nlm.nih.gov/geo) database was used to validate the expression of selected miRNAs and targeted genes. We compared the expression levels of miR-NAs and targeted genes between STAD cases and adjacent non-tumor controls and the difference of expression levels were displayed as box-plots.

ROC analyses
Using the PROC package in the R language we, performed the ROC analyses to assess the diagnostic value of selected miRNAs and target genes. The area under the curve (AUC) under binomial exact confidence interval was calculated and the ROC curve was generated.

Identification of miRNA-target interactions
In the present study, potential target genes of the differentially expressed precursor form of miRNAs were identified. We found 160 target genes with 270 miRNAtarget pairs including 189 interaction pairs with miRNA up-regulated and target genes down-regulated and 81 interaction pairs with miRNA down-regulated and target genes up-regulated in STAD subjects. The established regulatory network of miRNA-targets is showed in Fig. 3. In addition, we further confirmed all of the miR-target interactions (MTIs) in TCGA data analysis using miRTarBAse. Our results showed that there was a total of 46 MTIs after the miRTarBAse analysis. However, we obtained a total of 187 MTIs by using the miR-Walk database. Therefore, the MTIs in the miRWalk database were more than those in the miRTarBAse database. Taking the intersection of the miRTarBAse and miRWalk databases, a total of 22 common MTIs was identified. The original MTIs from the miRWalk and miRTarBAse databases are listed in Tables S1 and S2, respectively. A Venn diagram of MTIs in the groups of the miRTarBAse database vs the miRWalk database is shown in Fig. S1.

Target genes enrichment analysis
Based on the DAVID database, we analyzed the biological function and pathways of these 160 target genes. Additionally, we also analyzed the function of all DEGs by KEGG. The enrichment of GO functional categories and KEGG biochemical pathways showed that these target genes were significantly enriched in calcium-transporting ATPase activity (FDR = 0.001048), calmodulin binding (FDR = 0.002055), calcium signaling pathway (FDR = 7.97 9 10 À6 ), salivary secretion (FDR = 9.13 9 10 À5 ), pancreatic secretion (FDR = 0.00209), neuroactive ligand-receptor interaction (FDR = 0.004133), cell adhesion molecules (FDR = 0.003932), bile secretion (FDR = 0.003801) and vascular smooth muscle contraction (FDR = 0.018674). A GO analysis of these enriched target genes is shown in Table 4, a KEGG pathway analysis of these target genes is listed in Table 5 and the pathway maps in KEGG of calcium signaling pathway and bile secretion are shown in Figs 4 and 5, which are related to STAD. Interestingly, all DEGs were found to be enriched in the signal pathway of gastric acid secretion (Fig. 6), which also plays a role in STAD.

Validation the expression of miRNAs and target genes
In this study, two down-regulated miRNAs (hsa-miR-145-3p and hsa-miR-145-5p) and four up-regulated target genes (ADAM12, ACAN, HOXC11 and MMP11) in STAD were selected to perform the expression validation (Fig. 8). Different expression levels of them between STAD and non-tumor tissues were analyzed and depicted through box-plots. These  box-plots were displayed visually by median and interquartile range. The expression levels of hsa-miR-145-3p and hsa-miR-145-5p were significantly downregulated in the case group compared with the normal group and the expression of ADAM12, ACAN, HOXC11 and MMP11 was significantly up-regulated in the case group compared with the normal group. Compared with the normal group, the expression levels for the case group were consistent with our bioinformatics analysis.

ROC curve analysis
We performed ROC curve analyses and calculated the AUC to assess the discriminatory ability of two selected miRNAs (hsa-miR-139-5p and hsa-miR-145-3p) and one target gene (MMP11) from the GEO dataset (Fig. 9). The AUC for them was more than 0.7. MMP11 had the largest AUC. For STAD diagnosis, the 1specificity (proportion of false positives) and sensitivity (proportion of true positives) of hsa-miR-139-5p were 87.5% and 71.7%, respectively; the 1specificity (proportion of false positives) and sensitivity (proportion of true positives) of hsa-miR-145-3p were 100% and 56.7%, respectively; the 1specificity (proportion of false positives) and sensitivity (proportion of true positives) of MMP11 were 100% and 83.3%, respectively.

Discussion
STAD remains the second leading cause of cancer death and makes up approximately 10% of newly diagnosed cancers [13]. Therefore, an understanding of the molecular mechanism of STAD is needed. In the current study, we obtained nine differentially expressed miRNAs and 1248 DEGs for STAD. Additionally, target gene detection revealed 160 target genes of these differentially expressed miRNAs. After biological function analysis, we found that these target genes were most significantly enriched in the calcium signaling pathway and bile secretion. Calcium signals can regulate the majority of physiological processes ranging from cell proliferation to cell apoptosis [14]. It is reported that any disorders of Ca 2+ channels and/or receptors will lead to various diseases, such as cancer [15]. In this study, we found that the target genes (SLC8A2, ADCY2, ATP2B4, CACNA1H, HTR2C, ATP2B2, PLN, NOS1 and ATP2B3) of differentially expressed miRNAs were associated with the calcium signaling pathway, which suggested a role for calcium signaling in STAD.
It is pointed out that there is a positive relationship between bile acid concentration and gastric carcinoma development, which suggests a carcinogenic role of bile acid in gastritis [16]. Furthermore, Suzuki et al. [17] also found that patients with a high level of bile acid developed gastric carcinoma more frequently than those patients with a low bile acid level. In this study, we found that target genes (ADCY2, AQP4, SLCO1A2 and BAAT) were related to bile secretion, which suggested a relationship between bile secretion and STAD.
All in all, these identified target genes played roles in the biological processes of calcium signaling and bile secretion in STAD. Additionally, we also investigated the function of all DEGs by KEGG pathway analysis and found that these DEGs were remarkably involved in signaling pathways of gastric acid secretion.
The principal secretory function of the stomach is secreting gastric acid [18]. Gastric acid functions in a number of ways including modulating the gut microbiome, assisting in protein digestion and facilitating absorption of iron and calcium [19]. It is reported that H. pylori is related to gastric carcinoma [20]. Interestingly, it is found that in patients with relatives with gastric carcinoma, H. pylori infection is related to decreasing gastric acid secretion [21]. In this study, we found all DEGs were significantly enriched in the signaling pathway of gastric acid secretion, which further demonstrated the role of gastric acid secretion in the development of STAD.
Among identified miRNAs, mature the form of hsa-miR-139-5p, hsa-miR-145-3p, hsa-miR-145-5p and hsa-miR-490-3p was validated by qRT-PCR and the results were consistent with the bioinformatics results except for the expression of hsa-miR-139-5p. The small sample size we used for qRT-PCR may account for this inconsistency. It is pointed out that hsa-miR-139-5p shows decreased expression associated with STAD [22,23]. Previous reports found that hsa-miR-145 was a potential tumor suppressor and hsa-miR-145-3p and hsa-miR-145-5p were down-regulated in stomach carcinoma [24][25][26]. Kuo et al. [22] demonstrated that hsa-miR-490-3p was down-regulated in STAD gastric carcinoma tissues. In this study, our results were in agreement with previous reports, which further demonstrated the role of these differentially expressed miRNAs in the process of STAD. Significantly, both hsa-miR-139-5p and hsa-miR-145-3p had a potential diagnostic value for STAD.
Additionally, we also validated the target genes (ADAM12, ACAN, HOXC11 and MMP11) of these four miRNAs mentioned above, and their expression patterns were consistent with the bioinformatics results. ADAM12 is a complicated and multi-domain protein that functions in cell proliferation and movement [27]. Furthermore, it is strongly expressed in various cancers including stomach carcinoma [28,29]. That mutation of ACAN occurs in stomach carcinoma has been shown by comprehensive whole-genome and transcriptome sequencing analysis [30]. HOXC11 is considered to be a novel potential oncogene with altered expression in stomach carcinoma pathogenesis by association analysis with candidate gene strategy [31]. MMP11 is known a marker of tumor invasion and metastasis [32]. Moreover, previous reports detected the expression level of MMP11 by microarray analysis and found that it was elevated in stomach carcinoma patients and quantitative polymerase chain reaction analysis also confirmed its upregulated expression [32,33]. Herein, up-regulated expression of ADAM12, ACAN, HOXC11 and MMP11 may be involved in the pathology of STAD. Interestingly, MMP11 was significantly associated with STAD diagnoses.
Besides the above validated differentially expressed miRNAs, we also found two miRNAs (hsa-miR-196b and hsa-miR-135b) were also differentially expressed in STAD. Moreover, hsa-miR-196 and hsa-miR-135 covered the most downstream target genes in the regulatory network between differentially expressed miRNAs and target genes. It has been demonstrated that the expression level of hsa-miR-196b is significantly higher in stomach carcinoma [34,35]. Additionally, overexpression of hsa-miR-196b is linked to stomach carcinoma and may be considered as a stomach carcinoma marker [36,37]. It was demonstrated that hsa-miR-135b is up-regulated in stomach carcinoma tissues [38,39]. Based on the results obtained in several studies, hsa-miR-135b is reported as a potential biomarker of the intestinal-type of STAD [40][41][42][43]. In our study, hsa-miR-196b and hsa-miR-135b were also up-regulated, which was in line with previous reports. Further qRT-PCR validation experiments for hsa-miR-196b and hsa-miR-135b are needed.
There are limitations to our study. Firstly, target site information for miRNA-mRNA pairs is needed for validation of the miRNA-mRNA interaction. Secondly, RNA-seq and miRNA-seq are further needed to screen larger numbers of candidate miRNA and mRNA. Thirdly, a luciferase assay for direct verification of the identified miRNA-target interactions is needed in further studies.

Supporting information
Additional Supporting Information may be found online in the supporting information tab for this article: Fig. S1. Venn diagram of MTIs in the groups of miR-TarBAse database vs miRWalk database. Table S1. The original miR-target interactions from the miRWalk database .  Table S2. The original miR-target interactions from the miRTarBAse database.