The clinical value of miR‐193a‐3p in non‐small cell lung cancer and its potential molecular mechanism explored in silico using RNA‐sequencing and microarray data

miR‐193a‐3p is a tumor‐related miRNA playing an essential role in tumorigenesis and progression of non‐small cell lung cancer (NSCLC). The objective of the present study was to investigate the relationship between miR‐193a‐3p expression and clinical value and to further explore the potential signaling of miR‐193a‐3p in the carcinogenesis of NSCLC. RNA‐sequencing and microarray data were collected from the databases GEO, ArrayExpress and The Cancer Genome Atlas (TCGA). Furthermore, in silico assessments were performed to analyze the prospective pathways and networks of the target genes of miR‐193a‐3p. In total, 453 cases of NSCLC patients and 476 normal controls were included in blood samples, while 920 cases of NSCLC patients and 406 normal controls were included in tissue samples. The pooled positive likelihood ratio, the pooled negative likelihood ratio and the pooled diagnostic odds ratio were calculated to reflect the diagnostic value of miR‐193a‐3p in blood and tissue samples. Moreover, the areas under the curve of the summary receiver operating characteristic curve of blood and tissue were 0.64 and 0.79, respectively. In addition, we found a lower level of miR‐193a in NSCLC tissues than in non‐cancerous controls based on TCGA. A gene ontology (GO) enrichment analysis demonstrated that miR‐193a‐3p could be related to key signaling pathways in NSCLC. Also, several vital pathways were illustrated by KEGG. Lower expression of miR‐193a‐3p in tissue samples of NSCLC may be associated with tumorigenesis and be a predictor of deterioration of NSCLC patients, and pathway analysis revealed crucial signaling pathways correlated with the incidence and progress of NSCLC.

accounted for~80-85% of newly diagnosed cases of lung cancer, and the 5-year survival rate of NSCLC was approximately 11% [3]. Currently, standard diagnostic approaches mainly include radiography, computed tomography, bronchial needle aspiration biopsy guided by ultrasound and detection of bronchial lavage tumor markers. Despite the advantages of these available diagnostic strategies, the lack of sufficient specificity and sensitivity creates a challenge for the identification of lung cancer at an early stage [4]. Tumor biomarkers have attracted great attention in the research area of lung cancer, since this cancer's occurrence is a result of a long-term process presenting with the change of a normal cell to a malignant one, which includes gradual genetic alterations [5]. Many lung cancer-related oncogenes, as well as tumor suppressors, have been identified, and aberrations of several signaling pathways have also been discovered [6]. Nevertheless, the specific molecular mechanism for the occurrence of lung cancer still remains uncertain. Recent studies have confirmed the correlation between lung cancer and non-coding RNAs, among which microRNAs (miRNAs) account for a large proportion.
miRNAs are a type of naturally occurring, small noncoding RNA molecule with approximately 21-25 nucleotides [7]. Recently, an increasing amount of research has shown that abnormally expressed miRNAs participating in the incidence and development of malignant tumors may become a new type of tumor marker [8]. miRNAs can control numerous biological pathways driving tumor behavior by targeting and controlling gene expression in lung cancer [9,10]. The characteristics of miRNAs modulating gene networks and corresponding biological pathways have provided new hope for diagnosing and guiding novel therapeutic decisions for lung cancer patients [10]. However, numerous miRNAs need to be identified and their potential molecular mechanisms still remain to be further determined.
miR-193a-3p is one of the cancer-related miRNAs. It has been found that up-regulated miR-193a-3p functions as an oncogene in esophageal squamous cell carcinoma, modulating proliferation, migration and apoptosis [11]. Aberrant regulation of miR-193a-3p has also been found in the development of other types of cancer, such as prostate cancer, breast cancer, head and neck squamous cell carcinomas, and colorectal cancer [12][13][14][15]. Previously, we reported that miR-193a-3p might be a tumor suppressor in hepatocellular carcinoma. Besides, the expression level of serum miR-193a-3p combined with alpha-fetoprotein and ultrasound could assist in the diagnosis of hepatocellular carcinoma at an early stage [16]. Our previous work using quantitative real-time polymerase chain reaction also demonstrated that the miR-193a-3p level in NSCLC tissue samples was significantly down-regulated compared with that in non-tumorous lung tissues [17]. Several other reports also determined the level of miR-193a-3p in lung cancer tissues, but the sample size varied and the results were inconsistent. However, no study has mined the public high-throughput data of RNA-sequencing and microarray to explore the clinical role of miR-193a-3p in lung cancer.
The function and molecular mechanisms of miR-193a-3p have been explored in a few diseases. For instance, in human osteosarcoma, miR-193a-3p functioned as a suppressor by suppressing the signaling pathway of Rab27B and serine racemase [18,19]. Similarly, the suppressive influence of miR-193a-3p on growth was related to the reduction of MCL1 expression in malignant pleural mesothelioma [20]. In NSCLC, only three targets have been confirmed, including ERBB4, S6K2 and KRas [19,21,22]. Nevertheless, other specific regulatory mechanisms of miR-193a-3p in lung cancer are still uncertain, due to the diversity of target genes regulated by a single miRNA. Hence, the objective of the present study was to evaluate the association between miR-193a-3p level and the development of NSCLC using public high-throughput data including RNA-sequencing, microarray and all available published documents. Further, we also explored the potential signaling of miR-193a-3p in the carcinogenesis of NSCLC via in silico approaches, such as Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) pathway analyses.

Search strategy and inclusion criteria
We searched the NSCLC-related miRNA microarray or RNA-sequencing data from the National Center of Biotechnology Information (NCBI) Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) and ArrayExpress to 1 July 2017.
The search strategy was as follows: (lung OR pulmonary OR respiratory OR bronchi OR bronchioles OR alveoli OR pneumocytes OR 'air way') AND (cancer OR carcinoma OR tumor OR neoplas* OR malignan* OR adenocarcinoma) AND (MicroRNA OR miRNA OR 'Micro RNA' OR 'Small Temporal RNA' OR 'non-coding RNA' OR ncRNA OR 'small RNA').
The inclusion criteria of eligible datasets were as follows. Firstly, the experimental group should be NSCLC patients and the control group should be non-cancerous individuals.
Secondly, the minimum number of samples for each group should be 30. Thirdly, the raw miRNAs expression data from profiling of experimental and control groups should be available or calculable, which provided the expression level of miR-193a-3p. Fourthly, only human samples were included. Fifthly, both tissue and peripheral blood samples from NSCLC were included.

Statistical analysis
Firstly, the expression level of miR-193a-3p was extracted from each microarray. Student's paired or unpaired t test was used to assess the difference of miR-193a-3p level between different groups with SPSS STATISTICS v. 23.0 (IBM Corp., Armonk, NY, USA). Receiver operating characteristic (ROC) curve analyses were used to assess the diagnostic significance of miR-193a-3p in NSCLC. P < 0.05 was considered as statistically significant in the current study.
Secondly, a meta-analysis with GEO and ArrayExpress data was performed with STATA V. 12.0 (StataCorp, College Station, TX, USA) and the METAN package. Continuous outcomes were presented as standard mean difference (SMD) with 95% confidence interval (CI), and effect sizes were pooled with a random or fixed-effects model based on different conditions. Heterogeneity across studies was assessed with the chi-square test of Q and the I 2 statistic. A P value < 0.05 or I 2 > 50% was considered heterogeneous. If so, the random-effects model (the DerSimonian-Laird method) would be selected to calculate the summarized SMD. If not, the fixed-effects model (the Mantel-Haenszel method) was preferred for the pooling process.
If heterogeneity was present, to further explain whether the pooled result was achieved due to one large study or a single study with an extremely divergent result, sensitivity analysis was applied to omit one study at a time. In addition, the potential publication bias was evaluated with Begg's and Egger's tests. If P < 0.05, there would be publication bias.
Another approach, meta-analysis with summary receiver operating characteristic (SROC), was further carried out to verify the expression level of miR-193a-3p in NSCLC cases. Further, diagnostic odds ratio analysis was executed to assess the diagnostic possibility of miR-193a-3p for NSCLC patients. Positive and negative likelihood ratios were also obtained to reflect the diagnostic value of miR-193a-3p.

Implication of miR-193a-3p expression in NSCLC based on The Cancer Genome Atlas data
The Cancer Genome Atlas (TCGA) only offered expression data of precursor miRNA; therefore, we extracted the miR-193a expression data from lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) patients. We also collected the clinical pathological parameters of the patients. The differences of miR-193a between different groups were assessed as mentioned above. The association between miR-193a expression and the clinicopathological features was determined using SPSS STATISTICS 23.0.

Meta-analysis of studies from literatures
PubMed, Web of Science, EMBASE, Science Direct, Wiley Online Library, Ovid, Cochrane Central Register of Controlled Trials, LILACs and Google Scholar, and the CNKI, VIP, CBM and WanFang databases were used to search for studies evaluating the clinical value of miR-193a-3p in NSCLC. Literature published up to April 2017 was retrieved. The search terms were the same as aforementioned for the GEO database. All eligible studies were included in line with the following criteria: (a) NSCLC patients should be affirmed pathologically; (b) the expression level of miR-193a-3p should be evaluated for NSCLC patients or mean and standard deviation (SD) should be provided; (c) the literature should be the most complete or recent if the same patient cohort was reported more than once by the same authors or research group; (d) it should be written in either Chinese or English in full text.

Meta-analysis based on GEO and TCGA databases and literature
To obtain the comprehensive picture of the clinical role of miR-193a-3p in NSCLC, we combined all available data from GEO, TCGA and literature in the final meta-analyses. Similar approaches were performed as aforementioned, including the calculation of SMD and SROC.
In addition, we assessed the differentially expressed genes of LUAD and LUSC from TCGA. All datasets were processed and calculated for total read counts and reads per million values. The EDGER package was used to perform the statistical analysis. All the dysregulated genes of LUAD and LUSC were obtained for intersection elements. Finally, the overlap section of predicted target genes and the dysregulated ones from TCGA were collected for the following analysis.

Signaling pathway and gene network analyses
The potential target genes of miR-193a-3p achieved above were pooled for further GO and KEGG analyses with Database for Annotation, Visualization and Integrated Discovery (DAVID), which was applied to perform GO enrichment and KEGG pathway analysis. The predicted target genes were uploaded to DAVID, and only pathways with P < 0.05 were considered to be statistically significant. The STRING database was utilized to construct the PPI network for the hub gene identification. Hub genes were regarded as the key target genes of miR-193a-3p in NSCLC. Associations among proteins were assessed by adopting a confidence score threshold of > 0.4.
In total, 453 cases of NSCLC patients and 476 normal controls were contained in blood samples, while 920 cases of NSCLC patients and 406 normal controls provided tissue samples (Table 1).

Potential Diagnostic value of miR-193a-3p as a marker for NSCLC based on GEO and ArrayExpress
Firstly, the expression level of miR-193a-3p was extracted from each microarray. Student's paired or unpaired t test was used to assess the alteration of miR-193a-3p level between different groups with SPSS  (Figs 1 and 2). P < 0.05 was regarded as being statistically significant in the current study.
In the meta-analysis of blood samples, the SMD ranged from À0.56 to 0.62 among the seven datasets (Fig. 3A). According to the result of the heterogeneity test, there was significant heterogeneity in these datasets (P = 0.045, I 2 = 53.5%). Thus, the expression level of blood miR-193a-3p between the NSCLC and normal controls was of no difference based on the randomeffects model (SMD = 0.03; 95% CI, À0.23 to 0.29).
According to sensitivity analysis, results indicated that study GSE27486 had the most negative influence on the summary SMD, which was consistently verified by Begg's funnel plot (Fig. 3B,C). Thus, study GSE27486 was removed and the pooled SMD changed to 0.11 (95% CI, -0.12 to 0.34) as assessed by the random-effects model, since the heterogeneity was still available (P = 0.161, I 2 = 36.9%, Fig. 3D).
In the meta-analysis for tissue samples, the SMD ranged from -1.09 to 0.52 among the 12 datasets (Fig. 4A). According to the result of the heterogeneity test, there was significant heterogeneity in these datasets (P = 0.003, I 2 = 61.4%). Thus, the expression of tissue miR-193a-3p had no difference between the NSCLC and normal controls based on the randomeffects model (SMD = À0.20; 95% CI, À0.43 to 0.04).
The results of the sensitivity analysis indicated that the microarrays of GSE72526 and GSE25508 had the most negative influence on the summary SMD, which was consistently verified by Begg's funnel plot ( Fig. 4B,C). Thus, the studies GSE72526 and GSE25508 were removed and the pooled SMD changed to -0.33 (95% CI, -0.52 to -0.13) as assessed by the random-effects model, since the heterogeneity was still present (P = 0.126, I 2 = 35.3%, Fig. 4D).

Characteristics of the patients with NSCLC from TCGA
The correlation between the expression of miR-193a and clinic pathological features in LUAD and LUSC tissues from TCGA were enumerated. Significant differences of miR-193a expression level was found between lung cancer tissues and adjacent non-cancerous ones in LUSC (P < 0.001). The expression of miR-193a was lower in lung cancer when compared with adjacent non-cancerous lung. LUSC patients with a smoking habit (7.66 AE 0.79) had greater up-expression of miR-193a than those without the habit (7.45 AE 0.88, P = 0.023, Tables 2-4).
For LUAD, patients with a smoking habit (7.58 AE 0.90) also had greater up-expression of miR-193a than those without it (7.36 AE 0.88, P = 0.020). Concerning the clinical tumor, node and metastasis (TNM) stage of LUAD, the relative level of miR-193a was remarkably higher in stage N than in other stages (7.5 AE 0.87, P = 0.023). Compared with a peripheral  Tables 5-7).
However, no obvious association was observed between miR-193a level and other clinicopathological features.

Information from studies from literature
A total of 17 documents were retrieved, but only one of them met our criteria. Therefore, a meta-analysis of studies could not be performed. The expression level of miR-193a-3p of NSCLC was prominently up-regulated over that of adjacent non-cancerous lung tissues (P < 0.001) [17].

Target gene aggregation of miR-193a-3p
A total of 512 genes were derived from miRWalk2.0, Tarbase, miRTarbase and miRecords, which were validated by qPCR, western blot or luciferase assay. Twelve online databases were used for the prediction and a total of 2586 targets genes overlapped in at least five databases were obtained. In addition, together with the extracted genes from the literature, we collected a total of 3121 predicted target genes.
From TCGA, 6138 up-regulated genes in LUAD and LUSC were obtained. Finally, bioinformatics analyses were performed on the 379 overlapping sets of these genes.

GO and pathway enrichment analyses
According to the target-GO analysis in DAVID, genes were highly concentrated in the biological processes of neurotransmitter transport, skeletal system development, ion transport, etc. (P < 0.005, Fig. 7A), in the cellular components of integral to organelle membrane, intrinsic to organelle membrane, chromosomal part, etc. (P < 0.05, Fig. 7B), and in the molecular functions of transcription factor activity, sequence-specific DNA binding, leukotriene-B4 20-monooxygenase activity, etc. (P < 0.05 Fig. 7C). In KEGG pathway analysis, target genes mainly gathered at pathways of cell cycle (P < 0.05, Fig. 7D). In addition, we conducted protein-protein interaction (PPI) by STRING  10.0 to find the hub genes of mir-193a-3p. E2F3, CDC6, AURKA, CHEK1, H2AFX, CDC25A and MYCN were obtained (Figs 8 and 9).
Discussion miRNAs have the potential to function as steady and reproducible biomarkers for different solid malignant tumors, especially NSCLC [23,24]. The methylation of the mir-193a gene has an indirect effect on the expression level of the target gene [25][26][27]. In the oral squamous cell carcinoma, the miR-193a gene is hypermethylated and the expression of miR-193a was frequently down-regulated [26]. Conversely, miR-193a gene methylation silencing could reduce miR-193a expression in some diseases, for example acute myeloid leukemia and NSCLC [27,28]. Downregulation of the miR-193a level in NSCLC tissues was reported by Chen et al. [29]. In addition, our previous study found that the miR-193a-3p level was reduced in NSCLC tissues [17]. In the current study, we studied the miR-193a expression of NSCLC patients from TCGA and their corresponding paracancer tissue (PT). We found lower expression of miR-193a in NSCLC tissues, in comparison with non-cancerous controls, which confirmed the findings of Chen et al. [29]. The relevant level of miR-193a in LUAD was predominantly downregulated compared with that of the PT (P < 0.001). Additionally, the relevant level of miR-193a in LUSC was also less than that of the PT (P = 0.0056). The additional ROC curve demonstrated that miR-193a had a diagnostic value for NSCLC, especially for LUAD. Thus, we were strongly convinced that miR-193a could be a tumor-suppressive predictor in NSCLC.
Since few studies have concentrated on the correlation between the expression of miR-193a and clinic pathological features of NSCLC, we recruited patients to investigate the correlation between the expression of miR-193a and clinicopathological features in LUAD and LUSC from TCGA. Firstly, there was a significant alteration of relevant miR-193a expression between lung cancer and adjacent non-cancerous lung. miR-193a was down-regulated in lung cancer  compared with PT, which suggested its role in diagnosis. Secondly, the smoking habit was the common risk factor for LUAD and LUSC. However, there was a significant difference between miR-193a and location and clinical TNM stage in LUAD. Tumors located in central parts expressed lower miR-193a than those in   peripheral parts. As for clinical TNM stage, miR-193a was expressed at high levels in LUAD patients whose tumors were in stage N. These two indexes indicated poor prognosis and metastasis of LUAD. As for the discrepancy of miR-193a expression in LUSC and LUAD, the highly expressed miR-193a in the peripheral location and stage N of LUAD might be due to tumor cell differentiation. The exact mechanism needs further investigation. miR-193a-3p is a member in the miR-193a family. Nonetheless, information for miR-193a-3p is inadequate and its molecular function and mechanism in early diagnosis remain unidentified. No specific blood miRNA has been verified as a biomarker in the clinic for the early Table 2. Relationship between the expression of miR-193a and clinicopathological features in LUSC from TCGA (mean AE SD). For RPKM, Student's paired or unpaired t test was used.  [15,30]. miR-193a-3p attracted our attention and in the present study we attempted to investigate the potential value of miR-193a-3p in early screening of NSCLC based on microarray databases, and to further explore its prospective relevant pathways via bioinformatics analysis.
In the meta-analysis, a total of seven blood miR-193a-3p microarray datasets were involved, including 453 NSCLC patients and 297 healthy controls. The random effects model showed that significant inconsistency of miR-193a-3p expression was noticeable between NSCLC patients and healthy controls, since the AUC of the SROC curve was 0.64, which indicated a diagnostic value of blood miR-193a-3p in NSCLC compared with non-cancerous controls.
Since we found a significantly lower miR-193a-3p level in NSCLC tissues, down-regulation of miR-193a-3p also played a vital role in the prognosis of NLCLC [17]. In the following study, we attempted to probe the probable value for tissue miR-193a-3p. In sum, we  Table 5. Relationship between the expression of miR-193a and clinicopathological features in LUAD from TCGA (mean AE SD). For RPKM, Student's paired or unpaired t test was used.
studied 12 tissue miR-193a-3p microarray datasets including 920 NSCLC patients and 406 healthy controls. The pooled diagnostic odds ratio was 7.36 (95% CI, 3.54-15.27, P = 0.0018). In addition, the AUC of the SROC curve was 0.79, which suggested that tissue miR-193a-3p had a reliable diagnostic value for NSCLC. Thus, we were strongly convinced that tissue miR-193a-3p was a tumor-suppressive predictor in NSCLC. However, the diagnostic value of tissue miR-193a-3p still requires more studies to confirm this. Concerning the molecular mechanism of miR-193a-3p in NSCLC, several reports have stated the function and molecular mechanism. miR-193a-3p could inhibit the metastasis of lung cancer cells by modulating the expression of cancer-related proteins [31]. It could also overpower the metastasis of human NSCLC by suppressing the Erb-B2 Receptor Tyrosine Kinase 4 (ERBB4)/S6 kinase 2 (S6K2) signaling pathway [19]. In addition, our previous study showed that astrocyte elevatedgene-1 (AEG-1) had the potential to be one target of miR-193a-3p [4]. Since bioinformatics analysis might help in understanding the potential molecular mechanism of miR-193a-3p in the carcinogenesis and progression of NSCLC, we performed in silico predictions to gather all prospective target genes. Finally, we combined the validated genes and prediction genes together and achieved hub genes for miR-193a-3p. The GO and KEGG pathway analyses revealed that miR-193a-3p could be related to several key signaling pathways of NSCLC, such as modulation of apoptosis and modulation of programmed cell death in biological processes; organelle lumen, membrane-enclosed lumen and intracellular organelle lumen in cellular components; identical protein binding, enzyme binding and transcription factor binding activities in molecular functions. Also, several vital pathways were illustrated by KEGG enrichment analysis, including pathways of cancer and focal adhesion signaling pathways. PPI showed the hub genes of mir-193a-3p, including E2F3, CDC6, CHEK1, H2AFX, CDC25A, MYCN and AURKA.
In view of the results we had achieved, we consulted literature to verify our findings. E2F3 mRNA levels were significantly higher in lung cancer patients in comparison with non-cancerous lung tissues and its overexpression was related to poor prognosis [32,33]. Also, CDC6 has been confirmed to be linked to DNA replication to regulate the occurrence and development of lung tumor [34,35]. Overexpression of CHEK1 in lung cancers was related to poor overall survival [36]. An increase in CDC25A expression by means of a decrease in miR-184 promoted cell invasive capacity [37]. MYCN was found to be overexpressed in NSCLC, which was positively related to a more invasive tumor phenotype and poorer outcome [38]. AURKA functioned as an oncogene, and its low expression level inhibited tumor cell proliferation, promoted apoptosis and hindered cell cycle development in NSCLC [39,40]. However, there are some limitations to the current study. Firstly, the sample size in our study was still small, which would weaken the conclusion of the impact of miR-193a-3p in NSCLC patients. Secondly, the small sample size restricted the reliability of the meta-analysis. Further larger studies should be conducted to support the under-expression of tissue miR-193a-3p expression in NSCLC. Thirdly, extra bias might be due to the limited regions involved in the current meta-analysis, since the only blood mir-193a-3p datasets were from USA and Germany. Fourthly, the potential target genes were achieved via in silico prediction, and further validation will also be needed.
Overall, the current observations showed that significant down-expression of tissue miR-193a-3p was detected in NSCLC patients, which indicated tissue miR-193a-3p detection might play an integral part in the early diagnosis of NSCLC compared with blood miR-193a-3p. Furthermore, the pathway analyses of the prospective target genes of miR-193a-3p revealed several key signaling pathways correlated with the incidence and worsening of NSCLC, including the hub genes, such as neuroactive ligand-receptor interaction, cell cycle, etc. Cohorts with larger sample sizes and further in vitro and in vivo studies are needed to determine the diagnostic value and relevant mechanism of tissue miR-193a-3p in NSCLC.