Serum small extracellular vesicle‐derived LINC00853 as a novel diagnostic marker for early hepatocellular carcinoma

This study aimed to identify novel long noncoding RNA (lncRNA) biomarkers for hepatocellular carcinoma (HCC) using publicly available tissue genomic datasets and validate their diagnostic utility for early‐stage HCC. Differentially expressed lncRNAs between 371 HCC and 50 nontumor tissues were obtained from The Cancer Genome Atlas liver hepatocellular carcinoma (TCGA_LIHC) project. Subsequently, the expression of the serum‐ and extracellular vesicle (EV)‐derived lncRNA was assessed in 10 patients with HCC and 10 healthy controls using RT–qPCR. The candidate lncRNAs were validated in 90 HCC and 92 non‐HCC (29 healthy control, 28 chronic hepatitis, 35 liver cirrhosis) patients. The sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were calculated for the candidate lncRNAs and the current HCC biomarker, alpha‐fetoprotein (AFP). SFTA1P, HOTTIP, HAGLROS, LINC01419, HAGLR, CRNDE, and LINC00853 were markedly upregulated in HCC in TCGA_LIHC dataset. Among them, LINC00853 has not been reported in relation to HCC before. In patients with HCC, only expression of small EV‐derived LINC00853 (EV‐LINC00853) was increased. EV‐LINC00853 showed excellent discriminatory ability in the diagnosis of all‐stage HCC (AUC = 0.934, 95% confidence interval = 0.887–0.966). Moreover, using a 14‐fold increase and 20 ng·mL−1 as cutoffs for EV‐LINC00853 expression and AFP level, respectively, EV‐LINC00853 was found to have a sensitivity of 93.75% and specificity of 89.77%, while AFP showed only 9.38% sensitivity and 72.73% specificity for the diagnosis of early‐stage HCC (mUICC stage I). EV‐LINC00853 had a positivity of 97% and 67% in AFP‐negative and AFP‐positive early HCC, respectively. Serum EV‐derived LINC00853 may be a novel potential diagnostic biomarker for early HCC, especially for AFP‐negative HCC.

This study aimed to identify novel long noncoding RNA (lncRNA) biomarkers for hepatocellular carcinoma (HCC) using publicly available tissue genomic datasets and validate their diagnostic utility for early-stage HCC. Differentially expressed lncRNAs between 371 HCC and 50 nontumor tissues were obtained from The Cancer Genome Atlas liver hepatocellular carcinoma (TCGA_LIHC) project. Subsequently, the expression of the serum-and extracellular vesicle (EV)-derived lncRNA was assessed in 10 patients with HCC and 10 healthy controls using RT-qPCR. The candidate lncRNAs were validated in 90 HCC and 92 non-HCC (29 healthy control, 28 chronic hepatitis, 35 liver cirrhosis) patients. The sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were calculated for the candidate lncRNAs and the current HCC biomarker, alpha-fetoprotein (AFP). SFTA1P, HOTTIP, HAGLROS, LINC01419, HAGLR, CRNDE, and LINC00853 were markedly upregulated in HCC in TCGA_LIHC dataset. Among them, LINC00853 has not been reported in relation to HCC before. In patients with HCC, only expression of small EV-derived LINC00853 (EV-LINC00853) was increased. EV-LINC00853 showed excellent discriminatory ability in the diagnosis of all-stage HCC (AUC = 0.934, 95% confidence interval = 0.887-0.966). Moreover, using a 14-fold increase and 20 ngÁmL −1 as cutoffs for EV-LINC00853 expression and AFP level, respectively, EV-LINC00853 was found to have a sensitivity of 93.75% and specificity of 89.77%, while AFP showed only 9.38% sensitivity and 72.73% specificity for the diagnosis of early-stage HCC (mUICC stage I). EV-LINC00853 had a positivity of 97% and 67% in AFP-negative and AFP-positive early HCC, respectively. Serum EV-derived LINC00853 may be a novel potential diagnostic biomarker for early HCC, especially for AFP-negative HCC.

Introduction
Liver cancer is the sixth most prevalent cancer and fourth most common cause of cancer-related death globally; the high mortality rate is mainly due to the late diagnosis and poor response to therapy [1]. Hepatocellular carcinoma (HCC) accounts for~90% of the primary liver cancers and represents a major global health problem. Approximately 90% of the HCCs are associated with a known underlying etiology, most frequently chronic viral hepatitis B or C, excessive alcohol intake, or aflatoxin exposure [2]. Individuals at high risk of developing HCC are recommended to undergo abdominal ultrasonography every 6 months [3]. However, ultrasonography has only a sensitivity of 63% for detecting early-stage HCC [4]. Alpha-fetoprotein (AFP), which is the most widely used blood biomarker in HCC, also shows suboptimal performance as a serological test in HCC surveillance because of fluctuations in the AFP levels during hepatitis flares and 10-20% positivity in early-stage HCC [5]. Therefore, there is an urgent need to develop better screening tools and diagnostic tests for diagnosis of earlystage HCC to improve the prognosis of patients with this fatal disease.
Long noncoding RNAs (lncRNAs) are referred transcripts having a lengths exceeding 200 nucleotides that do not encode proteins in general; they are related to various functions, including regulation of transcription in cis or trans, modulation of messenger RNA (mRNA) processing, post-translational control of protein activity, and organization of nuclear domains [6,7]. Many lncRNAs have been functionally associated with human diseases [8], and their dysregulation has been observed in several cancers, including liver cancer [9][10][11]. Altered lncRNA expression can contribute to cancer phenotypes by stimulating cellular proliferation, angiogenesis, immune evasion, metastasis, and inhibiting apoptosis [8,12].
The present study aimed to identify novel HCC-related lncRNAs using publicly available tissue genomic datasets and validate their diagnostic performance for early-stage HCC in a moderately large cohort of patients with different liver diseases.

Resources of publicly available genomic data
To investigate the expression of lncRNA biomarkers in HCC, genomic data were acquired from The Cancer Genome Atlas liver HCC project (TCGA_LIHC, https://cancergenome.nih.gov) and the GEO database of the NCBI (Accession Numbers: GSE94660, GSE114584, and GSE124535). The expression data for each lncRNA were log 2 transformed [log 2 (FPKM + 1)] for downstream analyses.

Patient enrollment and clinical term definitions
Patients were enrolled from the Ajou University Hospital, Suwon, South Korea, between January 2014 and December 2018. The study subjects were allocated into one of four groups: healthy control, chronic hepatitis (CH), liver cirrhosis (LC), and HCC. A healthy control was defined as an individual between 18 and 50 years of age without any medical history, who visited the Ajou Health Promotion Center for health check-up. CH was diagnosed based on the persistence of serum hepatitis B surface antigen or hepatitis C virus RNA for more than 6 months. LC was diagnosed based on ultrasonographic findings including splenomegaly, blunt angle, and morphological changes [19]. HCC was diagnosed if tumor had a maximum diameter > 1 cm and characteristic features of HCC (arterial phase hyperenhancement, washout in the portal venous or delayed phase, threshold growth, and capsule appearance) in multiphase computed tomography and/or magnetic resonance imaging. If these criteria were present but there was a lack of diagnostic certainty, then a liver biopsy was performed to confirm the diagnosis of HCC [20]. Early-stage HCC was defined as a single lesion less than 2 cm in diameter corresponding to the modified Union for International Cancer Control (mUICC) stage I. The test cohort consisted of 10 patients with HCC and 10 healthy controls, and the validation cohort consisted of 90 patients with HCC and 92 patients without HCC (29 healthy controls, 28 with CH, and 35 with LC). Patients whose AFP level measurements were unavailable were excluded from the comparative analysis. Overall survival was defined as the time from HCC diagnosis to death resulting from any causes. All investigations performed in the present study were conducted in accordance with the guidelines of the 1975 Declaration of Helsinki. The study protocol was approved by the Institutional Review Board of the Ajou University Hospital, Suwon, South Korea (AJRIB-BMR-KSP-18-397 and AJIRB-BMR-KSP-18-299). Anonymous serum samples and clinical data were provided by the Ajou Human Bio-Resource Bank. Informed consent was waived.

Separation of blood serum
Five milliliters of blood was collected from each individual directly into serum collection tubes. The blood was centrifuged at 1800 g for 10 min to extract the serum, which was aliquoted into 1.5-mL tubes and stored at −80°C. The serum samples were centrifuged at 3000 g at 4°C for 15 min to remove cell debris before analysis.

Characterization of serum small EVs
Small EVs were extracted from the serum using Exo-Quick (System Biosciences, Mountain View, CA, USA) according to the manufacturer's instructions with minor modifications [21]. Briefly, serum samples (300 μL) were mixed with ExoQuick (72 μL) and incubated at 4°C overnight. The mixtures were then centrifuged at 1500 g for 30 min at room temperature. The supernatants were collected and used as EV-depleted serum (EDS), whereas the pellets were resuspended in PBS (100 μL) and stored at −80°C for subsequent extraction of RNA and proteins.
Transmission electron microscopy (TEM), nanoparticle tracking analysis (NTA), and western blotting were performed to confirm the presence and size of small EVs. For TEM, small EVs were marked with 10-nm gold particles conjugated to anti-CD63 antibody. Sample fixation was performed with 2% glutaraldehyde and 4% paraformaldehyde for 2 h at room temperature, and the EVs were inspected under a Sigma 500 electron microscope (Carl Zeiss, Jena, Germany). The size and quantity of the isolated EVs were examined using the NanoSight NS300 instrument (Malvern Panalytical Ltd., Malvern, UK) equipped with a 405 nm laser. A 60-s video was recorded with a frame rate of 30 frames/s, and the particle movement was evaluated using NTA software (version 3.0, Malvern Panalytical). Each sample was analyzed three times, and the counts were merged.
For western blotting, EDS, serum derived-small EVs, and Huh7 total cell lysate were lysed in RIPA lysis buffer (100 μL; Thermo Scientific, Waltham, MA, USA) and incubated on ice for 10 min. Total protein concentration was quantified by the bicinchoninic acid assay (Thermo Scientific). The proteins (10 μg) were separated on 4-20% Mini-PROTEAN TGX ™ gels (Bio-Rad, Hercules, CA, USA) and then transferred to poly(vinylidene difluoride) membranes (Amersham; GE Healthcare, Munich, Germany). The membranes were blocked in 5% nonfat milk in TBS-T and immunoblotted using the following primary antibodies:

Isolation of serum RNA and small EVderived RNA from peripheral blood samples from patients
RNA from serum-derived EVs was extracted using the SeraMir ™ Exosome RNA Amplification Kit (System Biosciences) according to the manufacturer's instructions. Briefly, serum samples (300 μL) were mixed with ExoQuick solution (72 μL) and incubated at 4°C overnight. The mixtures were centrifuged at 1500 g for 30 min at room temperature before the supernatants were removed and the pellets resuspended in PBS (100 μL). The EV lysates were mixed with lysis buffer (300 μL) and 100% EtOH (200 μL). After vortexing for 10 s, the mixtures were transferred to a spin column and centrifuged at 15 928 g for 1 min and then washed twice with wash buffer (400 μL). After further centrifugation for 2 min, small EV-derived RNA was eluted in elution buffer (30 μL).
The RNA concentration was assessed using the NanoDrop 2000 spectrophotometer (Thermo Scientific), while its yield and size distribution were analyzed using the Agilent 2100 Bioanalyzer and RNA 6000 Nano kit (Agilent Technologies, Foster City, CA, USA).

Quantitative reverse transcription PCR (RT-qPCR)
The expression level of the serum-derived and the serum small EV-derived lncRNA was measured using RT-qPCR. Serum RNA (500 ng) was reverse transcribed into complimentary DNA (cDNA) using the PrimeScript ™ RT Master mix (TaKaRa Bio, Otsu, Japan), whereas small EV-derived RNA (500 ng) was reverse transcribed using the miScript II RT kit (QIA-GEN, Hilden, Germany). The resultant cDNAs were used as templates for RT-qPCR with the amfiSure qGreen Q-PCR Master Mix (GenDEPOT), which was monitored in real time using the ABI 7300 Real-Time PCR System (Applied Biosystems ™ , Foster City, CA, USA). PCR conditions were as follows: 15 s at 95°C for denaturation, 34 s at 60°C for primer annealing, and 30 s at 72°C for primer extension. The following primer pairs were used as follows: LINC00853 forward: AAAGGCTAGGCGATCCCACA, reverse: ACTCCCTAGCTTGGCTCTCCT; HMBS forward: GGAGGGCAGAAGGAAGAAAACAG, reverse: CA CTGTCCGTCTGTATGCGAG. The 2 ÀΔΔC t method was used to determine target gene expression relative to the internal control gene, HMBS. Relative LINC00853 levels were calculated using 2 ÀΔΔCt , where ΔC t = C t (LINC00853) − C t (HMBS) and ΔΔC t = ΔC t (individual samples) − ΔC t (mean of normal samples). All measurements were performed in triplicate.

Statistical analysis
The data are presented as mean AE SD of three experiments. All statistical analyses were performed in IBM SPSS version 22.0 (SPSS Inc., Chicago, IL, USA) and GRAPHPAD PRISM version 7.01 (GraphPad Software, San Diego, CA, USA). For numerical variables, one-way analysis of variance (ANOVA) with Tukey's post hoc analysis was used to perform multiple comparisons between the three groups. The associations between categorical parameters were assessed using the twosided chi-square test. Survival curves were plotted using the Kaplan-Meier method, and significant differences between the curves were determined using logrank test. P values < 0.05 were considered to be statistically significant. Each candidate biomarker accuracy for HCC was assessed by the area under the curve (AUC), sensitivity, and specificity based on receiver operating characteristic (ROC) curves analysis. The Youden index was used to determine optimal cutoff values.

Selection of candidate HCC-associated lncRNAs
In order to identify novel lncRNAs that play key roles in the development of HCC, we analyzed the publicly available lncRNA profiles of 371 HCC and 50 surrounding nontumor tissues from TCGA-LIHC dataset (Fig. 1A). Among the 14 269 lncRNAs, 3674 were significantly differentially expressed between the HCC and nontumor specimens (P < 0.05 and ≥ 1.5fold change). Specifically, 3140 lncRNAs were upregulated while 534 lncRNAs were downregulated in HCC (Fig. 1B). Volcano plot analysis identified seven distinctly upregulated lncRNAs (SFTA1P, HOTTIP, HAGLROS, LINC01419, HAGLR, CRNDE, and LINC00853; Fig. 1C). According to a review of the literature (Table S1), we propose that LINC00853 as a novel HCC-related lncRNA that has not been reported thus far. We verified the expression of LINC00853 in publicly available HCC RNA-Seq datasets (TCGA_LIHC, GSE94660, GSE124535, and GSE114564) and found that it was not only consistently overexpressed in HCC in all three datasets, but also that its expression increased with the progression of liver disease to HCC (Fig. 1D, 1E, P = 0.0017). The remaining six lncRNAs were also significantly overexpressed in HCC (Fig. S1). Survival analysis based on LINC00853 expression in TCGA_LIHC dataset showed that high LINC00853 expression was associated with poor overall survival and disease-free survival (Fig. 1F, log-rank P = 0.002, P = 0.006, respectively).

Expression of LINC00853 in the serum and serum small EVs in the test cohort
To evaluate the utility of LINC00853 as a noninvasive diagnostic marker for HCC, we measured LINC00853 expression in the serum ( Fig. 2A) and serum EVs of 10 healthy controls and 10 patients with HCC. After separation from the serum, the EVs were characterized using TEM (Fig. 2B), immunoblotting for positive and negative protein markers of EV (Fig. 2C), and NTA (Fig. 2D). RT-qPCR analysis revealed that the level of serum-derived LINC00853 was similar in the two groups ( Fig. 2A, P = 0.7246), whereas that of the serum EV-derived LINC00853 (EV-LINC00853) was significantly higher in patients with HCC than in healthy controls (Fig. 2E, P < 0.001).

Validation of EV-LINC00853 as a diagnostic biomarker for HCC
A total of 90 patients with HCC and 89 patients without HCC were enrolled to validate the diagnostic performance of EV-LINC00853 for HCC. Demographic and clinical parameters of all subjects are listed in Table 1. The most common etiology of CH, LC, and HCC was hepatitis B virus. The percentage of patients with mUICC stage I, II, III, IVA, and IVB tumors was 35%, 16%, 29%, 12%, and 8%, respectively. The expression of EV-LINC00853 was significantly higher in patients with HCC compared to that in healthy controls, patients with CH, and patients with LC (Fig. 3  A). ROC curve analysis revealed that EV-LINC00853 had excellent discriminatory ability [AUC = 0.934, 95% confidence interval (CI) = 0.887-0.966] in the diagnosis of HCC. The optimal cutoff value for the change in EV-LINC00853 expression was 14-fold (Fig. 3B). Considering the uneven age distribution between the groups, we tested whether EV-LINC00853 expression could vary depending on patient age. EV-LINC00853 expression was comparable between different age groups (Fig. S2).

Diagnostic performance of EV-LINC00853 for early-stage HCC
Next, we evaluated the diagnostic value of EV-LINC00853 for early-stage HCC and compared it with the diagnostic performance of AFP. Table 2 and Fig. 4 A-C summarize the diagnostic performance and ROC curves of EV-LINC00853 and AFP for the diagnosis of HCC based on the tumor stage and when compared with different control groups. EV-LINC00853 displayed excellent discriminatory ability in the diagnosis of all-stage HCC (Fig. 4A) as well as early-stage HCC (Fig. 4C). The high diagnostic performance of EV-LINC00853 was maintained even when the control group was changed from patients without HCC to patients with CH or LC. The ROC AUC of EV-LINC00853 (0.908-0.969) was significantly higher than that of AFP (0.541-0.713) in all subgroup analyses (P < 0.001 for all comparisons). Using a 14-fold increase as cutoff for EV-LINC00853 expression, and 20 ngÁmL −1 as cutoff for the AFP level, EV-LINC00853 had a sensitivity of 93.75%, specificity of 89.77%, and 76.92% positive predictive value, while AFP showed only 9.38% sensitivity, 72.73% specificity, and 11.11% positive predictive value for the diagnosis of early-stage HCC (mUICC stage I). Figure 5A compares the positivity rate of EV-LINC00853 and AFP in healthy subjects and in the CH, LC, and HCC groups. Interestingly, EV-LINC00853 had a high positivity rate even in AFP-negative HCC (Fig. 5B-D). In mUICC stage I tumors, EV-LINC00853 had 97% positivity in AFP-negative HCC and 67% positivity in AFP-positive HCC (Fig. 5D).

Prognostic performance of EV-LINC00853 in the validation cohort
Previous survival analyses showed that high tissue LINC00853 expression was associated with poor overall and disease-free survival in TCGA_LIHC dataset (Fig. 1F). Therefore, we evaluated the prognostic power of EV-LINC00853 in our validation cohort. In mUICC stage II HCC, patients with high EV-LINC00853 expression had lower overall survival rate than those with low EV-LINC00853 expression (Fig. 6, HR = 16.55, 95% CI = 1.52-179.7, log-rank P = 0.021). EV-LINC00853 expression was not associated with the overall survival rate in other-stage HCC (Fig. S3).

Discussion
In this study, we identified seven lncRNAs that were differentially expressed between the HCC and the nontumor tissues using TCGA_LIHC data. Among them, LINC00853 was a novel lncRNA that has never been reported in association with HCC or other malignancies. Interestingly, LINC00853 was upregulated in serum EVs, but not in the serum of patients with HCC. In the validation cohort consisting of 90 HCC subjects and 92 non-HCC subjects, EV-LINC00853 had better discriminative power in the diagnosis of allstage HCC and early-stage HCC (AUC = 0.935 and 0.969, respectively) than AFP (AUC = 0.713 and 0.548, respectively).
To the best of our knowledge, our study is the first to report LINC00853 as a novel HCC-related EV-derived biomarker. We believe EV-LINC00853 will be useful for diagnosing early-stage tumors without elevated AFP levels. In the present study, only 9% of the early HCC (mUICC stage I) cases were AFP-positive, while 94% of them were EV-LINC00853-positive. Moreover, 97% of the AFP-negative early-stage HCC cases were positive for EV-LINC00853. These results contrast with those of the previous studies which reported that lncRNA expression is correlated with AFP levels [31,38].  3. Expression of EV-LINC00853 and its diagnostic performance in the validation cohort. (A) Violin plot of EV-LINC00853 expression, as measured by RT-qPCR. Statistically significant differences were determined using the one-way ANOVA with Tukey's multiple comparisons test. Black horizontal lines denote means, and error bars represent SEM. Compared to healthy liver; *P < 0.05, **P < 0.01, ***P < 0.001, compared to CH; #P < 0.05, ##P < 0.01, ###P < 0.001, compared to LC; §P < 0.05, § §P < 0.01, § § §P < 0.001. (B) Analysis of EV-LINC00853 ROC curve in patients with HCC vs control (healthy, CH, and LC). Statistically significant differences in the AUC were relative to AUC of 0.5. Target gene expression was calculated relative to that of HMBS. LINC00853 is likely to act independently of AFP, and thus, it may be a more useful biomarker in patients with CH or LC who sometimes exhibit elevated AFP levels in the absence of HCC, leading to false-positive results.
LINC00853 is a 1826 nucleotide-long lncRNA located on the chromosome 1p33. However, little is known about its biological function in disease, including malignancy. Generally, lncRNAs exert their influences via epigenetic modifications, such as chromatin modulation and DNA methylation, altering the stability of proteins and complexes, or by acting as miRNA sponge. Through these mechanisms of action, lncRNAs have been implicated in the six hallmarks of cancer: self-sustained growth signaling, resistance to growth inhibition, avoidance of apoptosis, uncontrolled proliferation, promotion of angiogenesis, and metastasis [42]. In HCC, the lncRNA HOTAIR induces epigenetic silencing of the HOXD locus [43], HULC may function as a competing endogenous RNA [44], while TERC forms part of the catalytic center of the telomerase complex [45]. Moreover, several lncRNAs have been shown to be involved in Wnt/β-catenin and STAT3 signaling, cancer stem cells, and epithelial-to-mesenchymal transition in HCC [46].
In our study, EV-LINC00853 expression was associated with overall survival only in patients with mUICC stage II HCC. In fact, the positivity rate of EV-LINC00853 decreased with increasing tumor stage, suggesting that overexpression of EV-LINC00853 may  not reflect aggressiveness of HCC. Considering that tissue expression of LINC00853 increased with tumor progression in TCGA_LIHC dataset, prognostic performance of EV-LINC00853 ought to be evaluated in a larger number of patients to confirm these results.
The current study has several limitations. First, we did not investigate the functional role of EV-LINC00853 in HCC development. Considering that little is known about LINC00853 and its roles in cancer, this is an area that warrants further research. Second, we did not confirm the diagnostic performance of EV-LINC00853 in an external patient cohort. HCC is a heterogeneous disease with various underlying etiologies, variable global prevalence, and many poorly defined prognostic patient subsets. Ours was a singlecenter study whereby hepatitis B was the cause of CH, LC, and HCC in a vast majority of the subjects; thus, the results may not be directly generalizable to patient populations with different etiological backgrounds. For these reasons, large multicenter studies involving patients with a variety of liver diseases and originating from different geographical regions will be needed to comprehensively evaluate the diagnostic and the prognostic usefulness of the lncRNA biomarkers [46]. Third, we did not explore exo-LINC00853 expression in malignancies other than HCC; thus, we could not confirm its specificity for HCC. However, majority of the lncRNAs are highly tissue-specific and it is possible that some may be specifically expressed in HCC, allowing for fast diagnosis and better disease management [22,46]. Finally, due to the limited amount of human serum samples available (300 μL/sample), we were unable to use EV separation methods such as ultracentrifugation [47,48], as recommended by MISEV2018 [49]. However, through preliminary experiments, we identified the optimal precipitation kit and RNA isolation method to use with a small amount of sample (data not shown), and confirmed that the extracted EVs satisfied the MISEV2018 criteria (Fig. 2).

Conclusions
A member of the lncRNA family, LINC00853, was significantly expressed in the EVs of HCC patients. EV-LINC00853 had excellent and significantly better discriminatory ability in the diagnosis of both all-stage HCC and early HCC than did AFP. Furthermore, EV-LINC00853 showed high positivity even in AFPnegative early HCC cases. Our findings indicate that EV-derived LINC00853 can serve as a potential noninvasive diagnostic biomarker for HCC that may be of particular value in patients with AFP-negative tumors. Table S1. Seven long non-coding RNAs overexpressed in hepatocellular carcinoma. Fig. S1. Six known lncRNAs expression in HCC cohorts. Fig. S2. Age-related LINC00853 expression in subjects without HCC. Fig. S3. Prognostic power of EV-LINC00853 expression in the validation cohort.