Rapid bacterial identification by direct PCR amplification of 16S rRNA genes using the MinION™ nanopore sequencer

Rapid identification of bacterial pathogens is crucial for appropriate and adequate antibiotic treatment, which significantly improves patient outcomes. 16S ribosomal RNA (rRNA) gene amplicon sequencing has proven to be a powerful strategy for diagnosing bacterial infections. We have recently established a sequencing method and bioinformatics pipeline for 16S rRNA gene analysis utilizing the Oxford Nanopore Technologies MinION™ sequencer. In combination with our taxonomy annotation analysis pipeline, the system enabled the molecular detection of bacterial DNA in a reasonable time frame for diagnostic purposes. However, purification of bacterial DNA from specimens remains a rate‐limiting step in the workflow. To further accelerate the process of sample preparation, we adopted a direct PCR strategy that amplifies 16S rRNA genes from bacterial cell suspensions without DNA purification. Our results indicate that differences in cell wall morphology significantly affect direct PCR efficiency and sequencing data. Notably, mechanical cell disruption preceding direct PCR was indispensable for obtaining an accurate representation of the specimen bacterial composition. Furthermore, 16S rRNA gene analysis of mock polymicrobial samples indicated that primer sequence optimization is required to avoid preferential detection of particular taxa and to cover a broad range of bacterial species. This study establishes a relatively simple workflow for rapid bacterial identification via MinION™ sequencing, which reduces the turnaround time from sample to result, and provides a reliable method that may be applicable to clinical settings.

Rapid identification of bacterial pathogens is crucial for appropriate and adequate antibiotic treatment, which significantly improves patient outcomes. 16S ribosomal RNA (rRNA) gene amplicon sequencing has proven to be a powerful strategy for diagnosing bacterial infections. We have recently established a sequencing method and bioinformatics pipeline for 16S rRNA gene analysis utilizing the Oxford Nanopore Technologies MinION TM sequencer. In combination with our taxonomy annotation analysis pipeline, the system enabled the molecular detection of bacterial DNA in a reasonable time frame for diagnostic purposes. However, purification of bacterial DNA from specimens remains a rate-limiting step in the workflow. To further accelerate the process of sample preparation, we adopted a direct PCR strategy that amplifies 16S rRNA genes from bacterial cell suspensions without DNA purification. Our results indicate that differences in cell wall morphology significantly affect direct PCR efficiency and sequencing data. Notably, mechanical cell disruption preceding direct PCR was indispensable for obtaining an accurate representation of the specimen bacterial composition. Furthermore, 16S rRNA gene analysis of mock polymicrobial samples indicated that primer sequence optimization is required to avoid preferential detection of particular taxa and to cover a broad range of bacterial species. This study establishes a relatively simple workflow for rapid bacterial identification via MinION TM sequencing, which reduces the turnaround time from sample to result, and provides a reliable method that may be applicable to clinical settings.
Acute infectious diseases remain one of the major causes of life-threatening conditions with high mortality, particularly in patients under intensive care. Therefore, rapid and accurate identification of pathogenic bacteria facilitates the initiation of appropriate and adequate antibiotic treatment [1,2]. Although culturebased techniques are still the forefront of clinical microbial detection, these methods are time-consuming and have the critical drawback of not being applicable to noncultivable bacteria [3].
As an alternative approach for overcoming the limitations of traditional culture-based bacterial identification, metagenomic sequencing analysis has been introduced for the diagnosis of bacterial infections [4]. Among the sequence-based microbiome studies, the 16S ribosomal RNA (rRNA) genes have been the most predominantly used molecular marker for bacterial classification [5]. The bacterial 16S rRNA gene is approximately 1500 bp long and contains both conserved and variable regions that evolve at different rates. The slow evolution rates of the former regions enable the design of universal primers that amplify genes across different taxa, whereas fast-evolving regions reflect differences between species and are useful for taxonomic classification [6].
Targeted amplification of specific regions of the 16S rRNA gene followed by next-generation sequencing (NGS) is a powerful strategy for identifying bacteria in a given sample. Despite the high-throughput capacity, second-generation DNA sequencing technologies provide relatively short read lengths with limited sequence information, which often hampers accurate classification of the bacterial species [7]. A portable sequencing device MinION TM from Oxford Nanopore Technologies offers a number of advantages over existing NGS platforms [8,9]. Besides its small size and low cost, the intriguing feature of MinION sequencer is that it can provide a real-time and on-site analysis of any genetic material, which should be useful especially for clinical applications [10]. With the ability to generate longer read lengths, MinION TM analysis targets the whole coding region of the 16S rRNA gene, showing great potential for rapid pathogen detection with more accuracy and sensitivity [11][12][13][14][15][16][17]. We have previously established a sequencing method and bioinformatics pipeline for rapid determination of bacterial composition based on 16S rRNA gene amplicon sequencing via the MinION TM platform [15]. A 5-min data acquisition using MinION TM and sequence annotation against our in-house genome database enabled the molecular detection of bacterial DNA in a reasonable time frame for diagnostic purposes.
In the current study, we attempted to further refine and update the protocols for 16S rRNA gene sequencing analysis. We evaluated the performance of primer sets targeting the near-full-length 16S rRNA gene. To accelerate the process of sample preparation, we adopted a direct PCR strategy to amplify the 16S rRNA gene from bacterial extracts without DNA purification.

Direct PCR amplification of 16S rRNA genes
The number of colony-forming units (CFU) of bacteria (Escherichia coli and Staphylococcus aureus) was determined by plating serial dilutions of cultures on agar plates and counting colonies [18]. For mechanical cell disruption, zirconia beads (EZ-Beads TM ; Promega, Madison, WI, USA) were added to the bacterial cell suspensions and the samples were vortexed for 30 s. The bacterial cell samples with or without mechanical disruption were added directly to PCRs for amplifying the 16S rRNA genes. Bacterial DNA was purified using DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) and used as a PCR template. PCR amplification of 16S rRNA genes was conducted using the 16S Barcoding Kit (SQK-RAB204; Oxford Nanopore Technologies, Oxford, UK) containing the 27F/1492R primer set [19,20] and LongAmp TM Taq 29 Master Mix (New England Biolabs, Ipswich, MA, USA). Amplification was performed using an Applied Biosystems Veriti TM Thermal Cycler (Thermo Fischer Scientific, Waltham, MA, USA) with the following PCR conditions: initial denaturation at 95°C for 3 min, 25 cycles of 95°C for 20 s, 55°C for 30 s, and 65°C for 2 min, followed by a final extension at 65°C for 5 min. To determine the effects of human DNA contamination on 16S rRNA gene amplification, genomic DNA purified from the human monocytic cell line THP-1 was mixed with E. coli DNA and subjected to PCR. To amplify human b-globin gene as an internal control for the human genome, the following primers were used: forward, 5ʹ-GG TTGGCCAATCTACTCCCAGG-3ʹ; and reverse, 5ʹ-TG GTCTCCTTAAACCTGTCTTG-3ʹ. Quantitative real-time PCR was performed using SYBR Green I fluorescence and Rotor-Gene Q cycler (Qiagen). Melting-curve analysis was done using ROTOR-GENE Q series software version 2.1.0 (Qiagen).
Genomic DNA from a mock bacterial community MSA-1000 TM 10 Strain Even Mix Genomic Material was obtained from the American Type Culture Collection (ATCC, Manassas, VA, USA). The DNA mixture (1 ng) was used as a template for amplifying 16S rRNA genes. PCR amplification was conducted using the 16S Barcoding Kit and LongAmp TM Taq 29 Master Mix following the thermal cycling protocol as described above. Alternatively, 16S rRNA genes were amplified using KAPA2G TM Robust HotStart ReadyMix PCR Kit (Kapa Biosystems, Wilmington, MA, USA). Amplification conditions for fast PCR using the KAPA2G TM polymerase were as follows: initial denaturation at 95°C for 3 min, 25 cycles of 95°C for 15 s, 55°C for 15 s, 72°C for 30 s, followed by a final extension at 72°C for 1 min.

Whole-cell mock bacterial community
MSA-3000 TM 10 Strain Mix Whole Cell Material was obtained from ATCC. Lyophilized bacterial cell pellets were suspended in PBS and divided into aliquots. The resulting cell suspensions were then either used for direct PCR to amplify the 16S rRNA genes (2.5 9 10 4 cells/reaction) or subjected to mechanical cell disruption via beadbeating prior to PCR amplification. Bacterial DNA purified from the cell suspension was also used for 16S rRNA amplicon sequencing.
Sequencing of 16S rRNA gene amplicons PCR products were purified using AMPure XP (Beckman Coulter, Indianapolis, IN, USA) and quantified by a Nano-Drop (Thermo Fischer Scientific). A total of 100 ng DNA was used for library preparation, and MinION TM sequencing was performed using R9.4 flow cells (FLO-MIN106; Oxford Nanopore Technologies) according to the manufacturer's instructions. MINKNOW software ver. 1.11.5 (Oxford Nanopore Technologies) was used for data acquisition.

Bioinformatics analysis
MinION TM sequence reads (i.e., FAST5 data) were converted into FASTQ files by using ALBACORE software ver. 2.2.4 (Oxford Nanopore Technologies). Then, the FASTQ files were converted to FASTA files using our own program. In these reads, simple repetitive sequences were masked using TANTAN program ver. 13 with default parameters [21]. To remove reads derived from humans, we searched each read against the human genome (GRCh38) using minimap2 with default parameters [22]. Then, unmatched reads were regarded as reads derived from bacteria. For each read, a minimap2 search with 5850 representative bacterial genome sequences stored in the GenomeSync database (http://genomesync.org) was performed. Next, we chose species showing the highest minimap2 score as the existing species in a sample. Taxa were determined using our in-house script based on the NCBI taxonomy database [23] and then visualized using Krona Chart [24]. Sequence data from this article have been deposited in the DDBJ DRA database (https:// www.ddbj.nig.ac.jp/dra/index-e.html) under accession numbers DRR157203 to DRR157213.

Statistical analysis
For permutational multivariate analysis of variance (PER-MANOVA), Morisita's index of similarity ranging from 0 (no similarity) to 1 (complete similarity in composition) was used [25]. PERMANOVA tests were performed using R package vegan [26].

Direct PCR approach for amplifying 16S rRNA genes from crude bacterial extracts
To overcome the time-consuming and laborious process of sample preparation for DNA sequencing, we tried to amplify the 16S rRNA gene directly from bacterial suspensions without a DNA purification step ( Fig. 1). We used a commercially available kit (16S Barcoding Kit; Oxford Nanopore Technologies) with primers optimized for 16S rRNA amplicon sequencing on the MinION TM platform. The primers were designed to amplify the near-full-length 16S rRNA gene for bacterial identification [19,20]. Each indexing primer has a unique barcode for multiplexing and contains a tag sequence at the 5ʹ-end for attachment of sequencing adapters. Performance of the barcoded primers for 16S rRNA gene amplification was evaluated by PCR assays using bacterial cell suspensions. E. coli was chosen to represent gram-negative pathogens, and S. aureus was used as a model for hard-to-lyse bacteria with gram-positive cell walls. A defined amount of each bacterium was serially diluted, and the resulting cell suspensions were directly added to PCRs with LongAmp TM Taq DNA polymerase. As for E. coli suspensions, the lower limit of detection was less than 1 9 10 2 CFU in agarose gel electrophoresis ( Fig. 2A). On the other hand, a higher number of cells was required for detecting 16S rRNA gene amplicons from S. aureus suspensions, whose detection limit was as low as 1 9 10 3 CFU (Fig. 2B).
To facilitate the release of bacterial DNA, cells in suspensions were disrupted by vortexing with zirconia beads before being subjected to PCR amplification. Bead-beating proved to be effective for the direct amplification of 16S rRNA genes from cell suspensions of both E. coli and S. aureus, improving the yield of PCR products (Fig. 2C).

Rapid detection and identification of bacterial strains via direct PCR amplicon sequencing on MinION TM
Having demonstrated the efficacy of the barcoded primers for amplifying 16S rRNA genes directly from bacterial suspensions, we investigated whether the direct PCR method can impact MinION TM sequencing results and the accuracy of strain identification. Bacterial cell suspensions of E. coli and S. aureus were used for preparing 16S rRNA gene amplicon libraries, and then, the samples were sequenced on MinION TM for 5 min (Table 1 and Fig. 3). In addition, sequencing libraries were prepared using purified bacterial DNA templates as a standard reference for comparison. Sequencing reads were analyzed using a bioinformatics pipeline based on a BLAST search against our in-house genome database GenomeSync. MinION TM sequencing data identified the bacteria at the species level with more than 90% of reads being correctly assigned to each species (Fig. 3A,B). Shigella flexneri was additionally detected at a low abundance probably due to its high sequence similarity to E. coli [27,28].
Moreover, the type of PCR template (purified DNA versus cell suspension) did not substantially affect the quality of sequence reads nor bacterial identification results (Table 1). These results demonstrate the utility of the direct PCR method, which can enable rapid pathogen identification from crude materials without the need for DNA purification.
Impact of nonbacterial DNA contamination on 16S rRNA gene amplification Successful identification of infectious pathogens should rely on the specific amplification of bacterial target sequences in clinical samples, which can often be contaminated with patient-derived human genetic materials. We tested whether a higher amount of human DNA would affect the amplification of bacterial 16S rRNA genes. E. coli DNA (0.1 ng) was mixed with increasing amounts of human DNA samples extracted from the monocytic cell line THP-1, after which the mixture was subjected to PCR amplification of the 16S rRNA gene ( the b-globin gene were used as internal control for the human genome ( Fig. 4A; lower panel). Bacterial 16S rRNA genes were specifically amplified even in a background of high human DNA concentrations. The contaminated human DNA had no significant inhibitory effects on PCR product yield, which was further confirmed by quantitative real-time PCR (Fig. 4B). Melting-curve analysis suggested that targeted 16S rRNA amplicons were specifically generated by PCR (Fig. 4C). Thus, the existence of nonbacterial genetic materials in the sample does not affect the sensitivity and specificity of 16S rRNA gene detection.

16S rRNA gene sequencing of a mock bacterial community
The performance of the current tools for 16S rRNA gene amplicon sequencing was further tested with a mixture of DNA prepared from 10 different bacterial species. The relative abundance of individual bacterial taxa was estimated by genome size and copy number of the 16S rRNA gene ( Table 2). The mock community DNA mixture was used as a template for PCR, and the 16S rRNA gene amplicon libraries were sequenced on MinION TM . Nine out of 10 bacterial strains were successfully identified at the species level, and PERMANOVA showed no significant community difference (P = 0.5) between data collected at different time points (Fig. 5). Thus, 3 min of run time generating 3985 reads was sufficient for identifying the nine species, whereas longer run times (5 min, 10 167 reads; 30 min, 44 248 reads) did not significantly affect species detection accuracy (Table 3). There were some biases observed in the taxonomic profile; Bacillus cereus was detected at lower abundances than expected, while Clostridium beijerinckii and E. coli were overrepresented. These instances of partially biased assignment or misidentification of bacterial species were not resolved by increasing the number of sequencing reads analyzed (Fig. 5). Our approach failed to identify Bifidobacterium adolescentis in the mock community even when it was represented in the database.
We also evaluated the potential of another DNA polymerase and PCR amplification protocol for bacterial species identification by MinION TM sequencing. KAPA2G TM Robust DNA Polymerase has a significantly faster extension rate than the standard wild-type Taq, enabling shorter reaction times (approximately 100 min with LongAmp TM Taq versus 45 min with KAPA2G TM ) for amplifying 16S rRNA genes from the mock bacterial community. The rapid amplification protocol with KAPA2G TM did not impact the overall taxonomy assignment results of 16S rRNA gene  sequence reads generated by MinION TM (Fig. 5 and Table 3). Three-minute sequencing of KAPA2G TMamplified 16S rRNA libraries identified nine bacterial species from the mock community. B. adolescentis was not detected as was the case with LongAmp TM Taq polymerase.

Evaluation of sample preparation methods for accurate bacterial identification via MinION TM sequencing
Given the successful identification of a broad range of bacterial species from mixed DNA samples, we further tested the utility of direct 16S rRNA gene amplification and MinION TM sequencing on a mixture of whole bacterial cells with intact cell walls (Table 4 and Fig. 6). We assessed the effects of DNA extraction procedures on MinION TM sequencing results. Bacterial cell pellets comprising 10 different species were suspended in PBS and divided into three aliquots. The first aliquot remained untreated (designated as 'Direct') and the second was subjected to bead-beating for mechanical cell disruption ('Processed'). Bacterial DNA purified from the third aliquot served as a reference for comparison ('Purified'). Regardless of the extraction procedures, all bacterial species except for B. adolescentis were correctly identified and PERMA-NOVA did not indicate a significant effect for sample preparation methods on community composition (P = 0.33). Although not statistically significant, similarity indices may imply that species abundance differed across the three groups (Morisita's indices: [Direct:Processed] = 0.66, [Direct:Purified] = 0.65, and [Processed:Purified] = 0.87). The relative abundance of E. coli was especially high in the 16S rRNA sequencing library amplified directly from the untreated cell suspension, and impaired sensitivity was found for the detection of several types of bacteria (Fig. 6A). Mechanical disruption of bacteria by bead-beating can improve results, as it showed patterns of bacterial composition that were more similar to the reference group (Fig. 6B,C).

Discussion
Currently, identification of clinically relevant bacteria largely relies on culture-based techniques. However, culture-dependent methods are time-intensive and potentially lead to delayed or even incorrect diagnoses [29]. Metagenomic sequencing analysis provides an alternative approach for identifying bacterial pathogens in clinical specimens [5][6][7]. As previously reported, we developed a sequencing method and bioinformatics pipeline for 16S rRNA gene amplicon sequencing and analysis utilizing the nanopore sequencer MinION TM [15]. Although the system offers faster turnaround time than other NGS platforms, purification of bacterial DNA from samples, which typically takes around 1-2 h, remains a rate-limiting step in the workflow. Moreover, the bacterial DNA purification requires a multistep procedure including cell lysis, separation from contaminants, washing, and elution of purified material. These processes are not only timeconsuming and laborious but can potentially increase the risk of introducing sample mix-ups and cross-contamination. To further facilitate the process of sample preparation, we attempted to amplify 16S rRNA genes directly from bacterial cell suspensions without DNA purification [30,31]. Three minutes of sequencing run time generated a sufficient number of reads for taxonomic assignment, and we achieved successful identification of bacterial species with a total analysis time of less than 2 h. The direct PCR approach revealed that differences in cell wall morphology (gram status) significantly affected amplification efficiency and sequencing results. For example, S. aureus, a gram-positive bacterium with thick cell walls, was more resistant to heat lysis for DNA extraction and yielded less PCR product compared with that of E. coli. A mechanical disruption method such as beadbeating was useful in minimizing sample preparation bias. Indeed, samples processed by bead-beating prior to PCR amplification exhibited a better representation of the mock bacterial composition. The differential susceptibility to cell lysis among bacterial species can affect 16S rRNA gene amplification and may introduce a bias in the relative abundance of bacterial species in the community. Thus, mechanical cell disruption preceding direct PCR amplification was indispensable for obtaining an accurate representation of the sample bacterial composition. We used new primer sets from Oxford Nanopore Technologies that are optimized for rapid 16S rRNA gene sequencing on the MinION TM platform. These universal primers are designed to amplify the near-fulllength sequence of bacterial 16S rRNA genes. The specificity and sensitivity of 16S rRNA gene amplification with these primers were not substantially affected even when human DNA contaminants outweighed bacterial DNA. Using these primer sets and the updated sequencing protocols, we performed a metagenomic analysis of the precharacterized bacterial community consisting of 10 different species. Although the universal primers are expected to bind to regions that  [32]. We speculate that these sequence mismatches lead to poor amplification of Bifidobacterium 16S rRNA gene, resulting in the absence of these bacteria in the sequence data. Consistent with our results, it has been reported that the 27F primer has a bias toward underrepresentation of Bifidobacterium and other bacterial taxa in microbiome analysis, which is caused by nucleotide variations even in the phylogenetically highly conserved regions of 16S rRNA genes [32][33][34]. As shown here and in previous publications, it should be noted that universal primers (e.g., 27F primer) commonly used for metagenomic analyses have a limitation related to amplification bias; thus, modifications of primer sequences are required to avoid preferential detection of particular taxa and to cover a broad range of bacterial species [35]. Our study has some limitations. First, the direct PCR approach has been tested only for pure culture of bacteria and a mock community of precharacterized species. Successful amplification of 16S rRNA genes will greatly depend on the types of biological samples. More extensive studies are required to establish a reliable method for rapid bacterial identification, and future work will focus on optimizing and validating our direct PCR strategy on patient-derived clinical   samples. Another issue is that some bacteria share high sequence identity. In this study, Sh. flexneri was additionally detected from a pure culture of E. coli. Thus, the 16S rRNA gene sequencing has poor discriminatory power to separate closely related species [36,37]. The sequence analysis targeting additional genetic markers such as 23S rRNA genes may provide better resolution [38].
In conclusion, direct amplification of 16S rRNA genes from crude bacterial extracts can further accelerate sample processing for MinION TM sequencing. Direct 16S rRNA gene amplification combined with MinION TM sequencing provides an attractive option for accelerating pathogen detection. Further optimization and establishment of the relatively simple workflow for rapid bacterial identification via MinION TM sequencing would reduce the turnaround time from sample to result and provide a reliable method that would be applicable to the clinical settings.