Functional characterization of diverse ring-hydroxylating oxygenases and induction of complex aromatic catabolic gene clusters in Sphingobium sp. PNB

Highlights • Gene clusters responsible for aromatic degradation in Sphingobium sp. PNB were sequenced.• The ferredoxin from sphingomonads is structurally unique.• Substrate specificities of several ring-hydroxylating oxygenases were determined.• Oxygenase capable of transforming alkylaromatics was characterized.• Complex regulation of degradative genes was revealed by real-time PCR analyses.


Introduction
Bacteria of the genera Sphingomonas, Novosphingobium, Sphingopyxis and Sphingobium, commonly referred to as sphingomonads [1], are well known for their potential in bioremediation and industrial applications [2,3]. Sphingomonads are widespread in various aquatic and terrestrial environments and are isolated with an exceptionally high frequency, as compared to bacteria from other taxonomic groups [3]. The members of these genera are often isolated and studied because of their ability to degrade a wide range of recalcitrant natural and anthropogenic aromatic compounds, including polycyclic aromatic hydrocarbons (PAHs) [3]. Degradation pathways in sphingomonads and non-sphingomonads are quite similar but there is a low degree of homology between the genes/enzymes of the degradation pathways. The extraordinary metabolic diversity of sphingomonads is primarily due to the existence of multiple ring-hydroxylating oxygenases (RHOs) and the conservation of specific gene clusters. These bacteria supposedly evolved as independent group, restricting gene transfer to other bacteria and enabling these organisms to adapt faster to new potential carbon sources in the environment.
RHOs catalyze the initial oxidation step of a broad range of aromatic hydrocarbons including PAHs. RHOs have one or two soluble electron transport (ET) proteins, which deliver reducing equivalent to the a-subunit of the hetero-multimeric a n b n or homomultimeric a n forms of terminal oxygenases for oxygen activation during catalysis [4]. Structural studies on representative oxygenases showed that the a-subunit of RHOs contains an N-terminal iron-sulfur protein (ISP) domain, with a conserved Rieske [2Fe-2S] center and a C-terminal catalytic domain having a conserved mononuclear iron-binding site [5]. RHO a-subunits have been classified on the basis of their evolutionary and functional behaviors, in relation to structural configuration of substrates and preferred oxygenation site(s) [6].
The sphingomonad strains Sphingobium yanoikuyae B1 [7], Novosphingobium aromaticivorans F199 [8], Sphingobium sp. P2 [9] and Sphingomonas sp. LH128 [10] are capable of degrading a wide range of aromatic compounds. All of them possess seven pairs of genes coding for the large and small subunits of RHOs and a single set of ET system, consisting of a ferredoxin and a ferredoxin reductase. The arrangement of degradative genes in sphingomonads is complex, with the genes scattered across several gene clusters, in contrast to the coordinately regulated organized operonic structure of genes in other bacteria. The metabolic versatility of sphingomonads is presumed to be due to their ability to oxidize a wide range of organic compounds, but the substrate specificities of the individual oxygenases are poorly studied. Furthermore, the regulation of the complex gene clusters in sphingomonads remains undefined.
Sphingobium sp. PNB isolated from municipal waste-contaminated soil is capable of growing with phenanthrene as the sole source of carbon and energy. Strain PNB can also utilize or cometabolize a number of aromatic compounds including highmolecular weight PAHs [11,12]. The present study focuses on the molecular cloning, sequencing and organization of genes responsible for the degradation of various aromatic compounds, in order to understand the functional aspects of diverse RHO a-subunits. Further, the structural uniqueness of a single ferredoxin component involved in electron transfer to multiple RHO subunits and the induction profiles of the degradative genes in strain PNB were revealed, expanding the current perception of the complex catabolic architecture present in sphingomonads.

Identification of RHO a-subunits in strain PNB
Based on the degenerate primers (Table S1), designed from multiple sequence alignment (MSA) using a-subunits of RHOs in sphingomonads, respective gene segments in strain PNB were amplified by PCR. Sequencing of the PCR products followed by blastx analyses confirmed the amplification of genes corresponding to a-subunits of six distinct RHOs, designated as ahdA1b, ah-dA1c, ahdA1d, ahdA1e, ahdA1f and xylX. The primer designed to amplify ahdA1a failed to show any amplicon.

Cloning and sequence analysis of aromatic catabolic genes
Out of 1000 fosmid clones, eight supported desired PCR amplification with one and/or the other set of primer(s) corresponding to different RHO a-subunit genes (Table S1). Among them, fosmid clone FC-31 was found to harbor a-subunit gene specific for ah-dA1f, clone FC-183 for ahdA1c and ahdA1d while clone FC-781 for ahdA1b, ahdA1e and xylX. Subcloning, screening based on the presence of RHO a-subunit gene(s) and sequencing, followed by sequence alignment and blast searches revealed the identification of putative ORFs. The subclones which did not serve as template for the amplification of RHO a-subunit gene(s), were also sequenced using M13 forward and reverse primers and analyzed. Those which showed the presence of putative genes involved in the metabolism of aromatic compounds were further sequenced by primer walking and analyzed as described above to identify additional putative genes and proteins of the degradative gene clusters. Further, gaps between genes expected to be in close proximity were bridged by a conventional primer walking method, using primers designed from the sequences at the proximal ends of the genes. Examination of sequence revealed 37 complete, 5 partial and 2 disrupted ORFs. Putative genes and proteins, identified from the above analyses are listed in Table S2. Based on protein sequence homology and conserved domain analyses, a number of genes are likely to be involved in PAHs or other aromatic degradation pathways. Sequenced aromatic catabolic gene clusters of strain PNB were mapped and compared with the homologous gene clusters reported in various sphingomonads (Fig. 1).
Aligned sequence data encode seven pairs of putative oxygenase aand b-subunits (AhdA1 [a-f] A2 [a-f] , XylXY). Of which, the asubunit, ahdA1a is disrupted by the insertion of a transposase and a resolvase, which prevented PCR amplification, as stated above. In addition, the a-subunit (ahdA1e) was found to be truncated, with a deletion of 10 nucleotides (between bases 646 and 647) with respect to the corresponding homologous genes in related sphingomonads [8,9,13]. Interestingly, this is the only RHO large subunit in sphingomonads with a consensus sequence of D-X-D-X 2 -H-X 4 -H, which is slightly different from that of the classical non-heme iron coordination site, E-X 3/4 -D-X 2 -H-X 4/5 -H [8,14]. The majority of the catabolic enzymes from strain PNB (Table S2) are 99-100% identical with those found in Sphingomonas sp. LH128 whole genome data (NCBI BioProject: PRJNA172017), while the rest showed maximum similarity to those encoded in either Novosphingobium aromaticivorans F199 [8] or Sphingobium chungbukense DJ77 [15]. The aromatic degradative genes identified in this study showed maximum identity to those found in other sphingomonads, viz. Sphingomonas sp. LH128, Sphingobium yanoikuyae B1, Sphingobium sp. P2, Sphingomonas sp. CHY-1 (described as the closest neighbor of genus Sphingobium) [16], Sphingomonas chungbukensis DJ77 (reclassified as Sphingobium), Novosphingobium aromaticivorans F199 but the 16S rRNA gene sequence (1451 bp) of strain PNB showed 92.11, 95.04, 95.17, 96.39, 95.36 and 92.85% identity, respectively. Observed sequence similarity at the catabolic gene level and 16S rRNA level along with the signatures of transposases found in the catabolic gene clusters of strain PNB as well as in other sphingomonads [3], clearly indicate gene transfer events during their evolution.

Phylogenetic analysis of multiple RHO a-subunits and ET protein sequences
Sequence analysis revealed seven sets of putative aand b-subunit RHO genes, with one of the a-subunits (ahdA1a) disrupted. Each a subunit contains an N-terminal ISP domain, with a conserved Rieske [2Fe-2S] center, a C-terminal catalytic domain having a conserved mononuclear iron-binding site and a conserved aspartate, which is known to facilitate inter-subunit electron transfer between ISP and catalytic domains of a-subunits [14,17,18].
The genes encoding b-subunits were found adjacent to that of the a-subunits in all the sets of oxygenases indicating possible co-evolution of aand b-subunits and the presence of hetero-multimeric (a n b n )-type of RHOs in strain PNB. Fig. 2 illustrates the phylogenetic relation of the a-subunit protein sequences (AhdA1b, AhdA1c, AhdA1d, AhdA1e, AhdA1f and XylX) in strain PNB and the homologous sequences from other organisms, as mentioned in Table S3. Although the a-subunits in strain PNB share conserved domain regions, their nucleotide sequences and deduced amino acid sequences share limited homology with that of the nonsphingomonad counterparts. Moreover, phylogenetic analysis reveals that the individual a-subunit proteins in sphingomonads are distantly related (Fig. 2). Pairwise sequence alignments among the a-subunit in strain PNB, showed identities in the ranges of 55-64% and 25-48% at the levels of nucleotide and amino acid sequences, respectively. In previous studies, describing multiple RHOs in sphingomonads, substrate preferences of most of the RHOs have not been studied at length. Rather, the degradative genes have been annotated as bph, phn or ahd genes, merely on the basis of the aromatic compounds degraded by the individual species. A closer look at the phylogenetic tree of a-subunits reveals that the clustering depends broadly on substrate specificities [6]. Tree topology indicates that the homologous proteins first branch according to their substrate class preferences and within each branch, more similar sequences group in accordance with substrate sub-classes and ultimately, cluster according to the species tree. It has been observed that each of the homologous a-subunit proteins from sphingomonads clusters together in different clades ( Fig. 2). Apart from the homologous a-subunit proteins from sphingomonads (78-100% identity), few homologous a-subunits were also detected from the whole genome sequence of Cycloclasticus sp. P1, which showed up to 64% sequence identity to the asubunit proteins determined in strain PNB. According to the classification suggested by Chakraborty et al. [6], AhdA1b and AhdA1f of strain PNB and their homologues belong to A-IIIab type RHOs where AhdA1f corresponds to well studied PAH dioxygenases in sphingomonads [7,10,19] while AhdA1b correspond to ethylbenzene dioxygenase (72.8% identity) in Rhodococcus jostii RHA1 [20], one of the least explored A-IIIab type RHOs. On the other hand, XylX has largely been described as benzoate/toluate dioxygenase belonging to B-IIab type RHO [8]. The a-subunit from strain PNB, designated as xylX, also clustered with benzoate/toluate dioxygenases present in various genera and showed maximum identity (50.8%) with the well characterized benzoate dioxygenase from Pseudomonas putida [21]. On the other hand, AhdA1c, AhdA1d, and AhdA1e, all of which belong to C-IVab type RHOs, branched into three different subclusters (Fig. 2). AhdA1c and AhdA1d showed 49.75 and 47.25% identity with biochemically characterized o-halobenzoate dioxygenases of Achromobacter xylosoxidans A8 [22] and Burkholderia mallei ATCC 23344 [23] respectively. Similarly, AhdA1e clustered distinctly along with that of the homologous a-subunits from other sphingomonads and shared a common ancestry with o-halobenzoate dioxygenase and salicylate 5-hydroxylase.
As in other sphingomonads, only one ferredoxin and one ferredoxin reductase were identified, each encoded in a separate cluster [3]. Analysis of AhdA3 revealed it to be a Rieske [2Fe-2S] type ferredoxin with the conserved C-X-H-X n -C-X 2 -H motif, a distinct feature of this family [24]. Fig. 3A illustrates the phylogenetic relation of AhdA3 with those of homologous sequences from various xenobiotic degrading organisms described in Table S4. Although these proteins share a conserved domain, AhdA3 shares limited sequence homology (<50% identity) with its non-sphingomonad counter-parts. It is evident from the dendogram that the ferredoxins of the sphingomonads are highly similar (81-100% identity at amino acid level) and cluster together. It may be mentioned that AhdA3 and homologous sequences in sphingomonads displayed a few differentially conserved amino acids across the length of the proteins (Met26, Asn33, Gln57, Ile61, Phe66, Gly68, Ser70, Ala77, Ala80 and Phe81) in comparison to that of non-sphingomonads (Fig. 3B). However, the ferredoxins from few non-sphingomonads, viz. Et-bAc, PhnA3 and PhnAb, complementing ethylbenzene dioxygenase in Rhodococcus jostii RHA1, PAH dioxygenase in Cycloclasticus sp. A5 and phenanthrene dioxygenase in Alcaligenes faecalis AFK2, respectively, were found to be phylogenetically close to the sphingomonad ferredoxins with significant sequence similarity (Fig. 3). This observation was in congruence with the phylogenetic relatedness of corresponding oxygenase from these organisms to those of the sphingomonads (Fig. 2), indicating a possible event of lateral transfer of oxygenase gene clusters among them.
The ferredoxin reductase encoded by ahdA4 belongs to glutathione reductase (GR)-type. It showed a maximum of 40.11% identity with the biochemically characterized ferredoxin-NAD + reductase components of ethylbenzene dioxygenase present in nonsphingomonad strain Rhodococcus jostii RHA1 [20]. Fig. S1 shows the phylogenetic relationship of AhdA4 with the homologous sequences from various xenobiotic degrading organisms listed in Table S5. It has been observed that ferredoxin reductase sequences from sphingomonads cluster together (72-100% identity at amino acid level), similar to that obtained with the corresponding terminal oxygenase and ferredoxin components.

Homology modeling of terminal oxygenase subunits and ferredoxin proteins
Secondary structure prediction of the translated protein sequence of the terminal oxygenase aand b-subunits (AhdA1fA2f) revealed that both proteins belong to structural class 'alpha and beta' proteins (a+b) (SCOP: 53931) whereas the ferredoxin (AhdA3) belongs to 'all beta' class (SCOP: 48724) with three stacked beta sheets. A search for homologs of the above proteins in the Brookhaven Protein Data Bank (PDB) yielded a close resemblance with that of the oxygenase components (PDB: 2GBX:A; 2GBX:B) and ferredoxin (PDB: 2I7F:A), respectively of the biphenyl 2,3-dioxygenase from Sphingobium yanoikuyae B1. AhdA1f showed 78.6% identity over 454 amino acids whereas AhdA2f showed 68.39% identity over 174 amino acids with 2GBX:A and 2GBX:B, respectively. On the other hand, AhdA3 was found to be 81.48% identical over 108 amino acids with 2I7F:A. Using these chains as templates, models of the terminal oxygenase a and b-subunits (AHD-O PNB ) and ferredoxin (AHD-F PNB ) proteins were generated. Qualities of the modeled structures were found to be satisfactory, as observed from PROCHEK, VERIFY3D and VADAR analyses.  Table S6. As observed from the docked poses, NDO-F 98164 binds at the depression between two adjacent a-subunits of NDO-O 98164 , which is in congruence with an earlier study [26]. On the contrary, both BDO-F B1 and AHD-F PNB seem to bind at a pronounced depression formed by two aand two b-subunits (Fig. 4), the other putative ferredoxin binding site, as postulated by Ashikawa et al. [26]. Mapping of predicted interface residues onto the alignment of ferredoxin sequences obtained from various xenobiotic degrading organisms depicted a few differentially conserved amino acid residues (Phe66, Gly68 and Phe81) in sphingomonads (Fig. 3B). Thus, it is believed that the Rieske-type [2Fe-2S] ferredoxins of sphingomonads might have evolved to complement multiple oxygenases present in these organisms.

Functional expression of RHOs
In order to investigate the substrate specificities of the six RHOs present in strain PNB, corresponding a-subunits along with the evolutionarily-related b-subunits were PCR-amplified and cloned into pET28a. To provide the terminal oxygenase component with an appropriate ET system, ahdA3 (ferredoxin) and ahdA4 (ferredoxin reductase) genes were cloned in a pET28a compatible vector, pCDF-1b. The construct harboring ET components were co-transformed individually along with each of the constructs of terminal oxygenases into Escherichia coli (E. coli) BL21(DE3). The recombinant E. coli strains when induced with isopropyl-b-thiogalactopyranoside (IPTG), produced appreciable levels of characteristic polypeptides, indicative of the expression of various components of oxygenases including ET proteins as revealed by SDS-PAGE analysis (data not shown). The recombinant E. coli strains producing multi-component oxygenases were incubated overnight separately with several aromatics based on their phylogenetic affiliation to substrate classes. GC-MS analyses of the n-butyl boronated (NBB) derivatives of the neutral extract of the water-soluble products  released into the culture medium indicated the formation of aromatic dihydrodiols (Table 1). Among the oxygenases, AhdA1fA2f, belonging to A-IIIab sub-class of RHOs, was shown to dioxygenate naphthalene, phenanthrene, anthracene, biphenyl, acenaphthene, benzo[a]pyrene and benz[a]anthracene to the respective dihydrodiols. While AhdA1bA2b, phylogenetically belonging to same subclass (A-IIIab) of RHOs (although, AhdA1b displayed an identity of 39% at the amino acid level with AhdA1f), could also dioxygenate naphthalene, phenanthrene, anthracene, biphenyl, acenaphthene, benzo[a]pyrene, benz[a]anthracene to their respective dihydrodiols, apart from dioxygenating ethylbenzene, propylbenzene, cumene and p-cymene. Recombinant strains, expressing AhdA1fA2f and AhdA1bA2b individually along with the same set of ET proteins were also found to transform indole to indigo. However, in comparison to AhdA1fA2f, transformation of indole to indigo was found to be more rapid in presence of AhdA1bA2b, even with no gene induction. On the other hand, AhdA1cA2c, AhdA1dA2d and AhdA1eA2e broadly clustered with biochemically characterized o-substituted benzoate dioxygenases. The recombinant strains expressing Ah-dA1cA2c and AhdA1dA2d along with the ET components (AhdA3 and AhdA4) could transform salicylic acid to catechol by salicylate 1-hydroxylase activity as reported in other sphingomonads [9,27,28]. These recombinant strains were also shown to transform anthranilic acid to 2-aminophenol, however, no detectable metabolite was identified when 2-chloro-or 2-iodobenzoate were used as substrates. No such activities could be detected in the recombinant strain harboring AhdA1eA2e which might be due to truncated form of AhdA1e. XylX-XylY, clustered with typical benzoate/toluate dioxygenase, when expressed along with the ET components (AhdA3 and AhdA4), showed transformation of benzoic acid and p-toluic acid to catechol and 4-methylcatechol, respectively. Salicylate 1-hydroxylase and benzoate/p-toluate dioxygenase activities were further confirmed by incubating the respective reaction mixture in the presence of another recombinant strain forming catechol 2,3-dioxygenase (XylE) furnishing yellow colored products with characteristic absorbance around 340 nm, indicating the formation of 2-hydroxymuconic semialdehyde or its methyl derivative. Biotransformed products, catechol, methylcatechol and 2-aminophenol were subsequently characterized by HPLC analyses by comparing the retention times and UV-visible spectra (obtained from diode array analysis) with those of the authentic compounds analyzed under identical conditions (data not shown).
In addition, few more putative catabolic upper pathway enzymes (XylM, XylA, XylB and XylC) and lower pathway enzymes (XylF, XylG, XylJ, XylQ and XylK) involved in the metabolism of xylene and 2-hydroxymuconic semialdehyde, respectively, were identified (Table S2). Sequenced clusters also encode a putative NtrCtype regulator (AhdR), largely reported to be involved in the regulation of aromatic degradation pathway enzymes [36] and a TetRtype transcriptional regulator. However, ahdR in strain PNB is disrupted by insertion of a transposase. Sequenced clusters also encoded one each of pyruvate phosphate dikinase, TonB-dependent receptor protein, glutathione S-transferase, 4-hydroxythreonine-4-phosphate dehydrogenase, IS4 family transposase and three hypothetical proteins, whose role in aromatic degradation could not be determined.

Real-time PCR analyses
Results obtained from real-time PCR analyses with cDNA synthesized from the respective RNAs isolated from cells grown in presence of either phenanthrene or biphenyl are shown in Fig. 5. With the exception of ahdA1b and ahdA4, most of the genes were overexpressed in phenanthrene-grown cells as compared with succinate-grown cells. However, ahdA1b, ahdA4, nahD, ahdC or xylE were not overexpressed in biphenyl-grown cells. On the other hand, catA (GenBank: KC683533), encoding catechol 1,2-dioxygenase, was found to be upregulated in biphenyl grown cells but marginally downregulated in phenanthrene grown culture.

Discussion
The ability to degrade diverse harmful aromatic compounds by sphingomonads may be linked to the presence of evolutionarily unique enzyme system. Evolutionary relationships among several RHOs in sphingomonads have already been studied [37][38][39][40][41], which showed a radical divergence of their RHO genes from those of other genera, indicating a restriction in genetic exchange between sphingomonads and non-sphingomonads [3]. The presence of multiple peripheral enzymes (terminal oxygenases) and a single copy of each of the downstream degradative enzymes in the catabolic clusters are unique to strain PNB and various other sphingomonads [8,13]. Moreover, it has already been reported that the multiple RHOs from sphingomonads interact with only a single set of the corresponding ET system [3]. Effectively, the maximal activity of RHOs is shown to require the specific ET proteins, ferredoxin and ferredoxin reductase. Although the specific ET proteins can be partially replaced by endogenous E. coli ET proteins at the cost of reduced activity of RHOs but the role of ferredoxin is more significant than that of reductase in productive catalysis [42,43]. The ET proteins from sphingomonads are reported to be quite flexible in their redox partner interactions as together they are capable of transferring electrons to some of the oxygenase components of the RHOs from several non-sphingomonads. On the contrary, although the reductase from sphingomonads could be replaced by other reductase from non-sphingomonads, alternative ferredoxin components from non-sphingomonads failed to transfer electrons to the terminal oxygenase component of RHOs in sphingomonads [44]. Again, a single ferredoxin present in the degradative cluster of sphingomonad strain CHY-1 has been reported to have varied affinities for the different terminal oxygenases [28].
Thus, it is interesting to understand the structural nature of a single ferredoxin component capable of transferring electrons to structurally diverse terminal oxygenases in sphingomonads. In the present study, molecular modeling followed by docking analysis of ferredoxin component with the heterohexameric a 3 b 3 -type terminal oxygenase of strain PNB reflect its unique structural configuration to bind at a pronounced depression formed by two aand two b-subunits, similar to that observed with the structurally characterized subunits from Sphingobium yanoikuyae B1. However, similar analysis with the corresponding proteins from Pseudomonas putida NCIB 9816-4 revealed striking differences, binding at the other depression formed by the two adjacent a-subunits of NDO-O 98164 [26]. Unique structural configuration of ferredoxin in sphingomonads has also been reflected from the presence of differentially conserved amino acids, which are also involved as interface residues in protein-protein interactions. Indeed, structural information of the rest of the oxygenases in sphingomonads will help to understand the mechanism of interaction of a single ferredoxin with multiple terminal oxygenases.
Seven pairs of aand b-subunits identified in strain PNB correspond to the analogous subunits of RHOs (bphA1A2 [a-f] and xylXY) in Sphingomonas yanoikuyae B1 [13], Novosphingobium aromaticivorans F199 [8] and Sphingomonas sp. LH128 (NCBI BioProject: PRJNA172017). Based on phylogenetic relationships and substrate preferences of a-subunits in strain PNB, transformation of putative substrate(s) and related compounds by recombinant strains, expressing individual RHO terminal oxygenases along with sole available set of constituent ET components indicate their broad substrate specificities.
The a-subunit of the terminal oxygenase corresponding to Ah-dA1f responsible for initial dioxygenation of a number of PAHs and polycyclic heteroaromatic hydrocarbons has been well characterized in strains, viz. LH128 [10], B1 [7] and CHY-1 [45] which are respectively 99.92, 76.71 and 76.62% identical to AhdA1f. On the other hand, AhdA1bA2b, functionally reported for the first time, is capable of transforming alkylbenzenes, such as ethylbenzene, propylbenzene, cumene and p-cymene, in addition to the aromatics described for AhdA1fA2f. Based on the formation of biotransformed products (Table 1), differences in regiospecificity of AhdA1fA2f and AhdA1bA2b have been noticed for a number of PAHs. AhdA1fA2f can transform phenanthrene to both phenanthrene cis-3,4-dihydrodiol (retention time, R t 11.15 min) and phenanthrene cis-1,2-dihydrodiol (R t 13.67 min) in contrast to the formation of phenanthrene cis-3,4-dihydrodiol (R t 11.20 min) only with AhdA1bA2b. Phenanthrene cis-3,4-dihydrodiol and phenanthrene cis-1,2-dihydrodiol have already been reported in the phenanthrene degradation pathways via 1-hydroxy-2-naphthoic acid and 2-hydroxy-1-naphthoic acid, respectively in strain PNB [11]. Based on phylogenetic affiliation, AhdA1b uniquely clustered together with the homologous sequences from other sphingomonads and evolved from the common ancestor of largely defined PAHs and alkyl-and/or arylbenzene (which includes biphenyl) dioxygenase showing closest relationship with the biochemically characterized ethylbenzene dioxygenases from Rhodococcus jostii RHA1 and Rhodococcus sp. DK17, which are 100% identical at protein level. However, ethylbenzene dioxygenases from strain RHA1 was also reported to transform various aromatic compounds, including benzene, biphenyl, ethylbenzene and naphthalene, with the latter as preferred substrate. Functionally, AhdA1bA2b showed characteristics of both A-IIIab and A-IVab type RHOs, justifying the phylogenetic affiliation of AhdA1b within the tree.
Salicylate hydroxylase (AhdA1cA2c and AhdA1dA2d) described in strain PNB and homologous proteins reported in related sphingomonads viz. strains P2, B1 and CHY-1 [9,27,28] are the only three-component decarboxylative monooxygenases that transform salicylic acid to catechol. Apart from salicylate-1-hydroxylase, anthranilate-1-hydroxylase is reported in strain PNB, similar to that reported in CHY-1. On the other hand, XylX-XylY, as reported earlier, showed transformation of benzoic acid and p-toluic acid to catechol and 4-methylcatechol respectively. Thus it is believed that the presence of multiple copies of highly conserved RHO aand bsubunits and their broad substrate specificities may likely provide strain PNB a pronounced selective advantage in the management of a wide range of aromatics present in the environment. Although phylogenetic analyses revealed the substrate preference of Ah-dA1aA2a towards phenanthrene and/or hetero-substituted aromatics such as dibenzothiophene and dibenzofuran but this could not be validated experimentally owing to the disrupted nature of this gene in strain PNB. Although the substrate preference of AhdA1eA2e was predicted to be salicylic acid, this particular RHO a-subunit being truncated in strain PNB, failed to show any oxygenase activity.
The presence of multiple transposons and insertion elements in strain PNB as well as in the genome of other sphingomonads strongly indicates pronounced DNA rearrangements [3,46], and suggests significant roles for them in the localization of the conserved gene clusters and establishment of the degradation pathways for various compounds [47][48][49]. Comparison of the genome sequences of Sphingobium chlorophenolicum L-1 and Sphingobium japonicum UT26 suggests horizontal gene transfer events in the pentachlorophenol degradation pathway [50]. The complex genetic architecture of sphingomoanads was further revealed by the presence of a number of overlapping genes (nahD-ahdA1c, ah-dA2c-ahdA3, xylX-xylY-orf183_9, ahdA1d-ahdA2d, xylQ-xylK and ah-dA2a-ahdA1a) in strain PNB, involved in aromatic degradation. The presence of overlapping genes is thought to be the result of evolutionary pressure to conserve sequence length [51][52][53], minimize genome size and regulate gene expression [52,54,55].
Compared to the succinate-grown cells, phenanthrene or biphenyl induced cells of strain PNB showed upregulation of many genes present in the gene clusters reported in this study. Generally, the groups of genes transcribed in the same frame were upregulated simultaneously. However, in biphenyl grown cells, ahdC was downregulated in spite of being in the same frame with the genes, necessary for the upper pathway of degradation of various aromatics. Again, nahD, encoding an isomerase essential for the degradation of phenanthrene but not for biphenyl, was found to be overexpressed in phenanthrene-grown cells but not in biphenylgrown cells. Expression profiles of catA and xylE (Fig. 5) suggest that the central metabolite catechol is processed through ortho (b-ketoadipate) pathway in biphenyl and benzoic acid degradation but through meta (a-ketoadipate) pathway in case of phenanthrene, naphthalene and salicylic acid degradation. Interestingly, although AhdA1b, along with the corresponding b-subunit and ET components, was able to transform a large spectrum of aromatics including biphenyl and phenanthrene (Table 1), its expression was downregulated in presence of either of the substrates. Again AhdA4, which is present as a single copy in the degradation cluster and is essential for functional activity of the multiple RHOs, was not overexpressed in presence of either biphenyl or phenanthrene. Moreover, neither salicylic acid nor benzoic acid/p-toluic acidgrown cells were found to overexpress the majority of the degradative genes reported in this study. Thus it is believed that the expression of degradative gene cluster is more specific towards inducible substrates rather than the transformation of a range of compounds by the peripheral enzymes. Based on the prediction of regulation, it has already been suggested that multiple inducers are required for the expression of aromatic catabolic enzymes in sphingomonads [56]. Moreover, it is likely that genes, which are not overexpressed but essential for the degradation of a particular compound, must be complemented by the expression of appropriate catabolic genes present in other location in the genome. In this context, it may be mentioned that the analysis of genomes of different aromatics-degrading sphingomonads revealed the presence of multiple copies of degradative genes, in addition to those present in the aromatic-degradative clusters reported in this study [46].
As mentioned above, the regulation of genes for various aromatics degradation in sphingomonads is quite complex. The ahdR gene encoding a putative regulator belonging to the NtrC family was identified in close proximity to the genes coding for the catabolism of aromatic compounds. Members of this family are known to activate RNA polymerase containing the alternative sigma factor r 54 .
Homologs of ahdR are found in many sphingomonads having similar organization of degradative genes. Analyses of promoter region sequences of the catabolic plasmid pNL1 in strain F199 suggested that regulatory events are modulated through the interaction of BphR with r 54 type promoters [56]. However, its actual role in aromatic degradation cannot be ascertained as truncated versions of the gene are found not only in Sphingomonas sp. P2, Novosphingobium pentaromativorans US6-1 but also in strain PNB with the insertion of transposase (orf26, Fig. 1), all of which show similar degradative gene arrangement and reported to successfully mineralize phenanthrene or high molecular weight PAHs. Apart from ahdR, a gene (orf781_19) encoding TetR-type transcriptional regulator, often involved in aromatic degradation [36], is also present in the degradative cluster of strain PNB similar to that observed in strain LH128 (Fig. 1). However, the role of TetR-type regulator in the regulation of these proximal genes cannot be contemplated as the same is absent in rest of the sphingomonads compared in this study. Thus, the present study lays the groundwork for revealing the answers to the molecular basis of the underlying complex regulation of gene expression involved in the degradation of broad spectrum aromatics in sphingomonads.

Experimental procedures
4.1. Amplification and identification of RHO a-subunit genes from strain PNB RHO a-subunit genes, belonging to the seven paralogous groups (annotated as bphA1a-bphA1f and xylX in Novosphingobium aromaticivorans F199, GenBank: NC_002033) were subjected to amplification from strain PNB using degenerate primers (Table S1). Primers were designed based on MSA of nucleic acid sequences obtained from various reported phenanthrene-degrading sphingomonads (Table S3). PCR amplifications were performed with a MJ Mini Gradient Thermal Cycler (Bio-Rad Laboratories, Inc.) followed by sequencing of amplified PCR products according to the manufacturer's specifications for Taq DNA polymerase-initiated cycle sequencing reactions using fluorescent-labeled dideoxynucleotide terminators with an ABI PRISM 377 automated sequencer (Perkin-Elmer Applied Biosystems, Inc.). Sequence homology analyses were performed using both blastn and blastx programs [57], available at the NCBI (NIH, Bethesda, MD).

Construction and screening of genomic library
Sphingobium sp. PNB was grown on Luria-Bertani (LB) broth overnight at 28°C. Genomic DNA from the strain PNB was isolated and purified according to Marmur and Doty [58] with certain modifications and improvisations, as suggested by Lambert et al. [59]. A genomic library was prepared in pCC2FOS Fosmid vector (Epicentre, Madison, Wisconsin) according to the manufacturer's protocol. Briefly, the genomic DNA from strain PNB was randomly sheared to approximately 40 kb fragments and the ends were blunted using End-repair Enzyme Mix (CopyControl TM HTP Fosmid Library Production Kit, Epicentre) and ligated into pCC2FOS vector. The ligated DNA was packaged with MaxPlax Lambda Packaging Extracts and transduced into EPI300-T1R Phage T1-resistant E. coli Plating strain followed by spreading onto LB-agar plates containing 12.5 lg ml À1 chloramphenicol. The resulting library was replica-plated and the clones containing RHO(s) were screened by PCR amplification using primers as described above.

Subcloning and sequencing
Fosmid DNA was individually isolated using the Miniprep Spin-Kit (Qiagen Inc., Stanford, USA) from the clones containing various RHO genes. Isolated DNA was then digested individually with PstI, SmaI, HindIII, XhoI and BglII and the DNA fragments ranging from 2-8 kb were subcloned into pBluescript SK(À) and transformed into E.coli XL1-Blue cells. The transformants were plated onto LBampicillin plate containing 20 lg ml À1 of 5-bromo-4-chloro-3indolyl-beta-D-galactopyranoside (X-gal) and 0.1 mM IPTG. The plates were incubated overnight at 37°C. The transformants were then screened by PCR for the presence of various RHO a-subunit genes. For sequencing, the recombinant plasmids were isolated and subjected to DNA sequencing using M13 forward and reverse primers, followed by primer walking in both directions. Gaps between genes located in close proximity were bridged by conventional primer walking method. Both DNA sequencing and sequence homology analyses were carried out as described above.

Phylogenetic analyses
Homologous RHO a-subunits (Table S3), ferredoxin (Table S4) and reductase (Table S5) protein sequences were identified with blastp program [57] against the non-redundant (NR) database at NCBI using each RHO a-subunit paralog, AhdA3 and AhdA4 from strain PNB as query sequences. ClustalX v1.81 [60] was used to generate individual MSA of RHO a-subunits, ferredoxin and reductase protein sequences obtained from strain PNB and those of the corresponding homologous sequences from other sphingomonads and non-sphingomonads followed by manual adjustment, wherever necessary. Phylogenetic trees were constructed by neighborjoining (NJ) method from distance data using the NJ algorithm implemented in ClustalX. The trees were visualized and manipulated either using the program Tree Explorer v2.12 [61] or using iTOL: Interactive Tree Of Life, an online phylogenetic tree viewer and Tree Of Life resource [62].

In silico analysis
The homology models of monomers of oxygenase a-subunit (AhdA1f), b-subunit (AhdA2f) and ferredoxin (AhdA3) were generated using the software MODELLER 9v7 [63] with the respective oxygenase components (PDB: 2GBX:A; 2GBX:B) and ferredoxin (PDB: 2I7F:A) of the biphenyl 2,3-dioxygenase from Sphingobium yanoikuyae B1. The models were checked using PROCHECK [64], VERIFY3D [65], VADAR [66] and PSIPRED [67]. For docking experiments, structures of the terminal oxygenase and ferredoxin components of naphthalene dioxygenase from Pseudomonas putida NCIB 9816-4 (NDO-O 98164 and NDO-F 98164 ) and those of biphenyl dioxygenase from Sphingobium yanoikuyae B1 (BDO-O B1 and BDO-F B1 ) were downloaded from the PDB. The monomeric structures of oxygenase a-subunit (AhdA1f) and b-subunit (AhdA2f) from strain PNB, modeled above, were used to construct the heterohexameric form of the enzyme (AHD-O PNB ) so as to dock with the corresponding ferredoxin (AHD-F PNB ). GRAMM-X [68] was used to predict and assess the interactions between NDO-O 98164 and NDO-F 98164 , BDO-O B1 and BDO-F B1 as well as AHD-O PNB and AHD-F PNB . The program performs a rigid-body docking using Fast FT methods by applying smoothed Lennard-Jones potential to find protein complexes with the highest surface complementarity. From each docking, 50 most probable predictions (in order from most to least favorable) based on geometry, hydrophobicity and electrostatic complementarity of the molecular surface were considered for further analyses. The interface residues in the docked complex were predicted using ProFace server [69].

Overexpression of oxygenases and in vivo assays
Different E. coli BL21(DE3) cells containing one of the RHOs, ferredoxin and reductase genes were grown overnight in 5 ml LB medium in presence of appropriate antibiotics. These cultures were used to inoculate 100 ml LB medium (0.1% v/v) and were incubated at 37°C until an OD 600 of 0.5 was reached. After inducing the cultures with IPTG (0.5 mM), the cells were further incubated overnight at 25°C. For in vivo assays, cells were centrifuged, washed and resuspended to an OD 600 of 2.0 in M9 medium [70] supplemented with 0.2% glucose. Cells overexpressing various components of RHO were incubated overnight at 25°C with 400 lM of each of the test substrate dissolved in 2 ml of silicone oil.

Chemical analyses
After incubation, the resting cell cultures were centrifuged (8,000Âg, 10 min) and the supernatants were adjusted to pH 7.0 and extracted thrice with an equal volume of ethyl acetate. The combined extracts were dried over anhydrous sodium sulfate, evaporated under reduced pressure and finally resuspended in 100 ll N,N-dimethylformamide (DMF). Then, 100 ll of n-butylboronic acid solution (500 lg of n-butylboronic acid dissolved in 1 ml of DMF) was added, and the mixture was heated at 70°C for 15 min to form the NBB derivatives. The reaction mixture thus obtained was diluted 15 fold with cyclohexane and analyzed by GC-MS using a Varian model 3800 (Varian Inc., California, USA) with a Saturn 2200 mass spectrometer equipped with a 30 m Â 0.25 mm (0.25 lm film thickness) DB5 MS capillary column (Agilent Technologies, California, USA). The inlet temperature was kept at 285°C while the transfer line temperature was kept at 270°C. The temperature program gave a 2 min hold at 80°C, an increase to 260°C at 18°C min À1 , followed by hold for 6 min at 260°C, further increase to 285°C at 4°C min À1 and a 11 min hold at 285°C. The injection volume was 1 ll, and the carrier gas was helium (1 ml min À1 ). The mass spectrometer was operated at an electron ionization energy of 70 eV and dihydrodiols were detected by selected ion monitoring by using the calculated mass of the NBB derivative. On the other hand, low molecular weight polar metabolites were resolved by HPLC using a Shimadzu model LC20-AT pump system equipped with a diode array model SIL-M20A detector and a C 18 reversed-phase column attached to a model SIL-20A autosampler. The biotransformed products were eluted using a programmed gradient solvent system at a flow rate of 1.0 ml min À1 and detected at 254 nm along with diode array analysis. The mobile phase, consisting of methanol and water containing 1% (v/v) acetic acid, was a 45 min linear gradient from 50% (v/v) to 95% (v/v) aqueous methanol with hold at 95% (v/v) aqueous methanol for 10 min followed by 95% (v/v) to 50% (v/v) aqueous methanol over 5 min.

RNA isolation, cDNA preparation and real-time PCR analysis
Total RNA was isolated using TRIzol (Invitrogen, Carlsbad, CA) from mid-exponential phase cultures of strain PNB grown individually on phenanthrene, biphenyl or succinate as sole carbon sources. Residual DNA was removed by additional treatment with RNase-free DNase I (Thermo Scientific, Waltham, MA). Subsequently, cDNA was prepared with RevertAid reverse transcriptase (Thermo Scientific) and Random Hexamer Primer (Thermo Scientific), according to the manufacturer's instructions. To quantitatively estimate expression of genes involved in degradation of different aromatics, real-time PCR was performed in an ABI 7500 real-time PCR system (Applied Biosystems, California, USA) with various sets of primers (Table S8) using SYBR Green mix and cDNAs prepared from different set of cells. Relative changes in mRNA expression of various genes were compared with succinate as control, normalized to 16S rRNA, and quantified by the 2 ÀDDCt method [71]. Mean values were obtained from triplicate experiments.

Nucleotide sequence accession numbers
The nucleotide sequences described in this study were deposited into GenBank database under the accession numbers Gen-Bank: KF483792, GenBank: KF483793 and GenBank: KF483794.