Structure and steroid isomerase activity of Drosophila glutathione transferase E14 essential for ecdysteroid biosynthesis

Ecdysteroids are critically important for the formation of the insect exoskeleton. Cholesterol is a precursor of ecdysone and its active form 20‐hydroxyecdysone, but some steps in the ecdysteroid biosynthesis pathway remain unknown. An essential requirement of glutathione (GSH) transferase GSTE14 in ecdysteroid biosynthesis has been established in Drosophila melanogaster, but its function is entirely unknown. Here, we have determined the crystal structure of GSTE14 in complex with GSH and investigated the kinetic properties of GSTE14 with alternative substrates. GSTE14 has high‐ranking steroid double‐bond isomerase activity, albeit 50‐fold lower than the most efficient mammalian GSTs. Corresponding steroid isomerizations are unknown in insects, and their exact physiological role remains to be shown. Nonetheless, the essential enzyme GSTE14 is here demonstrated to be catalytically competent and have a steroid‐binding site.

Steroids play various essential roles in higher forms of life. Originating from cholesterol, series of enzymatic reactions lead to steroid hormones in mammals and insects, as exemplified by sex hormones and ecdysteroids. Certain members in the family of glutathione transferases (GSTs), enzymes originally discovered as detoxication enzymes [1], have been shown to fulfill significant functions in steroid hormone biosynthesis. In humans and horse, GST A3-3 catalyzes steroid double-bond isomerizations with extraordinary efficiency, preceding the formation of testosterone and progesterone [2][3][4][5] (Fig. 1). The chemical mechanism of this isomerization has been described in molecular detail [6]. In Drosophila melanogaster, ecdysone is secreted from the ring gland and the fat body and plays a pivotal role in the development and metamorphosis, as in other insects and arthropods. Its product 20-hydroxyecdysone ( Fig. 1) is the active form required for molting (ecdysis), which encompasses shedding of the exoskeleton in transitions between different larval stages, pupation, and formation of the imago. Various reactions catalyzed by different cytochrome P450 enzymes have been uncovered in the biosynthesis of ecdysteroids [7]. Interestingly, an epsilonclass GST, GSTE14, has been shown to play an essential role in ecdysteroid biosynthesis [8] and identified with the mutant Noppera-bo among the Halloween genes [9]. Specifically, loss of the GSTE14 gene is detrimental to cuticle formation, it interrupts the formation of exoskeleton and prevents ecdysis, but the effect can be rescued by the administration of 20-hydroxyecdysone. However, the actual function of GSTE14 remains unknown.
The genome of D. melanogaster has been found to harbor 42 GST genes [10], and a preliminary examination of the corresponding GSTome [11] of soluble enzymes has been performed [12]. Based on sequence similarities, the insect GSTs have been assigned to six classes: theta, omega, sigma, zeta, delta, and epsilon. The first four are common to eukaryotes, whereas the delta and epsilon classes appear specific for arthropods, including insects. In D. melanogaster, the delta and epsilon GSTs have 11 and 14 members, respectively [12,13], accounting for more than half of the fly GSTome and suggesting that the proteins have evolved for diverse functions. GSTE14 is the most divergent member in the epsilon branch of the phylogenetic tree [12]. Just a limited number of the D. melanogaster GSTs have been characterized in detail, and among the epsilon-class enzymes, structural studies of only GSTE6 and GSTE7 have previously been published [14]. Structural information on GSTs has been reviewed by Ketterman [15] and Wu and Dong [16]. We here report the crystal structure and functional properties of D. melanogaster GSTE14.

Materials and methods
Cloning, expression, and purification of the DmGSTE14 protein DNA encoding the D. melanogaster GSTE14 (DmGSTE14) protein sequence (NP_610855.1) was codon-optimized for Escherichia coli expression and designed to include a His 6tag at the C terminus. The DNA sequence was synthesized by GENEWIZ (Leipzig, Germany) and subcloned into the pET22b vector (Novagen, Darmstadt, Germany) between the NdeI and SacI restriction sites. The resulting plasmid was used to transform the E. coli BL21 star strain. Isolated colonies from an LB-ampicillin agar plate (50 mgÁL À1 ampicillin) were used to inoculate 50 mL LB-ampicillin culture (50 mgÁL À1 ampicillin) and incubated at 37°C overnight. Then, the 50 mL culture was used to inoculate 4 L LB-ampicillin medium. Protein expression was induced at OD 600nm = 0.6 with 1 mM (final concentration) isopropyl b-D-1-thiogalactopyranoside, and the bacteria were further grown for 18 h at 37°C. Cells were harvested by centrifugation (4000 g, 20 min, 4°C), resuspended in binding buffer (20 mM sodium phosphate, 0.5 M NaCl, 100 lM DTT, and 20 mM imidazole, pH 7.4), and disrupted at 4°C using sonication (Vibracell; Bioblock, Waltham, MA, USA). After centrifugation at 20 000 g for 45 min at 4°C, the supernatant was loaded onto a 5 mL column containing Ni 2+ Sepharose 6 fast flow resin (GE Healthcare, Chicago, IL, USA). Following washing with binding buffer, DmGSTE14 was eluted using 20 mM sodium phosphate, 0.5 M NaCl, 100 lM DTT, and 500 mM imidazole, pH 7.4. Fractions containing DmGSTE14 were pooled and dialyzed against 100 mM potassium phosphate buffer, pH 6.4. DmGSTE14 was concentrated using an Amicon ultra-spin column with a cutoff of 10 kDa (Millipore, Billerica, MA, USA). Protein purity was confirmed using a Coomassiestained 12% SDS/PAGE.
Steady-state kinetic parameters were determined under the corresponding assay conditions using variable substrate concentrations. The Michaelis-Menten equation was fitted to the data by nonlinear regression, and V max values were transformed into k cat based on the molar concentration of the dimeric protein.

Crystallization and diffraction data collection
The protein was further purified for crystallization experiments by size-exclusion chromatography in 20 mM HEPES pH 7.5, 100 mM NaCl, 2 mM tris(2-carboxyethyl)phosphine, and 10% glycerol using the column Superdex 200 16/600 and € Akta Prime FPLC system (GE Healthcare Life Sciences, Uppsala, Sweden) and concentrated to 22 mgÁmL -1 . The protein was supplemented with 10 mM glutathione (GSH) and 20 lL of this solution was supplemented with 0.4 mg of solid 20-hydroxyecdysone, resulting in > 10 mM 20-hydroxyecdysone in the final concentration. The crystals were obtained by vapor diffusion technique at 21°C in a hanging drop containing 2 lL of protein solution and 1 lL of reservoir solution derived from the Morpheus protein crystallization screen [18], containing 0.1 M MOPS/HEPES pH 7.5, 12.5% w/v PEG 1000, 12.5% w/v PEG 3350, and 12.5% v/v MPD. Large cubeshaped crystals were harvested after 24 h and flash-cooled without additional cryoprotection. A complete diffraction data set at 1. 3 A resolution was collected at 100 K at the beamline i04 of the Diamond Light Source.
The diffraction data were processed using xia2 [19], DIALS [20], and Aimless [21] from the CCP4 package [22]. The structure was solved by molecular replacement using the program Molrep [23] and the structure of epsilon-class GSH S-transferase from Musca domestica [24] (PDB ID: 3vwx) as a search model. Refinement was performed using Refmac 5.8.0232 [25] in combination with manual adjustments in Coot [26]. MolProbity server [27] was used for the evaluation of the final model quality. The data collection and refinement statistics are listed in Table 1. All figures representing structures were created Residues in Ramachandran allowed regions (%) e 100 a R meas defined in Ref. [29]. b Pearson's correlation coefficient determined on the data set randomly split in half.
where F o and F c are the observed and calculated structure factors, respectively. d R free -value is equivalent to the R-value but is calculated for 5% of the reflections chosen at random and omitted from the refinement process [30]. e As determined by MolProbity [27].
using PyMOL [28]. Atomic coordinates and structure factors were deposited in the PDB under the accession code 6t2t.

Preparation of DmGSTE14
The DmGSTE14 protein was obtained in pure form following IMAC and size-exclusion chromatography, as judged by SDS/PAGE analysis. However, a component corresponding to the dimer required rigorous reducing conditions to merge with the monomeric form of 25 kDa. Gel filtration of the purified enzyme demonstrated that the native DmGSTE14 is dimeric, as expected for a soluble GST. Figure 2 shows the DmGSTE14 structure, including the sequence (238 amino acids with a calculated molecular mass of 27 444 Da, including the C-terminal purification His 6 tag). Near the middle of the sequence, C106 is suitably positioned to form an intersubunit disulfide bond, as judged from the crystallographic analysis (see below). A corresponding Cys residue is conserved in related GSTE14 sequences from other insect species in UniProt. Interface cysteine residues (C106) are shown as sticks, and active-site ligands GSH and MPD are in pink and yellow ball-and-stick representations, respectively. The primary and secondary structure is shown below the cartoon representation with b-sheets indicated by arrows and a-helices as barrels. Important residues discussed further in the text, S14, C106, and D113, are highlighted in the sequence in blue, yellow, and red, respectively.

Crystal structure analysis
We solved the crystal structure of DmGSTE14 in complex with GSH at the resolution of 1. 3 A with one monomer of the enzyme in the asymmetric unit (Table 1). Three N-terminal residues (M-S-Q) and the His 6 -tag at the C terminus could not be modeled into the electron density due to disorder. The crystal structure of DmGSTE14 represents the typical GST fold with an N-terminal thioredoxin-like a/b-domain and a C-terminal bundle of a-helices. The overall structure of the biological form of the enzyme, a dimer generated by symmetry operation, is shown in Fig. 2. The large dimer interface (1431 A 2 ) includes a cysteine residue (C106) that is likely to form a disulfide bond and further stabilize the dimer assembly.
The G-site of the enzyme, formed by residues S14, P15, P16, L38, Q43, F44, H55, S56, V57, P58, D69, S70, H71, C106, and F110, is occupied by GSH. Molecular details of enzyme-GSH hydrogen-bond interactions are shown in Fig. 3. The conserved active-site residue S14 interacting with the sulfur of GSH through a sulfur-centered hydrogen bond is located close to the N-terminal end of the a1-helix as noted for other GST belonging to the serine/cysteine type [31].
A hydrophobic cavity (H-site), located adjacent to the G-site, is formed by predominantly hydrophobic residues R13, S14, P15, L38, F39, F110, D113, M117, S118, V121, T172, L208, and M212 and appears suited to accommodate hydrophobic substrates. In our crystal structure, the H-site is occupied by 2-methyl-2,4pentanediol (MPD) originating from the crystallization mother liquor and a water molecule (Fig. 4). The only charged residue in the H-site is aspartate D113, which forms a water-mediated hydrogen bonding interaction with MPD. This residue is flexible, and we have modeled it in two different conformations. These structural features suggest that aspartate D113 is likely to be a catalytic residue in the steroid isomerase reaction. During the preparation of this manuscript, a structure of DmGSTE14 in complex with GSH and 17b-estradiol (PDB ID: 6kep, Fig. 1) was deposited in the PDB. The superposition of this steroid-bound structure with our structure (Fig. 4) illustrates that the steroid binds to the same site as MPD and the hydroxyl oxygen O3 of the steroid overlays with the water molecule in our structure and interacts with D113, which is only present in a single conformation, facing inwards the Hcavity. This provides further evidence for the key role of D113.
A structure similarity search of the Protein Data Bank using the Dali server [32] identified a number of homologous structures of insect epsilon-class GSTs. The closest homologues are a GST from Drosophila mojavensis (PDB ID: 4hi7, RMSD 1.5 A for 214 residues), D. melanogaster GST E7 (PDB ID: 4png [14], RMSD 1. 5 A for 214 residues), a GST from M. domestica (PDB ID: 3vwx [24], RMSD 1. 5 A for 212 residues), and D. melanogaster GST E6 (PDB ID: 4yh2 [33] and 4pnf [14], RMSD 1.7 A for 213 residues). Further homologous GST structures originate from mosquito species, including epsilon-class GSTs from Anopheles gambiae (PDB ID: 2il3, 2imi, 2imk [34], and 4gsn [35]), Anopheles funestus (PDB ID: 3zmk [36]), and Aedes aegypti (PDB ID: 5ft3), with RMSD values ranging between 1.5 and 1.6 A (for 211-213 aligned residues). The sequence identity between DmGSTE14 and its insect structural homologues is between 30% and 34% [32]. The structure of DmGSTE14 is highly similar to all these homologues that share the same overall fold. However, DmGSTE14 has a unique long C-terminal tail, which is not present in any of the homologues and stretches in an extended conformation along the enzyme surface away from the active pocket. Moreover, none of the structural homologues possess the cysteine residue in the dimer interface, likely to form an interchain disulfide bridge stabilizing the dimer assembly. Instead, Fig. 3. GSH bound in the G-site of DmGSTE14. GSH is shown as pink sticks, and residues involved in direct (black dashed lines) or water-mediated (yellow dashed lines) hydrogen bonds are shown as blue sticks. Residues S107 and R111 from the second GST monomer of the dimer, which is participating in GSH binding, are shown as pale blue sticks. they all have a conserved serine residue in this position, apart from the GST from M. domestica [24], which has an alanine residue.
DmGSTE14 also shares structural homology with mammalian GSTs involved in steroid metabolism [5]. High steroid isomerase activity was originally observed with human Homo sapiens GST (HsaGST) A3-3 [2]. By contrast, the homologous HsaGST A2-2 has 5000fold lower activity, which could be explained by different orientations of the steroid substrate D 4 -AD in the active site [37]. The overlay of the human enzyme structures with DmGSTE14 structure shown in Fig. 5 clearly demonstrates that the architecture of their active sites and the steroid-binding modes are completely different, despite the similarity of the overall fold of the enzymes. The RMSD for the superposition of DmGSTE14 with both human GST A2-2 and A3-3 is 3.2 A (192 aligned residues, sequence identity 21%). The structures of the C-terminal portions are quite different. In both HsaGST A2-2 and HsaGST A3-3, the C terminus forms an alpha helix which folds over and shields the H-site of the active pocket, while in DmGSTE14 a substantially longer C-terminal tail does not form a helix and is oriented away from the active site. In the human GSTs, GSH is bound further away from the hydrophobic cavity providing more space for steroid binding. In DmGSTE14, GSH is bound in close proximity to the hydrophobic pocket occupied by MPD. The mode of steroid binding to DmGSTE14, with the steroid plane being roughly parallel to the GSH molecule, is therefore quite distinct from the binding mode of D 4 -AD to both human GSTs (Fig. 5).

Enzymatic activities
In order to examine the general enzymatic competence, a number of well-known GST substrates including   5. Comparison of DmGSTE14 with human GST A2-2 and A3-3. DmGSTE14 (blue cartoon) in complex with GSH (pink sticks) and MPD (yellow sticks) is superposed with human GST A3-3 (light gray cartoon, PDB ID: 2vcv [37]) in complex with GSH and D 4 -AD (dark gray sticks), and human GST A2-2 (brown cartoon, PDB ID: 2vct [37]) in complex with D 4 -AD (brown sticks), using the N-terminal domains. 17b-estradiol from DmGSTE14 structure (PDB ID: 6kep) is shown as green sticks. Table 2. Specific activities of DmGSTE14 in the glutathionylation of conventional GST substrates. Measurements were made in triplicate at pH 6.5 using 100 µM electrophilic substrate and 1 mM GSH.

Substrate
Specific activity (µmolÁmin À1 Ámg -1 ) CDNB, five ITCs, and nonenal were tested. Table 2 shows that CDNB and ITCs give low, but clearly demonstrable, activities with DmGSTE14. ITCs occur abundantly in the form of glucosinolates in cruciferous plants. By contrast, nonenal, which is related to products of lipid peroxidation, did not show any detectable activity. Steady-state kinetic parameters for selected substrates are shown in Table 3. These studies demonstrate that DmGSTE14 indeed is catalytically competent, even though activities with the naturally occurring ITC substrates are significantly lower than the activities of the homologous GSTE6 and GSTE7 [38]. In enzymes catalyzing several alternative reactions, it is often observed that acquisition of high activity with a particular substrate is accompanied by decreased efficiency with alternative substrates [39]. It can therefore be surmised that DmGSTE14 should be more active with other substrates. In view of the importance of DmGSTE14 for ecdysteroid metabolism, a surrogate for a so far unknown reaction with steroids in insects was tested. The most striking example of a GST-catalyzed reaction involving a steroid is the positional double-bond isomerization of the ketosteroids D 5 -AD and D 5 -PD, precursors to the hormones testosterone and progesterone [3]. These steroids are efficiently isomerized by some mammalian alpha-class GSTs, particularly by the horse and human enzymes Equus caballus GST (EcaGST) A3-3 [5] and HsaGST A3-3 [2]. We therefore tested these steroids as substrates for DmGSTE14, even if there is no evidence for these reactions to occur in insects. Table 4 shows the steroid double-bond isomerase activity of DmGSTE14 assayed with D 5 -AD and D 5 -PD. Tables 4 and 5 show that both D 5 -AD and D 5 -PD are suitable substrates for the enzyme. DmGSTE14 is less active than the most efficient mammalian enzymes, but nevertheless superior to many of the mammalian GSTs. EcaGST A3-3 is approximately 50 times more catalytically efficient than DmGSTE14 (Table 5). It should be noted that among all known enzymes the highest catalytic efficiencies (k cat /K m ) are approximately 10 8 s -1 ÁM -1 and that the isomerase efficiencies of (0.136-0.16) 9 10 8 s -1 ÁM -1 for GST A3-3 are just an order of magnitude below (Table 5). By inference, the 50-fold lower isomerase efficiency of approximately 0.3 9 10 6 s -1 ÁM -1 for DmGSTE14 can still be regarded high, ranking near the geometric mean of representative k cat /K m values of typical enzymes [40].
The reaction mechanism of HsaGSTA 3-3 catalyzing the isomerization of D 5 -AD into D 4 -AD has been investigated by computational methods to atomic resolution [6]. GSH in the active site acts as a base, via its thiolate group, and polarizes the 3-keto group of D 5 -AD via a hydrogen bond to the amide nitrogen of its glycine residue. Y9 of the G-site assists in the transfer of a proton from C4 in D 5 -AD, via GSH, to C6 in the steroid nucleus. By contrast, in DmGSTE14 the most likely catalytic residue is D113, which in a hydrophobic environment could be a functional base. In the bacterial ketosteroid isomerase, which catalyzes the same reaction, an aspartic acid and a tyrosine residue form an oxyanion hole accommodating the 3-keto group of D 5 -AD and a second aspartate serves as the base abstracting a proton from C4 [41]. Despite the high concentration of 20-hydroxyecdysone in the crystallization experiments, we were not able to obtain a DmGSTE14 structure in complex with this steroid. Cocrystallization attempts with the enzyme and 10 mM D 4 -androstene-3,17-dione (D 4 -AD) also did not yield a steroid-bound structure. It is not obvious how D 5 -AD or D 5 -PD binds to DmGSTE14, but judging from the position of 17b-estradiol in the recently deposited structure (PDB ID: 6kep), D113 may interact with the 3-keto group of D 5 -AD. We believe that some other steroid compounds in the earlier steps of the ecdysone biosynthesis pathway, rather than 20-hydroxyecdysone, might be the true physiological ligand of DmGSTE14. The lack of polar residues in the H-site of the enzyme likely prevents the binding of 20-hydroxyecdysone, which contains a higher number of polar modifications on the sterane core. The more hydrophobic precursor cholesterol or its products in the early steps of the ecdysone biosynthetic pathway are more likely to be accommodated by the H-site of DmGSTE14. Even though we have demonstrated that GSTE14 from D. melanogaster is catalytically competent and active with a steroid substrate, the nature of a possible substrate in the biosynthetic pathway to ecdysteroids remains obscure. Some GSTs are known as binding proteins (ligandins) [42], and therefore, in addition to catalysis, the function of an intracellular steroid carrier may also be considered a potential role of this enzyme.