Multiple binding modes of a small molecule to human Keap1 revealed by X-ray crystallography and molecular dynamics simulation

Graphical abstract


Introduction
Living organisms have specific defense systems against various environmental stresses. Of all these stresses, the oxidative stress have been a focus of constant attention in medicine because it is related to many pathologies including cancer [1,2], cardiovascular disease [3,4], diabetes [5,6], neurodegenerative disease [7,8], chronic arthritis [9,10] and aging [11,12]. Understanding of the antioxidant response system is important to develop medical treatments for these pathologies [13]. The antioxidant response is accomplished by the sensing of oxidants and the subsequent activation of antioxidant genes [14]. The sensing of oxidants such as reactive oxygen species and electrophilic xenobiotics is carried out by the Kelch-like ECH-associated protein 1 (Keap1) [15]. On the other hand, the nuclear factor erythroid 2-related factor 2 (Nrf2) protein [16] is responsible for the transcriptional regulation of about 200 antioxidant proteins including many enzymes/transporters for drug metabolism and Nrf2 itself [17]. Keap1 is susceptible to a posttranslational modification by oxidants at certain reactive cysteine residues, allowing Keap1 to sense them [18,19]. Keap1 and other proteins assemble to make the Cullin 3-based E3 ubiquitin ligase complex that binds and ubiquitinates Nrf2. After the ubiquitination, Nrf2 is rapidly degraded by proteasome to keep the lower level of intracellular Nrf2, and therefore the transcription of antioxidant genes is suppressed in the quiescent cells [19,20]. The oxidative-stress-induced modification of Keap1 at specific cysteine residues inhibits the Nrf2 ubiquitination, thereby elevating the intracellular Nrf2 level [19]. As a result, the intact Nrf2 activates the transcription of antioxidant genes through making a complex with the small Maf protein and subsequent binding to a cis-acting DNA sequence termed the antioxidant response element located in the promoter regions of target genes [16].
The Keap1 molecule mainly consists of the N-terminal domain, the C-terminal Kelch domain and the intervening region located in-between the two domains. A single particle analysis of electron microscopy confirmed that Keap1 in solution state forms a homodimer assembled at two cognate N-terminal domains [21]. The N-terminal domain also mediates interactions with other components of the Cullin 3-based E3 ubiquitin ligase complex [22], and contains a cysteine residue Cys151 that senses the oxidative stress [19]. The other sensing cysteine residues Cys273 and Cys288 present in the intervening region [18]. The Kelch domain is responsible for the interaction with Nrf2. As the binding site to Keap1, the ETGE motif located in the N-terminal region of Nrf2 was found first [23]. The DLG motif in the N-terminal region was identified later, as the second binding site required for the ubiquitination/degradation of Nrf2 [24]. Biophysical analyses using nuclear magnetic resonance and isothermal titration calorimetry revealed that the DLG and ETGE motifs interact independently with the Kelch domain by different dissociation constants of 0.5 lM and 8 nM, respectively [25]. Based on analyses in molecular biology, McMahon et al. proposed a two-site interaction model so-called ''tethering'' mechanism in which two Kelch domains of the Keap1 dimer recognize a single molecule of Nrf2 to facilitate the Cullin-mediated ubiquitination of Nrf2 [26].
These studies imply that a small molecule capable of binding to the Nrf2 interaction site on the Keap1 Kelch domain could be a useful medicine to activate the cellular defense to the oxidative stress through inhibiting the ubiquitination of Nrf2. Although several candidates for such compounds were reported, no one was applied in practical use [27][28][29]. Precise structural information, for instance, from the X-ray crystallography at high resolution, is indispensable for the structure-based drug design. To date, several crystal structures were reported: the Kelch domain [30], the Kelch domain in complex with the Nrf2 peptide containing the ETGE motif [31,32] or the DLG motif [33,34]. However, structural information of Keap1 complexes with small molecule ligands is still limited; only two complex structures have been reported recently [35,36]. Structural comparison between multiple crystal structures of Keap1 complexes with different small molecule ligands would be useful for the more effective design of Keap1 inhibitors [28]. Here we determined a crystal structure of the Kelch domain of human Keap1 in complex with a small ligand referred to as Ligand1 that has a novel chemical scaffold. Interestingly, two different binding modes were observed in the Keap1-Ligand1 complex crystals.

Screening and characterization of Ligand1
To search for the candidate compounds, an in silico screening was performed on the crystal structure of the Kelch domain of human Keap1 in complex with the ETGE peptide of Nrf2 (PDB entry 2flu) [32]. The in-house and commercially available compounds were docked against the Nrf2 peptide binding site. Based on the docking score and the predicted affinity against Keap1, 65 compounds were selected. Then, the binding ability of these compounds was evaluated using a surface plasmon resonance-based solution assay at 50 lM of the compound concentration. We found 27 active compounds including the LigandX (Fig. 1).
Based on the structural comparison between the docking pose of LigandX and the ETGE peptide of Nrf2 in 2flu, we designed and synthesized the Ligand1 in which the phenol moiety of LigandX was modified by the oxyacetic acid group that intended to mimic the sidechain of the first (N-terminal) glutamic acid residue in the ETGE peptide (see Section 4 for the synthesis of Ligand1). The association constant of Ligand1 to the human Keap1 Kelch domain was estimated as the third to the fourth power of ten from the equilibrium affinity analysis of a surface plasmon resonance-based solution experiment (Fig. 2). Unfortunately, the limited solubility of Ligand1 (less than 1 mM) hampered further analyses on binding properties including the precise association constant, the number of binding sites, and the cooperativity. The Keap1-Ligand1 interaction was also confirmed by another assay using AlphaScreen (PerkinElmer Inc.), a bead-based, amplified luminescent proximity homogeneous assay, which revealed the competitive effect of Ligand1 on the interaction between the Nrf2 peptide and the Kelch domain of Keap1 (Supplemental Fig. S1).

Quality of crystal structures
Two crystal structures of the human Keap1 Kelch domain in complex with Ligand1, the soaking form and the cocrystallization form, were determined at 2.1 Å resolution (Table 1). In these two forms, the asymmetric unit contained a Kelch domain and a Ligand1 molecule. The final models of the Kelch domain covered the amino-acid residues 322-609 with well-defined electron densities, while the N-terminal 21 residues (20 His-tag residues and Ala321) were not included due to a structural disorder, as observed in the crystal structure of the same Kelch domain reported (PDB entry 1u6d) [30]. The temperature factor (B) values calculated from final models (average) were comparable to the Wilson B values from corresponding diffraction data. The soaking form crystal was isomorphous to 1u6d with the space group P6 5 22, whereas the cocrystallization form revealed a different crystal packing with the other space group P2 1 2 1 2 1 . The stereochemistry analysis revealed no residue in generously allowed or disallowed regions of the Ramachandran plot, except for Arg336 and His516 in the generously allowed region. These two residues were found in well-defined electron densities without steric clashes. All atoms comprising Ligand1 were identified in electron density maps with reasonable individual B values comparable to those of neighboring protein atoms, indicating high occupancy of the ligand (Table 1). Thus we fixed the occupancies of all atoms comprising Ligand1 to 1.0 and did not refine them. In addition, annealed 2F o ÀF c omit maps for Ligand1 in both the forms provided clear electron densities comparable to those of corresponding final 2F o ÀF c maps (Supplemental Fig. S2), confirming the existence of ligands bound.

Recognition of Ligand1
The overall structural architecture of the human Keap1 Kelch domain in our structures is the same as that in the PDB entry 1u6d that represents the b-propeller structure composed of a sixfold repeat of all b domain called ''blade'' [30]. The Rmsd values from the superposition of C a atoms (residues 325-609) between 1u6d and our structures are 0.241 Å for the soaking form and 0.431 Å for the cocrystallization form. The latter value is similar to that from the C a superposition between the soaking and the cocrystallization forms: 0.449 Å. Reflecting the crystal packing difference, the N-terminal three residues adopt totally different main-chain conformations in the cocrystallization form when compared to other forms, thereby excluding these residues in the superposition. Elsewhere, the cocrystallization form is similar to the others in terms of the main-chain structure. On the other hand, 1u6d and the soaking form confer essentially the same main-chain structure including the N-terminal region.
When the soaking and the cocrystallization forms are compared, the Ligand1 molecule is found on the same side of the 6-bladed b-propeller that is opposite to the N-and C-termini. However, the precise locations of two binding sites are distinct ( Fig. 3B and C). In the soaking form, the Ligand1 molecule approximately locates on the central hole of Keap1. On the other hand, in the cocrystallization form, the binding site of Ligand1 relocates by about 10 Å toward the first blade. The Ligand1 molecule is is the ith observation of reflection hkl and < I(hkl)> is the weighted average intensity for all observations i of reflection hkl. b R cryst = P hkl ||F obs | À |F calc ||/ P hkl |F obs |, where |F obs | and |F calc | are the observed and calculated structure-factor amplitudes, respectively. R free was calculated with 5% of the reflections chosen at random and omitted from refinement. Values in parentheses are for the outermost shell. recognized by ten residues in the soaking form and seven residues in the cocrystallization form (Table 2). These residues are well-conserved in the Keap1 members from various organisms. Interestingly, some of these residues are overlapped, indicating an apparent multiple recognition mode to the same ligand (Fig. 3A). The residues used in both the forms are Tyr334, Ser363, Arg380, Asn382 and Asn414. On the other hand, the form-specific recognition residues are Arg415, Arg483, Ser508, Ala556 and Gly603 for the soaking form, whereas Arg336 and Ser602 for the cocrystallization form. Interaction types used are: 1 hydrogen bond, 1 water-mediated hydrogen bond, 3 electrostatic interactions and 19 nonpolar interactions for the soaking form; 5 hydrogen bonds, 3 water-mediated hydrogen bonds and 11 nonpolar interactions for the cocrystallization form. Thus, in terms of the interaction type, nonpolar interactions and hydrogen bonds dominate in the soaking and the cocrystallization forms, respectively.
From a detailed comparison between 1u6d and the soaking form, a ligand-induced rearrangement of side-chain atoms is observed at several ligand recognition residues: Arg380, Asn382, Arg415, Arg483 and Ser508. On the other hand, from that between 1u6d and the cocrystallization form, another ligand-induced rearrangement is observed at a few ligand recognition residues: the side-chains of Arg336 and Arg380, and the main-/side-chain of Asn382. In addition, probably reflecting the crystal packing, several peripheral loops adopt slightly different main-/side-chain conformations. Of these peripheral loops, the b-hairpin protrusion of the second blade (residues 380-389) in which two ligand recognition residues Arg380 and Asn382 exist, shows a limited conformational change in the main-/side-chain structure, probably reflecting both the ligand binding and the crystal packing difference. Therefore, this b-hairpin in the second blade may be important for the ligand recognition. Among six b-hairpin protrusions of Keap1, only that of the first blade (residues 334-338) adopts relatively rare class 1 conformation [37] where two ligand recognition residues Tyr334 and Arg336 exist; other five b-hairpins adopt the class 2 conformation. Interestingly, the six b-hairpins of Keap1 are accompanied with similar uncommon b-bulges including the invariant glycine doublet, which cannot be classified by the current criteria [38] of b-bulge. In the soaking form, a ligand recognition residue Arg483 locates on the b-hairpin in the fourth blade (residues 478-483). Since they harbor the ligand binding residues, the b-hairpins in the first and the fourth blades may also be important for the ligand recognition, although they do not show large conformational differences in the main-chain structure upon binding ligand.

Structural comparison between Ligand1 and Nrf2
To understand the binding ability of Ligand1 to Keap1, present two Keap1-Ligand1 complexes were compared with other reported Keap1 complex structures that contain peptides from the physiological ligand, Nrf2. The clearest result was obtained from the Keap1 complex structure with the ETGE peptide of Nrf2: 1x2r [31], 2flu [32] and 3zgc [39]. When the soaking and the cocrystallization forms are superimposed onto 1x2r at  Interactions are classified in three types: HB as a hydrogen bond with a distance not greater than 3.4 Å (angle considered); NP as a nonpolar interaction with a distance not greater than 3.4 Å; ES as an electrostatic interaction with a distance not greater than 4.0 Å.
corresponding C a atoms, the carboxylate group of Ligand1 overlap with the first and the second glutamate of the ETGE motif, respectively ( Fig. 4A and B). The other ETGE complexes, 2flu and 3zgc, provided essentially the same results. In another reported Keap1 complex structure 3wn7 [34] containing the DLG peptide of Nrf2, the binding mode of the DLG peptide was dissimilar to that of Ligand1. Accordingly, the superposition of the soaking and the cocrystallization forms clearly mimics the recognition mode of the ETGE peptide by Keap1 (Fig. 4C). This result is consistent with the solution experiment using AlphaScreen (PerkinElmer Inc.) in which Ligand1 inhibits the Keap1-ETGE binding (Supplemental Fig. S1). Other peptides from p62 [40] or prothymosin a [41] with essentially the same binding mode as that of Keap1-ETGE may compete with Ligand1 too. In addition, the overlapping of binding sites for the ETGE and DLG peptides indicates that Ligand1 can inhibit the Keap1-DLG binding partly. This observation is similar to that in the first report on the Keap1 complex with small molecules by Marcotte et al. (PDB entries 4iqk and 4in4) [35] in which the sulfone group of small molecules located near to the acidic sidechains of the ETGE peptide when the Keap1-peptide complex was superimposed. However, the degree of mimicry seems to be higher in our structures where the recognition mode for the carboxylate group of Ligand1 by the Keap1 residues is corresponding exactly to that for the glutamate sidechains of the ETGE peptide. In the other reported Keap1 complex with small molecules by Jnoff et al. (PDB entries 4n1b, 4l7c and 4l7d) [36], the binding mode of the small molecule is dissimilar to that in our structures.

Molecular dynamics simulation
In the present structures, the Ligand1 bound to Keap1 mediates the crystal packing (Fig. 5). The numbers of contacts with interatomic distances not greater than 3.4 Å from the Ligand1 molecule to the asymmetric and the symmetry-related molecules of Keap1 are 20 and 5 for the soaking form, whereas 16 and 7 for the cocrystallization form, respectively. This indicates a substantial contribution of the crystal packing to the protein-ligand interactions in crystals. The crystal packing is known as a possible factor to influence on the structure-based drug design. For instance, the crystal contact at the active site can hamper binding ligands when the soaking method is used to prepare the complex crystals [42]. One of solutions to that situation is the crystal-contact engineering to obtain a new crystal form suitable for the ligand soaking experiments where the inappropriate crystal packing is disrupted by introducing mutations at the crystal contact [39]. However, the influence of crystal packing on the ligand binding mode when the packing is mediated by the ligand, is not fully investigated to date. Thus, to examine the influence of the crystal packing, we employed a molecular dynamics (MD) simulation in aqueous condition. For instance, such MD simulation was performed to rule out a false ligand which ejected from the binding pocket of the target protein within 50-100 ps of simulation [43]. In another example, an MD simulation for 20 ns was carried out to analyze the stability of a protein-ligand complex [44].
For that reason, we performed a 20 ns MD simulation of present Keap1-Ligand1 complexes. The time course of protein-ligand contacts with interatomic distances not greater than 3.4 Å revealed that the contact number tends to decrease (Fig. 6). In the MD structures from the soaking form, the contact numbers achieved equilibria with the average number of about ten. However, in those from the cocrystallization form, several times of no contacts, indicating the dissociation of ligand, were observed in the later moments of the MD simulation. Judging from the time course of the Rmsd from the superposition of backbone atoms between the MD structures and the original crystal structure, the MD simulation revealed a metastable state after 10 ns (Supplemental Fig. S3). Thus, to classify the binding mode, 500 structures from 10 to 20 ns were submitted to a cluster analysis based on the method described by Daura et al. [45]. In this analysis, the structure with the highest number of neighbors is defined as the center of a cluster. As a result, MD structures in a trajectory could be classified into a few binding modes that were related to but distinct from those observed in the crystal structures ( Fig. 7 and Table 3).
In the MD structures from the soaking form, two clusters sharing over 10% of 500 structures were obtained: the major cluster sharing 204 structures with the 892nd as the center and the minor cluster sharing 70 structures with the 774th as the center. The crystal structure does not belong to the major or minor clusters. In both the clusters, the guanidium groups of Arg415 and Arg483 interact with the ureido oxygen OAB and the carboxyl oxygens OAA/OAC of Ligand1, respectively. Thus, in the MD structures from the soaking form, Arg415 and Arg483 dominate the protein-ligand interactions. Difference between the two clusters is that Phe478 is used for the ligand recognition in the major cluster whereas Gly509 is used in the minor cluster (Table 3). On the other hand, in the MD structures from the cocrystallization form, three clusters sharing over 10% of 500 structures were obtained: the major cluster sharing 180 structures with the 658th as the center, the second cluster sharing 79 structures with the 842nd as the center, the third cluster sharing 63 structures with the 597th as the center. Again, the crystal structure does not belong to any of these clusters. In the major, the second, and the third clusters, the Ligand1 molecule locates in-between the b-hairpins of the first and the second blades, near the b-hairpin of the first blade, and near the b-hairpin of the second blade, respectively ( Fig. 7B and Supplemental Fig. S4). The second cluster showed the lowest average contact number 8.34 per structure, indicating the weakest protein-ligand interactions in these three clusters (Table 3). A common feature of these clusters is that the guanidium group of Arg380 interacts with the carboxyl oxygens OAA/OAC of Ligand1,   and a stick model, respectively. Important residues for the ligand recognition are shown as stick models and labeled. The b-hairpins in the first, the second and the fourth blades are indicated as Roman numerals. In the 20.014 ns MD trajectories, only the center structure of each cluster is shown: the 892nd at 17.834 ns from the major cluster (blue) and the 774th at 15.474 ns from the minor cluster (orange) for the soaking form; the 658th at 13.154 ns from the major cluster (blue), the 842nd at 16.834 ns from the second cluster (orange) and the 597th at 11.934 ns from the third cluster (gray) for the cocrystallization form. These models are superimposed on the crystal structures (green) at corresponding backbone atoms. although Arg336 dominates the ligand recognition in the second cluster. Notably, the structural diversity of Ligand1 is higher in the cocrystallization form when compared to the soaking form.
From the results of the MD simulation, the binding modes observed in crystals seem to be atypical in the solution state, indicating that the MD simulation is required for the structure-based drug design when the ligand of interest mediates the crystal packing. However, importantly, a few residues for the ligand recognition, namely, Arg380, Arg415 and Arg483, are used commonly both in the crystal and the solution states. The binding modes observed in the MD structures from the soaking form have certain possibility to account for the moderate association constant of Ligand1 in solution, because the ligand binding was retained over 20 ns in the simulation. On the other hand, the binding modes observed in the MD structures from the cocrystallization form may be less stable. Presumably, in the cocrystallization form, the atypical binding mode appropriate for the crystal packing was selected in solution when the crystal nucleation occurred. This selection of minor and weak binding mode can occur in the case of other protein-ligand complexes with higher affinity.

Conclusions
To elucidate how Keap1 recognizes a small molecule ligand, we determined the complex crystal structure of Keap1-Ligand1 in two different forms. Because these two binding modes of Ligand1 mimic that of the physiological ligand Nrf2 peptide in different manners, the present structural information concomitant with the MD simulation will be a useful basis for the pharmaceutical drug development. For instance, the fragment-based drug discovery [46,47] based on the present results may contribute to design more potent inhibitors of Keap1, although the pharmacological efficacy of Ligand 1 needs to be examined elsewhere in future. At the same time, this work provides a lesson about the crystal For selected structures in the latter half of 20.014 ns MD trajectory from 10.014 ns to 19.994 ns comprising 500 structures from 501st to 1000th, the protein-ligand contacts with interatomic distances not greater than 3.4 Å were counted and listed after a descending sort by the number of contacts. Only major contacts with appearance frequency values not less than 20% are shown. For the atomic superposition, a pair of protein-ligand atoms of which shortest interatomic distance in the 500 structures from 501st to 1000th was not greater than 3.4 Å was selected, except for the atoms with possibility of flipping in the MD simulation: C d1 , C d2 , C e1 and C e2 of tyrosine/phenylalanine; OAA and OAC of Ligand1.
packing effect that should be considered in the interpretation of protein-ligand complex structures. The MD simulation may be a good tool to investigate the crystal packing effect.

Screening of compounds
The docking program FRED ver2.2 (OpenEye Scientific Software, Inc.) and the crystal structure of the Keap1 Kelch domain in complex with the Nrf2 peptide (PDB entry 2flu) were used in our in silico screening and the docking pose analysis of LigandX. The compounds in the ZINC database ver8 [48] and those in our in-house library were used for the in silico screening. The 3D coordinates of the library compounds were prepared by the program LigPrep (Schrödinger, LLC) and the 3D conformers for the docking were generated by the program OMEGA 2.3 (OpenEye Scientific Software, Inc.). The MASC consensus score in FRED was used for the compound selection.

Surface plasmon resonance-based equilibrium affinity analysis
To evaluate the Keap1-Ligand1 interaction, the surface plasmon resonance-based equilibrium affinity analysis was performed using Biacore S51 (Biacore AB/GE Healthcare). The protein sample used was the GST fusion of the human Keap1 Kelch domain. The Keap1 sample was immobilized on the Series S Sensor Chip CM5 (GE Healthcare) using the amine coupling kit (GE Healthcare). The experiment was performed at 298 K in a physiological solution [150 mM NaCl, 3.4 mM EDTA, 1% dimethylsulfoxide, 0.005% polysorbate 20, 10 mM HEPES buffer pH 7.4]. Data analysis was performed using the S51Evaluation program (GE Healthcare). The number of experiments was 2 for the zero concentration of Ligand1 and 1 for other concentrations. The resonance unit was calculated by subtracting the response on the sample sensor spot from that on the reference sensor spot with immobilized GST, and by considering a bulk correction to remove the difference in the refractive index of solvents [49]. The maximum value of resonance unit was estimated as 24 in the case of single binding site, based on the positive control experiment using the ETGE peptide of Nrf2. The affinity calculation using the steady state evaluation of S51Evaluation failed to calculate precise affinity but yielded a range of K d value as more than 50 lM. A manual analysis using the double reciprocal plot based on the experimental data yielded a rough estimation of association constant as the third to the fourth power of ten depending upon the assumption on the binding site number.

Synthesis of ethyl 2-(3-cyanophenoxy)acetate
To a solution of 3-hydroxybenzonitrile (1.429 g, 12 mmol) in acetone (120 ml), calcium carbonate (2.156 g, 15.8 mmol) and ethyl 2-bromoacetate (2.81 g, 18.8 mmol) were added. The mixture was stirred at room temperature for 11 h. The reaction mixture was filtered and the filtrate was evaporated under reduced pressure. The residue was purified by flash chromatography on silica gel to afford the product (2.46 g, 99%) as colorless oil.

Synthesis of 2-(3-cyanophenoxy)acetic acid
To a solution of ethyl 2-(3-cyanophenoxy)acetate (616 mg, 3 mmol) in THF (12 ml), distilled water (6 ml) and 4 mM lithium hydroxide (3.75 ml) were added. The mixture was stirred at room temperature for 1 h. The mixture was diluted with 1 N hydrochloric acid to reach pH < 1 and extracted with dichloromethane (20 ml). The organic phase was washed with brine (20 ml), dried over with magnesium sulfate, and evaporated in vacuo to give the desired product (531 mg, 99%) as white solid. 1

Expression of Keap1
We have expressed the Kelch domain of human Keap1 (residues 321-609) in Escherichia coli system using a modified protocol from the original method described by Li et al. [50]. A hexahistidine tag comprising 20 residues was added to the N-terminus. The plasmid encoding Keap1 was digested with NdeI and BamHI, and the fragment was inserted into the expression vector pET15b (Novagen) linearized with NdeI and BamHI. Luria-Bertani medium containing carbenicillin was inoculated with a single colony of E. coli BL21(DE3) (Novagen) carrying recombinant plasmid and grown at 310 K with vigorous shaking. Plusgrow medium (Nacalai tesque) containing carbenicillin was inoculated with resulting Luria-Bertani medium and grown at 310 K to reach an optical density of 0.8 at 600 nm. Then, the Keap1 expression was induced by the addition of IPTG to a final concentration of 0.4 mM at 293 K. The cells were harvested by centrifugation at 5000g for 10 min and stored at 193 K.

Purification
The cell pellets were treated with BugBuster HT Protein Extraction Reagent (Novagen). The soluble fraction was applied onto a His-Trap HP column (GE healthcare) equilibrated with the binding buffer [500 mM sodium chloride, 20 mM sodium phosphate buffer pH 7.4]. The column was washed with 2% Buffer A [1 M imidazole, 500 mM sodium chloride, 20 mM sodium phosphate buffer pH 7.4], and the His-tagged protein was eluted with 30% Buffer A. The eluate was concentrated by ultrafiltration using an Amicon Ultra-15 (Millipore, 10,000 cut-off) and applied onto a Although the actual concentration of Ligand1 in solution state in the crystallization drop was assumed to be less than 1 mM, the saturation level would be kept during crystallization because of using the suspension state. Orthorhombic crystals with low diffraction quality grew in seven days to an approximate size of 10 Â 60 Â 10 lm. To obtain high-resolution crystals of the cocrystallization form, these crystals were used for the seeding in which the seed crystals were added to another crystallization drop at three days after the crystallization setup. The high-resolution orthorhombic crystals grew in seven days after the seeding to an approximate size of 10 Â 60 Â 10 lm. All crystals for the X-ray data collection were frozen using a cryoprotectant solution consisting of 30% (v/v) ethylene glycol in the crystallization precipitant solution. 4.7. X-ray data collection and structure determination X-ray diffraction data were collected at the beamline BL26B2 of SPring-8, Japan. The X-ray wavelength used was 1.0 Å with the crystal-to-detector distance of 200 mm and the oscillation angle of 1°. For both the soaking and the cocrystallization forms, complete diffraction data sets were obtained to 2.1 Å resolution at 100 K. The data were processed and scaled using the HKL2000 program package [51]. The crystal structures were solved by the molecular replacement method using the program MOLREP [52] from the CCP4 program suite [53], using an available structure of the human Keap1 Kelch domain (PDB entry 1u6d) [30] as a search model. Refinement was carried out using the programs REFMAC5 [54] and CNX (Accelrys Inc.) [55]. The structure was visualized and modified using the program Coot [56]. Several rounds of manual fitting and refinement were carried out through careful inspection of 2F o À F c and F o À F c electron-density maps. The stereochemical quality of the final structures was checked using the program PROCHECK [57]. The statistics from crystallographic analysis are given in Table 1. The superposition of models was performed using the program LSQKAB [58] from the CCP4 suite. The visualization of molecules in figures was prepared using the program Quanta 2000 (Accelrys Inc.) for Figs. 3-5, and the program Discovery Studio (Accelrys Inc.) for Fig. 7.

Molecular dynamics calculations
The molecular dynamics (MD) simulation was performed using the program Discovery Studio v4.0.100.13344 (Accelrys Inc.) using the CHARMm force field [59]. The atomic coordinates of 3vng and 3vnh are read into Discovery Studio without crystal water molecules. The protonation state was estimated using the pK prediction function of Discovery Studio; the total charge of molecules including Ligand1 was set to À5 at pH 7.4. The system in a truncated octahedron cell adopted the periodical boundary condition with the molecule-boundary distance of 20 Å. The system was then explicitly solvated and neutralized by adding Na + and Cl À ions to reach an ionic strength of 0.145. Each cell contained: 4334 protein atoms, 39 ligand atoms, 46 Na + ions, 41 Cl À ions and 46,560 water atoms for the soaking form (3vnh); 4334 protein atoms, 39 ligand atoms, 47 Na + ions, 42 Cl À ions and 47,430 water atoms for the cocrystallization form (3vng). The simulation protocol was composed of six sequential stages: the steepest descent minimization with the target gradient of 1.0 kcal mol À1 Å À1 ; the first heating stage from 0 to 10 K for 0.1 ps with the step duration of 0.1 fs; the first equilibration stage at 10 K for 1 ps with the step duration of 0.1 fs; the second heating stage from 50 to 300 K for 4 ps with the step duration of 2 fs; the second equilibration stage at 300 K for 10 ps with the step duration of 2 fs; the final NPT production stage using the leap-frog integrator at 300 K for 20 ns with the reference pressure of 1.0 atm and with the step duration of 2 fs. The SHAKE condition [60] was off in the first three stages whereas it was applied in the later stages. The particle-mesh Ewald method [61] was selected for the electrostatic calculation throughout the stages. The coordinates of the trajectory were recorded every 20 ps. The numbers of processors used for the calculation were 8 or 16. Unless otherwise noted, the parameter was set to the default value in Discovery Studio. The temperature fluctuation during the production stage was comparable to that estimated from the degree of freedom of the system. The number of interatomic contacts in the MD structure was calculated using the program CONTACT in the CCP4 suite [53]. The cluster analysis was performed using Perl scripts based on the method described by Daura et al. [45] where a series of non-overlapping clusters was obtained. First, Rmsd from a superposition of relevant atoms was calculated between all pairs of structures. Then, for each structure, the number of the other neighbor structures with the Rmsd value less than a cutoff value was calculated. The cutoff value of Rmsd was determined by a searching around 70% of the average intratrajectory Rmsd.