High‐resolution crystal structures of the botulinum neurotoxin binding domains from subtypes A5 and A6

Botulinum neurotoxins (BoNTs) cause the deadly condition called botulism, but they can be used as therapeutics for a wide range of indications. BoNTs are classified into many different serotypes and subtypes, each with potentially different intoxication properties. Here, we report crystal structures of the receptor‐binding domains from two subtypes (BoNT/A5 and BoNT/A6) and compare their binding sites with previous BoNT/A structures.

Clostridium botulinum neurotoxins (BoNTs) cause flaccid paralysis through inhibition of acetylcholine release from motor neurons; however, at tiny doses, this property is exploited for use as a therapeutic. Each member of the BoNT family of proteins consists of three distinct domains: a binding domain that targets neuronal cell membranes (H C ), a translocation domain (H N ) and a catalytic domain (LC). Here, we present high-resolution crystal structures of the binding domains of BoNT subtypes/A5 (H C /A5) and/A6 (H C /A6). These structures show that the core fold identified in other subtypes is maintained, but with subtle differences at the expected receptorbinding sites.
Clostridial botulinum neurotoxins (BoNTs) are responsible for causing the deadly condition, botulism, in vertebrates [1][2][3][4]. There are seven distinct serotypes termed BoNT/A through BoNT/G, of which serotypes /A, /B, /E and /F5]. Each BoNT serotype can be further categorised into subtypes based on amino acid sequence identity. For example, there are currently eight known subtypes of BoNT/A (/A1-/A8), which share between 84% and 97% sequence identity [6]. While BoNTs are the most toxic biological molecules known to science, they are used in human therapy, especially BoNT/A1 [7].
The BoNTs contain three major functional domains, a binding domain located in the C-terminal half of the heavy chain (H C ), a translocation domain located in the N-terminal half of the heavy chain (H N ) and a Zn 2+ -dependent protease domain located in the light chain (LC). The H C is responsible for targeting the BoNT to the neuronal cell membrane by binding to specific gangliosides and protein receptors on the neuronal cell surface. The H N facilitates entry of the LC into the cytosol where it cleaves a target SNARE protein(s), which inhibits exocytosis. Although there are currently more than 46 different BoNT subtypes, there is limited structural information available for the majority of these natural variants. Many of these subtypes have been found to contain beneficial properties when compared to the commercially available toxins.
The BoNT subtypes from within the same serotype display a high degree of amino acid sequence identity and similarity; however, several studies have found distinct differences in their properties [8][9][10][11][12] (Fig. 1). Although the molecular basis of intoxication is not yet fully understood, the LC appears to define the length of intoxication (duration of action), while both H N and H C appear to be responsible for the spread and speed of cellular entry (onset of action). Considering the toxic nature of BoNTs, they are classed as tier 1 select agents due to their potential misuse in bioterrorism or as a bioweapon. From this perspective, Fig. 1. Alignment of the binding domain sequences from BoNT/A1 to A8. BoNT/A1 numbering and secondary structure used for annotation. Figure generated using ESPript [34] structural details of each subtype may aid the design of broadly BoNT-neutralising antibodies.

Materials and methods
All reagents used were purchased from Sigma-Aldrich (Dorset, UK) or Fisher Scientific (Leicestershire, UK) unless otherwise specified.

Protein expression and purification
The binding domain (residues 871-1296) of BoNT/A5 and BoNT/A6 was cloned into the pJ401 vector (Atum Bio, California, USA) from their respective full-length sequences (UniProtKB: C7BEA8 and C9WWY7) with an N-terminal 6xHis tag. Constructs were expressed and purified as described previously [17]. The N-terminal 6xHis tag was not removed from the proteins prior to crystallisation.

X-ray data collection and structure determination
Complete X-ray diffraction data sets were collected from single crystals of H C /A5 and H C /A6 (3600 images each) using 0.1°oscillations and a wavelength of 0.98 A at beamlines IO3 and IO4 (Diamond Light Source, Didcot, UK). Raw images were processed using DIALS [23], and integrated data were scaled and merged using Aimless [24] from the CCP4 suite [25]. The 3D structures of both proteins were solved by molecular replacement with PHASER [26] using the coordinates from Phyre2 web server homology models [27] as search models. Both models were manually built COOT [28] and refined with REFMAC [29] in the CCP4 suite of programs [25]. The structures were validated with PDB_REDO [30], MOLPROBITY [31] and WWPDB VALIDA-TION [32]. Crystallographic data processing and refinement statistics are given in Table 1. Structure-based figures were generated with either PyMOL (Schr€ odinger, LLC, New York, NY, USA) or MOE (Chemical Computing Group, Quebec).

Results and Discussion
Structure of the BoNT/A5-binding domain (H C /A5) The crystal of H C /A5 belonged to the orthorhombic space group P2 1 2 1 2 1 , and it diffracted to a resolution of 1. 15 A (Table 1). Electron density was excellent throughout, with all H C /A5 residues (except the N-terminal 6xHis tag and Lys871) being easily observed. The structure closely resembles the structures of other BoNT-binding domains [6] with an N-terminal jelly roll-like fold and C-terminal modified b-trefoil fold containing a conserved ganglioside-binding site (SxWY) (Fig. 2). However, compared to the structure of BoNT/A1 in complex with GT1b (PDB: 2VU9), the loop of residues 1260-1280, which contains ganglioside-interacting residues, adopts a different arrangement (Fig. 3a,b). It is possible that upon ganglioside binding, the loop changes conformation to allow S1275 and R1276 to take part in the interaction. In comparison with the unbound GD1a-binding site of BoNT/A1 (PDB: 3BTA) and BoNT/A3 (PDB: 6F0O), the corresponding site in H C /A5 perhaps more resembles that of the latter rather than the former, which is consistent with a higher sequence identity between the sites at residues corresponding to positions 1117, 1254 and 1278 (Figs 1 and 4). Either way, considering that both BoNT/A1 and BoNT/A3 are able to bind to GD1a (PDB: 5TPB and 6THY, respectively), this suggests that BoNT/A5 is able to do so too.
In close proximity to the ganglioside binding loop was observed an unusual featurea methylene bridge between the S c of Cys1280 and N f of Lys1236 ( Fig. 5a), rather than a disulfide bond with a nearby cysteine residue (Cys1235). During refinement of the H C /A5 structure, clear electron density was observed between the side chains of Cys1280 and Lys1236, into which a methylene group could be fitted. Weak anomalous data recorded at the start of data collection were used to generate a low-resolution anomalous difference map. Despite the noise, large peaks were observed at the location of sulfur atoms, which confirmed the location of each cysteine residue (Fig. 5b). This specific methylene bond between a lysine and cysteine side chain is unusual, and the mechanism surrounding the formation of a methylene-bridged lysine and cysteine is not fully understood [33]. Whether this bond is biologically relevant remains to be established. While there are indications of this bond in the electron density maps of other BoNT crystal structures, it is possible that this may be an artefact of exposure to synchrotron radiation.
Inspection of the H C /A5 structure corresponding to the BoNT/A1 SV2C-binding site ( 1139 PRGSVMTT 1146 + Arg1156) reveals the presence of perhaps a slightly shortened b-hairpin (Fig. 3d,e). The three different residues at positions 1143, 1144 and 1156 (V ? I, M ? V and R ? M, respectively) do not appear to preclude the possibility of SV2C binding. Indeed, the related binding domain of BoNT/ HA possesses the same residues at the corresponding location and is still able to bind to SV2C [20]. However, inspection of the accompanying SV2C glycanbinding site reveals one residue (Gln1064) potentially hindering the binding of glycan (Fig. 3g,h). This residue, corresponding to His1064 in BoNT/A1, has been shown to drastically decrease the binding affinity to SV2C [20]. Although this suggests that SV2C may not be the protein receptor for BoNT/A5, it should be noted that there exists a second BoNT/A5 sequence that differs by this one residue (UniProtKB: C1IPK2).

Structure of the BoNT/A6-binding domain (H C /A6)
The crystals of H C /A6 belong to orthorhombic space group P2 1 2 1 2 1 and diffracted to a resolution of 1.35 A. Electron density was excellent, with all but the first six residues of H C /A6 being clearly observed, and like H C / A5, the overall protein fold was highly similar to other BoNT H C structures. The ganglioside-binding site was identical to that of H C /A5 except for residue 1117, which was a Phe rather than a Tyr (Fig. 3b,c).
Although the absence of the hydroxyl group would result in the loss of hydrogen bonding with the terminal sialic acid of GT1b, the side chain can still continue to interact with the carbon ring. Compared to the unbound GD1a-binding site of BoNT/A1 (PDB: 3BTA) and BoNT/A3 (PDB: 6F0O), the corresponding site in H C /A6 also more resembles that latter rather than the former, even though there is no greater sequence identity between the sites at residues corresponding to positions 1117, 1254 and 1278 (Figs 1  and 4). Like that for the H C /A5 structure, BoNT/A6 is predicted to be able to bind to GD1a as well.
For the corresponding BoNT/A1-SV2C-binding site in H C /A6, a larger sequence variation is observed: 1139 SRSTLLTT 1146 + Met1156 rather than 1139 PRGSVMTT 1146 + Arg1156 for BoNT/A1. Despite these differences, the b-hairpin remains available to bind to SV2C via mostly backbone-backbone hydrogen bonding (Fig. 3d,f). Like H C /A5, H C /A6 possesses a different residue in the glycan-binding site at position 1064 (Arg) compared to that of BoNT/A1 (His, Fig. 3g,i), and this has also been reported to significantly reduce binding of glycosylated SV2C [20]. This would suggest that BoNT/A6 may have a lower affinity for SV2C than BoNT/A1. Interestingly, BoNT/A2 also contains an Arg at position 1064 and it has previously been reported that both BoNT/A2 and BoNT/ A6 are capable of entering hiPSC-derived neurons faster than BoNT/A1 [8].

Conclusions
The BoNT/A subtypes are believed to bind to the target cell surface via a dual-receptor complex involving a ganglioside and protein receptor. For BoNT/A1, they are GT1b (preferentially) and SV2C, respectively, but for most of the others, the exact identities of these receptors have not yet been determined. Structural analysis of the expected binding sites has revealed some differences with that of BoNT/A1, suggesting either an altered binding affinity to each receptor or a different receptor specificity altogether. Our highresolution structures further add to the body of  knowledge around BoNT receptor binding and enhance the available molecular information for engineering novel therapeutic BoNTs and BoNT-binding moieties.