A high‐resolution (1.2 Å) crystal structure of the anti‐CRISPR protein AcrIF9

This study determined the 1.2 Å high‐resolution crystal structure of AcrIF9 for the purpose of understanding the molecular basis underlying its anti‐CRISPR function. This high‐resolution crystal structure was compared with the identified structure of the cryo‐electron microscopy structure of AcrIF9 associated with the cascade complex to analyze feature similarities and differences.

To counteract this prokaryotic immune system that confers resistance to foreign genetic elements, phages evolved to have multiple anti-CRISPR genes that encode anti-CRISPR proteins (Acr) that can inhibit the host CRISPR-Cas system function [12][13][14]. Based on genome searches and comparisons with advanced machine learning tools, approximately 60 Acr genes have been identified thus far [12,15]. Because Acr genes have low and unrelated sequence homology, they are classified based on the targeted CRISPR-Cas systems [11,12].
Although the diverse structures and mechanisms of many Acr have been revealed [11,14,16,17], the inhibitory mechanism of the AcrIF9 family has not been identified due to limited structural information. Thus, to understand the molecular basis underlying the AcrIF9 anti-CRISPR function, we determined the 1.2 A high-resolution crystal structure of AcrIF9. Structural and biochemical studies showed that AcrIF9 exists in monomeric form in solution and can directly interact with DNA using a positively charged cleft. During our manuscript preparation, the cryo-electron microscopy (EM) structure of AcrIF9 associated with the cascade complex was released [18]. Based on comparisons with the structure of cascade-complexed AcrIF9, we identified a number of similarities and differences in various features of the AcrIF9 structure.

Cloning, overexpression, and purification
The AcrIF9 gene from the Pseudomonas aeruginosa phage was synthesized by Bionics (Daejeon, Korea) and cloned into a pET21a plasmid vector (Novagen, Madison, WIS, USA) with a C-terminal polyhistidine tag. The NdeI and XhoI restriction sites were used for cloning. The procedures and methods used for expression and purification of this target protein were similar with those used for our previous study [19]. The resulting recombinant vector containing the full-length AcrIF9 (residues 1-68) gene was transformed into Escherichia coli strain BL21(DE3) competent cells. The cells were cultured at 37°C in 1 L of lysogeny broth containing 50 lgÁmL À1 kanamycin. When the optical density value at 600 nm reached 0.7, the temperature was adjusted to 20°C, and 0.5 mM isopropyl b-D-1-thiogalactopyranoside was added for induction of the target gene. The induced cells were further cultured for 18 h in a shaking incubator. The cultured cells were harvested by centrifugation at 2000 g for 15 min at 4°C, resuspended in lysis buffer [20 mM Tris/HCl (pH 8.0), 500 mM sodium chloride, and 25 mM imidazole], and lysed by ultrasonication at 4°C. The cell lysate and supernatant were separated by centrifugation at 10 000 g for 30 min at 4°C. The collected supernatant was mixed with Ni-nitrilotriacetic acid (NTA) affinity resins for 3 h, and the mixture was loaded onto a gravity-flow column (Bio-Rad, Hercules, CA, USA). To remove impurities, the resin was washed with 50 mL of washing buffer [20 mM Tris/HCl (pH 8.0), 500 mM NaCl, and 60 mM imidazole]. After washing, the resin-bound target protein was eluted from the resin in the column using elution buffer [20 mM Tris/HCl (pH 8.0), 500 mM NaCl, and 250 mM imidazole]. AcrIF9 was further purified by size-exclusion chromatography (SEC) using a Superdex 200 10/300 GL column (GE Healthcare, Waukesha, WI, USA), which had been pre-equilibrated with a solution comprising 20 mM Tris/HCl (pH 8.0) and 150 mM NaCl. The target protein eluted from SEC was collected, pooled, and concentrated to 3.0 mgÁmL À1 for crystallization. The purity of the protein was visually assessed using SDS/PAGE.

Multi-angle light scattering analysis
The absolute molecular weight of AcrIF9 in solution was measured using SEC-coupled multi-angle light scattering (SEC-MALS). The protein solution was loaded onto a Superdex 200 Increase 10/300 GL 24 mL column (GE Healthcare) pre-equilibrated with an SEC buffer [20 mM Tris/HCl (pH 8.0) and 150 mM NaCl]. The flow rate of the buffer was controlled to 0.4 mLÁmin À1 , and SEC-MALS was performed at 20°C. A DAWN-TREOS MALS detector was connected to an € AKTA Explorer system. The molecular weight of bovine serum albumin was measured for a reference value. Data were processed and assessed using ASTRA software (WYATT technology, SantaBarbara, CA, USA).
Crystallization and X-ray diffraction data collection AcrIF9 was crystallized using the hanging-drop vapor diffusion method at 20°C. Initial crystals were obtained by equilibrating a mixture containing 1 lL of protein solution The crystallization conditions were further optimized by experimenting with a range of protein and precipitant concentrations at various pH values. As a result, the best crystals were obtained by adding 4% (v/v) 2,2,2trifluoroethanol at the reservoir solution. The optimized crystals appeared in 14 days. A single crystal was selected and soaked in reservoir solution supplemented with 40% (v/v) glycerol for cryoprotection. X-ray diffraction data were collected at À178°C on the beamline BL-5C at the Pohang Accelerator Laboratory (Pohang, Korea). Data processing, including indexing, integration, and scaling, was conducted using HKL2000 software [20].

Structure determination and refinement
The structure was determined using ARCIMBOLDO_-BORGES ab initio phasing software [21], combining fragment search with PHASER [22] and density modification with SHELXE [23]. The initial model was built automatically using AutoBuild from the PHENIX package, and further model building with refinement was performed using COOT [24] and phenix.refine [25]. The full anisotropic refinement was used. The structure quality and stereochemistry were validated using MOLPROBITY [26]. All structural figures were generated using the PYMOL program [27].

Results and Discussion
Overall structure of AcrIF9 from the P. aeruginosa phage The type I CRISPR-Cas system forms RNA-guided multi-subunit cascade complexes. Cas3 (trans-acting The color of the chain from the N termini to the C termini gradually moves through the spectrum from blue to red. The four antiparallel b-sheets and one a-helix are labeled S1-S4 and H1, respectively. Extra residues from the expression construct (LEHHHHHH) are indicated at the C terminus. (E) Topology representation of AcrIF9. (F) Crystallographic packing of AcrIF9. The single AcrIF9 molecule in the asymmetric unit is colored in orange. The other gray molecules are symmetrical molecules. The C-terminal six-histidine tag that is critical for crystal packing is indicated by the red dotted circle. The view focused on the single AcrIF9 molecule in the asymmetric unit is provided on the right side of the panel for a better view of crystal packing. nuclease) is involved in this system to cleave the target DNA (Fig. 1A). The type I CRISPR-Cas system is divided into six subtypes, I-A to I-F, based on the subunit composition in the cascade complex [28]. Because classification of Acr depends on the target CRISPR-Cas systems, AcrI proteins, which target the type I CRISPR-Cas system, can be divided into six families, AcrIA to AcrIF. Among these families, the inhibitory mechanism of Acr has been intensively analyzed with structural studies of the AcrIF family. Previous studies have shown that the AcrIF family inhibits the type I-F CRISPR-Cas system in two different ways: (a) directly binding to cascade complex proteins and blocking the target DNA interaction (e.g., AcrIF1 [29], AcrIF2 [17], and AcrIF10 [30]) or (b) directly binding to the Cas3 helicase/nuclease protein and inhibiting Cas3 interactions to target DNA (e.g., AcrIF3 [31]; Fig. 1A). Although diverse structures and mechanisms of several Acr have been revealed, the inhibitory mechanism of the AcrIF9 family has not been identified due to limited structural information. Thus, to understand the molecular basis underlying AcrIF9 anti-CRISPR function, we purified AcrIF9 using two-step chromatography, affinity chromatography and SEC. According to SEC, the protein was eluted around 19 mL of the SEC column, indicating that AcrIF9 exists as a monomer in solution (Fig. 1B). Although various AcrI families can function in monomeric form, previous structural and biochemical studies have shown that AcrI families, sometimes, form homodimers in solution [16,31,32]. Given the stoichiometric diversity of the AcrI families, we used MALS to determine the absolute molecular mass of AcrIF9 in solution, which was 11.2 kDa (8.2% fitting error) with 1.002 polydispersity (Fig. 1C). Because the theoretically calculated molecular weight of monomeric AcrIF9 with the Cterminal histidine tag was 9.8 kDa, the peak may be attributable to monomeric AcrIF9. These SEC and MALS experiments indicate that AcrIF9 exists in a monomeric state in solution.
With no known structural homologues available in the PDB database, we were unable to solve the phasing problem by molecular replacement. However, the phase was able to be obtained using ARCIMBOL-DO_BORGES ab initio phasing software, which can use small helix and sheet fragments available in the PDB for ab initio phasing. The final 1.2 A structure was refined to R work = 19.4% and R free = 20.5%. The diffraction data and refinement statistics for AcrIF9 are summarized in Table 1. The crystal belongs to space group P2 1 2 1 2 1 with one molecule present in the asymmetric unit (Fig. 1D). The final model contains the complete sequences of AcrIF9 (from M1 to Q68) with six C-terminal histidine residues and extra leucine and glutamic acid residues (LE) from the expression construct (Fig. 1D). The structure of AcrIF9 is composed of four antiparallel b-sheets (S1-S4) surrounding one a-helix (H1; Fig. 1D). Detailed topology analysis indicated that the fold of AcrIF9 is constructed with two antiparallel b-sheets connected by one a-helix in the middle (Fig. 1E). Crystallographic packing analysis showed that the uncleaved C-terminal six-histidine tag was critical for crystal packing by interacting with neighboring molecules (Fig. 1F). We failed to obtain the AcrIF9 protein crystal whose C-terminal six-histidine tag was removed, which may be due to the role of the tag in crystal packing.

Structural comparison with the cryo-EM structure of cascade-complexed AcrIF9
During our analysis of the AcrIF9 structure with further biochemical studies, Zhang et al. [18] released the cryo-EM structure of AcrIF9 associated with the cascade complex. Because the advantage of our crystal structure was accuracy with extremely high resolution, we compared our structure with newly reported cryo-EM structure. The 1.2 A high-resolution crystal structure of our AcrIF9 was highly ordered, and it was easy to see every atom in the electron density map ( Fig. 2A). Even the hole in the center of phenyl rings (e.g., F40) was visible in our structure (Fig. 2B). A structural comparison with the cryo-EM structure of cascade-complexed AcrIF9 by pairwise superimposition showed that the overall structure was almost identical with a RMSD value of 0.5 A; only the locations of a few loops, including the H1-S3 connecting loop and the C-terminal loop, did not align perfectly (Fig. 3A). The cryo-EM structure showed that the inhibitory mechanism of AcrIF9 resulted from its direct binding to the cascade spiral backbone, particularly Cas7f and Cas8f, to prevent DNA binding.
Because Y5 and L27 as well as Q38, C39, and F40 of AcrIF9 were involved in the interactions with Cas7f and Cas8f, respectively, we analyzed the cascade-binding region of AcrIF9 to see whether any structural changes occurred during the process. This analysis showed that all side chains of Y5, L27, Q38, C39, and F40 from our crystal structure were exactly same as those from the cryo-EM structure, indicating that the structure of AcrIF9 does not change when it binds to the cascade complex for inhibition (Fig. 3A). Although identical structures of cascade complex-binding residues on AcrIF9 were detected, the locations of the side chains from many residues, especially those on the surface such as K2, Q11, R17, Q21, E23, K36, D60, R63, and Q68, were not identical, indicating that those surface residues have dynamic properties (Fig. 3B).
B-factor analysis indicated that our high-resolution crystal structure was rigid with a low B-factor (average of 19.30 A 2 ), while the cryo-EM structure was relatively less rigid with a higher B-factor (average of 75.73 A 2 ; Fig. 3C,D). Interestingly, the highest B-factor area in the crystal structure was the C-terminal loop right next to S4 (Fig. 3C) while the highest The region contains F40, and the hole in the center of the phenyl rings is visible in the structure B-factor areas in the cryo-EM structure were H1 and the S2-H1 connecting loop (Fig. 3D). These findings indicate that AcrIF9 has a flexible C-terminal loop and changes its structural properties after binding to the cascade complex. The flexible C-terminal loop becomes rigid, and H1 along with the S2-H1 connecting loop becomes less rigid after target protein binding.
AcrIF9 directly binds to DNA as well as cascade complex proteins The charge distributions and surface features were analyzed by calculating the surface electrostatic potential. This analysis showed that AcrIF9 contains a highly positively charged cleft between the Cas7f-and Cas8f-binding regions (Fig. 4A). Based on this observation, we speculated that AcrIF9 may bind to negatively charged DNA and cascade proteins (Cas7f and Cas8f). According to a structural homology search using the Dali server [33], Cas3 (PDB id: 5B7I) [31] and Cas2 (PDB id: 4P6I) [34] were selected as structural homologues with AcrIF9 even though the top hits in order were antitoxin Dmd (PDB id: 5I8J) [35], PURS protein (PDB id: 1VQ3) [36], and insecticidal protein (PDB id: 5V3S) [37] (Table 2). Because Cas3 and Cas2 are nucleases/helicases, which are involved with DNA binding, structural homologue AcrIF9 also functions in binding to DNA. This structural homologue analysis also supports our idea that AcrIF9 may have the ability to bind to DNA. Finally, we performed a direct DNA-binding test using an agarose gel shift assay. As indicated in Fig. 4B, linearized plasmid DNA was shifted up by adding AcrIF9 in a concentration-dependent manner, indicating that AcrIF9 directly binds to DNA. It has been revealed that AcrIF9 inhibits CRISPR-Cas systems by binding to the spiral backbone of CRISPR to prevent further DNA cleavage by Cas3. If AcrIF9 binds to DNA when it binds to CRISPR, AcrIF9 may highjack the targeted DNA by binding to the CRISPR complex. Our structural comparison with the cryo-EM structure of cascadecomplexed AcrIF9 revealed that the Cas7f-and Cas8fbinding regions of AcrIF9 were rigid in conformation with or without cascade proteins, whereas the S2-H1 connecting loop and H1 of AcrIF9 became less rigid after binding to the cascade complex. Because the S2-H1 connecting loop and H1 region are tentative DNA-binding regions featuring highly positive charges, this AcrIF9 lack of rigidness after binding to the cascade complex may be important for further DNA recognition and subsequent inhibition of CRISPR-Cas systems. Future biochemical and structural studies are needed to elucidate the meaning of the DNA-binding capability of AcrIF9 during CRISPR-Cas system inhibition.