Characterization of P. falciparum dipeptidyl aminopeptidase 3 specificity identifies differences in amino acid preferences between peptide‐based substrates and covalent inhibitors

Malarial dipeptidyl aminopeptidases (DPAPs) are cysteine proteases important for parasite development thus making them attractive drug targets. In order to develop inhibitors specific to the parasite enzymes, it is necessary to map the determinants of substrate specificity of the parasite enzymes and its mammalian homologue cathepsin C (CatC). Here, we screened peptide‐based libraries of substrates and covalent inhibitors to characterize the differences in specificity between parasite DPAPs and CatC, and used this information to develop highly selective DPAP1 and DPAP3 inhibitors. Interestingly, while the primary amino acid specificity of a protease is often used to develop potent inhibitors, we show that equally potent and highly specific inhibitors can be developed based on the sequences of nonoptimal peptide substrates. Finally, our homology modelling and docking studies provide potential structural explanations of the differences in specificity between DPAP1, DPAP3, and CatC, and between substrates and inhibitors in the case of DPAP3. Overall, this study illustrates that focusing the development of protease inhibitors solely on substrate specificity might overlook important structural features that can be exploited to develop highly potent and selective compounds.

Malarial dipeptidyl aminopeptidases (DPAPs) are cysteine proteases important for parasite development thus making them attractive drug targets. In order to develop inhibitors specific to the parasite enzymes, it is necessary to map the determinants of substrate specificity of the parasite enzymes and its mammalian homologue cathepsin C (CatC). Here, we screened peptide-based libraries of substrates and covalent inhibitors to characterize the differences in specificity between parasite DPAPs and CatC, and used this information to develop highly selective DPAP1 and DPAP3 inhibitors. Interestingly, while the primary amino acid specificity of a protease is often used to develop potent inhibitors, we show that equally potent and highly specific inhibitors can be developed based on the sequences of nonoptimal peptide substrates. Finally, our homology modelling and docking studies provide potential structural explanations of the differences in specificity between DPAP1, DPAP3, and CatC, and between substrates and inhibitors in the case of DPAP3. Overall, this study illustrates that focusing the development of protease inhibitors solely on substrate specificity might overlook important structural features that can be exploited to develop highly potent and selective compounds.

Introduction
Malaria is a devastating infectious parasitic disease causing nearly half a million deaths every year [1]. Malaria is caused by parasites of the Plasmodium genus and is transmitted by Anopheles mosquitoes during a blood meal. Within the mosquito midgut, parasites reproduce sexually, multiply and travel to the salivary glands from where they are transmitted to the human host. Upon infection, parasites first establish an asymptomatic infection in the liver, followed by exponential asexual replication in the blood stream through multiple rounds of red blood cell (RBC) invasion, intracellular replication and egress from infected RBCs. This erythrocytic cycle is responsible for the symptoms and pathology of malaria. Over the last 15 years, the world has seen a very significant drop in malaria incidence, mainly due to the global distribution of insecticide-impregnated bed nets and the use of artemisinin-based combination therapies as the standard of care for uncomplicated malaria [2]. However, malaria remains a major global health burden with half of the world population at risk and around 200 million clinical cases per year. Unfortunately, mosquitoes are becoming increasingly resistant to insecticides [3], and artemisinin resistance is on the rise [4], thus making the identification of antimalarial targets and the development of drugs with novel mechanisms of action are extremely urgent [5].
Dipeptidyl aminopeptidases (DPAPs) are papainfold cysteine proteases that are expressed at all stages of parasite development [6,7] and might therefore be viable drug targets to treat malaria and prevent its transmission. DPAPs recognize the free N-terminus of protein substrates and cleave N-terminal dipeptides [8,9]. The mammalian homologue cathepsin C (CatC) is the best studied DPAP [10]. In most cells, CatC plays a catabolic lysosomal function. However, in immune cells it is responsible for activating various granule serine proteases involved in the immune response and inflammation such as neutrophil elastase, chymase, granzyme A and B or cathepsin G [11][12][13][14]. Because of its role in activating pro-inflammatory proteases, CatC has been pursued as a potential target for chronic inflammatory diseases [15][16][17]. Phase I clinical trials with CatC inhibitors have been performed by GSK (GSK2793660) [18] and AstraZeneca (AZD7986) [19], thus proving that DPAPs can be targeted with small drug-like molecules.
Three DPAPs are conserved across Plasmodium species but very little is known about their molecular functions. In P. falciparum, the most virulent Plasmodium species responsible for 90% of malaria mortality, attempts to directly knockout (KO) DPAP1 [20] or DPAP3 [21] have been unsuccessful, suggesting that they are important for parasite replication. In the P. berghei murine model of malaria, KO of DPAP1 or DPAP3 results in a significant decrease in parasite replication [22][23][24]. DPAP1 localizes mainly in the digestive vacuole [20], an acidic organelle where degradation of haemoglobin takes place. This proteolytic pathway provides a source of amino acids for protein synthesis and liberates space within the RBC for parasites to grow. DPAP1 has been proposed to play an essential role at the bottom of this catabolic pathway [20,25]. However, this function has not yet been confirmed genetically. Previously published inhibition studies suggested that DPAP3 was at the top of the proteolytic cascade that controls parasite egress form iRBCs [26]. However, our recent conditional KO studies have disproven this hypothesis and shown instead that DPAP3 activity is critical for efficient RBC invasion [21]. Finally, DPAP2 is only expressed in sexual stages and has been shown to be important for gamete egress from iRBCs, thus making it a potential target to block malaria transmission [27,28]. Overall, a pan-DPAP inhibitor will target the parasite at different stages of development, thus potentially slowing down the emergence of resistance.
A clear understanding of the determinants of substrate specificity of Plasmodium DPAPs and CatC will be required in order to develop pan-DPAP inhibitors with minimal off-target effects on host CatC, and to design highly specific inhibitors to study the biological function of DPAP1 and DPAP3. In this article we will use the accepted Schechter and Berger nomenclature to describe the specificity of proteases [29]. Residues upstream of the scissile bond will be referred to as P1, P2, P3, etc. Their side chains bind into the S1, S2, S3, etc pockets of the active site respectively. Residues downstream of the scissile bond are referred to as P1 0 , P2 0 , P3 0 , etc, and they bind into the corresponding S1 0 , S2 0 , S3 0 , etc pockets. The scissile bond is between the P1 and P1 0 positions.
A general approach to determine the specificity of proteases upstream of the scissile bond (nonprime pockets) is the use of positional scanning substrate libraries where a fluorophore is conjugated to the C terminus of a peptide library via an amide bond. Proteolytic cleavage of this bond results in a significant increase in fluorescence intensity allowing accurate measurement of substrate turnover. The most common libraries used for this purpose are positional scanning synthetic combinatorial libraries (PS-SCL) [30][31][32]. PS-SCL are composed of multiple sub-libraries designed to determine the specificity of each nonprime binding pocket in a protease. In each sub-library, the amino acid (AA) at a specific position is varied while a stoichiometric mixture of all natural AAs is used in all other positions. PS-SCL thus provide the substrate specificity at each site in the context of all possible combination of AAs at all other positions. Alternatively, the specificity of a given binding pocket can be determined by varying the identity of the AA at that position while fixing the rest of the peptide to residues known to be recognized by the protease of interest. This approach has been used to fingerprint the specificity of amino exopeptidases such as aminopeptidases [33,34] or DPAPs [35], which only recognize one or two AAs upstream of the scissile bond respectively. PS-SCL have also been applied to protease inhibitor libraries by replacing the fluorophore with a reversible or irreversible warhead [36]. Optimum substrates and inhibitors are then designed by combining the best residues in each position. Importantly, the recent incorporation of non-natural AAs into these libraries has significantly increased the chemical space that can be explored to characterize the specificity of proteases and has allowed the design of substrates and inhibitors with enhanced selectivity over compounds that contain only natural amino acids [37][38][39][40].
Structure-activity relationship (SAR) studies with positional scanning substrate and inhibitor libraries have been performed both on DPAP1 and CatC [25,35,41], but to a much lesser extent on DPAP3 [26]. Here, we used libraries of peptide-based substrates and inhibitors to determine the specificity of P. falciparum DPAP3 at the P1 and P2 positions. Importantly, the libraries used in this study have been previously screened against DPAP1 and CatC and are therefore ideal to compare the specificities of these three proteases [35]. Our studies show that DPAP3 preferentially cleaves after basic and large aromatic residues (P1 position), and that it prefers substrates having Nterminal aliphatic residues (P2 position). We also identified several non-natural P2 residues that are exclusively recognized by either DPAP1 or DPAP3. By combining the SAR information obtained from these substrate and inhibitor screens, we developed specific DPAP1 and DPAP3 inhibitors that remain selective in live parasites. Interestingly, while SAR information obtained from positional scanning substrate libraries is often used to develop potent protease inhibitors [39], we have identified significant differences in specificity between substrates and inhibitors, particularly in the case of DPAP3. Homology modelling and docking studies provide structural explanations about the differences in specificity between these enzymes, and between substrates and inhibitors in the case of DPAP3. Overall, our study shows that while highly potent inhibitors can be designed based on the sequence of optimal substrates, equally potent and specific inhibitors can be developed using sequences of nonoptimal substrates.

DPAP3 substrate specificity
A positional scanning library of 96 substrates (Fig. 1A), composed of a P1 sub-library of 39 substrates (P2 fixed to Met) and a P2 sub-library of 57 substrates (P1 fixed to homophenylalanine (hPhe)), was screened at 1 lM against recombinant DPAP3 (DPAP3; Fig. 1B,C). The fixed P1 and P2 side chains were selected based on previously known AA preferences for CatC. The heat map shown in Fig. 1B compares the specificities of DPAP3 with those previously obtained for DPAP1 and CatC [35] at the same substrate concentration. Note that D-Phg is the only D-AA in P2 that is cleaved by DPAP3, albeit poorly. The remaining 17 substrates containing D-AAs in P2 (D-hPhe and all natural D-AAs except D-Cys and D-Met) were not cleaved by DPAP3, nor by DPAP1 or CatC. To simplify Fig. 1, D-Phg is the only substrate containing a D-AA that is shown.
Clear differences in specificity were observed between the three DPAPs at the P2 position (Fig. 1B). DPAP3 seems to have a narrower P2 specificity for natural AAs than DPAP1 or CatC, probably reflecting its more specific biological function in RBC invasion compared to the catabolic functions of DPAP1 and CatC. Both DPAP1 and DPAP3 have a strong   preference for long aliphatic residues such as Leu, Ile, norleucine (nLeu), Met or norvaline (nVal). Some non-natural AAs seem to be exclusively cleaved by DPAP3 such as cyclohexylglycine (Chg), Phg or 4methyl-phenylalanine (Phe(Me)). Surprisingly, Phe (Me) is the only substrate in the library with an aromatic P2 residue that is accepted by DPAP3 even though vinyl sulfone (VS) inhibitors with P2 aromatic residues such as Tyr, Trp or nitrotyrosine (Tyr(NO 2 )) have been shown to be potent DPAP3 inhibitors [21,26]. Interestingly, an Ile in P2 is efficiently cleaved by both Plasmodium DPAPs but not by CatC.

Development of optimum DPAP3 substrates
To determine how P1 and P2 side chains influence k cat and K m for DPAP3, we performed Michaelis-Menten studies on selected substrates from the P1 and P2 libraries. In addition, we synthesized a series of substrates that combine optimal natural and non-natural AAs for DPAP3: Arg, hPhe, nLeu(o-Bzl) and Bpa in P1, and Leu, Val, nVal and Phg in P2. We also tested substrates predicted to be DPAP1-selective (Pro-Arg-ACC and hPro-hPhe-ACC) against DPAP3. Finally, because we were surprised by the lack of activity observed for substrates with aromatic P2 residues, we measured Michaelis-Menten parameters for Phe-Arg-ACC, Trp-hPhe-ACC, and Tyr(NO 2 )-hPhe-ACC. The sequence of the last substrate is based on the structure of SAK1 (Tyr(NO 2 )-hPhe-VS), which is the most potent DPAP3 inhibitor identified so far (see below). Table 1 and Fig. 1D report the Michaelis-Menten parameters determined for all these substrates (Michaelis-Menten curves are shown in Fig. S1). P1 residues have a significant influence in k cat , with nLeu(o-Bzl) and positively charged residues having the highest values. A positive charge on the d position (Arg(NO 2 ) and Arg) is favoured over the e position (Lys and hArg). Elongated aliphatic and hydrophobic residues in P1 decrease K m , especially when aromatic groups are distant from the peptide backbone. This is evident by the decreasing K m values between nVal, Met, hPhe, Bpa, Glu(Bzl) and nLeu(o-Bzl). This tendency was also observed for CatC and DPAP1 and might suggest the presence of a distal binding pocket (Fig. 1B), potentially an exosite, since P1 residues are usually solvent exposed in clan CA proteases.
P2 residues have a bigger influence on K m than P1, with Leu and nVal being optimal P2 residues for DPAP3. Beta-branched residues are not optimal for DPAP3 as can be observed by an increase in K m between nVal and Val or nLeu and Ile. However, the c-branched AA Leu has the lowest K m value.
Substrates with aliphatic P2 side chains that extend to the d position (Met and nLeu) result in higher K m values than slightly shorter ones (Leu and nVal) but also higher k cat values. Overall, combining optimal P1 (nLeu(o-Bzl) and Arg) and P2 (nVal and Leu) residues results in improved k cat /K m values (Table 1).
Interestingly, although substrates with Phg and indanyl-glycine (Igl) in P2, or Bpa in P1, are not the preferred AAs at these positions, these non-natural residues are structurally very different from natural AAs and are turned over quite efficiently by DPAP3 when combined with optimal P1 or P2 residues respectively, that is, Leu-Bpa-ACC or Phg-nLeu(o-Bzl)-ACC. Finally, the optimal substrate for DPAP1, hPro-hPhe-ACC [35] is very poorly turned over by DPAP3 (> 200fold difference in k cat /K m ). We think that substrates containing these non-natural AAs could be used as specific tools to measure DPAP1 or DPAP3 activity in biological samples, that is, parasite lysates or live parasites, an application we are currently investigating.
Finally, our studies show that substrates with aromatic P2 residues are poorly cleaved by DPAP3 compared to optimal substrates, that is, 100 to 1000-fold lower k cat /K m . This is surprising since vinyl sulfone inhibitors containing aromatic P2 residues such as Tyr (NO 2 ) or Trp are potent and selective DPAP3 inhibitors [21,26]. Because these two AA side chains have fluorogenic properties, we investigated whether the low turnover rate measured for Tyr(NO 2 )-hPhe-ACC and Trp-hPhe-ACC might be due to quenching effects. The emission of free ACC (0, 1, or 5 lM) in assay buffer was measured in the presence of 0-100 lM of these substrates ( Fig. 2A,B). No significant decrease in the ACC emission signal was observed even when substrates were present in 100-fold excess, thus indicating that the low turnover rates measured for these substrates are not due to quenching effects.
As an alternative method to confirm that Tyr(NO 2 )-hPhe-ACC and Trp-hPhe-ACC bind relatively poorly to DPAP3, we performed substrate competition assays using the (PR) 2 Rho substrate (k ex = 492 nm, k em = 523 nm), which emits at much higher wavelengths than ACC (k ex = 355 nm, k ex = 460 nm), thus allowing us to simultaneously measure the turnover of (PR) 2 Rho and ACC substrates without quenching interference. (PR) 2 Rho was initially designed as a DPAP1-specific substrate to directly measure the activity of this protease in crude parasite extracts [42], but it is also cleaved by DPAP3 with a K m,app of 40 lM (Fig. 3C). This substrate is cleaved twice by DPAPs, releasing two Pro-Arg dipeptides and the rhodamine 110 fluorophore. In this assay, we simultaneously measured inhibition of (PR) 2 Rho turnover by ACC substrates (IC 50 values) as well as the apparent K m values of these ACC substrates in the presence of 40 lM (PR) 2 Rho. As shown in Fig. 2D, the IC 50 and K m,app values obtained are within experimental error and, as expected, slightly higher than the K m values reported in Table 1 due to the substrate competition effect. Overall, the lack of quenching effect between Trp or Tyr(NO 2 ) and ACC, and the good agreement between IC 50 and K m,app measured under substrate competition conditions indicates that the Michaelis-Menten parameters reported in Table 1 are accurate and that substrates containing aromatic P2 AAs are indeed relatively poor DPAP3 substrates compared to those containing optimal aliphatic P2 residues. These results raise the question of why vinyl sulfone inhibitors with aromatic P2 residues are among the most potent DPAP3 inhibitors identified so far. To better understand this discrepancy, we measured the kinetics of inactivation of DPAP3 by the previously published vinyl sulfone inhibitor library [25,26]. N is the number of replicates. Standard errors are shown for each parameter. a For DPAP1, k cat = 0.22 AE 0.004 s À1 , K m = 0.67 AE 0.14 lM and k cat /K m = 320 000 AE 50 000 M À1 Ás À1 ; for CatC, k cat = 10.8 AE 0.2 s À1 , K m = 91 AE 5 lM and k cat /K m = 116 000 AE 11 000 M À1 Ás À1 [35].; b For DPAP1, k cat = 6.2 AE 0.4 s À1 , K m = 84 AE 9 lM and k cat / K m = 74 000 AE 2000 M À1 Ás À1 ; for CatC, k cat = 490 AE 10 s À1 , K m = 130 AE 10 lM and k cat /K m = 3 600 000 AE 300 000 M À1 Ás À1 [41].; c For DPAP1, k cat = 3.5 AE 0.1 s À1 , K m = 21 AE 2 lM and k cat /K m = 170 000 AE 10 000 M À1 Ás À1 ; for CatC, k cat = 180 AE 10 s À1 , K m = 51 AE 8 lM and k cat /K m =3600 000 AE 300 000 M À1 Ás À1 [41].

4003
The

DPAP3 inhibitor specificity
Time-and concentration-dependent inactivation of DPAP3 by the P2 library of vinyl sulfone inhibitors (P1 fixed to hPhe) was measured using a continuous assay at 2.2 lM of Met-nLeu(o-Bzl)-ACC (0.25 x K m ). For most compounds, the mechanism of inhibition was consistent with a two-step irreversible inhibition model (Eqn. 1, Fig. 3A).
K i is the inhibition equilibrium constant, and k inact the rate of covalent modification of the catalytic cysteine.
For a few inhibitors only k inact /K i values could be obtained, that is, no saturation was achieved in the k obs  vs.
[I] graph (Fig. 3B). The inhibition constants are reported in Fig. 3C and Table 2 (See also Fig. S2 for representative fits). Only one inhibitor, containing an amino-1-methyl-benzyl (Amb) group in P2, was not able to inhibit DPAP3 in a time-dependent manner under our assay conditions (Fig. 3D). This is probably due to the fact that this extended and rigid P2 AA (Fig. 3E) might prevent proper positioning of the vinyl sulfone group into the active site of DPAP3 to allow covalent modification of the catalytic cysteine. For this compound, a K i value for reversible inhibition was measured.
Overall, changes in P2 do not have a big influence in k inact with the exceptions of Asn, Phe(Me) and Tyr (NO 2 ), which significantly increase k inact . Intriguingly, substrates containing the latter two P2 residues were the only substrates with a P2 aromatic residue that could be cleaved by DPAP3 with k cat /K m > 2000 M À1 Ás À1 (Table 1). In terms of K i , we observed some SAR similarities between substrates and inhibitors: DPAP3 does not bind inhibitors with an N-terminal basic residue (Arg or Lys), but it is strongly inhibited (K i < 35 nM) by aliphatic residues (Leu, nLeu, hAla, or Val). However, we measured potent inhibition of DPAP3 with aromatic residues in P2 such as Phe, Tyr or Trp (k inact / K i ≥ 60 000 M À1 Ás À1 ). Substrates with these P2 residues show relatively poor substrate turnover (k cat / K m ≤ 2000 M À1 Ás À1 ). We also observed other discrepancies between substrates and inhibitors. For example, Thr in P2 results in a poor substrate (Fig. 1B) but a relatively potent inhibitor (k inact /K i = 55 000 M À1 Ás À1 ). Inversely, DPAP3 cleaves Phg-hPhe-ACC and Phe(Me)-hPhe-ACC with a k cat /K m of 11 000 and 12 000 M À1 Ás À1 , respectively, but the k inact /K i for the respective inhibitors are only 36 and 580 M À1 Ás À1 . As shown in Fig. 4A, there is not a clear correlation between k cat /K m and k inact /K i for DPAP3 as a function of the P2 residue, however, P2 residues that make optimal substrates also make good inhibitors (Val, nVal, nLeu).

Correlation between substrate turnover and inhibition for DPAP1 and CatC
To determine whether the lack of correlation between k cat /K m and k inact /K i observed for DPAP3 is a common feature in DPAPs, we calculated apparent k cat /K m values for DPAP1 for the P2 substrate library based on the activity measurements previously reported at 1 lM and the k cat /K m for hPro-hPhe-ACC [35]. We also measured Michaelis-Menten parameters for CatC for the P2 substrate library (Table 3 and Fig. S3), and k inact and K i values for DPAP1 and CatC for the P2 vinyl sulfone library (Table 2 and Fig. S2). As shown in Fig. 4B,C, we observed a good correlation between substrate turnover and k inact /K i for DPAP1 and CatC. However, a few discrepancies were observed for CatC (Fig. 4C). For example, the k cat /K m for Phe(Me)-hPhe-ACC is 30-fold higher than for Phg-hPhe-ACC while the k inact /K i for Phe(Me)-hPhe-VS is 13-fold lower than for Phg-hPhe-VS, thus resulting in a 400-fold discrepancy in the changes in k cat /K m and k inact /K i .
A systematic way to visualize discrepancies between substrate turnover and inhibition is to compare the fold difference in k cat /K m with that in k inact /K i for any two P2 residues (Eqn. 2). DACT We performed these pairwise calculations for each of the DPAPs studied here and have presented the results as heat maps in Fig. 4D-F. We observed significant and numerous discrepancies between substrates and inhibitors for DPAP3 (30% of pairwise DACT/DINH > 100 or < 0.01), almost no discrepancies for DPAP1 (only 1% DACT/DINH > 100 or < 0.01), and only a few for CatC (4% of DACT/DINH > 100 or < 0.01). Overall, this study indicates that the level of correlation between k cat /K m and k inact /K i is protease-dependent.

Development of DPAP1 and DPAP3-selective inhibitors
We next synthesized several inhibitors to determine whether the optimal nLeu(o-Bzl) P1 residue identified from the substrate screen could be used to increase the potency and specificity of inhibitors towards DPAP1 or DPAP3. We selected P2 AAs that were predicted to provide specificity towards DPAP1 (Pro and hPro) or DPAP3 (aromatic residues: Tyr(NO 2 ), Trp, Igl and 2Nal) based on our substrate and inhibitor screening results. We also included in our analysis the previously synthesized compound JCP410 (nVal-hPhe-VS) [26], since nVal is one of the best P2 residues identified from the substrate screen. We determined the inhibition constants of these compounds for DPAP1, DPAP3 and CatC (Table 4 and Figs 3 and S2). The major structural difference between these compounds and the inhibitor library is that they have a phenyl group in P' instead of a long aliphatic linker (Fig. 3A). This change usually increases the potency of compounds except in the context of a P2 Trp for DPAP3, or P2 Tyr(NO 2 ) or 2Nal for CatC (Tables 2 and 3). These exceptions indicate some level of interdependence between the prime and nonprime sites of DPAPs.
Compared to hPhe, P1 nLeu(o-Bzl) decreases k inact / K i value for DPAP3 by 4-to 18-fold except in the   context of a P2 Trp where it increases it by 24-fold, or a P2 hPro where there is no significant change (Table 4). These differences are mainly due to changes in K i rather than k inact and likely reflect cooperativity between the S1 and S2 pockets of DPAP3. However, in the case of DPAP1 and CatC, replacement of P1 hPhe with nLeu(o-Bzl) decreases K i . Because of this P1-P2 interdependence, the most potent inhibitors for DPAP3 are either a combination of P2 Tyr(NO 2 ) and P1 hPhe, that is, SAK1, or P2 Trp and P1 nLeu(o-Bzl), resulting in k inact /K i values close to 10 6 M À1 Ás À1 . Importantly, nVal-hPhe-VS is as potent as these two inhibitors (k inact /K i = 940 000 M À1 Ás À1 ), confirming that optimal inhibitors can be designed based on the structure of optimal substrates. These high second order rate constants are mainly driven by low K i values (<5 nM). While Trp-nLeu(o-Bzl), Igl-hPhe-VS and Tyr(NO 2 )-hPhe-VS show more than 100-fold and CatC (C) between vinyl sulfone inhibitors and P2 substrate library. For DPAP1, apparent k cat /K m (k cat /K m,app ) were calculated based on previously reported turnover rates at 1 lM [35]. k cat /K m,app were calculated similarly for DPAP3 and CatC for P2 substrates whose activity was too low to obtain accurate Michaelis-Menten parameters (i.e., substrates not present in Tables 1 and S2). Filled circles correspond to compounds belonging to the vinyl sulfone library (compounds in Table 2), and empty circles to inhibitors having a phenyl group in P1 0 ( Table 4). The P2 residue is labelled next to each data point. Pearson correlation coefficients (q) are shown for each protease and were calculated using the default function in Prism. Error bar represents the standard error of the fit for each parameter. (D-F) Comparison of changes in substrate turnover relative to inhibitor potency for any pair of P2 residues for the P2 substrate library and the VS inhibitor library (both with P1 hPhe). The log value of DACT/DINH (Eqn. 2) calculated for DPAP3 (D), DPAP1 (E) and CatC (F) are shown as a heat maps with values above and below zero in red and blue respectively. Each pairwise value showing more than a 100-fold discrepancy between activity and inhibition (DACT/DINH > 100 or < 0.01) is highlighted in bold.

Homology modelling and docking studies
In order to study the differences in substrate and inhibitor specificity between the different DPAPs, homology models of DPAP1 and DPAP3 were built based on the crystal structure of CatC. Selected compounds were docked into these structures using the MOE (Molecular Operating Environment) software. To visualize differences in the general structure of the active sites, the structures of the pan-DPAP inhibitor and substrate nVal-hPhe-VS and nVal-hPhe-ACC were superimposed into the active sites of each DPAP (Fig. 5A). As expected, the hPhe side chain is solvent exposed in the S1 pocket, the free N termini are in close proximity to the carboxylic group of the exclusion domain N-terminal Asp, and the nVal side chain points into the S2 pocket. The images in Fig. 5A,B clearly show that the DPAP3 S2 pocket is significantly larger than that of DPAP1 or CatC. To better quantify the volume of the S2 pockets, we used our models docked to the nVal-hPhe-VS inhibitor and applied the 'Site Finder' function in MOE to identify the residues that form the S2 pocket in each protease, and to calculate its volume (Fig. 5B). The relative volumes thus obtained for DPAP1, CatC and DPAP3 are 17, 19 and 24, thus confirming that the S2 pocket of DPAP3 is significantly larger than that of DPAP1 or CatC, and that the CatC S2 pocket is slightly larger than the DPAP1 one. These differences in size explain the specificity of these proteases with DPAP3 being the only one able to accommodate large aromatic P2 residues such as Trp or Tyr(NO 2 ), and DPAP1 preferring substrates with relatively small P2 AA such as Val, hAla, Ser or nVal. Similarly to DPAP1, CatC has a preference for small P2 residues but can also accommodate slightly bigger side chains such as Phe, His or 2fa ( Fig. 1 and Table 2).
To explain why DPAPs preferentially cleave substrates containing long hydrophobic non-natural AAs in P1, the structure of nVal-nLeu(o-Bzl)-ACC was docked into the different DPAP structures (Fig. 5C). Although it is difficult to predict how this side chain will bind in the active site of each enzyme given the flexibility of the aliphatic chain, the docked structures indicate that the phenyl group of nLeu(o-Bzl) might be able to bind beyond the S1 pocket into groves that are not accessible to natural amino acids. Our docking studies identified two potential distal binding sites: one above the S1 pocket (See DPAP3 model in Fig. 5C),   the other adjacent to the S2 pocket (DPAP1 and CatC models in Fig. 5C). Of particular interest is the welldefined grove next to the S2 pocket that is at the interface of the exclusion and catalytic domains. These pockets are present in all DPAPs, which might explain why long hydrophobic residues in P1 make better substrates for this enzyme family. Finally, we performed docking studies to try to explain why Tyr(NO 2 ) and hPro are the preferred P2 residues for DPAP3 and DPAP1 respectively (Fig. 5D,  E). For both DPAP1 and CatC, we observed proper docking of hPro-hPhe-ACC substrate, with the hPro residue fitting tightly into the S2 pocket, and the secondary amine of hPro pointing towards the N-terminal Asp side chain of the exclusion domain (Fig. 5E). However, we were unable to obtain any reasonable conformation of this substrate bound to DPAP3. These results reflect the difference in substrate turnover for the three DPAPs: k cat /K m = 320 000 M À1 Ás À1 for DPAP1, 116 000 M À1 Ás À1 for CatC and 300 M À1 Ás À1 for DPAP3. On the other hand, the hPro-hPhe-VS inhibitor could only be docked properly into DPAP1, which again might reflect the differences in k inact /K i values: 1 230 000 M À1 Ás À1 for DPAP1 vs. 69 000 and 76 000 M À1 Ás À1 for DPAP3 and CatC respectively.
We were not able to dock the Tyr(NO 2 )-hPhe-VS compound into any of the DPAP structures. This is especially surprising since this is the most potent DPAP3 inhibitor identified so far. As an alternative, we modelled how this inhibitor would fit into the DPAPs active sites after covalent modification of the catalytic Cys by the vinyl sulfone electrophile (Fig. 5D). In these models, we clearly see that the Tyr(NO 2 ) side chain makes very significant steric clashes in the DPAP1 and CatC S2 pockets, but only a few were observed in DPAP3. The steric clashes in DPAP3 could easily be avoided by allowing movement of the side chains forming the S2 pocket as shown in the energy minimized structure of Fig. 5D. This docked structure identified potential interactions between the Tyr(NO 2 ) side chain and DPAP3 that can explain why Tyr(NO 2 )-hPhe-VS is such a potent DPAP3 inhibitor (k inact /K i = 950 000 M À1 Ás À1 ). These include hydrogen bonds between the NO 2 group and the amide bond of Ile552 and the hydroxylic group of Tyr716 at the bottom of the pocket, stacking interactions with Tyr551 and the free amine of the inhibitor and further hydrophobic interactions with Tyr551. This very tight fit within the S2 pocket might also explain why the equivalent substrate Tyr(NO 2 )-hPhe-ACC has a relatively low turnover rate given that potential cooperativity in binding between the S2 and S' pockets might prevent binding of the Tyr(NO 2 ) side chain into the S2 pocket. Indeed, the K m for Tyr (NO 2 )-hPhe-ACC is 9.9 lM while the K i for Tyr (NO 2 )-hPhe-VS is 2.4 nM. We speculate that the larger size of the substrate ACC group, compared to the inhibitor phenyl group, might prevent proper binding to the Tyr(NO 2 ) side chain deep into the S2 pocket.

Testing inhibitor specificity in live parasites
We then tested the potency and selectivity of our DPAP1-and DPAP3-specific inhibitors in live parasites using the FY01 activity-based probe (ABP) in a competition assay. ABPs are small molecule reporters of activity that use the catalytic mechanism of the targeted enzyme to covalently modify its active site. A reporter tag, usually a fluorophore or a biotin, allows visualization and quantification of the labelled enzymes in a gel-based assay [44]. FY01 is a cell-permeable fluorescent ABP that was initially developed for CatC [45] but also labels Plasmodium DPAPs and, to a lesser extent, the falcipains [25,26]. Falcipains (FPs) are clan CA cysteine proteases involved in haemoglobin degradation (FP2, FP2 0 an FP3) [46] and possibly RBC invasion (FP1) [47,48]. Binding of inhibitors into the active site of any of these cysteine proteases prevents probe labelling resulting in the disappearance of fluorescent bands in a SDS/PAGE gel.
Live parasites were treated with different concentrations of inhibitor for 30 min, and the residual level of DPAPs and FPs activities were labelled with FY01 and quantified by densitometry (Fig. 6A). Dose response curves are shown in Fig. 6B, and IC 50 values are reported in Table 5. Inhibitors with a P2 Pro or hPro are equally potent and inhibit DPAP1 at mid nanomolar concentrations. However, P2 Pro makes the inhibitor more selective since it does not target the FPs. Compounds with a P2 Trp or Tyr(NO 2 ) are highly specific for DPAP3 in intact parasites. Surprisingly, the inhibitor with homoprolylglycine in P1 (Trp-hPG-VS) is by far the most potent DPAP3 inhibitor in intact parasites (IC 50 = 1.4 nM) despite being five-to 100-fold less potent than Trp-nLeu(o-Bzl)-VS or Trp-hPhe-VS against recombinant DPAP3 (see k inact /K i values in Table 4). This suggests that this compound is metabolically more stable and/or that decreasing the hydrophobicity of the P1 residue enhances the cell permeability of the compound. Inhibitors need to cross four membranes to reach DPAP3: the RBC, parasitophorous vacuole and parasite plasma membranes, plus the membrane of the apical organelle where DPAP3 resides [21]. (The parasitophorous vacuole is a membrane-bound structure within which the parasite grows and multiplies isolated from the RBC cytosol.) However, crossing of the parasitophorous vacuole membrane is not likely to be a limiting factor to reach DPAP3 since this membrane is generally permeable to small molecules. It is also likely that the apparent increased potency of Trp-hPG-VS in live parasites might be due to its accumulation into the DPAP3 acidic organelle via protonation of its free amine. However, we predict that this lysosomotropic effect likely occurs for all DPAP inhibitors presented here.

Discussion
This study provides the first characterization of the specificity of DPAP3, a cysteine protease important for efficient invasion of RBCs by the malaria parasite [21]. DPAP3 is a highly efficient proteolytic enzyme showing similar k cat and k cat /K m values as those measured for DPAP1 or CatC when using optimal substrates (Table 1). In general, DPAP3 has a narrower substrate specificity than either DPAP1 or CatC, and it preferentially cleaves substrates with aliphatic residues at the N terminus. Our study also shows similar P1 substrate specificity across all DPAPs, that is, a strong preference for basic or aromatic residues. The previously described DPAP1 inhibitor SAK2 (Pro-hPhe-VS) shows the greatest specificity for DPAP1 in live parasites. Incorporation of optimal P1 (nLeu(o-Bzl)) and P2 (hPro) residues identified from the substrate screen improves the potency of DPAP1 inhibitors both in vitro and in live parasites, but also results in some loss in specificity (Table 4 and Fig. 6). DPAP3 was the only DPAP able to cleave some substrates with aromatic P2 residues (Tyr, Phe(Me), Phg), albeit with relatively low turnover rates. This is consistent with the larger volume of the S2 pocket observed in our modelling studies (Fig. 5). Surprisingly, vinyl sulfone inhibitors with P2 aromatic residues are highly specific for DPAP3 and as potent as compounds with optimal P2 residues identified from the substrate screen (Table 4 and Fig. 6). Our docking studies with Tyr(NO 2 )-hPhe-VS predict very significant steric clashes in the S2 pocket of DPAP1 and CatC (Fig. 5D), explaining why this inhibitor is more than 1000-fold specific for DPAP3 (Table 4). We also observed that the Tyr(NO 2 ) side chain is able to fully occupy the DPAP3 S2 pocket and form potential hydrogen bond, stacking and hydrophobic interactions with residues in this pocket, which explain the potency of this inhibitor (Fig. 5D). Finally, our computational and experimental studies suggest the presence of cooperativity between the S2 and S' pockets of DPAP3, which can explain the differences in specificity observed between substrates and inhibitors containing large hydrophobic residues.
Despite being highly potent DPAP1 and/or DPAP3 inhibitors, the vinyl sulfone inhibitors presented here only show antiparasitic activity at mid to high micromolar concentrations [21,25,26], probably due to metabolic stability issues. Indeed, we have previously shown that this is the case for Pro-hPhe-VS (SAK2), which is not able to sustain target inhibition in live parasites [25]. A possible cause for this instability is the presence of multiple aminopeptidases in the malaria parasite that might cleave the amide bond of these compounds [49], thus preventing them from binding into the DPAPs active sites. Nonetheless, this study provides a strong SAR foundation to develop potent nonpeptidic inhibitors able to sustain DPAP inhibition. From a drug development point of view, our SAR studies indicate that inhibitors with short aliphatic P2 residues strongly inhibit both DPAPs (Fig. 4E), and that potent pan-DPAP inhibitors can be The benzyl group (spacefill atoms) can reach into different conserved groves distal from the S2 and S1 pockets. (D) Side view of the S2 pocket after modelling the nTyr(NO 2 )-hPhe-VS inhibitor into the structures of each DPAP after covalent modification of the catalytic cysteine by the vinyl sulfone group. The surface of the S2 pocket is shown to illustrate steric clashes between the P2 side chain (spacefill atoms) and the S2 pocket. The structure of the inhibitor bound to DPAP3 was further refined by allowing energy minimization of residues within 4.5 A of the Tyr(NO 2 ) side chain (lower right image). Note that in DPAP3, the NO 2 group might form hydrogen bonds with the amide bond of Ile552 (blue arrows) and with the hydroxylic group of Tyr716 (white arrows) at the bottom of the S2 pocket. We also observed potential stacking interactions between Tyr551 (orange arrows) and the free amine of the inhibitors, as well as hydrophobic interactions with the Tyr(NO 2 ) side chain. (E) Docking of hPro-hPhe-ACC into the structure of CatC and of hPro-hPhe-VS and hPro-hPhe-ACC into the model of DPAP1. Note that neither of these compounds could be docked into the DPAP3 model, nor hPro-hPhe-VS into the CatC structure. developed. Unfortunately, we did not identify any clear P1 or P2 residue that would discriminate malarial DPAPs from host CatC. Therefore, further studies are required to determine whether differences in specificity in the prime binding pockets can be exploited to develop parasite-specific inhibitors.
Interestingly, our docking studies with nVal-nLeu(o-Bzl)-ACC identified potential distant binding sites that can only be reached with non-natural AA and that could explain the DPAPs preference for long hydrophobic residues in P1 (Fig. 5C). Of particular interest is the well-defined pocket adjacent to the S2 pocket at the interface of the catalytic and exclusion domains that is present in all DPAPs (Fig. 5A). Because DPAPs are the only clan CA proteases with an exclusion domain, designing compounds that bind into this pocket might be a good strategy to prevent off-target effects against other clan CA endopeptidases. In addition, our homology models suggest that this pocket is much deeper in DPAP1 and DPAP3 than in CatC (Fig. 5A). Therefore, compounds designed to bind deep into this pocket might be specific for Plasmodium DPAPs.
While inhibition of host CatC might be a concern when developing DPAP inhibitors as antimalarials, it is important to point out that given the short-term treatment required for antimalarial therapy (single dose or less than 3 doses in 3 days), we think it is unlikely that inhibition of CatC would lead to adverse side effects. Firstly, highly specific DPAP inhibitors might not be necessary given that a high level (> 95%) of sustained CatC inhibition is required to induce a decrease in the activity of serine proteases activated by CatC [50]. Secondly, activation of granule serine proteases by CatC takes place during cell differentiation in the bone marrow, and a decrease in the levels of serine proteases activities in circulating immune cells is only achieved after more than 2 weeks of daily treatment with CatC inhibitors [19]. And thirdly, no signs of toxicity were observed in phase I clinical trials when volunteers were treated daily for more than 3 weeks with CatC inhibitors, albeit some on-target side effects such as plantar and palmar epithelial desquamation were observed in some instances [18,19]. However, these side effects were not observed in volunteers who received a single dose of CatC inhibitor, nor within the first week of daily treatments.
Positional scanning substrate libraries have been successfully used over the last 20 years [30][31][32] to determine the specificity of proteases and guide the synthesis of inhibitors. For example, highly potent and specific inhibitors for caspases [51], neutrophil elastase [38], the proteasome [52] or the Zica virus NS2B-NS3 protease [53], have been developed based on the substrate specificity of proteases. Here, we have also shown that vinyl sulfone inhibitors containing P1 and P2 residues corresponding to optimal DPAP1 or DPAP3 substrates result in extremely potent inhibitors. However, our studies clearly identified differences in amino acid preferences between substrates and Fig. 6. Selective inhibition of DPAP1 or DPAP3 in live parasites. Intact mature schizonts were pretreated for 30 min with increasing concentrations of inhibitor followed by 1 h treatment with FY01. Samples were then run in an SDS/PAGE gel and the in-gel fluorescence measured using a fluorescence scanner. (A) Representative gel images obtained for each of the inhibitors. Fluorescent bands corresponding to the different cysteine proteases labelled by the probe are indicated with arrowheads (DPAP3, red; FPs, green; DPAP1, blue). (B) The fluorescence intensity of each of the indicated bands was quantified by densitometry using ImageJ, and the fluorescence values normalized to the DMSO control. IC 50 values are reported in Table 5. Note that control inhibitors containing D-Trp are more than 1000-fold less potent than those with L-Trp. Three technical replicates were performed; dose responses for a representative replicate are shown.  inhibitors, especially for DPAP3. This lack of correlation between substrates and inhibitors might not have been more broadly reported in the literature because, in general, either substrate or inhibitor libraries are used to determine the specificity of a protease, but not both. Also, inhibitors are not usually designed based on the structure of nonoptimal substrates, and as indicated above, inhibitors that mimic optimal substrates are generally very potent. That said, there are multiple reasons that can account for discrepancies in specificity between substrates and inhibitors: First, although the substrate and inhibitor libraries used in this study have equivalent P1 and P2 residues, the structural features that bind into the S' pockets are quite different. Therefore, if the specificity of a protease shows interdependence between its prime and nonprime binding pockets, it might explain differences in specificity. This is likely the case for DPAPs since we observed a 50-fold increase in k cat between the Phe-Arg-ACC and Phe-Arg-bNA substrates. These two substrates only differ in the structure of the fluorophore that binds in the S1 0 pocket (Table 1). Also, while we observed very good correlation between k cat / K m and k inact /K i of DPAP1 for the P2 substrate and VS library (compounds in Table 2), in the context of a phenyl group in P1 0 (inhibitors in Table 4), we observed clear discrepancies between substrates and inhibitors (Fig. 4B).
Second, the position of the electrophilic warhead within the active site might differ from that of the scissile bond in a substrate, especially in terms of distance and orientation relative to the catalytic cysteine. This positioning might be differently affected by changes in the P1 and P2 residues of substrates and inhibitors. Indeed, a recent study on caspases has shown that acyloxymethyl ketone covalent inhibitors might act through a reversible mechanism even if they are designed based on the sequence of optimal substrates [51]. However, this is an exception rather than the norm. Also, ABPs designed to profile deubiquitinating proteases by conjugating an electrophile to the C terminal of ubiquitin have been shown to label different subsets of enzymes depending on the warhead used [54].
Third, substrate turnover by Cys and Ser proteases requires two different chemical steps: First, nucleophilic attack of the peptide bond by the catalytic residue to form the acyl intermediate and release of the Cterminal product of proteolysis; and second, hydrolysis of the acyl intermediate by an activated water molecule to reconstitute the free enzyme and release the Nterminal product of the reaction (Scheme 1). Therefore, peptide sequences that are poorly turned over because of a very slow acyl intermediate hydrolysis step might be a good strategy to design covalent inhibitors. For example, a substrate that displaces the catalytic water from the active site upon formation of the acyl intermediate will be very poorly turned over.
And fourth, the reaction mechanism between covalent inhibition and substrate turnover is quite different making k cat , K m , and k cat /K m not directly comparable with k inact , K i , and k inact /K i (Scheme 1). k cat , K m , and k cat /K m are empirical parameters obtained under steady-state conditions while k inact , K i , and k inact /K i are real kinetic and thermodynamic constants that cannot be measured under steady-state conditions since covalent inhibitors deplete the concentration of free enzyme over time. k cat might be equivalent to k inact only if k Ac << k Dac , that is, formation of the acyl intermediate is the rate limiting step. K m might be equivalent to K i only if k Ac << k off and k Ac << k Dac, and k cat /K m might be equivalent to k inact /K i only if k off >> k Ac . Although these conditions might be met for certain substrates and inhibitors, the relative magnitudes of all these rate constants depend on the amino acid sequence. For Scheme 1. Comparison between irreversible inhibition and substrate turnover kinetic constants. E, I, E:I and E-I represent free enzyme, inhibitor, inhibitor associated with the enzyme and enzyme covalently modified by the inhibitor respectively. S, E:S, E-Ac, P C and P N represent the substrate, the Michaelis-Menten enzyme-substrate complex, the acyl intermediate and the C-and N-terminal products of proteolysis respectively; k on and k off represent the association and dissociation rate constant for substrates or inhibitors; and k Ac and k Dac represent the kinetic constants for the formation of the acyl intermediate and its hydrolysis respectively. example, detailed presteady-state kinetics and kinetic isotope effects studies on CatC have shown that the rate limiting step in the turnover of dipeptidic AMC substrates can be either formation of the acyl intermediate, its hydrolysis, or a contribution of both, depending on the nature of the P1 residue [55,56].
Although several studies have compared the potency of inhibitors to the turnover of equivalent substrates for selected peptide sequences, to the best of our knowledge, this is the first study that systematically compares the potency of a peptide-based covalent inhibitor library with the turnover efficiency of a substrate library at a specific position. We think that the discrepancies observed here between k cat /K m and k inact /K i might be present in other proteases, but the level of discrepancy will be dependent on the protease studied as well as on the type of substrate and covalent inhibitor that are being compared. We also predict that these discrepancies will be more pronounced if cooperativity exists between the prime and nonprime binding pockets.
Finally, we would like to point out that discrepancies between substrates and inhibitors have been reported in the literature. Some studies on caspases [51,57] and cathepsins [58] have shown that inhibitors designed based on the structure of specific substrates do not always retain their selectivity. Also, in a few instances, the sequence of an optimal substrate or inhibitor might render the equivalent inhibitor or substrate completely inactive [51]. Finally, different N-terminal capping groups in PS-SCL of substrates have been shown to result in different amino acid preferences in the S4 pocket of the Zika virus NS2B-NS3 protease [53,59]. Therefore, if cooperativity exists between P4 and other positions, the influence of the capping group on the S4 pocket specificity might also influence that of other pockets.
Overall, our detailed specificity study on DPAPs indicates that there can be substantial differences in specificity between substrates and covalent inhibitors. Although it is now well established that highly potent inhibitors can be developed based on the structure of optimal substrates, this might sometimes result in some loss of specificity. This study clearly demonstrates that optimal inhibitors with improved specificity can be developed based on the structure of relatively poor substrates.

Reagents
The syntheses of the DPAP substrate library and that of Val-Arg-ACC, Phe-Arg-ACC, Tyr(NO 2 )-hPhe-ACC [41] and (PR) 2 Rho [41] have been previously described. Additional substrates used in this study and were synthesized following previously published methods [35,60]. One gram (0.74 mmol) of Fmoc-protected Rink Amide resin (Iris Biotech GmbH, Germany) was added to a glass solid-phase reaction vessel. Next, 5 mL of dichloromethane (DCM) was added and the resin was gently stirred once per 10 minutes for 1 h, then filtered and washed three times with N,N-dimethylformamide (DMF). Fmoc-protecting group was removed using 20% piperidine in DMF (in three cycles: 5 min, 5 min, and 25 min), filtered each time and rinsed with DMF (six times). A ninhydrin test was performed to confirm resin Fmoc deprotection. Next, 2.5 eq of Fmoc-ACC-OH (1.85 mmol, 816 mg) was preactivated with 2.5 eq of HOBt (1.85 mmol, 278 mg) and 2.5 eq of DICI (1.85 mmol, 242 lL) in DMF for 3 min and the mixture was added to the resin. Reaction was gently agitated for 24 h at room temperature. Next, resin was washed five times with DMF and the reaction was repeated using 1.5 eq of the above reagents to improve the yield of ACC coupling. After 24 h of gentle stirring, resin was washed with DMF and Fmoc-protecting group was removed from ACC with the use of 20% piperidine in DMF (5 min, 5 min, and 25 min), filtered and washed with DMF (six times). Resin was subsequently washed with DCM (3 times) and MeOH (3 times), dried over P 2 O 5 and divided into eight equal portions (0.09 mmol per portion). Each portion of the H 2 N-ACC-resin was placed into the wells of semiautomatic FlexChem solid phase synthesizer cartridge (SciGene, USA). Then, to each well, 2.5 eq of Fmoc-P1-OH (0.225 mmol) with 2.5 eq of HATU (0.225 mmol, 86 g), and 2.5 eq of 2,4,6-collidine (0.225 mmol, 30 lL) in DMF were added. Reactions were carried out for 24 h with gentle agitation of reaction cartridge, followed by washing the resin five times with DMF. P1 coupling reactions were repeated using 1.5 eq of above reagents. P1 Fmoc-protecting group was removed from each substrate using 20% piperidine in DMF (5 min, 5 min, and 25 min), and the resin was washed six times with DMF. A ninhydrin test was performed to confirm P1 Fmoc deprotection. Next, 2.5 eq Fmoc-P2-OH (0.225 mmol) was preactivated with 2.5 eq of HOBt (0.225 mmol, 34 mg) and 2.5 eq of DICI (0.225 mmol, 30 lL) in DMF, added to the cartridge wells containing 1 eq of H 2 N-P1-ACC-resin and gently agitated for 3 hours. A ninhydrin test confirmed the complete P2 coupling. Next, resin was filtered and washed with DMF (six times). Fmocprotecting group was removed using 20% piperidine in DMF (5 min, 5 min and 25 min), followed by washing the resin six times with DMF and performing a ninhydrin test. Next, the HN2-P2-P1-ACC-resin product was washed with DMF (six times), DCM (three times) and MeOH (three times), dried over P 2 O 5 and cleaved from the resin with a mixture of TFA:TIPS:H2O (v/v/v 95/2.5/2.5). The crude product was purified by HPLC (Waters system), lyophilized and dissolved in DMSO to a final concentration of 20 mM. Each substrate was analysed by analytical HPLC and High Resolution Mass Spectrometry (See Appendix S1). The syntheses of the vinyl sulfone inhibitor library, SAK2, SAK1, L-WSAK and D-WSAK were also previously described [21,26]. The synthesis of additional vinyl sulfone inhibitors with natural and non-natural amino acids in the position P1 and P2 is based on a peptide coupling between the activated carboxylic acid of amino acid in the P2, and the unprotected amine of the amino acid P1 bearing the vinyl sulfone. The coupling reaction is followed by the removal of the Boc-protecting group of P2 to afford the desired inhibitor. The synthesis of the vinyl sulfone was done following previously reported methods [61]. Briefly, the carboxylic acid of the amino acid in P1 is coupled with N,O-Dimethylhydroxylamine to form a Weinreb-Nahm amide. The amide is reduced to the aldehyde with LiAlH 4 . With the carbonyl at hand, in Horner-Wadsworth-Emmons (HWE) type reaction with diethyl phenylsulfonylmethyl-phosphonate, the e-alkene is formed, which is the nucleophilic trap for the active Cys residue of the protease. The tert-butyloxycarbonyl-protecting group is removed in mild acid conditions, and coupled with the activated carboxylic acid as indicated above. Details about the synthesis and structural characterization of these inhibitors and their synthetic intermediates are described in the supplementary methods.
The Phe-Arg-bNA and Gly-Arg-AMC substrates were purchased from Sigma. Recombinant DPAP3 was expressed in insect cells using the baculovirus system as recently described [21]. Bovine CatC was purified to homogeneity from spleen by modification of a method described previously [62,63].

Recombinant DPAP3 active site titration
Recombinant DPAP3 was expressed in SF9 insect cells and purified from the culture supernatant by sequential ion exchange, Ni-NTA and size exclusion chromatography as previously described [21]. To accurately determine the concentration of active DPAP3 in our enzyme stocks, we used the FY01 ABP. Because ABPs only react with the active form of an enzyme and covalently modify its active site, in this case the catalytic Cys, they can be used to perform accurate active site titrations.
Our stock of DPAP3 was diluted 20-fold in assay buffer (100 mM sodium acetate, 100 mM NaCl, 5 mM MgCl 2 , 1 mM EDTA, 0.1% CHAPS, and 5 mM DTT at pH 6), pretreated for 30 min with DMSO or 1 lM SAK1 (Tyr(NO 2 )-hPhe-VS), and DPAP3 labelled with 1 lM FY01 for 1 h at RT. These samples were run on a SDS/PAGE gel along with a serial dilution of free probe (1.5-100 nM). In-gel fluorescence was measured using a Bio-Rad PharosFX flat-bed scanner and the intensity of the fluorescent bands quantified using Ima-geJ. The fluorescent signal from the free probe was used as a calibration curve and compared to the difference in signal between the DMSO and SAK1-treated DPAP3 (Fig. 7). Using this method, we determined that our DPAP3 stock contained 840 nM of active protease.

Substrate turnover assay
The substrate library was screened in triplicate at 1 lM substrate and 1 nM DPAP3 in assay buffer. Substrate turnover was measured over 30 min at RT using a SpectraMax M5e plate reader (Molecular Devices): k ex = 355 nm, k em = 460 nm, emission filter 455 nm, for ACC or AMC (7amino-4-methylcoumarin) substrates; k ex = 315 nm, k em = 355 nm, emission filter 420 nm, for Phe-Arg-bNA; and k ex = 492 nm, k em = 523 nm, emission filter 520 nm for (PR) 2 Rho. Calibration curves of free bNA (b-napthylamide) and ACC (0-500 nM) under the same assay conditions were performed to convert the turnover rate measured as fluorescent Fig. 7. Active site titration of DPAP3 using activity-based probes. (A) Labelling of purified rDPAP3 by FY01 in the presence or absence of SAK1. Our stock of rDPAP3 was diluted 20-fold in assay buffer, treated with DMSO or 1 lM SAK1 for 30 min, and labelled with FY01 for 1 h. Samples were run on an SDS/PAGE gel and DPAP3 labelling measured using a flatbed fluorescence scanner. (B) Calibration curve of free probe measured on the same SDS/ PAGE gel as the one shown in A. (C) The fluorescence signal for rDPAP3 labelling and free probe was quantified using ImageJ, and the concentration of labelled rDPAP3 calculated based on the calibration curve. units per second into moles per second. To determine k cat and K m values, substrate turnover was measured at different substrate concentrations in assay buffer using 1 nM of DPAP3 or 1 nM of bovine CatC. The initial velocities were then fitted with Prism to the Michaelis-Menten Eqs. 3 or 4 to obtain accurate k cat and K m or k cat /K m values respectively. A minimum of three replicates were performed for each substrate.
V i is the initial velocity and Et is the total concentration of active protease.

Irreversible inhibition assay
To determine the inhibition constants of vinyl sulfone inhibitors against DPAP3, 2.2 lM of Met-nLeu(o-Bzl)-ACC (0.25 x K m ) was mixed with increasing concentrations of inhibitor in assay buffer, and the turnover rate was measured over 40 min at RT after addition of 0.2 nM of DPAP3. The data were analysed according to the irreversible inhibition model shown in Eqn. 1. First, the progress curves (fluorescent units vs. time) were fitted to Eqn. 5, where F is the measured fluorescence at time t, F 0 the initial fluorescence, V 0 the initial turnover rate, and k obs the observed second order rate constant of inhibition measured at each inhibitor concentration [64]. ðF À F 0 Þ ¼ V 0 1 À e k obs Át k obs ð5Þ Second, k obs values were fitted to Eqs. 6 or 7 to obtain the k inact and K i or k inact /K i values respectively.
Inhibition constants for DPAP1 and CatC were determined in a similar way but using 0.25 x K m concentrations of the Pro-Arg-AMC (20 lM) and Gly-Arg-AMC (15 lM) respectively. Inhibition of DPAP1 was directly measured in parasite lysates diluted 100-fold in assay buffer using the DPAP1-selective substrate Pro-Arg-AMC [42]; 0.2 nM of CatC was used for these inhibition studies. Note that because DPAP1 inhibition was directly measured in parasite lysates, inhibitors might have been partially inactivated through the action of aminopeptidases present in the lysates during the course of the assay. Therefore, the potency of the inhibitors might have been underestimated. However, for all inhibitors the kinetics of DPAP1 inhibition is consistent with our inhibition model (Eqn. 1) indicating that time-dependent degradation of the inhibitor is likely to be negligible.

Homology modelling and docking studies
Homology models for DPAP1 and DPAP3 were built based on the crystal structure of CatC (PDB:1JQP), and selected inhibitors and substrates were docked into the CatC structure and DPAP1 and DPAP3 models. Phyre was used to build the initial alignment of human CatC, DPAP1 and DPAP3 with PSIRED secondary structure prediction and manual adjustments of the alignment to remove insertion sequence not present in the CatC structure. Alignment used to build the DPAP1 and DPAP3 models showed 33% and 30% sequence identity and 50% and 49% similarity to CatC respectively. MOE.2016.08 was used to build the homology models of DPAP1 and DPAP3 from this alignment. Ten models were built with Protonate3D and coarse minimization applied to the final model. Docking was carried out with Gold v5.6.2. The docking region was defined as a 10 A radius from the catalytic cysteine. Energy minimization was only applied to the small molecules, not to the enzyme backbone or side chains. Default/automatic settings were employed with GoldScore selected for the scoring function and the top 20 poses saved. Each pose was visually inspected for suitability. The volume of the S2 pocket was calculated based on the docked structures of the nVal-hPhe-VS inhibitor. We use the 'Site Finder' function in MOE to identify a binding pocket within 9 A of the d carbon of the nVal side chain and to calculate a relative volume of the S2 pocket for each of the DPAPs.

Measurement of inhibitor specificity in live parasites
Anonymized human blood used to culture P. falciparum was sourced ethically from the United Kingdom National Health Service Blood and Transplant (NHSBT) Special Health Authority in accordance with the United Kingdom Human Tissue Act, which conforms to the Declaration of Helsinki. Blood was obtained with informed consent of the donors and for research purposes only.
Inhibition of Plasmodium DPAPs and the falcipains in intact parasites was measured using the FY01 ABP in competition assays as previously described [25,26]. Because DPAP3 is maximally expressed in very mature schizonts [6], labelling was performed after treating parasites with 1 lM of 'Compound 2 0 , a cGMP-dependent protein kinase inhibitor that arrests parasite development 15-30 min before they egress from the infected RBC [65]. For each sample, 5 lL of percoll-purified schizonts was diluted in 45 lL of RPMI (Gibco), pretreated for 30 min with a dose response of inhibitor, and labelled for 1 h with 1 lM FY01. Samples were then boiled for 10 min in loading buffer and run on a 12% SDS/PAGE gel. Fluorescently labelled proteases in the gel were detected using a Bio-Rad PharosFX flatbed fluorescence scanner.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. Michaelis-Menten fits for DPAP3. Fig. S2. Representative irreversible inhibition fits. Fig. S3. Michaelis-Menten fits for CatC. Appendix S1. Synthesis and characterization of inhibitors and substrates.