Journal list menu

Volume 271, Issue 10 p. 2000-2011
Free Access

Characterization of a digestive carboxypeptidase from the insect pest corn earworm (Helicoverpa armigera) with novel specificity towards C-terminal glutamate residues

David P. Bown

David P. Bown

School of Biological and Biomedical Sciences, University of Durham, UK

Search for more papers by this author
John A. Gatehouse

John A. Gatehouse

School of Biological and Biomedical Sciences, University of Durham, UK

Search for more papers by this author
First published: 27 April 2004
Citations: 47
J. A. Gatehouse, School of Biological and Biomedical Sciences, University of Durham, South Road, Durham, DH1 3LE, UK. Fax: + 44 191 334 1201, Tel.: + 44 191 334 1264, E-mail: [email protected]

Enzymes: glutamate carboxypeptidases (EC 3.4.17).

Note: A website is available at http://silver-server.dur.ac.uk

Abstract

Carboxypeptidases were purified from guts of larvae of corn earworm (Helicoverpa armigera), a lepidopteran crop pest, by affinity chromatography on immobilized potato carboxypeptidase inhibitor, and characterized by N-terminal sequencing. A larval gut cDNA library was screened using probes based on these protein sequences. cDNA HaCA42 encoded a carboxypeptidase with sequence similarity to enzymes of clan MC [Barrett, A. J., Rawlings, N. D. & Woessner, J. F. (1998) Handbook of Proteolytic Enzymes. Academic Press, London.], but with a novel predicted specificity towards C-terminal acidic residues. This carboxypeptidase was expressed as a recombinant proprotein in the yeast Pichia pastoris. The expressed protein could be activated by treatment with bovine trypsin; degradation of bound pro-region, rather than cleavage of pro-region from mature protein, was the rate-limiting step in activation. Activated HaCA42 carboxypeptidase hydrolysed a synthetic substrate for glutamate carboxypeptidases (FAEE, C-terminal Glu), but did not hydrolyse substrates for carboxypeptidase A or B (FAPP or FAAK, C-terminal Phe or Lys) or methotrexate, cleaved by clan MH glutamate carboxypeptidases. The enzyme was highly specific for C-terminal glutamate in peptide substrates, with slow hydrolysis of C-terminal aspartate also observed. Glutamate carboxypeptidase activity was present in larval gut extract from H. armigera. The HaCA42 protein is the first glutamate-specific metallocarboxypeptidase from clan MC to be identified and characterized. The genome of Drosophila melanogaster contains genes encoding enzymes with similar sequences and predicted specificity, and a cDNA encoding a similar enzyme has been isolated from gut tissue in tsetse fly. We suggest that digestive carboxypeptidases with sequence similarity to the classical mammalian enzymes, but with specificity towards C-terminal glutamate, are widely distributed in insects.

Abbreviations

  • FAAK
  • furylacryloyl-Ala-Lys
  • FAEE
  • furylacryloyl-Glu-Glu
  • FAPP
  • furylacryloyl-Phe-Phe
  • PCI
  • potato carboxypeptidase inhibitor
  • SKTI
  • soya bean Kunitz trypsin inhibitor
  • ACE
  • angiotension 1 converting enzyme
  • Carboxypeptidases are exopeptidases that remove a single amino acid residue from the C-terminus of a protein or peptide substrate. They play an important role in protein digestion in the guts of higher animals, acting to liberate free amino acids from the peptides produced by endopeptidase action, thus completing the digestive process and generating molecules that can be absorbed by the gut, via amino acid transporters. Mammals contain three genes encoding digestive carboxypeptidases, designated carboxypeptidases A1, A2 and B [1]. All three proteins are zinc-containing metallopeptidases of clan MC [2].

    The specificity of the digestive carboxypeptidases in mammals has been extensively investigated. These enzymes show a specificity directed towards the C-terminal amino acid residue in their substrates. Carboxypeptidases A1 and A2 prefer neutral amino acids, with A1 favouring smaller amino acid side chains, whereas A2 favours bulkier side chains [3]. Carboxypeptidase B is highly specific for basic C-terminal residues (with arginine favoured over lysine [4]). Specificity is primarily determined by interaction of the side chain of the C-terminal amino acid of the substrate with a binding pocket on the enzyme, with amino acid 255 (human carboxypeptidase A1 numbering) at the bottom [5]. The side chain of this residue interacts with the substrate side chain; in carboxypeptidase B amino acid 255 is negatively charged aspartic acid to interact with positively charged basic side chains, whereas in carboxypeptidase A1 and A2 it is isoleucine, to form a hydrophobic interaction with neutral side chains. Amino acids Tyr248 (hydrogen bond to P1 amino group), Arg71 (hydrogen bond to P2 carbonyl oxygen), Asn144, Arg145 (bind C-terminal carboxylate group of substrate) and Tyr198 are also important for substrate binding [2,6].

    The presence of digestive carboxypeptidases in insects was established by Ward [7] who partially purified an enzyme with carboxypeptidase A activity from gut extracts of the webbing clothes moth, Tineola bisselliella. Soluble carboxypeptidase activity was subsequently found in gut extracts from larvae of both coleopteran (mealworm; Tenebrio molitor[8]); and lepidopteran (armyworm; Spodoptera frugiperda[9]); insect species. In addition to carboxypeptidase A activity, carboxypeptidase B activity has been detected in corn earworm (Helicoverpa armigera) although at much lower levels than carboxypeptidase A activity [10]. The molecular characterization of these enzymes has been carried out through the identification of cDNA clones encoding them. The first putative insect digestive carboxypeptidase to be cloned was from black fly (Simulium vittatum[11]), followed by enzymes from corn earworm (H. armigera[12]), mosquito (Anopheles gambiae[13]; Aedes aegypti[14]), tsetse fly (Glossinia morsitans[15]), and bertha armyworm (Mamestra configurata[16]). On the basis of sequence similarity, the genes from black fly, mosquito and corn earworm were predicted to encode proteins with similar specificity to carboxypeptidase A, whereas the cDNA from tsetse fly was predicted to encode a protein showing similar specificity to carboxypeptidase B. These predicted enzyme activities have not been directly demonstrated except in the case of a carboxypeptidase A-like enzyme from corn earworm, which has been expressed as a recombinant protein in insect cells using a baculovirus-based expression system, and has been shown to hydrolyse a synthetic substrate for carboxypeptidase A, but not a synthetic substrate for carboxypeptidase B [10]. Although other insect carboxypeptidases have not been expressed as recombinant enzymes, expression of putative digestive carboxypeptidases has been shown to be strongly gut-specific, and upregulated by feeding [11,13,14,17]. Insects also contain other types of metallopeptidases, such as angiotensin I-converting enzyme (clan MA) [18], but these have not been shown to be involved in digestion.

    Mammalian digestive carboxypeptidases are synthesized as inactive proenzymes (after cotranslational removal of signal peptides) which contain a long N-terminal pro-region [19]. Activation of the carboxypeptidase results from cleavage of the peptide bond between the pro-region and the mature enzyme, catalysed by an activating proteinase (trypsin in mammals). The insect digestive carboxypeptidases also show evidence of the presence of pro-regions, on the basis of sequence similarity, but the activation process has only been directly demonstrated with the carboxypeptidase A-like enzyme from Helicoverpa armigera[20].

    The present paper describes the identification of further digestive carboxypeptidases in corn earworm, which are predicted to show differing specificities towards C-terminal amino acid residues. One of these enzymes is shown to have a novel specificity towards C-terminal glutamate residues.

    Experimental procedures

    Materials

    Cultures of H. armigera were obtained from Syngenta plc (Jealott's Hill Research Station, Bracknell, Berks, UK) and were maintained at 25 °C, with a 16 h day length in a licenced facility (DEFRA PHL 179/4428). Larvae were routinely reared on the standard artificial diet described by Bown et al. [12].

    Purification and characterization of carboxypeptidases from H. armigera larval gut extract

    Gut extract was prepared from fourth instar lavae of H. armigera as described previously [10]. Extract from 65 larvae (13.25 mL) was diluted with an equal volume of 2× loading buffer (1 × = 50 mm Tris/HCl, 100 mm NaCl pH 7.5), centrifuged at 10 000 g for 10 min, and filtered through a GF/C glass fibre disc (Whatman Biochemicals) followed by a 0.47 µm cellulose acetate membrane. The extract was applied to a column of immobilized potato carboxypeptidase inhibitor (PCI) which had been prepared by coupling 2 mg PCI (gift from F. X. Aviles, Institut de Biolotechnologia i de Biomedicina, Universitat Autonoma de Barcelona, Spain) to a 1 mL Hi-Trap NHS-activated Sepharose column as described by the manufacturer (Amersham-Pharmacia Biotech). The column was washed successively with 6 mL portions of: loading buffer; 0.25 m NaCl; 2 m glycine/HCl pH 2.0; 0.25 m NaCl; 0.1 m glycine/NaOH, pH 12.0; 0.25 m NaCl; and 6 m guanidine hydrochloride in loading buffer. Pooled fractions were acetone precipitated and analysed by SDS/PAGE. N-terminal sequencing was carried out on proteins blotted onto poly (vinylidene difluoride) membrane after SDS/PAGE by standard Edman degradation procedures using an Applied Biosystems Model 477A Protein Sequencer.

    Isolation and characterization of cDNAs encoding H. armigera carboxypeptidase

    The construction of a cDNA library in the phage λ vector Lambda Uni-ZAP XR (Stratagene) from RNA extracted from gut tissue of H. armigera larvae has been described previously [12]. The library was screened (as described by Sambrook & Russell [21]) using PCR products as probes for carboxypeptidases. The probes were generated by PCR amplification of the library with specific primers encoding carboxypeptidase N-terminal sequences: Band C Fig. 1A, 5′-ATIACITGGGA(C/T)ACITA(C/T)TA(C/T)(A/C)G-3′; band D Fig. 1A, 5′-TT(C/T)GA(C/T)CA(A/G)ATITA(C/T)CA(C/T)C-3′; and a generic vector primer (T7 primer), 5′-GTAATACGACTCACTATAGGGCG-3′. PCR products were cloned in pCR2.1 using the TOPO cloning method (Invitrogen) and checked by DNA sequencing. Clones identified in the primary screen of the library were plaque-purified, excised into pBluescript SK+, and characterized by DNA sequencing as described previously [12]. 5′ RACE was carried out using a BD SMART RACE cDNA Amplification Kit, using the manufacturer's protocol (http://www.bdbiosciences.com/clontech/) and the gene-specific primer: 5′-CCTCGTCAATGGAGTACTCGTAGCCATCAG-3′. The amplified product was cloned in pCR2.1 as described for DNA sequencing. DNA sequences were determined by standard dideoxynucleotide sequencing protocols as adapted for ABI automated DNA sequencers, carried out by the DNA Sequencing Service, School of Biological and Biomedical Sciences, University of Durham. Both DNA strands were fully sequenced on overlapping fragments. Sequences were assembled using sequencher software (Genecodes; http://www.genecodes.com) running on Apple MacOS computers. Sequence analysis was carried out using blast searches to identify sequence similarities (http://www.ncbi.nlm.nih.gov/BLAST/), and signalp[22] to identify signal peptides (http://www.cbs.dtu.dk/services/SignalP-2.0/). Multiple sequence comparison and phylogenetic tree analysis was carried out using the Clustal method in the megalign program (DNAStar LaserGene software; http://www.dnastar.com).

    Details are in the caption following the image

    Purification of native and recombinant carboxypeptidases. (A) Affinity chromatography of gut extract from larval H. armigera on immobilized PCI. Fractions eluted under conditions as shown were analysed by SDS/PAGE. Bands A–F refer to polypeptides subjected to N-terminal sequence analysis (Table 1). (B) Purification of recombinant HaCA42 carboxypeptidase from culture medium after expression in P. pastoris. The purified protein was analysed by SDS/PAGE.

    Preparation of expression construct for recombinant HaCA42 procarboxypeptidase

    A complete ORF for the predicted HaCA42 procarboxypeptidase (i.e., without the signal peptide) was produced by PCR using primers designed to match the first 21 and last 21 bases of the coding sequence of the proprotein, which had extra bases added to include PstI (N-terminal) and SalI (C-terminal) restriction sites: Forward, 5′-CGCGCTGCAGGTCATGAGAAATATGAAGGA-3′; Reverse, 5′-GCGCGTCGACTGAATAGTTTTGCAAGACGTACTG-3′. They were designed to allow the amplified sequences encoding the proproteins to be ligated into the pGAPZα (Invitrogen Life Technologies) to form a continuous reading frame from the α-factor secretion signal of the vector, through the procarboxypeptidase sequence, and into the 6 × His-tag and stop codon of the vector. The PCR products were first cloned into the pCR2.1 vector using a TOPO cloning system (Invitrogen). After confirming the identity of the intermediate clone, the coding sequence fragment was excised by restriction with PstI and SalI, and ligated to pGAPZα which had been restricted with the same enzymes. Vector constructs were transformed into chemically competent TOP10F′ cells (Invitrogen) and maintained on medium containing zeocin (Invitrogen) at a final concentration of 50 µg·mL−1. All expression constructs were sequenced through the ligation sites and inserted sequence to ensure that no errors had been introduced into the expressed polypeptides by the PCR process, and that the construct had been correctly assembled.

    Expression and purification of recombinant HaCA42 procarboxypeptidase

    Competent Pichia pastoris cells (protease-deficient strain SMD1168H) were prepared using the Pichia EasyComp Transformation Kit (Invitrogen) following the manufacturer's protocol. Cells were transformed using linearized DNA (restricted with BlnI) from the expression construct. Transformed yeast cells were selected by plating on YPDS agar medium (10 g·L−1 yeast extract, 20 g·L−1 peptone, 20 g·L−1 dextrose, 1 m sorbitol, 20 g·L−1 agar) containing zeocin at a final concentration of 100 µg·mL−1. Selected colonies were screened for the presence of the expression construct by colony PCR. Selected PCR-positive colonies were screened for expression of the recombinant protein by immuno dot-blot analysis [23] of culture supernatant from small-scale (10 mL) cultures grown for 72 h at 30 °C in YPD/zeocin medium (10 g·L−1 yeast extract, 20 g·L−1 peptone, 20 g·L−1 dextrose, 100 µg·mL−1 zeocin). Recombinant protein was detected with anti-His(C-term) primary antibodies (Invitrogen) followed by horseradish peroxidase-linked goat antimouse secondary Ig (Bio-Rad). Bound peroxidase activity was visualized with a chemiluminesent ECL detection system (Amersham Biosciences).

    The highest-expressing clone was grown in large-scale culture in a 2.5 L laboratory fermenter (BioFlo 3000, New Brunswick Scientific Co. Inc.; http://www.nbsc.com) using the method described in Rogelj et al. [24], but omitting the methanol induction step. After pelleting the yeast cells by centrifugation at 8000 g for 30 min at 4 °C, NaCl was added to the resulting culture supernatant to a final concentration of 2 m. Recombinant protein was purified from this solution by hydrophobic interaction chromatography on a column of phenyl-Sepharose (1 cm i.d., 25 mL volume), equilibrated in and washed with 2 m NaCl. The column was eluted with water, and the eluted peak of protein was pooled. The pooled fractions were adjusted to 20 mm Tris/HCl pH 7.8, 0.5 m NaCl (buffer A) and 5 mm imidazole by adding concentrated buffer solutions. The recombinant protein was finally purified by nickel affinity chromatography on a Ni/nitriolitriacetic acid agarose (Qiagen) column (1 cm i.d., 5 mL volume). The column was washed with Buffer A plus 5 mm imidazole. The recombinant 6 × His-tagged protein bound comparatively weakly, and eluted with both Buffer A plus 20 mm imidazole and Buffer A plus 300 mm imidazole. These fractions were pooled and the protein was precipitated with ammonium sulphate to 90% saturation. The precipitated protein was resuspended in a minimum volume of buffer and desalted by gel filtration. Glycerol was added to the excluded peak and this material was stored frozen in aliquots at −20 °C. The frozen aliquots were thawed before use in all subsequent assays; no loss of activity occurred on storage under these conditions.

    Activation of HaCA42 carboxypeptidase with trypsin

    The HaCA42 procarboxypeptidase was activated by treatment with bovine trypsin in 50 mm Tris/HCl pH 8 at 37 °C. Both the molar ratio of trypsin/procarboxypeptidase and the time of incubation were varied. Samples were removed and diluted 1 : 5 into ice-cold sodium borate buffer pH 8.5. Samples of diluted enzyme were assayed for activity against furylacryoyl-Glu-Glu (FAEE, see below) as a substrate, and the remaining protein was precipitated by acetone. The protein pellet was redissolved in SDS sample buffer and analysed by SDS/PAGE.

    Carboxypeptidase assays and expression of HaCA42 mRNA

    Carboxypeptidase assays using the synthetic substrates furylacryloyl-Phe-Phe (FAPP), furylacryloyl-Ala-Lys (FAAK) and FAEE and Northern blotting of RNA from larval gut tissue, were carried out as described previously [10].

    Peptide digestion by HaCA42 carboxypeptidase

    Recombinant HaCA42 procarboxypeptidase was activated by treatment with bovine trypsin as described above. The activated enzyme was diluted into buffer containing 1 mm benzamidine to inhibit trypsin, and the mixture was treated with phenyl methylsulphonyl fluoride or aminoethyl-benzene sulphonyl fluoride (1 mm) to inactivate the serine proteinase. Diluted carboxypeptidase was incubated with peptide substrates at a concentration of 1–2 µm in 10 mm Tris/HCl pH 7.5 for varying times up to 120 min at 30 °C, routinely at an enzyme/substrate molar ratio of 1 : 200. Other ratios were used as required. Reactions were sampled and quenched by adding dithiothreitol to 20 mm, and spotted onto Ciphergen H4 protein chips (http://www.ciphergen.com). Peptides were analysed by surface enhanced laser desorption/ionization MS, using a Ciphergen instrument, as described in the manufacturer's literature. Mass ion sizes were estimated by calibrating the instrument with size standards covering the range analysed.

    Results

    Purification of carboxypeptidase enzymes from H. armigera larval gut

    In order to characterize the total complement of digestive carboxypeptidases in larval corn earworm, a H. armigera larval gut extract was subjected to affinity chromatography using immobilized PCI as a ligand. The gut extract was applied to the column under nondenaturing conditions at neutral pH, and the column was washed extensively prior to elution under successively more denaturing conditions. Eluted protein fractions were pooled, concentrated and analysed by SDS/PAGE (Fig. 1A). No protein bands were visible in the fraction eluted using buffer at pH 2 (data not shown). Subsequent elution of the column with buffer at pH 12 gave a fraction containing a number of discrete polypeptides, with major bands at ≈ 25, ≈ 50 and ≈ 55 kDa. Finally, the column was eluted under highly denaturing conditions, using buffer containing 6 m guanidine hydrochloride; the eluted fraction contained three polypeptides, a major band at ≈ 35 kDa, and a closely spaced doublet of bands at ≈ 30 kDa.

    Proteins were identified by N-terminal sequencing of polypeptide bands blotted from gel electrophoretic separations (Table 1). None of the major bands eluted at pH 12 contained N-terminal sequences similar to carboxypeptidases present in the databases. The two proteins migrating at ≈ 50 and ≈ 55 kDa (bands A and B; Fig. 1A) were identified from their N-terminal sequences as similar to α-amylase (accession no. AAA17751) from silkworm (Bombyx mori). The 25 kDa polypeptide band (band F; Fig. 1A) gave an N-terminal sequence which corresponded to that predicted by a cDNA previously isolated from the H. armigera larval gut library [12]. This cDNA, SR21 (accession No. Y12274) encodes a protein with sequence similarity to serine proteases, but which appears to lack members of the catalytic triad required for enzyme activity. Binding of these proteins to PCI may be a result of specific interactions between the inhibitor and the proteins themselves, although this would not be expected on the basis of their functional properties and sequences, or may suggest that they are present in a tightly bound complex with carboxypeptidase(s) in vivo.

    Table 1. N-terminal sequences of polypeptides eluted from affinity column containing immobilized PCI at pH 12 and with 6 m guanidine hydrochloride. Partial amino acid sequences predicted by specified cDNAs (accession numbers in brackets) are given in italic type.
    Band
    (kDa)
    N-terminal sequence determined/sequence predicted
    from cDNA (partial)
    Identification
    PCI
     F (25) SSSPARXEDYPSTVQLETGI     Ha cDNA SR21
    AYSSSSPARIEDYPSTVQLETGIGRV    Similar to serine protease (Y12274)
     B (50) YKNPYYAPGR(S)VNVN Bombyx mori
    ALAYKNPHYASGR T TMVHLFE    α-amylase (AAA17751)
     A (55) YLNPXY Bombyx mori
    ALAYKNPHYASGRTTMVHLFE    α-amylase (AAA17751)
    6 m guanidine hydrochloride
     E (30–) LSFDKIHSYEEVDAYLQELAKEFPNVVTVV    Ha cDNA CM1
    RSRLSFDKIHSYEEVDAYLQELAKEFPNVVTVVEGG    carboxypeptidase (AJ005176)
     D (30+) LD(F/S)LPFDQIYTYHQVDTFLA     Ha cDNA CB6
    ASRLD   S LPFDQIYTYHQVDTFLDMLA    carboxypeptidase
     C (35) SITWDTYYRHDEINDYLDELAEQNSD(L/I)XTV Ha cDNA CA42
    SGKSITWDTYYRHDEINDYLDELAEQNSD  L VTVINA   carboxypeptidase

    In contrast to the bands eluted at pH 12, the polypeptides eluted by 6 m guanidine hydrochloride had N-terminal sequences with similarity to, or identity with carboxypeptidases. The band estimated as 35 kDa (band C; Fig. 1A) gave an N-terminal sequence which had 41% identity over 29 amino acid residues to the N-terminal region of a crayfish carboxypeptidase (P04069). The less strongly stained bands estimated as ≈ 30 kDa (bands D and E; Fig. 1A) contained two different N-terminal sequences. The lower molecular mass band of the doublet (band E; Fig. 1) gave an N-terminal sequence identical over 32 amino acid residues to the N-terminal region of carboxypeptidase A from H. armigera larval gut (sequence predicted by cDNAs AJ005176–8). The higher molecular mass band of the doublet (band D; Fig. 1A) gave an N-terminal sequence of 20 amino acid residues which was similar to (47% identity), rather than identical with the N-terminal region of H. armigera carboxypeptidase A.

    Identification of cDNAs encoding carboxypeptidases in H. armigera larval gut library

    The characterization of three similar cDNAs encoding carboxypeptidase A-like digestive proteases as a result of screening a cDNA library prepared from RNA extracted from gut tissue of corn earworm larvae has been described previously [10]. In order to isolate cDNA clones encoding other digestive carboxypeptidases, degenerate oligonucleotide primers were designed using the N-terminal sequence data obtained from gut polypeptides, as described above. Using these specific primers, and primers directed against vector sequences, PCR was carried out on the larval gut cDNA library. Both N-terminal primers in combination with a generic 3′ primer gave products of ≈ 1.0 kb. PCR products were individually excised from gel, purifed, and cloned. At least three independent clones for each product were characterized by a preliminary DNA sequencing run. The PCR reactions using the two separate carboxypeptidase-specific primers each gave essentially a single product with similarity to carboxypeptidases (although minor heterogeneity, potentially resulting from amplification errors, was present). These sequences were then used as probes to screen the cDNA library. cDNAs detected by each of the two PCR products were isolated and sequenced.

    Characterization of H. armigera carboxypeptidase-encoding cDNAs

    cDNAs encoding the 35 kDa H. armigera carboxypeptidase (band C, Fig. 1A) are exemplified by a clone designated HaCA42. This cDNA (accession no. AJ626862) was fully sequenced on both strands; it is truncated at the 5′ end, and starts at nucleotide 8 of the coding sequence. The sequence at the 5′ end of the mRNA was completed by 5′ RACE, from which two independent clones gave identical sequences at the same starting point for the mRNA. A poly(A) tail is present. The corresponding mRNA thus contains an 11 base 5′ untranslated region (UTR), a coding sequence of 1275 bases (including stop codon), and an 89 base 3′ UTR excluding the poly(A) sequence. A cDNA clone with 98% identity with HaCA42 at the nucleotide level over the coding sequence and 99% identity with HaCA42 in the deduced amino acid sequence was also sequenced, and represents a second member of the subfamily of carboxypeptidase genes exemplified by HaCA42. The deduced amino acid sequence of HaCA42 (Fig. 2) predicts that this is a secreted protein, the first 18 residues constituting a typical signal peptide (SignalP prediction, vs. 2.0). The predicted proprotein is therefore is 406 amino acids in length, with a predicted MW of 46.0 kDa. When this sequence was used to query the protein sequence databases, the closest similarity (38–40% identity, based on identity of corresponding amino acid residues) was to the carboxypeptidase A sequences encoded by the cDNAs previously isolated from H. armigera (accession numbers AJ0051768). Similar levels of similarity were found to sequences of ORFs found in the genomes of Drosophila (34–37% identity, NM139861-3), Anopheles (42% identity, AAAB01008960) and Caenorhabditis elegans (40% identity, NM074283). A carboxypeptidase B enzyme from crayfish also lies within the group of sequences showing the high levels of similarity to HaCA42 (37% identity, P04069).

    Details are in the caption following the image

    Predicted protein sequence from cDNA HaCA42. The predicted signal peptide is indicated; propeptide and mature protein are designated from N-terminal sequence determined for carboxypeptidase purified from H. armigera larval gut extract (shaded). Sequence features of clan MC carboxypeptidases are denoted as follows (numbering from human carboxypeptidase A sequence): *, catalytically active residues (Arg127, Glu270); •, zinc ligand residues (His69, Glu72, His196); inline image , substrate binding residues (Arg71, Asn144, Arg145, Tyr198, Tyr248); ○, S1′ site residue (Arg255). Potential N-glycosylation sequences are boxed.

    The N-terminal sequence determined for the 35 kDa carboxypeptidase from H. armigera larval guts (band C; SITWDTY…; Table 1) is located 98 amino acids from the predicted N-terminus of the pro-region (Fig. 2). The amino acid sequence predicted by the cDNA is identical to the sequence determined (29 amino acid residues). Removal of the pro-region results in a predicted protein of 308 amino acids, MW 34.8 kDa, in close agreement with that determined by SDS/PAGE. Other features of the predicted protein sequence are consistent with the conserved residues in metallocarboxypeptidases of clan MC [2]. The mature sequence contains amino acid residues His69, Glu72 and His196 (numbering based on human carboxypeptidase A) which ligate the catalytic zinc ion in metallocarboxypeptidases; Arg127 and Glu270 also involved in catalysis; and Arg71, Asn144, Arg145, Tyr198 and Tyr248, which participate in substrate binding (Fig. 2). A distinguishing feature of this predicted protein sequence is the amino acid residue at position 255, which determines substrate specificity by interacting with the side chain of the P1′ residue. In the protein predicted by HaCA42 this residue is arginine. There are also two consensus N-glycosylation sites within the amino acid sequence predicted by HaCA42, both of which lie within the mature protein, with one near the C-terminus.

    The cDNAs encoding the polypeptide present in the upper band of the 30 kDa doublet of H. armigera carboxypeptidases (band D, Fig. 1A) are exemplified by a clone designated HaCB6 (accession no. AJ626863). This cDNA also encodes a clan MC metallocarboxypeptidase enzyme, and will be described elsewhere.

    Expression and purification of recombinant procarboxypeptidase HaCA42

    A construct to allow the protein encoded by HaCA42 to be expressed in the yeast P. pastoris was assembled by amplifying the coding sequence of the cDNA by PCR. Primers were designed to allow the PCR product to be inserted into the Pichia expression vector pGAPZαB with the N-terminus of the proprotein in-frame and adjacent to the cleavage point of the yeast α-mating factor secretion signal encoded by the vector. In addition, a (His)6-tag encoded by the vector was added to the C-terminus of the protein before the stop codon. The construct was verified by DNA sequencing after assembly, and linearized plasmid DNA was used to transform competent P. pastoris. After selection for transformation on zeocin plates, colonies were screened by PCR for the presence of the HaCA42 sequence. Positive colonies were individually grown in small-scale cultures, and samples of culture medium were assayed for expression of His-tagged protein by immunodot blot. The transformant that showed the highest level of expression was chosen for protein production, and was grown up under optimized conditions in a 2 L laboratory fermentor.

    The recombinant protein (referred to subsequently as HaCA42 procarboxypeptidase or carboxypeptidase) was purified from culture medium by hydrophobic interaction chromatography followed by affinity chromatography on immobilized nickel ions. The purified protein ran as a single band when analysed by SDS/PAGE, with an estimated MW of ≈ 50 kDa (Fig. 1B). The yield of purified protein was ≈ 5 mg·L−1 of fermenter culture.

    Activation of recombinant procarboxypeptidase HaCA42

    The recombinant HaCA42 procarboxypeptidase enzyme had no detectable activity when assayed against syntheticfurylacryloyl (FA)–peptide substrates for carboxypeptidases. Three substrates were assayed: FAPP, with a C-terminal phenylalanine residue (hydrolysed by carboxypeptidase A); FAAK, with a C-terminal lysine residue (hydrolysed by carboxypeptidase B) and FAEE, with a C-terminal glutamate residue (hydrolysed by glutamate carboxypeptidase). However, after the protein was treated with substoichiometric amounts of bovine trypsin (procarboxypeptidase/trypsin molar ratio > 5 : 1) carboxypeptidase activity against FAEE could be detected. Trypsin gave no activity against any of these substrates in the absence of the recombinant carboxypeptidase. Because the activation of mammalian digestive procarboxypeptidases in vivo is known to be caused by cleavage of the propeptide by trypsin [19], the results suggested that a similar activation process was necessary for the HaCA42 procarboxypeptidase, and that endogenous yeast proteases in the protease-deficient Pichia strain used were not sufficient to cause activation.

    To further investigate the activation process, recombinant HaCA42 procarboxypeptidase was incubated with trypsin (9.4 : 1 molar ratio) at 37 °C. At various timepoints samples were withdrawn and trypsin activity was quenched; the carboxypeptidase activity against FAEE was then assayed, and the polypeptides present were analysed by SDS/PAGE. Results are shown in Fig. 3A. A control sample of HaCA42 procarboxypeptidase (track C) contained no detectable polypeptides of < 48 kDa, but even after a nominal zero incubation time (track 0), corresponding to sampling the mixture of procarboxypeptidase and trypsin immediately after addition of trypsin, polypeptides of ≈ 36 kDa and ≈ 13 kDa are present in the sample. Neither of these polypeptides was present in trypsin when this enzyme was analysed by SDS/PAGE (data not shown). By analogy with the activation of mammalian carboxypeptidases, these polypeptides correspond to the active HaCa42 carboxypeptidase (36 kDa polypeptide) and the pro-region (13 kDa polypeptide). After 5 min the majority, and by 20 min all, of the original 50 kDa protein had been digested, with an increase in staining of the bands of ≈ 36 kDa and ≈ 13 kDa. On further incubation with trypsin the 13 kDa polypeptide is itself digested by trypsin, and decreases in amount until it is no longer detectable after 80 min of digestion, but the amount of 36 kDa polypeptide remains constant up to 2 h digestion under these conditions. When the carboxypeptidase activity (FAEE substrate) of the mixture was assayed, there was a qualitative correlation between the appearance of the putative 36 kDa activated carboxypeptidase polypeptide, and the level of activity detected. Thus, the control procarboxypeptidase sample had no detectable activity, but the zero time sample contained detectable activity (5% of maximum activity) which increased with time (Fig. 3B). However, when quantitative estimates of activity were compared to results of the gel analysis, it was apparent that cleavage alone was not sufficient for activation. After 20 min digestion by trypsin, all of the procarboxypeptidase band at 50 kDa had been cleaved to 36 kDa and 13 kDa bands, but the carboxypeptidase activity was only ≈ 55% of the maximum activity (Fig. 3A,B). The carboxypeptidase activity only reaches a maximum after ≈ 60–80 min incubation with trypsin, and further incubation with trypsin to 120 min does not affect the level of activity against this substrate. Attainment of maximum carboxypeptidase activity in this assay corresponds to the disappearance of the 13 kDa pro-region polypeptide (Fig. 3A,B); once this polypeptide has been completely digested by trypsin, the carboxypeptidase activity is maximal.

    Details are in the caption following the image

    Activation of HaCA42 carboxypeptidase by trypsin. (A) SDS/PAGE of cleavage of procarboxypeptidase, sampled after stated times of digestion with bovine trypsin (9.4 : 1 molar ratio procarboxypeptidase/trypsin). Pro-, procarboxypeptidase; Mature, mature carboxypeptidase; Activn. peptide, pro-region. The faint band at ≈ 25 kDa is from the trypsin used for activation. (B) Carboxypeptidase activity (digestion of FAEE substrate) after stated times of digestion with bovine trypsin.

    Characterization of recombinant HaCA42 carboxypeptidase activity

    The pH optimum for hydrolysis of FAEE by the activated HaCA42 carboxypeptidase was determined over the range 2.2–10.5 using a variety of buffer systems. There was a marked optimum activity at pH 8.5 in borate buffer with activity declining to 50% of maximum at pH 7.5 and 10.0 (data not shown). Various diagnostic inhibitors were used to characterize the activity of the recombinant enzyme. No inhibition (< 10% reduction in activity compared to enzyme preincubated without inhibitor) was observed after preincubation with: the cysteine protease inhibitor E-64 (10−5m final concentration); the aspartic protease inhibitor pepstatin (10−5m); the serine protease inhibitors phenyl methylsulphonyl fluoride, (2 × 10−5m) and soybean kunitz trypsin inhibitor (5 × 10−7m); chymostatin (10−5m) an inhibitor of chymotrypsin; and benzamidine (10−2m) an inhibitor of trypsin.

    The metalloprotease inhibitors phenanthroline (5 ×10−3m) and EDTA (10−2m) both had marked effects on activity (82% inhibition and 96% inhibition, respectively) as did the protein carboxypeptidase inhibitor PCI (94% inhibition at 2.5 × 10−6m). Interestingly, preincubation with zinc, used by many authors in activating carboxypeptidase, has a deleterious effect on activity; 10−5m ZnCl2 inhibits activity by 68% and 10−6m ZnCl2 inhibits activity by 21%. The reducing agent dithiothreitol also inhibits activity of the recombinant HaCA42 carboxypeptidase at concentrations above 10−5m, resulting in 45% inhibition at 10−4m and 86% inhibition at 10−3m.

    The kinetic parameters for hydrolysis of FAEE by the recombinant HaCA42 carboxypeptidase were determined by a standard Michaelis–Menten analysis using varying substrate concentrations. Km was estimated as 6 × 10−5m (mean of three determinations), and Vmax was estimated as 7.3 × 10−7 moles FAEE hydrolysed·s−1·mg protein−1. Assuming a MW of 36 kDa for the active recombinant enzyme, and that all the proenzyme has been activated and remains active in the assay, these figures give values of 26 s−1 for iat and 4.3 × 105 s−1·m−1for kcat/Km.

    Substrate specificity of recombinant HaCA42 carboxypeptidase

    The activated recombinant HaCA42 carboxypeptidase hydrolysed FAEE (Glu C-terminal residue), but gave no detectable hydrolysis of synthetic substrates for carboxypeptidase A (FAPP, Phe C-terminal residue) or carboxypeptidase B (FAAK, Lys C-terminal residue) even when used in large amount for extended digestion periods. The specificity of the activated enzyme was investigated in more detail by incubation with a selection of peptides of known sequence in the presence of trypsin inhibitors to prevent digestion by the activating enzyme. The presence or absence of digestion was assayed by MS over a mass range which included the peptide substrates. Results are presented in Table 2, which defines the peptide sequences and their abbreviations.

    Table 2. Digestion of peptide substrates by activated recombinant HaCA42 carboxypeptidase. Digestion was detected by MS after varying times of digestion up to 2 h. 0, no digestion detectable; ±, slight digestion detectable; +++, digestion readily detectable. ACTH, adrenocorticotropic hormone.
    Peptide C-terminal amino acid Sequence (Mr) Digestion
    [Glu1]-Fibrinopeptide B Arginine EGVNDNEEGFFSAR 0
    PDI substrate Asparagine NRCSQGSCWN 0
    ACE inhibitor Aspartate PTHIKYGD +
    β-endorphin (aa 61–91) Glutamate YGGFMTSEKSQTPLVTLFKNAIIKNAYKKGE +++
    Cys-CD36 (aa 139–155) Glutamine CNLAVAAASHIYQNQFVQ 0
    Angiotensin 1 Leucine DRVYIHPFHL 0
    ACTH (1–24) Proline SYSMEHFRWGKPVGKKRRPVKVYP 0
    Angiotensinogen 1–14 (rat) Serine DRVYIHPFHLLYYS 0 (unstable)

    When hydrolysing peptide substrates at enzyme/substrate ratios of 1 : 200, the HaCA42 carboxypeptidase had a similar specificity to that observed when synthetic dipeptide substrates were used. Angiotensin 1, a peptide with a C-terminal neutral, hydrophobic amino acid (Leu) was not hydrolysed, like the synthetic substrate FAPP (Phe C-terminal residue). Similarly, fibrinopeptide B, with a C-terminal basic residue (Arg), like FAAK (Lys C-terminal residue), was not hydrolysed. The neutral hydrophilic C-terminal serine of angiotensin (1–14), and the C-terminal proline of ACTH 1–24 were also not hydrolysed. On the other hand, a peptide with a C-terminal glutamate residue (β-endorphin amino acids 61–91) was readily cleaved by the HaCA42 carboxypeptidase, like the synthetic substrate FAEE (C-terminal Glu residue). The specificity was further explored by using peptide substrates with the C-terminal side-chain amide residues, asparagine (PDI substrate) and glutamine (Cys-CD36). Neither peptide was hydrolysed by the HaCA42 carboxypeptidase, suggesting that the C-terminal amino acid must carry a negative charge on the side chain. Finally, an angiotensin 1 converting enzyme (ACE) inhibitor peptide with a C-terminal aspartate residue was assayed; this was hydrolysed by the carboxypeptidase, but very slowly. Under conditions sufficient to completely cleave the β-endorphin substrate, < 5% of the ACE-inhibitor peptide was cleaved, as estimated by the appearance of a new peptide with lower molecular mass (data not shown).

    The specificity of the HaCA42 carboxypeptidase was also investigated by carrying out a time-course experiment for digestion of the β-endorphin substrate. Results are presented in Fig. 4. At an enzyme/peptide ratio of 1 : 5000, appearance of a peptide product of correct mass for cleavage of the C-terminal glutamate from the β-endorphin peptide was observed after 1 min. The amount of this product relative to the undigested peptide increased with time, until digestion was essentially complete after 90 min. After removal of the C-terminal glutamate, the next residue is a glycine, but there was no evidence for removal of this residue from the initial product of HaCA42 carboxypeptidase digestion in the timescale of this experiment (up to 120 min), or in experiments where HaCA42 carboxypeptidase was present at ratios up to 1 : 200 with respect to substrate.

    Details are in the caption following the image

    Time-course for digestion of β-endorphin peptide substrate by activated HaCA42 carboxypeptidase. Traces show mass spectra from peptide sampled after varying times of digestion. Mass ion at m/e 3465.0 corresponds to uncleaved peptide; mass ion at 3335.9 corresponds to removal of a glutamate residue from the C-terminus; this product is then stable to further C-terminal exopeptidase action. The small peak at m/e 3150.7 visible after extended digestion results from cleavage between lysine residues in the peptide C-terminal sequence (…YKKGE) caused by residual trypsin activity from the activating enzyme.

    The HaCA42 carboxypeptidase was also assayed for its ability to hydrolyse the folate analogue methotrexate, which contains a glutamate residue linked via an amide bond to pteroic acid. No activity against this substrate could be detected in a spectrophotometric assay in the presence of excess enzyme.

    Glutamate carboxypeptidase activity in H. armigera larvae

    Activity towards synthetic substrates for carboxypeptidase A and B (FAPP and FAAK) has previously been characterized in gut extracts from H. armigera larvae [10]. However, crude extracts of H. armigera larval gut contents showed little detectable activity towards the glutamate carboxypeptidase substrate FAEE, although carboxypeptidase A activity, and low levels of carboxypeptidase B activity could be detected in the same material. To confirm that the digestive carboxypeptidases in this insect did include enzymes with activity towards substrates with C-terminal glutamate residues, two approaches were taken. When insects were induced to regurgitate gut contents, and the regurgitant was collected and analysed, carboxypeptidase activity towards the FAEE substrate could be readily detected. The activity was shown to be present in bulk gut contents by partial purification of total gut content proteins by ammonium sulphate precipitation. The redissolved ammonium sulphate pellet was assayed for carboxypeptidase activity, and hydrolysis of both FAPP (carboxypeptidase A activity) and FAEE (glutamate carboxypeptidase activity) were detected, although more activity towards the former substrate was present. Quantitative analysis gave values of 4.3 × 10−8 moles FAPP hydrolysed·min−1·gut equivalent−1 and 7.3 × 10−9 moles FAEE hydrolysed·min−1·gut equivalent−1 under the conditions of this assay, suggesting that approximately six times as much carboxypeptidase A activity as glutamate carboxypeptidase activity is present in bulk gut contents.

    A Northern blot of RNA extracted from gut tissue of larval H. armigera was probed with the HaCA42 cDNA. A single band of estimated size 1.45 kb was observed after autoradiography (Fig. 5), consistent with the estimated size of the mRNA, and its assumed abundance in gut tissue.

    Details are in the caption following the image

    Expression of HaCA42 in gut mRNA. RNA extracted from midgut tissue of H. armigera larvae fed on control diet (C) and diet supplemented with SKTI (S) was separated by formaldehyde/agarose gel electrophoresis and blotted onto nitrocellulose. The blot was probed with the HaCA42 cDNA (coding sequence and 3′ UTR) and washed to a final stringency of 0.1 × NaCl/Cit, 0.1% SDS at 50 °C. The size of the hybridizing band was estimated from markers run on separate tracks of the same gel, which were excised and stained.

    Discussion

    Carboxypeptidases specific for glutamate have been characterized from a number of bacterial species; they are referred to as carboxypeptidase G (various subtypes), or more correctly, glutamate carboxypeptidases (NC-IUBMB preferred [25]), and have been given the EC classification 3.4.17.11. These enzymes are able to cleave C-terminal glutamate residues in peptides, and also the glutamate residue linked via its α-amino group to pteroic acid in folic acid and folate analogues, such as the drug methotrexate (4-amino-N10-methylpteroylglutamate). A distinct enzyme, known as glutamate carboxypeptidase II (EC 3.4.17.21), which is active towards acidic dipeptides with C-terminal glutamate, and folate analogues, is present in mammalian nervous tissue and prostate [26]. These enzymes all belong to clan MH of metalloproteinases, and have little sequence similarity or structural similarity to clan MC carboxypeptidases. The enzyme described in the present paper is different from these previously described glutamate carboxypeptidases in belonging to the clan MC of metallocarboxypeptidases. It is also more specific than the clan MH glutamate carboxypeptidases, as it has no detectable activity towards glutamate residues linked to folic acid. No other carboxypeptidase in clan MC has a similar specificity to the HaCA42 enzyme, and to date the only eukaryotic digestive carboxypeptidase activity demonstrated has been of the -A or -B type [2,27]. The HaCA42 enzyme is therefore the first example of a new peptidase. The best nomenclature for this enzyme would be carboxypeptidase C (which would emphasize its similarity to carboxypeptidases A and B), but this name is already used for a subclass of serine carboxypeptidases, although not for any specific enzyme in this class. ‘Glutamate carboxypeptidase MC’ is a possible alternative.

    It seems unlikely that this type of carboxypeptidase is unique to H. armigera, and it would be reasonable to expect similar enzymes to be present in other lepidopteran herbivores, and possibly in a wider range of arthropods. The Drosophila melanogaster (fruit fly) genome contains 19 genes encoding proteins with sequence similarity to the HaCA42 carboxypeptidase (blast comparison, E < 10−30), plus two genes encoding proteins with a low level of similarity (CG4122, 4678; E = 7 × 10−6, 2 × 10−6, respectively). A phylogenetic tree based on sequence comparison between HaCA42 and similar proteins predicted by the Drosophila genome is shown in Fig. 6A. The HaCA42 carboxypeptidase maps within the phylogenetic tree of similar Drosophila predicted proteins. Although not all the Drosophila genes encode active carboxypeptidase enzymes, the majority contain the residues necessary for activity, and have sufficient similarity over the region corresponding to residues 248–270 in human carboxypeptidase A to allow the equivalent residue to amino acid 255, the specificity determining residue, to be identified. Three genes, CG4408, CG12374 and CG14820, predict proteins with lysine residues at position 255 (Fig. 5B), where a positively charged basic side chain should give these proteins a similar specificity to HaCA42. All these proteins are predicted to have metallocarboxypeptidase activity; the CG12374 product is annotated in FlyBase as having carboxypeptidase A activity, but this assignment is based only on overall sequence similarity and, we suggest, is probably incorrect.

    Details are in the caption following the image

    Sequence comparisons for carboxypeptidases. (A) Phylogenetic tree for predicted carboxypeptidases of clan MC, family M14, from D. melanogaster (designated by CG-gene identifier) compared to HaCA42 carboxypeptidase (shaded branch). The S1′ site residue (AA255) and the predicted carboxypeptidase activity (A-like, B-like or glutamate-) based on this residue are as indicated. Sequence similarity over the region corresponding to amino acids 248–270 (human carboxypeptidase A numbering) is not present for the predicted products of CG32379 and CG8945; CG15679 lacks both E270 and Y248, and CG3097 and CG8564 lack Y248. These genes are predicted to encode proteins inactive as carboxypeptidases. (B) Sequence alignment over the region including amino acids 246–272 (human carboxypeptidase A numbering) for human carboxypeptidase A, and enzymes of clan MC, family M14 predicted to show glutamate carboxypeptidase activity. H. armigera (shaded) and D. melanogaster genes are designated as above; GmZcp cDNA, protein predicted by G. morsitans (tsetse fly) gut cDNA clone.

    In contrast with the situation in Drosophila, the Anopheles gambiae (mosquito) genome does not contain genes encoding carboxypeptidases with similar predicted specificity to HaCA42. There are 22 genes predicting proteins with sequence similarity to HaCA42 (E < 10−29), but none of the genes predicting active enzymes have a basic residue at positions equivalent to Ile255 in human carboxypeptidase A, all being carboxypeptidase A- or B-like in predicted specificity. In support of a wider distribution of glutamate-specific carboxypeptidases beyond H. armigera, examination of the global sequence databases suggests that a further enzyme similar in sequence to HaCA42, and with a similar predicted cleavage specificity, is present in one other insect species. An incomplete cDNA from tsetse fly (Glossinia morsitans morsitans), designated GmZcp (accession number AAK07479; amino acid sequence given is not complete), encodes a carboxypeptidase which contains a lysine residue at position 255 (Fig. 6B), like the Drosophila sequences, predicting similar specificity to HaCA42. In this case the enzyme has been incorrectly predicted to have carboxypeptidase B-like specificity [15]. The tsetse fly sequence is predicted to encode a digestive enzyme, as it was cloned from gut tissue, and the corresponding mRNA increases in level in response to feeding, but the role(s) of the Drosophila genes are not known. It seems likely that further enzymes with this predicted specificity will be found in other insects, and possibly in a wider range of eukaryotes, as more sequence data become available.

    Expression of the HaCA42 carboxypeptidase as a recombinant protein in P. pastoris has allowed its functional properties to be fully characterized. A similar approach has been taken for a second carboxypeptidase from this species, the digestive carboxypeptidase A-like enzyme described by Bown et al. [10]. A study using recombinant enzyme produced in P. pastoris showed that the insect enzyme had a broader substrate specificity than human carboxypeptidase A, showing some hydrolytic activity towards both aliphatic and basic C-terminal residues as well as more hydrophobic residues [20]. In that case also, activation of the proenzyme produced in Pichia by bovine trypsin was necessary for activity.

    The activation of human carboxypeptidase A by trypsin in the intestine involves both the cleavage of a susceptible peptide bond between pro-region and mature polypeptide, and the subsequent degradation of the pro-region peptide by trypsin and other digestive enzymes. The pro-segment behaves as an inhibitor of the carboxypeptidase prior to activation, and can be shown to act as a potent inhibitor (Ki in the nm range) when produced separately and added to activated enzyme [28]. Structural studies have shown that the pro-segment binds to the active site region of the enzyme, rendering the active centre inaccessible to protein and peptide substrates [19]. The activation process is thus limited by degradation of the bound pro-region, because the carboxypeptidase will be inhibited as long as the activation domain of the pro-region is kept in place. Structural studies on recombinant carboxypeptidase A-like enzyme from H. armigera have shown the presence of an inhibitory pro-region in the enzyme prior to activation with bovine trypsin [29], and suggest that the activation process for this enzyme in vivo is similar to that established for mammalian enzymes, with insect trypsin-like enzymes being responsible for the cleavage and degradation of the pro-region. Subsequent studies suggested that the lysine-specific endoprotease LysC was a more effective activator of the enzyme in vitro[20], although enzymes of this type have not been detected in the insect host, whereas trypsin-like enzymes are abundant [12], and lepidopteran trypsins have been shown to hydrolyse more efficiently at Lys than at Arg residues [30]. The activation process for the HaCA42 carboxypeptidase corresponds well to this general model; there is a lysine reside immediately prior to the mature peptide N-terminus (Fig. 2), so that peptide bond between propeptide and mature protein can be cleaved by Helicoverpa trypsin. There are a further five basic residues in the preceding 11 amino acids of the propeptide, giving this region a high probability for cleavage by trypsin. Cleavage of the recombinant protein by trypsin in vitro takes place at or near the N-terminus of the mature peptide produced in vivo, as the estimated molecular masses of pro- and mature polypeptides in vitro correspond to cleavage having taken place in this region.

    The presence of hydrolytic activity towards FAEE in H. armigera larval gut extract and regurgitant shows that activated glutamate carboxypeptidase is present in vivo, a conclusion confirmed by the isolation of a polypeptide with the predicted N-terminal sequence of the mature protein by affinity chromatography on immobilized potato carboxypeptidase inhibitor. The H. armigera carboxypeptidases bound very tightly to this inhibitor, with complete denaturation in 6 m guanidine hydrochloride necessary for elution. This inhibitor forms stable complexes with a wide range of carboxypeptidases, with dissociation constants in the nanomolar range, but without binding being dependent on the cleavage specificity of the enzyme (the C-terminal residue of the inhibitor is glycine). The purified polypeptides are thus likely to represent the range of digestive carboxypeptidases in the gut extract. An unexpected result was that the most abundant carboxypeptidase to be eluted from the PCI affinity column was not the carboxypeptidase A-like enzyme previously characterized [10], but (apparently) the glutamate carboxypeptidase encoded by HaCA42, whereas gut carboxypeptidase activity towards synthetic substrates shows more A-like activity than other types. The kinetic parameters for the recombinant HaCA42 carboxypeptidase are similar to those found for the carboxypeptidase A-like enzyme from this insect [10,20] and therefore the difference in FAPP-specific and FAEE-specific carboxypeptidase activities in vivo cannot be explained by differences in specific activities of the respective enzymes. Possibly the majority of the glutamate carboxypeptidase enzyme remains associated with its inhibitory pro-peptide in vivo in gut contents. Further studies will be required to fully characterize the complement of digestive carboxypeptidases in this insect.

    Acknowledgements

    The authors thank John Gilroy for coaxing some excellent protein sequence data out of an ageing instrument, and Prof. F. X. Aviles for the generous gift of recombinant PCI. This work was funded in part by EC Programme FAIR6-CT98-4239 and by the McKnight Foundation.