Biological functions, genetic and biochemical characterization, and NMR structure determination of the small zinc finger protein HVO_2753 from Haloferax volcanii

The genome of the halophilic archaeon Haloferax volcanii encodes more than 40 one‐domain zinc finger µ‐proteins. Only one of these, HVO_2753, contains four C(P)XCG motifs, suggesting the presence of two zinc binding pockets (ZBPs). Homologs of HVO_2753 are widespread in many euryarchaeota. An in frame deletion mutant of HVO_2753 grew indistinguishably from the wild‐type in several media, but had a severe defect in swarming and in biofilm formation. For further analyses, the protein was produced homologously as well as heterologously in Escherichia coli. HVO_2753 was stable and folded in low salt, in contrast to many other haloarchaeal proteins. Only haloarchaeal HVO_2753 homologs carry a very hydrophilic N terminus, and NMR analysis showed that this region is very flexible and not part of the core structure. Surprisingly, both NMR analysis and a fluorimetric assay revealed that HVO_2753 binds only one zinc ion, despite the presence of two ZBPs. Notably, the analysis of cysteine to alanine mutant proteins by NMR as well by in vivo complementation revealed that all four C(P)XCG motifs are essential for folding and function. The NMR solution structure of the major conformation of HVO_2753 was solved. Unexpectedly, it was revealed that ZBP1 was comprised of C(P)XCG motifs 1 and 3, and ZBP2 was comprised of C(P)XCG motifs 2 and 4. There are several indications that ZBP2 is occupied by zinc, in contrast to ZBP1. To our knowledge, this study represents the first in‐depth analysis of a zinc finger µ‐protein in all three domains of life.

The genome of the halophilic archaeon Haloferax volcanii encodes more than 40 one-domain zinc finger µ-proteins. Only one of these, HVO_2753, contains four C(P)XCG motifs, suggesting the presence of two zinc binding pockets (ZBPs). Homologs of HVO_2753 are widespread in many euryarchaeota. An in frame deletion mutant of HVO_2753 grew indistinguishably from the wild-type in several media, but had a severe defect in swarming and in biofilm formation. For further analyses, the protein was produced homologously as well as heterologously in Escherichia coli. HVO_2753 was stable and folded in low salt, in contrast to many other haloarchaeal proteins. Only haloarchaeal HVO_2753 homologs carry a very hydrophilic N terminus, and NMR analysis showed that this region is very flexible and not part of the core structure. Surprisingly, both NMR analysis and a fluorimetric assay revealed that HVO_2753 binds only one zinc ion, despite the presence of two ZBPs. Notably, the analysis of cysteine to alanine mutant proteins by NMR as well by in vivo complementation revealed that all four C(P)XCG motifs are essential for folding and function. The NMR solution structure of the major conformation of HVO_2753 was solved. Unexpectedly, it was revealed that ZBP1 was comprised of C(P)XCG motifs 1 and 3, and ZBP2 was comprised of C(P)XCG motifs 2 and 4. There are several indications that ZBP2 is occupied by zinc, in contrast to ZBP1. To our knowledge, this study represents the first indepth analysis of a zinc finger µ-protein in all three domains of life.

Introduction
For a long time, the existence and importance of very small proteins of less than 100 amino acids were severely underestimated. The main reason is that the annotation of protein-coding genes in newly sequenced genomes typically had a lower limit of 100 codons, and, thus, smaller proteins could not be addressed in similarity searches or proteome analyses. In recent years, it has become clear that many very small proteins exist in all three domains of life, archaea, bacteria, and eukaryotes. In addition to the lack of annotation, also experimental work with very small proteins is very challenging and they are often overlooked using approaches that had been optimized for medium-sized and large proteins; for example, they are lost in standard protein isolation procedures or gel and staining systems.
Mainly two recent experimental developments have enabled the systematic identification and analysis of very small proteins: (1) Ribosomal profiling allows quantifying the positions of all ribosomes on all transcripts of a given species [1][2][3]. When ribosomes are found on small transcripts with very short open reading frames (sORFs), this is experimental proof for the translation of these sORFs into very small proteins.
(2) Optimization of several steps of sample preparation, mass spectrometry, and bioinformatics analysis of MS spectra transformed proteomics into the new field of peptidomics [4,5].
Several recent reviews give an overview of the current knowledge about very small proteins in eukaryotes [2,6,7] and in prokaryotes [1,[8][9][10]. Up to now a generally accepted terminology does not exist. 50 amino acids, 70 amino acids, or 100 amino acids are used as the upper limit for very small proteins. Very small proteins are denoted as 'small proteins', 'µ-proteins', 'peptides', 'micropeptides', and 'sORF-encoded peptides/proteins' (SEPs). Here, we will use the term µ-proteins, to distinguish these very small proteins against 'small proteins', which might 'only' be smaller than an average protein of 300 amino acids, and against 'peptides', which might be processed from larger precursors.
The number of studies concentrating on µ-proteins is much higher with eukaryotic and bacterial than with archaeal species. However, already more than 10 years ago the 'low molecular weight proteome' of the halophilic archaeon Halobacterium salinarum was characterized [11]. Several steps of the workflow were optimized for the application of small proteins. 380 proteins of less than 20 kDa were identified. The majority of 62% had no assigned function. Among these proteins were 20 proteins that had two C(P) XCG motifs (the P is in brackets, because it is present in only one of the two motifs). It was predicted that the four cysteines complex a zinc ion, and, thus, that these proteins contain zinc fingers [11].
Zinc fingers are very versatile interaction domains, which can mediate interactions to DNA, RNA, proteins including membrane proteins, lipids, and other molecules [12,13]. Zinc finger proteins are involved in many cellular processes, including regulation of transcription, signal transduction, protein degradation, and many others [14]. Therefore, the identification of zinc binding domains does not allow the prediction of putative interaction partners or biological functions of the respective proteins. The zinc ion can be complexed by four cysteines (C4), by three cysteines and one histidine (C3H), or by two cysteines and two histidines (C2H2, the 'classical' zinc finger of eukaryotic transcription factors). The zinc fingers exhibit high structural variability and have been classified into eight structural groups [15].
Zinc finger proteins have first been discovered in eukaryotes and were long thought to be confined to this domain [16]. Later, it was discovered that zinc finger proteins are also present in bacteria and archaea. However, a bioinformatic analysis revealed that zinc finger motifs are only present in 1.5% of bacterial small proteins of less than 100 amino acids, but in 8% of archaeal small proteins [17]. Therefore, the importance of small zinc finger proteins seems to be much higher in archaea than in bacteria.
The genome of the halophilic archaeon Haloferax volcanii contains 282 genes for µ-proteins of up to 70 aa, 43 of which contain zinc finger motifs [18]. Single in frame gene deletion mutants were generated for 16 of these genes. Phenotypic analyses under eight different conditions revealed differences to the wild-type for 12 of the 16 mutants, underscoring the high biological importance for zinc finger µ-proteins for haloarchaea. A deletion mutant of a further gene for a µ-protein without a zinc finger grew better than the wild-type under iron limitation [19]. The NMR solution structure of the protein was solved, and it turned out that HVO_2922 is a symmetrical dimer [19]. Solution state nuclear magnetic resonance (NMR) spectroscopy is the method of choice in the structural study of small proteins. It is particularly powerful for the elucidation of the structure and dynamics of proteins in isolated form and within complexes with their biological targets [20,21]. It should be noted that it is not typical that µ-proteins are structured. A systematic analysis of 27 μ-proteins from six archaeal and bacterial species revealed that only 5 of them were fully structured, while seven had partial structure (molten globule), 8 were fully unstructured, and 7 could not be produced [22]. For the proteins in the molten globule state, it was predicted that they probably need an interaction partner for the formation of a fixed structure. Four of the five structured proteins were from H. volcanii, and, thus, it seems that haloarchaeal μ-proteins are especially well suited for structural analyses.
Here, we present the in-depth characterization of the µ-protein HVO_2753 from H. volcanii. An in frame deletion mutant was generated and characterized. The protein was produced homologously in H. volcanii and heterologously in E. coli for biochemical and structural analyses. Its zinc content was quantified, and single amino acid mutants were generated to evaluate the importance of the C(P)XCG motifs. NMR analyses were used to determine the secondary structure and the temperature-dependent dynamics of the protein.
Last, this report presents the 3D high-resolution solution structure, which was elucidated using conventional NOE-based restraints derived from 3D NOESY experiments.

Results
Characteristic features, distribution, and phylogeny of the µ-protein HVO_2753 The genome of H. volcanii contains 43 genes encoding µ-proteins with less than 70 amino acids and at least two C(P)XCG motifs, which can safely be predicted to form zinc fingers. Only one of these proteins, HVO_2753, contains four C(P)XCG motifs, and, thus, should contain two zinc fingers. This solitary feature was the reason to choose HVO_2753 for a detailed characterization. The sequence of the 59 amino acid protein is shown in Fig. 1. The four C(P)XCG motifs make up one third of the protein, and they are more or less evenly distributed over the sequence, leaving only short amino acid patches between the motifs and at the N terminus and C terminus. The fraction of charged and hydrophilic amino acids is very high. Table S1 shows the amino acid composition of HVO_2753 in comparison with that of the H. volcanii proteome. Aliphatic amino acids are underrepresented, and the protein does not contain a single tryptophan. At least twofold overrepresented are serine, methionine, asparagine, and, of course, cysteine. The high serine content prompted us to look for possible posttranslational modifications (see below).
BLAST searches with the HVO_2753 sequence revealed that the protein is widely distributed in hundreds of euryarchaeota. Distant homologs were found in only eight species of bacteria, which probably have obtained the gene by lateral transfer from archaea. A HVO_2753 homolog is not present in any eukaryotic species. Figure S1 shows a multiple sequence alignment of HVO_2753 and 20 selected species that represent various genera of halophilic and methanogenic archaea. Overall, the sequence is highly conserved. A few positions are conserved only in haloarchaea and differ in methanogenic archaea, for example, the glutamic acid at position 51 and the aspartic acid at position 54. The replacement of the glycine in the third C(P)XCG motif with an arginine is highly conserved in all species. Notable, all haloarchaea contain an N-terminal highly hydrophilic extension of about 10 amino acids, which is absent in methanogenic archaea. It is tempting to speculate that this hydrophilic extension is important for the solubility of the protein in the high-salt cytoplasm of haloarchaea.
Phylogenetic trees with 1000 bootstrap replications were constructed using both the maximum parsimony approach and the maximum-likelihood approach (Fig.  S2A + B). Both trees are very similar to one another as well as to 16S rRNA-based species trees. This indicates that the last common ancestor of halophilic and methanogenic archaea already contained a HVO_2753 homolog, which has been vertically inherited since then.
Genomic localization of the HVO_2753 gene and its expression Figure 2A gives an overview of the genomic organization of the HVO_2753 gene and its neighborhood. It is very close to the gene HVO_2752 with an intergenic distance of only 2 nt, making co-transcription of the two genes very likely. The results from a recent RNA-Seq [23] study also indicate co-transcription of the two genes (red signal), and a dRNA-Seq study [24] found a Fig. 1. Sequence of the protein HVO_2753. Top: The four C(P)XCG motifs are highlighted in red. Bottom: Acidic amino acids are highlighted in red, basic residues in blue, residues with amide side chains in yellow and residues with hydroxy groups in green. The accession number in the UniProtKB database is D4GWB3_HALVD. single transcription start signal in front of HVO_2753 (green signal). For a further transcript analysis, northern blot analyses were performed. The probes for both genes gave a single signal about 500 nt in size, which fits well to common size of the two genes (448 nt) and a 3'-UTR. Together, the results provide experimental proof that HVO_2752 and HVO_2753 are transcribed into one bicistronic transcript. HVO_2752 encodes the beta subunit of translation elongation factor aEF1, indicating that HVO_2753 might also be involved in translation.

Generation of an in frame deletion mutant and its phenotypic characterization
An in frame deletion mutant was generated to study the potential biological roles of HVO_2753. H. volcanii is highly polyploid; therefore, Southern blot analysis and multiple cycle PCR were used to verify that the deletion mutant is homozygous. In addition, northern blot analysis was used to compare gene expression in the mutant with that of the wild-type. The HVO_2753 probe did not detect any transcript in the mutant, A screenshot from the Integrated Genome Browser is shown, which includes the genomic numbering in the middle, the annotated protein-coding genes in blue, the results from a RNA-Seq study [23] in red, and the results from a dRNA-Seq study [24]  while the HVO_2752 probe detected a transcript with the expected size difference to the bicistronic transcript of the wild-type (Fig. 2B).
The phenotypes of the in frame deletion mutant and the wild-type were compared under various conditions. Unexpectedly, the mutant did not exhibit a growth defect, neither in complex medium nor in synthetic medium with glucose as sole carbon and energy source (Fig. S3). This result excludes that HVO2753 has a critical role in translation, which would lead to a decrease in growth rate. In stark contrast, the lack of HVO_2753 led to a complete loss of swarming ( Fig. 3  A). In addition, the ability to form a biofilm was severely compromised (Fig. 3B). Therefore, it turned out that HVO_2753 is essential for surface-depended lifestyles of H. volcanii, whereas it is dispensable in suspension culture. The open reading frame (ORF) of the HVO_2753 gene was introduced into an expression vector to complement the phenotypes of the deletion mutant. The swarming defect of the mutant could be fully complemented, both with an N-terminally and with a C-terminally tagged version of the protein (Fig. 3A). Unexpectedly, the biofilm defect could not be complemented with any of the two tagged protein versions (Fig. 3B), indicating that either the required protein concentration or the interaction partners are different for the two lifestyles.

Homologous production and purification of HVO_2753
A his 6 -tagged version of HVO_2753 was introduced into the haloarchaeal expression vector pSD1/R1-6 [25], which enables homologous overproduction of the protein. The protein was isolated under native salt conditions using affinity purification followed by size exclusion chromatography. Figure S4 shows that this two-step procedure resulted in the isolation of a very pure protein of the expected size, which was used for further analyses (see below). The yield was 3.5 mg of protein per liter culture after the two-step isolation procedure. Notably, HVO_2753 forms a monomer in solution, in contrast to another recently characterized µ-protein from H. volcanii [19].

Analysis by mass spectrometry
The homologously produced protein was analyzed by both top-down and bottom-up mass spectrometry (MS/MS). Top-down analysis of the protein determined the monoisotopic mass of the protein to be 7249.9940 Da (z = 5 to z = 8) (Fig. 4). This is in excellent agreement with the theoretical monoisotopic mass of 7250.1003 Da (Δmass −14.66 ppm) for HVO_2753 including the his 6 -tag. In addition, the combined use of both CID and HCD activation types across a range of NCE values resulted in 63% of possible inter-residue cleavages being observed (Fig. S5A). In spite of the high fraction of amino acids that could potentially be modified, the MS analysis revealed that the native protein lacks any posttranslational modifications, at least during exponential growth in complex medium. Further confirmation toward the absence of any nontransient modifications was determined via bottom-up MS analysis. The latter, in concert with the top-down assessment, resulted in 100% sequence coverage of the

Heterologous production in E. coli
For NMR analyses, HVO_2753 was also produced heterologously in Escherichia coli. It was produced as a fusion protein with a GST tag, which enabled affinity purification as a first isolation step. The tag was removed by treatment with TEV protease, and the protein was further purified by size exclusion chromatography (Fig. S6). The two-step procedure resulted in an extremely pure protein, which lacked the N-terminal methionine and instead had four additional amino acids at its N terminus (GAMG), which were numbered from −4 to −1 to ensure an identical numbering of the protein obtained after homologous and heterologous production. The production in E. coli offers the advantage of established procedures for labeling the protein with stable isotopes, which is essential for several NMR approaches.

Structural characterization of the zinc finger protein
The comparison of the HVO_2753 protein heterologously expressed in E. coli and isolated from Haloferax volcanii shows consistent chemical shifts for the two differently expressed proteins and those indistinguishable structures (Fig. S7). Further structural investigation via heteronuclear multidimensional NMR experiments of the zinc finger protein was conducted with heterologously expressed protein using 13 C/ 15 N isotope labeling schemes.
High-salt conditions are typically required for growth of the halophilic archaeon H. volcanii, which is why the influence of 1 M NaCl on the folding motif of HVO_2753 protein was examined. The chemical shifts of 2D 1 H 15 N Best-TROSY [26,27] spectra recorded at low (200 µM) and high-salt (1 M) concentrations are consistent, indicating no significant effect of NaCl on the protein folding propensities (Fig. S8). Thus, further NMR experiments were conducted at low-salt conditions preferable for NMR spectroscopy.
HVO_2753 protein adopts a persistent structure rich of β-strand structure elements. At 298 K, 77 amide cross-peaks were observed in 2D 1 H 15 N BEST-TROSY spectrum, which corresponds to 32% more signals than initially expected (58 expected signals). The additional peaks indicate the presence of a second conformation. The analysis of the signal intensity defines two conformations with a population ratio of approximately 4:1.

NMR backbone assignment and TALOS prediction
The 1 H, 15 N, and 13 C chemical shift assignment of HVO_2753 was performed manually using standard double-and triple-resonance NMR experiments at 298 K. The backbone assignment of the major conformation was completed to 97%. 23 residues show a splitting pattern of the signals, indicative for two conformations in slow exchange on the NMR time scale  5A). The population ratio of the coexisting conformers is temperature-dependent (Figs S9 and S10).
In order to analyze the kinetics of the interconversion between the major and the minor conformation, we performed temperature-dependent 2D 15 N-ZZ-exchange NMR experiments [28,29]. Cross-signals correlating the two populations could be observed at 338 K, with a population ratio of two conformations close to 1 : 1 (Fig. 5C). For three residues (C29, C39, and G58), cross-signals between the major and the minor conformation could be resolved (Fig. S11). The exchange rate k ex , calculated as the sum of the forward and reverse rate constants, was calculated as the average of the values of the residues C29, C39, and G58 and is around 1.2 s −1 . With decreasing temperature, the exchange rates increased beyond the detectable range of 0.1 to 10 s −1 in ZZ-exchange experiment [30].
The presence of the minor conformation hampers the 3D structure calculation of the protein. Multiple signals from the minor conformation overlap with the peaks from the major conformation, and the automated restricted peak picking procedure by SPARKY 3.114 [31] becomes challenging. Even more crucial is the impeded NOESY-based analysis of two populations, where the signal intensity directly relates to the distance restraints. Therefore, we screened for conditions to observe mainly one conformation.
Since HVO_2753 the major conformation is favored at low temperature (278 K), the chemical shift assignment and eventually the structure calculation were performed at 278 K (Figs 5B, Fig. S9). Chemical shift assignments of the backbone for the well-defined rigid core and the full length were completed to 99% and 97%, respectively, while the overall assignment was achieved to 88% and 86%, respectively. For structural comparison of different conformations, the backbone resonance assignments of both major and minor population at 298 K and of the single major conformation at 278 K were used to predict the secondary structure elements of the protein with TALOS-N software [32] (Fig. 5D). Besides the N-terminal α-helical region, which in the major conformation is present at higher temperature and is absent at 278 K, no significant differences in secondary structure elements between the two conformations could be detected.

Quantification of zinc binding and its influence on protein folding
Typically, zinc ions stabilize the native structure of zinc finger proteins. Since HVO_2753 protein contains two zinc finger motifs in the sequence and NMR structural analysis showed two temperature-dependent conformations of the protein, the questions how zinc ions are involved in the protein folding pathway and how many zinc ions are incorporated in the binding pockets are particularly important. In order to answer these questions, zinc ions taken up by the protein even without extra addition of zinc to the growth medium were first removed from the protein to allow for titration experiments to determine binding stoichiometry. Structural changes in the protein induced by the zinc ions were monitored by 2D 1 H 15 N HSQC spectra. Complete removal of zinc ions turned out to be nontrivial; it was achieved by incubating the protein with threefold excess of EDTA and dialysis against EDTA at 338 K for four hours. The high temperature required for EDTA to chelate all the zinc ions indicates higher affinity of the protein for zinc ions compared to EDTA (K D = 10 −16 M) [33]. Eventually, the unfolded metal-free protein was titrated with Zn 2+ ions in 0.25 equivalent steps at 298 K (Fig. 6). Addition of 0.25 equivalents of Zn 2+ ions already induces the refolding of the protein. Remarkably, both conformations refold simultaneously appearing with the same population ratio (4: 1). The native structure of the protein is completely restored by 0.75 equivalents of Zn 2+ . Additional zinc ions do not change the fold or the population ratio of the protein any further. This experiment shows that despite the presence of two zinc finger motifs in the HVO_2753 sequence, only one zinc ion is bound to the protein.
The unexpected finding that only one zinc ion is bound despite the two zinc finger motifs was obtained with the protein after heterologous production in E. coli. Because it could not be excluded that E. coli lacks a potential zinc incorporation chaperone, the zinc content was also quantified for the protein after homologous production in H. volcanii. HVO_2753 was purified under native conditions (see above) and dialyzed against a low-salt buffer. Zinc was quantified with the fluorophore ZnAF-2F, which is highly specific for zinc and discriminates against all other ions [34].
Only a very small amount of zinc could be detected, underscoring that HVO_2753 is stable under low-salt conditions, in contrast to many other haloarchaeal proteins. Proteinase K was used to hydrolyze the protein to liberate bound zinc. Quantification with the fluorimetric assay led to the detection of 0.7 equivalents of zinc (Fig. S12), in excellent agreement with the NMR results. Taken together, two very different experimental approaches revealed that HVO_2753 binds only one zinc ion, in contrast to the expectation based on the presence of four C(P)XCG motifs.

Point mutations reveal the essentiality of all four C(P)XCG motifs
The binding of only one zinc ion opened the possibility that only two of the four C(P)XCG motifs might be required. To test this possibility, the first cysteine in each of the four motifs was replaced by alanine. The mutant proteins were produced in the HVO_2753 deletion mutant, and the phenotype of the resulting strains was tested. It turned out that none of the four mutant proteins could complement the deletion mutant, and all four strains were totally devoid of swarming (Fig.  S13). It was attempted to purify the four homologously produced mutant proteins and quantify their zinc contents. However, it turned out that the protein levels were very low, indicating that all four mutant proteins were misfolded in vivo and rapidly degraded.
Two further amino acids were arbitrarily chosen (Q34 and Y49), and single amino acid mutants were generated. Also, these two mutant proteins could not complement the HVO_2753 deletion mutant (Fig.  S13). This result shows that also residues outside of the C(P)CXG motifs are essential for function, and, in addition, that single amino acid replacements are suitable for a future in-depth scanning mutagenesis approach for HVO_2753 characterization.
We also induced mutations into the ORF of the HVO_2753 gene that had been introduced into an expression vector for E. coli. In contrast to the homologous production described above, heterologous production and purification turned out to be feasible. However, NMR analysis revealed that all mutations lead to severe loss in structure and stability, and, instead, the mutant proteins adopted molten globulelike conformations (Fig. S14). Taken together, characterization of proteins with single amino acid replacements after homologous as well as heterologous production clearly showed that all four C(P)XCG motifs are essential for protein stability, folding, and functionality, despite the fact that only one zinc ion is bound.

Dynamic studies of HVO_2753
We further characterized the dynamics of HVO_2753 by conducting relaxation experiments and analysis of intramolecular hydrogen bond formation (Figs 7, S15 and Table S4). Dynamics studied by heteronuclear 15 N relaxation measurements provide values of T 1 , T 2 , and { 1 H}-15 N het-NOE relaxation parameters, which are then used to determine the Lipari-Szabo order parameter S 2 and the experimental rotational correlation time (τ c ) by the TENSOR2 [35] software.
The N terminus, which includes the first 13 residues, shows low het-NOEs and order parameters, indicating high flexibility compared to the more rigid core of the protein (Fig. 7). The experimental rotational correlation time (τ c ) of the rigid core is 5.1 ns, which is characteristic of a protein of this size.
Intramolecular hydrogen bond formation was monitored by temperature-dependent amide proton chemical shift perturbations. A series of 2D 1 H 15 N BEST-TROSY spectra was recorded for the temperature range from 278 to 348 K in 10 K increments. The temperature coefficients of the amide protons were calculated from a linear fit of the chemical shift perturbation as a function of the temperature [36]. Temperature coefficients with values more negative than −4.5 ppb/K are characteristic for the rapidly exchangeable and therefore not hydrogen-bonded amide protons, while protons with a temperature coefficient higher than −4.5 ppb/K are involved in hydrogen bonding [36]. The temperature coefficients of the N terminus of the HVO_2753 protein are consistent with the relaxation data, showing a flexible unstructured part of the protein (Fig. S15).

3D structure determination
The structure calculation of HVO_2753 protein was performed by fully automated NOESY cross-peak assignment using CYANA [37][38][39]. NOE cross-peaks were obtained by restricted peak picking with Sparky 3.114 [31]. 32 pairs of φ/ψ backbone dihedral angles obtained from TALOS-N software, 42 3 J(H N ,H α )-coupling constants, and 8 hydrogen bond constrains were used for structure calculation. The 20 NMR structures with the lowest energy represent the 3D NMR solution structure of HVO_2753 (Fig. 8). The structural statistics are shown in Table 1. The atomic coordinates have been deposited in the protein data bank (PDB ID: 6YDH). HVO_2753 consists of a well-folded rigid core and unstructured flexible N terminus. The unstructured part comprises the first 13 amino acids. The core domain shows two twisted antiparallel β-sheets and one helix. The first β-sheet is composed of two βstrands including residues Q34 to R38 and A25 to C29, while the second one includes amino acids Y49 to C51 and F56 to G58. The α-helix is formed by the amino acids C42 to K44 and prolonged at its ends by two residues forming 3 10 -helix.
According to the calculated structure, two zinc binding pockets (ZBP) are possible. The first one (ZBP1) involves binding motifs 1 (CVSCG) and 3 (CSKCR), while the second binding pocket (ZBP2) consists of binding motifs 2 (CPDCG) and 4 (CPDCG). Regarding cis/trans-peptide bond conformation of the proline residues using the experimental chemical shift assignment and the prediction tool Promega [40,41], prolines P30 and P52 in ZBP2 adopt a trans-conformation, while the C-terminal P59 adopts a cis-conformation (Table S3).
In order to identify the occupied binding pocket, CYANA structure calculation was performed separately with manually defined ZBP1 and ZBP2, which were further compared with a structure without a strictly defined binding pocket (Fig. S16). Each binding pocket was defined by setting the additional six upper distance restraints between four coordinating sulfur atoms from the cysteine residues to 4Å, thus forming a symmetrical tetrahedral ZBP. No significant differences in the RMSD value could be detected between the structures without restricted ZBP and with defined ZBP2 (about 0.25), while the structure with defined ZBP1 shows a larger RMSD value (0.33). Moreover, the structure calculation of the last one was accompanied by Ramachandran plot outliers in all 20 structures. The affected residue is C15, which is directly involved in the coordination of the zinc ion in this structure. We take this as evidence that ZBP2 carries the zinc cation but not ZBP1.
Further indication that ZBP2 contains the metal ion provides 13 C NMR chemical shift analysis of cysteines. 13 C C α and C β NMR chemical shifts are sensitive to the redox state of the cysteines and can, therefore, be used to predict the reduced and oxidized form of cysteines. Furthermore, chemical shift analysis allows distinguishing between reduced zinc coordinating and noncoordinating cysteines [42]. According to the chemical shift assignment of the cysteines (Fig.  S17 and Table S2), all cysteine residues except for C42 are in the zinc-ligated region. Cysteine 42 shows characteristic downfield and upfield shifts of the C α and C β atoms, respectively, indicating the reduced metal nonbound form and, therefore, a free binding pocket.
In order to validate the structure and in particular the binding pockets, CYANA structure calculation was performed with excluded dihedral angles restraints, derived from the TALOS-N [32] predictions for both cases, with defined ZBP1 and ZBP2. From the resulted bundle of the 20 best structures, the backbone chemical shifts were predicted by SHIFTX2 [43] and further compared with the experimental set of resonances. The analysis was performed by calculation of the root mean square error (RMSE) and the mean square error (MSE) for both structures. RMSE and MSE of the structures with fixed ZBP1 and ZBP2 were  The RMSE values of the backbone chemical shifts for the structure with manually defined ZBP1 do not show significant differences to the structure with the fixed ZBP2 (Fig. S18). However, the MSE value of the binding pocket region between fixed ZBP1 and ZBP2 shows overall higher errors for the structure with fixed ZBP1 (Fig. S19). For instance, MSE of the fixed ZBP1 is 22% higher for the CB resonances and by a factor of two higher for the HA chemical shifts. Taken together, the results of the MSE structure validation indicate that ZBP2 is preferred and, therefore, most likely occupied by the zinc ion.

Analysis of the HVO_2753 minor conformation
In order to investigate the nature of the minor conformation, we performed CSP (chemical shift perturbation) analysis, which was further displayed by mapping the affected residues on the calculated solution structure of the rigid core domain (Fig. S20). All affected CSPs are located predominantly in the binding motifs. However, according to the TALOS prediction, no significant differences in the secondary structure elements between major and minor conformations could be detected (Fig. 5D). This indicates that the minor population is characterized by the slight movement of the binding motifs, without affecting the overall structure. Furthermore, detailed analysis of the relaxation data revealed that several residues exhibit faster relaxation rate R 2 compared to the average value of the rigid core of the protein (9.5 AE 0.2 s -1 ). This contribution to the transverse relaxation indicates the presence of chemical exchange. Almost all amino acids with R 2 rates higher than 13 s -1 were observed to adopt the second minor conformation at higher temperature. The highest R 2 values show residues K44 and R38 with 18.4 AE 0.4 s -1 and 15.3 AE 0.3 s -1 , respectively (Fig. S21). Both are involved in the formation of the minor conformation. Interestingly, among those residues are two cysteines, C29 and C51, which build the zinc binding pocket 2 (ZBP2) and are the first amino acids in the CPDCG zinc binding motifs. These residues were found to be essential for the protein stability and function.

Analysis of the HVO_2753 binding pockets
The analysis of the intramolecular interactions of the cysteines revealed that each residue is involved in the formation of several long-distance NOEs. The recording of a 3D 1 H 1 H 15 N NOESY-HSQC experiment at a  MHz spectrometer with the mixing time of 120 ms allowed us to observe cross-peaks corresponding to a distances of up to 5.7Å (Figs S22 and S23). Remarkably, C12 and C39 residues, which despite an unoccupied ZBP1 were found to be essential for the protein folding, show two intramolecular distances each. These interactions are important for the tertiary structure and are directly involved in structure formation of the protein.

Discussion
In this contribution, we describe an in-depth analysis of the µ-protein HVO_2753 from H. volcanii. Until now, only very few archaeal µ-proteins have been characterized. A very early analysis of the low molecular weight proteome led to the identification of 370 small proteins of less than 20 kDa [11]. However, only two of these proteins were further characterized. A 60 aa µ-protein with an unusual C3H zinc finger was found to be important for the expression of the bacterioopsin (bop) gene, and, accordingly, it was named Brz (bacteriorhodopsinregulating zinc finger protein) [17]. A 55 aa µ-protein was named 'bacteriorhodopsin-regulating basic protein' (Brb), because it was found that Brb cooperates with Brz and another protein, Bat, in the transcriptional regulation of the bacterioopsin gene [44].
Optimization of mass spectrometric techniques was used to systematically enhance the identification of µproteins of Methanosarcina mazei Gö1 together with the analysis of the standard proteome [45]. Two different approaches led, in summary, to the simultaneous identification of 28 µ-proteins and nearly 1900 standard proteins. In another study, LC-MSMS was used to verify the presence of µ-proteins in M. mazei Gö1 [46]. The concentrations of two µ-proteins, Sp36 and Sp41, were found to be elevated during nitrogen starvation, indicating that these µ-proteins are involved in (the regulation of nitrogen) metabolism.
The annotation of the microproteome of H. volcanii, the species used in this study, is more comprehensive than that of most other archaeal species, because the results obtained with H. salinarum discussed above could be transferred for all µ-proteins present in both species. The annotation contains 282 proteins with a length of up to 70 aa and 575 proteins with a length of up to 100 aa. Two recent studies characterized several of these µ-proteins: 1) 16 single gene deletion mutants of zinc finger proteins were generated and their phenotypes were characterized [18]. For 12 of the mutants, differences to the wild-type were observed. 2) The NMR structure of one µ-protein was solved, which was found to be a dimer in solution [19].
It should be noted that the characterization of HVO_2753, which is presented here, combines genetic, biochemical, and structural approaches and is more comprehensive than studies mentioned above. The protein was structured and stable in low salt, in contrast to typical haloarchaeal proteins, which require highsalt concentrations for stability [47]. However, low-salt stability has been described for several other proteins from H. volcanii, for example, the µ-protein HVO_2922 [19] and tRNase Z [48], which, surprisingly, was found to be only active at low salt.
The HVO_2753 deletion mutant had no growth defect in complex or synthetic media, indicating that HVO_2753 is not important for central processes like transcription or translation, in contrast to expectations based on its co-transcription with the translation elongation factor HVO_2752. Interestingly, a quantitative proteomic study has revealed that the differential protein level patterns of the two proteins under different conditions are not the same, but very dissimilar [49]. The HVO_2753 level is severely downregulated under several different stress conditions, while, in contrast, the HVO_2752 level is upregulated at low and high temperatures. This indicates differential regulation of translation or/and protein stability of HVO_2753 and HVO_2752, which are encoded in one operon.
The HVO_2753 deletion mutant revealed that the protein is essential for swarming and biofilm formation. The swarming phenotype could be complemented with the wild-type protein, but with none of four C-A mutants, which covered all four C(P)XCG motifs. In addition, protein levels of the four mutant proteins were very low, indicating that they are misfolded and unstable in vivo. In excellent agreement with these results, NMR analyses revealed that all four C-A mutant proteins are not structured, but were in the molten globule state.
The essentiality of all four C(P)XCG motifs was expected as long as it was thought that HVO_2753 binds to zinc ions and has two zinc fingers. However, very unexpectedly, using two very different methods it was revealed that it binds only one zinc ion, in spite of the four motifs. Both methods found about 0.75 equivalents of zinc, which is a somewhat lower than one. The difference is most probably due to a slight overestimation of the protein concentration. The protein does not contain any tryptophan, making a spectroscopic protein quantification impossible due to the low extinction coefficient. Therefore, standard protein quantification assays were applied, which are a bit less accurate. In any case, it can be safely excluded that HVO_2753 binds two zinc ions.
The NMR solution structure of HVO_2753 was solved and revealed several interesting features: 1) The  [50]. For Fdx, it could be experimentally proven that this extra domain is essential for the halophilic character of the protein, including stability and formation of the Fe-S cluster.
2) The four C(P)XCG motifs form two zinc binding pockets (ZBPs), and at first sight, the 3D localization of the eight cysteines did not explain why only one is able to bind zinc.
3) The two ZBPs are not formed by C(P)XCG motifs 1 + 2 and 3 + 4, as expected, but by clusters 1 + 3 and 2 + 4. The presence of two ZBPs with differential zinc binding capacity is highly unusual; in fact, to our knowledge, HVO_2753 represents the first such case. For the following reasons, we propose that ZBP2 is occupied by a zinc ion, while ZBP1 is empty: 1 In proteins with two C(P)XCG motifs, it is very highly conserved that one of the motifs contains a proline residue at the second position, and that both motifs contain a glycine residue at the fifth position. However, ZBP1 is formed by the motifs CVSCG and CSKCR, none of which includes a proline. In addition, motif 3 does not have the highly conserved glycine at the fifth position, but an arginine. 2 When additional distance constraints were included in the NMR solution structure to simulate zinc ion binding, for ZBP1 one of the cysteines (C15) became a Ramachandran plot outlier, while this was not the case for ZBP2. Furthermore, implementation of the additional restrains in order to model the occupied ZBP1 to the structure calculation resulted in larger RMSD value compared with an occupied ZBP2. 3 The chemical shift analysis showed that C42, which is part of ZBP1, is in the reduced form, indicating that ZBP1 is not occupied by zinc. 4 In ZBP2, the zinc ion binding site is formed by two hairpins between antiparallel beta-strands (Fig. 8). This zinc binding pocket is similar to the zinc ribbon folding group, which has been described earlier for zinc finger motifs in large proteins [15].
While the structures of various isolated zinc finger motifs from larger proteins have been reported, which represent very small fractions of the complete proteins, the structures of only very few complete small zinc binding proteins have been solved. Here, we present the structure of an especially interesting and to date unique protein, which contains 2 ZBPs, only one of which does bind a zinc ion. However, the geometry of the other ZBP, lacking a zinc ion, is also essential for the folding, stability, and functionality of the protein, because all single C-A mutations were detrimental and the cysteines are involved in long-range interaction networks. Future work will unravel interaction partners and the molecular mechanism of HVO_2753, and it will also include further analyses of the influence of single amino acids on the formation of a folded and functional structure.

Databases and bioinformatic analyses
Homologous sequences to HVO_2753 were identified in the NCBI nonredundant protein sequence database using BlastP. The top 500 hits were retrieved, and the phylogenetic distribution was analyzed.
Multiple sequence alignments of selected sequences were generated with MEGA-X [51] using the algorithm MUS-CLE. The alignments were visualized using Jalview, which enables coloring according to different criteria.
Phylogenetic trees were generated with MEGA-X [51] using, respectively, the maximum parsimony and the maximum-likelihood approach. In both cases, 1000 bootstrap repetitions were calculated, and the fractions of cases (%) each node was retrieved were shown in the resulting consensus tree.
The Integrated Genome Browser [52] was used to visualize the manually curated genome annotation [53] as well as the results of a RNA-Seq study [23] and a dRNA-Seq study [24]. Gene sequences from the H. volcanii genome were retrieved from the database HaloLex [54]. The amino acid composition of the H. volcanii proteome was provided by Friedhelm Pfeiffer (MPI for Biochemistry, Martinsried, Germany).

Strains, media, and culture conditions
The H. volcanii strain H26 [55] was used as a wild-type in this study. Generation of deletion and complementation strains is described below. The strains were grown in complex medium and in synthetic media with different carbon and energy sources as described [56,57].
The E. coli strain XL1-Blue MRF' (Agilent Technologies, Waldbronn, Germany) was used for cloning, and it was grown in standard media [58].

Northern blot analysis
Northern blot analyses were essentially performed as described [59]. In short, H. volcanii cultures were grown in complex medium to mid-exponential growth phase. Cells were harvested, and total RNA was isolated. 15 µg of total RNA from each sample was separated on a denaturing formaldehyde agarose gel. The RNA was transferred by capillary blotting onto a nylon membrane, and fixed by UV-cross-linking. Digoxigenin-labeled single-stranded DNA probes were generated by PCR using DIG-dUTP and a dNTP mix with reduced a dTTP concentration. The primers for probe generation are listed in Table S5. Hybridization was performed overnight at 50°C. The membrane was washed twice with 2 × SSC/ 0.1 % SDS and twice with 1× SSC/0.1 % SDS. The probes were detected using an anti-DIG antibody coupled to alkaline phosphatase and the chemiluminescence substrate CDP star according to the instructions of the manufacturer (Roche, Mannheim, Germany). The signals were visualized with X-ray films (GE Healthcare), and the sizes were analyzed with the marker 'RiboRuler Low Range RNA Ladder' (Thermo Fisher Scientific).

Generation of an in frame deletion mutant and its complementation
The in frame deletion mutant was generated by the so called Pop-In-Pop-Out method as described previously [55,60]. It should be noted that in frame deletions are important for genes that are situated in an operon, because polar effects on the expression of other can be avoided. The primers for the generation of the mutant are listed in Table S5. H. volcanii is highly polyploid with about 30 copies of the major chromosome [61], and thus, it is possible that not all copies of the wild-type chromosome are replaced by the deletion copies and heterozygous clones arise. Therefore, the homozygosity of the deletion mutant was verified both by Southern blot analysis and by multicycle PCR with isolated DNA as template.
For the complementation of the mutant, the HVO_2753 gene was cloned into the expression vector pTA929 [62]. The vector contains a tryptophan-inducible promoter and enables modulation of the expression level. The primers that were used to amplify the gene with genomic DNA as template are listed in Table S5. The sequence of the resulting plasmid was verified by sequencing, and it was introduced into the deletion mutant by PEG-mediated transformation. As a negative control, the deletion mutant was also transformed with the empty vector.

Analyses of growth, swarming, and biofilm formation
Growth analyses were performed in microtiter plates as described previously [57]. For each condition, 150 µL medium was inoculated to an OD 600 of 0.05 from a preculture that had been grown under the respective condition. The cultures were grown on a Heidolph Titramax 1000 rotary shaker (Heidolph, Schwalbach, Germany) with 1100 rpm at 42°C. The OD 600 was determined using the microtiter plate photometer Spectramax 340 (Molecular Devices, Ismaning, Germany). Three biological replicates were performed, and average values and standard deviations were calculated.
The swarming assay was performed as described previously [18]. 5 mL of the respective medium with 0.3% (w/v) agar was filled into each well of six-well plates (Sarstedt, Nümbrecht, Germany) one day prior to start of the assay. The cells from precultures in the respective medium were harvested by centrifugation and resuspended in basal salts (medium without a carbon source) to an OD 600 of 20. 2 µL of the cell suspension was injected deeply into the swarm agar in the center of the wells, because H. volcanii swarms only at low oxygen concentrations. The plates were incubated at 42°C in a Styrofoam box together with a glass of water to inhibit drying. Three biological replicates were performed, and averages and standard deviations were calculated.
Biofilm formation was quantified in 96-well plates. 150 µL of cell suspensions was used for each well, and the plates were incubated at 42°C without shaking for 24 h and 48 h. The principle of the method is to remove unattached cells, fix and stain cells in the biofilm, wash the biofilm, remove the bound stain, and quantify it spectroscopically. A detailed description has recently been published [18].

Homologous production and native purification of HVO_2753
For homologous production of the protein, the HVO_2753 gene was cloned into the expression vector pSD1/R1-6 [25], which had been modified by the addition of a NdeI site. The vector contains a very strong promoter, which leads to a 30fold higher trimethoprim resistance than a median promoter when it controls the expression of the dhfr (dihydrofolate reductase) gene [25]. The primers used to amplify the gene with genomic DNA as template are listed in Table S5. The primers added a NdeI site, the sequence for a N-terminal hexahistidine tag, and a KpnI site. The resulting plasmid was verified by sequencing and introduced in the ΔHVO_2753 deletion mutant by transformation. A preculture of the ΔHVO_2753 pSD1_R1/6_HVO_2753_NHis production strain was grown overnight in 30 ml complex media containing 0.4 µgÁmL −1 novobiocin. 1.5 ml of the preculture at midexponential growth phase was used to inoculate 500 ml of complex medium. The production culture was grown overnight, and the cells were harvested by centrifugation (6500 g, 30 min. 4°C). The pellet was washed with 20 ml of basal salts, and it was suspended in 5 ml binding buffer (2.1 M NaCl, 20 mM HEPES pH 7.5, 20 mM imidazole, 1 mM PMSF). The cells were lysed by sonication on ice. The lysate was clarified by centrifugation at 16000 × g for 60 min at 4°C. The supernatant was loaded on a Chelating Sepharose ® Fast Flow (GE Healthcare) gravity column, which had The resin was washed with 20 ml of binding buffer to remove nearly all unspecifically bound proteins. Specifically binding proteins were eluted with 4 ml binding buffer containing 300 mM imidazole. The subsequent size exclusion chromatography was performed using a Superose 6 FPLC column (10/300 GL, GE Healthcare). The mobile phase was 2.1 M NaCl, 20 mM HEPES pH 7.5, and the flow rate 0.3 ml. The proteins that were used to generate a standard curve were aprotinin (6.5 kDa), ribonuclease A (13,7 kDa), ovalbumin (44.3 kDa), γ-globulin (150 kDa), and thyroglobulin (670 kDa).

Fluorimetric zinc binding assay
The zinc content of the protein was quantified using the fluorophore ZnAF-2F, which is highly specific for zinc ions and discriminates against all other bivalent metal ions [34]. HVO_2753 was purified under native conditions as described above. The protein concentration was quantified using the BCA protein assay (https://www.thermofischer.c om) with BSA as standard. 1 µM HVO_2753 was dialyzed against 25 mM NaCl and 20 mM HEPES at pH 7.5, because haloarchaeal proteins typically denature in low salt. As this treatment turned out to be insufficient for HVO_2753, the protein was treated with 100 µgÁmL −1 proteinase K at 37°C overnight. Measurements were performed using a microtiter plate fluorimeter with 492 nm as excitation wavelength and 517 nm for detection. 3 µM ZnAF-2F was used, and a standard curve with different concentrations of ZnCl 2 was generated alongside the HVO_2753 samples. Four technical were used for each of the seven biological replicates, and average values and standard deviations were calculated.

Direct infusion top-down MS analysis of HVO_2753
Direct infusion analysis of the intact protein was performed on the LTQ Orbitrap Velos MS (Thermo Scientific, Germany) using the heated electrospray ionization source (HESI). Parameters were adjusted until a stable signal was acquired (voltage of approx. 4kV, flow rates between 5 and 15 μLÁmin −1 ). Both full scan and MS/MS analyses were performed at a target resolution of 60,000. Selective MS/ MS fragmentation of various dominant multiply charged species was performed across a range of normalized collision energies (NCE) utilizing both CID (NCE: 20,25,30) and HCD (NCE: 17, 25, 30) activation types. MS raw data files were opened in Xcalibur Qual Browser (version 4.2.28.14) and summed regions for the various precursor ions at the specified NCE processed (deconvolution and deisotoped) using the Xtract algorithm. Peak lists for the various summed spectra were exported, and combined lists were then assessed using the ProSight Lite software package (version 1.4).

Bottom-up analysis by LC-MS
A 100 μg aliquot of the sample was reduced (10 mM dithiothreitol, 56°C, 1 h) and alkylated (55 mM iodoacetamide, 20°C, 30 min). A 40 μg aliquot was digested for 3 hours with 1 μg of sequencing grade trypsin in 100 mM triethylammonium bicarbonate buffer (pH 8). Following digestion, the peptides were cleaned via solid phase extraction on a C18 cartridge (33 mg) (Waters, Germany), prior to a two-step elution off the cartridge with 100 μL of 60 % acetonitrile (ACN) + 0.1% trifluoroacetic acid (TFA) and 100 μL of 100% ACN. The peptides were dried via vacuum centrifugation (2h, 45°C) before resuspension in 50 μL of HPLC loading buffer (3% ACN + 0.1% TFA). Aliquots of the reduced, reduced, and alkylated, and unprocessed protein were also collected during the sample preparation and diluted to a concentration of 0.1 μgÁμL −1 in 10% ACN + 0.1% formic acid for direct injection analysis.
Chromatographic separation was performed on a Dionex U3000 nanoHPLC system (Thermo Scientific, Germany) equipped with an Acclaim PepMap 100 column (2 μm particle size, 75 μm × 150 mm) coupled online to a mass spectrometer. Eluent A: 0.05% formic acid (FA), eluent B: 80% ACN + 0.05% FA. Separation was performed over a programmed 60-minute run. Initial chromatographic conditions were 5% B for 2 minutes followed by linear gradients from 5% to 50% B over 30 minutes and a 5-minute increase to 95% B, and 8 minutes at 95% B. Following this, an inter-run equilibration of the column was achieved by 15 minutes at 5% B. A flow rate of 300 nlÁmin −1 was used, and 1 μL of sample was injected per run.
Acquisition of data following separation was performed on the LTQ Orbitrap Velos mass spectrometer (Thermo Scientific, Germany) utilizing a combination of CID and HCD activation. A full scan MS acquisition was performed (resolution 60,000) with subsequent data-dependent MS/ MS of the top 5 most intense ions fragmented using both CID (NCE 35) and HCD (NCE 35) (resolution 7,500); dynamic exclusion was enabled (30-sec duration).
MS raw data files were searched against protein fasta databases the complete UniProt Haloferax volcanii database (accessessed from UniProt 7th November 2018) with the target HVO_2753_Histagged protein appended; in addition, the common laboratory contaminant database was appended. The searches were performed using the Proteome Discoverer software package (version 2.2.0.388) and the SequestHT search algorithm. Precursor mass tolerance was set to 7 ppm, with fragment mass tolerance of 0.02 Da. Searches were performed with semitrypsin specificity, maximum 3 missed cleavages, variable modifications of methionine oxidation and asparagine and glutamine deamidation, and fixed modification of cysteine carbamidomethyl. Strict parsimony criteria have been applied (high stringency). A protein level FDR < 5% was applied, and at least two high confidence peptides were required, of which at least one is unique. Peptides were required to be of high confidence with an FDR < 1%.
Heterologous production in E. coli and low-salt purification for NMR analyses For heterologous expression in E. coli, the expression vector pGEX-CS was used, which leads to the production of fusion proteins with an N-terminal GST (glutathione Stransferase) tag, followed by a TEV (tobacco etch virus) protease cleavage site, followed by the respective protein of interest. The NcoI and BamHI restriction site was used to introduce the ORF of the HVO_2753 gene into the vector. The ORF was amplified by PCR using primers that added the respective restriction enzyme recognition motifs. The NcoI recognition motif, CCATGG, contains the start codon, but adds further nucleotides on both sides. This required the addition of one additional codon upstream (alanine) to stay in frame with the TEV recognition site and one additional codon downstream (glycine) to stay in frame with the HVO_2753 ORF, which started with the second codon. The TEV protease has the recognition site ENLYFQ\G, and it cleaves between the last two amino acids. Therefore, the four amino acid GAMG remain fused to HVO_2753 after TEX cleavage. They were numbered from −4 to −1, while the HVO_2753 sequence was numbered from + 2 to + 59, so that the amino acid numbering is identical for the heterologously produced and the homologously produced protein.
The resulting plasmid was used to transform T7 Express E. coli cells. The production strain was grown in M9 synthetic medium [58] to enable the incorporation of stable isotopes for some of the NMR experiments. 1 mM ampicillin was added to guarantee the presence of the plasmid. For labeling experiments, 15 N-labeled NH 4 Cl alone or together with 13 C-labeled glucose (Cambridge Isotope Laboratories, Cambridge, MA, USA) replaced the unlabeled components of the M9 medium [58]. Cultures were grown at 37°C to an OD 600 of 0.6, and expression of the fusion gene was induced by the addition of 1 mM IPTG. After a further incubation of 12 h, cells were harvested by centrifugation (5000 rpm, 15 min, 4°C). The cell pellet was resuspended in lysis buffer (25 mM Tris pH 8, 200 mM NaCl, 3 mM DTT), which was supplemented with 10% glycerol (v/v) and one protease-inhibitor tablet (cOmplete ™ , Roche, Germany) per liter. Cells were lysed by sonication, membranes and cell debris were removed by centrifugation, and the resulting cytoplasmic extract was used for protein purification.
The first step was affinity isolation using glutathionesepharose beads [63]. The fusion protein was bound to the beads via its GST tag, the beads were washed, and the protein was eluted with buffer containing 1 mM glutathione. Cleavage of the fusion protein with TEV protease was performed for 8 hours at 4°C. The liberated HVO_2753 was isolated from the GST tag and nonprocessed fusion protein by size exclusion chromatography. The identify and purity of the protein was confirmed using analytical SDS/PAGE and MALDI mass spectrometry (Fig. S6).
The QuikChange Lightning site-directed mutagenesis kit (Agilent Technologies, Santa Clara, CA, USA) was used for the introduction of single amino acid replacements. Successful mutagenesis was confirmed by sequencing (Eurofins Genomics, Ebersberg, Germany). Purification of the mutant proteins was performed as described above for the wild-type protein.

NMR spectroscopic experimental data
The protein samples were measured in NMR buffer containing 25 mM Bis-Tris pH 7, 200 mM NaCl, 3 mM DTT, 95% H 2 O/5% D 2 O. Spectra were recorded at 278 K and 298 K on Bruker spectrometers ranging from 600 to 950 MHz and equipped with z-axis gradient 1 H{ 13 C, 15 N} triple-resonance cryogenic probes. The spectrometer was locked on D 2 O. 1 H chemical shifts were referenced to DSS at 0.00 ppm, and 13 C, 15 [65].
The temperature series of 2D 1 H 15 N BEST-TROSY spectra was measured from 278 to 348 K with 10 K increments on a 600 MHz spectrometer. Amide proton temperature coefficient was calculated from a linear fit of a chemical shift perturbation as a function of the temperature [36].
In order to analyze the slow exchange of the two present conformations, ZZ-exchange experiments were measured. Experiments were carried out at temperatures ranging from 298 K to 338 K with mixing times from 50 to 1000 ms. Cross-peaks could only be observed at 328 K and higher temperatures. The peak intensities were determined with Topspin 3.5, and values were normalized. Fits and exchange rates k 1 and k -1 were calculated according to Farrow et al. [29]. This was done for three well-resolved residues (C29, C39, and G58) at 338 K.  15 N het-NOEs were calculated from the signal intensity ratio (I on /I off ) obtained from the spectra recorded with and without saturation of amide protons with recovery (d 1 ) and saturation delay of 3.5 seconds. The S 2 order parameter was calculated using the TENSOR2 12 software and the average of the first three best NMR structure models calculated with CYANA 3.97.
All NMR spectra were processed by using TopSpin version 3.5 (Bruker Biospin) and analyzed and visualized with SPARKY version 3.114 [31].

Structure calculation
Structure calculations were performed by CYANA 3.97. The chemical shift assignment of the HVO_2753 protein was performed manually and together with unassigned NOESY peak lists used as input for fully automated NOE cross-peak assignment. Therefore, three 3D NOESY NMR experiments were used: i) the aliphatic 3D 1 H 1 H 13 C-NOESY-HSQC and 1 H 1 H 15 N-NOESY-HSQC with a mixing time of 120 ms and ii) the aromatic 3D 1 H 1 H 13 C-NOESY-HSQC with a mixing time of 100 ms. Peak picking of the spectra was performed by using the restricted peak picking function in the Sparky 3.114 program [31]. The NOESY peak lists were refined by manual inspection of the NOE cross-peaks.
The chemical shift tolerances were set to 0.02 for the bound protons and other protons and to 0.20 ppm for the heavy atoms. The final structure calculation included the NOE data, hydrogen bond distances, dihedral angles derived from the TALOS-N 9 predictions, and 3 J(H N ,H α ) coupling constants (Karplus relation).
To imitate the coordination of the Zn 2+ ion in the binding pocket, the upper distance restraints between four coordinating sulfur atoms from the cysteine residues were manually defined to 4Å.
The structure calculation was performed using a standard protocol with 100 initial steps, 15000 refinement steps per iteration and 20 final structures. The final bundle of 20 top-ranked (lowest energy) structures was analyzed and validated, and the atomic coordinates were subsequently deposited in the protein data bank (PDB ID: 6YDH). The chemical shift assignment was deposited in the biological magnetic resonance bank (BMRN ID: 34500). Structure validation was performed by calculating the root mean square error (RMSE) and mean square error (MSE) according to formulae 1 and 2, respectively: where n is number of resonances, P is predicted value, and E is experimental value, where n is number of resonances, X is the difference between predicted and experimental value, and X is the mean value.
Chemical shift perturbations (CSPs) between the minor and major conformations were calculated using the following formula: where Δδ is the combination of 1 H-and 15 N-CSPs of the backbone amides, Δδ H and Δδ N are the chemical shift differences between major and minor confirmations of 1 Hand 15 N-atoms, respectively, and α is a correction factor for the gyromagnetic ratio (α = 0.1) [36].
of the analysis, all authors contributed to writing the manuscript.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Figure S1. Multipe sequence alignment (MSA) of HVO_2753 from H. volcanii and 20 homologous proteins from selected halophilic and methanogenic genera and species. Figure S2. Maximum parsimony (A) and maximum likelihood (B) phylogenetic trees of the multiple sequence alignment shown in Figure S1. Figure S3. Growth of the wildtype and the HVO_2753 deletion mutant. Figure S4. Native purification of HVO_2753 after homolous production in H. volcanii. Figure S5. Top down and bottomup MS analysis of HVO_2753. Figure S6. Purification of HVO_2753 after heterologous production in E. coli. Figure S7.   Figure S9. Amide proton temperature-dependency of HVO_2753 protein. Figure S10. Temperature dependent population ratio between major and minor conformations. Figure S11. ZZ-exchange NMR experiments. Figure S12. Zinc ion quantification with a fluorimetic assay. Figure S13. Swarm assay of single amino acid point mutants of HVO_2753. Figure S14. 2D 1 H 15 N BEST-TROSY spectra of four cysteine to alanine point mutations. Figure S15. Amide proton temperature-dependency of HVO_2753 protein. Figure S16. NMR solution structure of HVO_2753 protein. Figure S17. Mapping of the C α /C β chemical shift pairs from eight cysteines of the HVO_2753 on the schematic distribution plot of three thiol states. Figure S18. Root mean square error (RMSE) structure validation of HVO_2753 protein. Figure S19. Mean square error (MSE) structure validation of HVO_2753 protein. Figure S20. CSP analysis of the HVO_2753 minor conformation. Figure S21. Analysis of protein dynamics measured by NMR relaxation parameters based on the transverse relaxation rates. Figure S22. Representative long-distance NOEs of cysteines. Figure S23. Analysis of NOEs of the cysteines. Table S1. Amino acid composition of HVO_2753 in comparison to that the H. volc. proteome. Table S2. 13 C chemical shift assignment of cysteine residues of the HVO_2753 protein. Table S3. 13 C chemical shift assignment of proline residues of the HVO_2753 protein. Table S4. 3 J(H N ,H α ) coupling constants determined from a 3D HNHA NMR experiment. Table S5. List of oligonucleotides used for the generation of the deletion mutant and for probes for Southern and Northern analyses.