Origin and adaptation of green‐sensitive (RH2) pigments in vertebrates

One of the critical times for the survival of animals is twilight where the most abundant visible lights are between 400 and 550 nanometres (nm). Green‐sensitive RH2 pigments help nonmammalian vertebrate species to better discriminate wavelengths in this blue‐green region. Here, evaluation of the wavelengths of maximal absorption (λmaxs) of genetically engineered RH2 pigments representing 13 critical stages of vertebrate evolution revealed that the RH2 pigment of the most recent common ancestor of vertebrates had a λmax of 503 nm, while the 12 ancestral pigments exhibited an expanded range in λmaxs between 474 and 524 nm, and present‐day RH2 pigments have further expanded the range to ~ 450–530 nm. During vertebrate evolution, eight out of the 16 significant λmax shifts (or |Δλmax| ≥ 10 nm) of RH2 pigments identified were fully explained by the repeated mutations E122Q (twice), Q122E (thrice) and M207L (twice), and A292S (once). Our data indicated that the highly variable λmaxs of teleost RH2 pigments arose from gene duplications followed by accelerated amino acid substitution.

One of the critical times for the survival of animals is twilight where the most abundant visible lights are between 400 and 550 nanometres (nm). Green-sensitive RH2 pigments help nonmammalian vertebrate species to better discriminate wavelengths in this blue-green region. Here, evaluation of the wavelengths of maximal absorption (k max s) of genetically engineered RH2 pigments representing 13 critical stages of vertebrate evolution revealed that the RH2 pigment of the most recent common ancestor of vertebrates had a k max of 503 nm, while the 12 ancestral pigments exhibited an expanded range in k max s between 474 and 524 nm, and present-day RH2 pigments have further expanded the range to~450-530 nm. During vertebrate evolution, eight out of the 16 significant k max shifts (or | Dk max | ≥ 10 nm) of RH2 pigments identified were fully explained by the repeated mutations E122Q (twice), Q122E (thrice) and M207L (twice), and A292S (once). Our data indicated that the highly variable k max s of teleost RH2 pigments arose from gene duplications followed by accelerated amino acid substitution.
The molecular bases of the k max shift (or spectral tuning) in visual pigments have been studied mostly by introducing various mutations into various present-day pigments. This approach is based on two implicit assumptions: (a) identical amino acid changes in different pigments shift the k max by the same magnitude and (b) forward and reverse mutations shift the k max in opposite directions by the same magnitude. These assumptions often fail because of the interactions among different amino acids, and conclusions derived from these mutagenesis experiments can be erroneous [27][28][29][30]. To elucidate the correct mechanism of spectral tuning, it is imperative to genetically engineer and manipulate ancestral pigments by following the actual evolutionary processes in forward directions [27][28][29][30]. This was done for RH1 [30], SWS1 [31], and M/LWS [32] pigments, but the similar molecular analyses have not been applied to the RH2 and SWS2 pigments.
Here, we inferred and genetically engineered the ancestral RH2 pigments at 13 critical stages of vertebrate evolution and determined their k max s. Then, by introducing additional mutations, the molecular mechanisms of k max shifts at eight evolutionary steps of RH2 pigments have been established.

Inference of ancestral sequences
The amino acid sequences of ancestral RH2 pigments were inferred by applying the phylogenetic analyses by maximum likelihood (PAML) with Jones-Taylor-Thornton (JTT), Whelan and Goldman (WAG), and Dayhoff amino acid substitution models [33] to the amino acid sequences of the 37 vertebrate RH2 pigments and those of the more restricted 24 pigments with known k max s ( Fig. 1), together with the RH1 pigment of bovine (Bos taurus) and the RH1, SWS1 and SWS2 pigments of lamprey (Geotria australis) as the outgroup (Table S1). Among the four paralogous groups of pigments, RH2 pigments are most closely related to RH1, SWS2 and SWS1 pigments, in that order [6,34]. The divergence times of nonduplicated RH2 pigments were estimated from the TimeTree of Life Web server (www.timetree.org). The actual branch lengths of the composite phylogenetic tree were also determined by applying PAML to the user tree based on the amino acid sequences.
To infer the ancestral RH2 pigments, the amino acids between sites 31 and 311 were considered (Fig. S1A). In engineering ancestral pigments, we used an expression vector pMT5 that contained the amino (N) and carboxyl (C) termini (amino acids between positions 1-30 and 312-354, respectively) of the chameleon (P495; red letters in Fig. S1B) with the proper internal segment of a RH2 pigment. The N and C termini of chameleon (P495) do not affect the k max of a RH2 pigment significantly. For example, zebrafish4 (P505) with its own and the chameleon (P495) termini with the proper internal segment of a RH2 pigment have k max s of 505 nm [16] and 507 nm, respectively, and gecko (P467) with its own and the chameleon termini have k max s of 467 nm [35] and 466 nm, respectively. Similarly, different amino acids at the N and C termini do not modify the k max s of SWS1 pigments [31].
All mutant opsins were generated by using QuikChange Site-Directed Mutagenesis Kits (Stratagene, La Jolla, CA, USA). To rule out spurious mutations, the DNA fragments were sequenced by cycle sequencing reactions using the Sequitherm Excel II long-read kits (Epicentre Technologies, Madison, WI, USA) with dye-labelled M13 forward and reverse primers. Reactions were run on a LI-COR (Lincoln, NE, USA) 4300LD automated DNA sequencer.

The in vitro assay
Ancestral and other mutant opsins were expressed in COS1 cells by transient transfection [36]. The contiguous RH2 opsins between sites 31 and 311 were cloned into the EcoRI and SalI restriction sites of the expression vector pMT5, which contained the N terminus (amino acids between positions 1 and 30) and C terminus (amino acids between positions 312 and 354) of the chameleon (P495; red letters in Fig. S1B). The visual pigments were regenerated by incubating the opsins with 11-cis-retinal (provided by R. K. Crouch at Storm Eye Institute, Medical University of South Carolina and the National Eye Institutes) and were purified using immobilized 1D4 (The Culture Center, Minneapolis, MN, USA) in buffer W1 (50 mM N-(2-hydroxyethyl) piperazine-N 0 -2-ethanesulfonic acid (HEPES; pH 6.6), 140 mM NaCl, 3 mM MgCl 2 , 20% (w/ v) glycerol and 0.1% dodecyl maltoside). UV-visible spectra were recorded at 20°C using a Hitachi U-3000 dual beam spectrophotometer (LI-COR Biosciences, Lincoln, NE, USA). Visual pigments were bleached for 3 min using a 60 W standard light bulb equipped with a Kodak Wratten #3 filter at a distance of 20 cm. Data were analysed using SIGMAPLOT software (Jandel Scientific, San Rafael, CA, USA).

Ethics
Research was carried out under approval of Emory University according to the university's animal ethics guidelines.

Results
The amino acid sequences of ancestral pigments The ancestral amino acid sequences inferred are highly consistent. For example, when the earliest vertebrate ancestor AncAgnatha (node 1, Fig. 1) is considered, the amino acid sequences inferred by JTT and WAG models show that 241 (86%) out of a total of the 281 comparable sites are identical with Bayesian posterior probabilities (PPs) ≥ 0.95, 22 out of the remaining 40 sites also have identical amino acids with a 0.70 ≤ PP < 0.95, and different amino acids are predicted only at eight highly variable sites. The ancestral sequences inferred using the 24 and 37 sequences are also very similar. For AncAgnatha, JTT model shows that amino acids at 263 out of 281 sites (94%) are identical and only those at the other 18 sites differ (indicated by * in Table S2). At these 18 sites, the present-day pigments have variable amino acids, but their k max s are similar, which implies that these amino acids are not critical in determining the k max s of visual pigments [34,37].
One of these results disagrees with the published result. Previously, using PAML with JTT model, it was suggested that AncCyprini1 (node 5, Fig. 1; or Ancestor1 in Fig. S2A) had E122 with a PP of 0.77 [38], but our result using the identical method shows that AncCyprini1 (Fig. S2B) had Q122 with PPs of 0.89 and 0.97 using the 24 and 37 sequences, respectively (Table S3). Comparing the two data sets, we can find the cause of the discrepancy. That is, Chinen et al. used the teleost data consisting of 10 pigments with E122 and four pigments with Q122, which were heavily biased towards E122. On the other hand, our teleost data consist of eight pigments with E122 and nine pigments with Q122 (Fig. S2A,B, respectively). Because of the less-biased data set and significantly higher PPs, it is most likely that AncCyprini1 had Q122.

Functional differentiation
We applied the in vitro assay to the 13 ancestral pigments engineered. The results show that (a) the amino acid sequence of AncJawedFish differs from that of AncTetrapod at multiple sites (Fig. S1B), but these two pigments have identical k max values at 488 nm, and (b) the ratio of absorbance at~280 nm to that at 500 nm of the in vitro assay ranged from 2.5 (AncEuteleost) to 4.8 (AncSquamata; Fig. S3). In particular, AncAgnatha had a k max of 503 nm and its closest descendant, AncJawedFish (node 2, Fig. 1), decreased its k max to 488 nm, which had been maintained by six out of the remaining 11 ancestral pigments (nodes 3-5, 7, 8 and 11; grey ovals, Fig. 1;  Fig. S3). However, the five others expanded their k max s from 474 nm (node 9) to 524 nm (node 10) and the k max s of present-day pigments have been expanded further between~450 and 530 nm.
The k max shifts of RH2 pigments reveal three characteristics (Fig. 1). First, despite their similarities, the k max s of zebrafish3 (P488) and medakaC (P492) did not evolve directly from that of AncJawedFish (P488), but these pigments reversed their k max s to over 500 nm before reaching about 490 nm. These changes can be found only by reconstructing ancestral pigments at intermediate evolutionary steps. Second, with the exceptions of eel (P506), loosejaw (P468), coelacanth (P478) and gecko (P467), significant k max shifts have been generated by gene duplications followed by critical amino acid substitutions. Third, the decreased k max s of loosejaw (P468) and coelacanth (P478) are closely connected with their unique adaptations to their highly species-specific light environments ( [39][40][41]) and that of gecko (P467) indicates the adaptation to its nocturnal environment.
Our mutagenesis results show that E122Q, Q122E, M207L and A292S shifted the k max s of ancestral pigments by 7-20 nm (Table 1)
These two examples reveal the complex nature of the spectral tunings in RH2 pigments. Chinen et al. [38] encountered the same problem in explaining the significant k max shift from Ancestor1 (P506) to Ances-tor2 (P474; supplementary information, Fig. S2A; for more details, see Discussion section).

Evolutionary rates of amino acid substitution
To evaluate the effects of gene duplication on the induction of the highly variable k max s in teleost RH2 pigments, we considered representative 22 sequences and evaluated the numbers of amino acid substitutions per site per year at three lineages: (a) the vertebrate pigment lineage (lineage a), which excludes all Clupeocephala pigments (lineage c); (b) the Tetrapod pigment lineage (lineage b); and (c) the Clupeocephala pigment lineage where gene duplication events are prevalent (shown by a rectangle, lineage c; Fig. 2a). The branch lengths from these nodes to their descendant pigments were determined by applying PAML to the composite evolutionary tree of the amino acid sequences. The evolutionary rates were evaluated by taking the averages between bifurcated (or trifurcated) branches sequentially and assuming that lineages a, b and c originated 615, 230 and 413 million years ago (MYA), respectively (www.timetree.org). The results show that the evolutionary rate for lineage c (0.9 AE 0.12 9 10 À9 ) is significantly higher than that for lineage a (0.3 AE 0.05 9 10 À9 ; Z = 4.0) and lineage b (0.4 AE 0.06 9 10 À9 , Z = 3.6) at 1% level (Fig. 2b). Hence, the duplications of RH2 opsin genes were followed by the significantly accelerated amino acid substitutions, which led Clupeocephala RH2 pigments to expand the range of their k max s.

Discussion
Depending on whether AncCyprini1 (node 5, Fig. 1) had Q122 or E122, the evolutionary processes of zebrafish RH2 pigments can be interpreted very differently. If AncCyprini1 had E122 and a k max of 506 nm, the k max of zebrafish3 (P488) decreased by E122Q and Ancestor2 (P474) by E122Q and additional unidentified mutations [38] (Fig. S2A). On the other hand, if AncCyprini1 had Q122 and a k max of 489 nm and if the k max of Ancestor2 (P474) still holds, the k max shifts of AncCyprini2 (P505) and zebrafish3 (P488) are explained fully by Q122E and E122Q, respectively, but the critical mutations that caused the k max shift of Ancestor2 (P474) remain to be discovered (Fig. S2B).
This example shows that the mechanism of phenotypic adaptation can be understood by inferring the ancestral sequences correctly. For that, we need to use unbiased sequence data which, among other things, should consist of roughly equal numbers of molecules with different functions. Furthermore, to identify mutations that generated highly variable k max s of ancestral and present-day RH2 pigments, it would be necessary to construct various sets of chimeric pigments between a pair of pigments with different k max s and then perform extensive mutagenesis experiments [42][43][44][45].
The functional differentiation of euteleost RH2 pigments has been accelerated significantly by gene duplications. Evolution by gene duplication may be classified into four categories [46]: (a) gene duplication itself does not cause any selection, (b) the duplication itself renders selective advantage, (c) duplication occurs in a gene for which genetic variation already existed, and (d) duplications occur by whole-genome duplication or large segmental duplication. The first category includes Ohno's neofunctionalization model [47] and various subfunctionalization models [48][49][50].
Among these possibilities, neither selection caused by gene duplication itself (category 2) nor functionally meaningful genetic variation (category 3) have been established in the preduplication phase of RH2 pigments. During fish evolution, Euteleost and Cypriniform ancestors appeared roughly 240 and 100 MYA, respectively [51]. Hence, RH2 gene duplications in fishes occurred much later than the fish-specific wholegenome duplication (WGD), which occurred about 350 MYA [52], showing that neither WGD nor large segmental duplication (category 4) [16] seem to be involved. The differential ontogenetic RH2 gene expression was described as 'subfunctionalization' [15], but actual dual functions of the ancestral RH2 gene have not been established. On the other hand, coelacanths (Latimeria chalumnae) seem to have moved into the depth at 200 m about 200 MYA [53] and the k max s of the duplicated RH1 and RH2 pigments were decreased to 482 and 478 nm, respectively, and started to distinguish the narrow range of wavelengths available in its habitat [30,54]. Therefore, the k max shifts of RH1 and RH2 pigments and newer duplicate RH2 pigments can be described best by the neofunctionalization model. The accelerated evolutionary rates of the duplicate RH2 genes (lineage c, Fig. 2b) agree with the prediction [47,55] and observations [56][57][58] of the evolution of duplicate genes, and not the decelerated evolutionary rates that were caused presumably by the more critical biological functions of duplicate genes, which cause purifying selection [59]. In general, the latter evolutionary patterns of duplicate genes do not apply to opsin genes because in order to move into different ecological environments, organisms need to readjust the k max s of their paralogous visual pigments.

Conclusions
By inferring and engineering the RH2 pigments at 13 critical stages of vertebrate evolution, we have shown that the green-sensitive pigments of the vertebrate ancestor had a k max of 503 nm, from which the 12 ancestral pigments changed their k max s between 474 and 524 nm and the present-day RH2 pigments have further expanded the range to~450-530 nm. Eight out of the 16 significant k max shifts of RH2 pigments can be explained by the mutations E122Q (twice), Q122E (thrice), M207L (twice) and A292S (once). The highly variable k max s of teleost RH2 pigments have been achieved by gene duplications followed by accelerated amino acid substitution.
Acknowledgements SY was supported partially by National Institutes of Health (EY016400) and Emory University. We thank Dr. Rosalie K. Crouch (Storm Eye Institute, Medical University of South Carolina), National Eye Institute, for supplying 11-cis-retinal to us, and Dr. Ruth Yokoyama and anonymous reviewers for helpful comments.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. Aligned amino acid sequences of RH2 pigments. (A) Present-day pigments, where Bkillifish, scabbard, and Ital lizard are bluefin killifish, scabbardfish, and Italian lizard, respectively. The numbers after P in parentheses show k max s. Amino acids sites 122, 207, and 292 are indicated by stars (*). (B) 13 ancestral pigments inferred by applying the PAML with JTT model to the 24 sequence data, where the ancestral amino acids with < PP of 95% or less are indicated by bold italic letters. The amino acids in red letters are those of chameleon (P495). Following the tradition in vision science, the amino acid site numbers are those of bovine RH1 (GenBank accession no. M21606). Fig. S2. Two different inferences of the RH2 pigment evolution in Cypriniformes. The AncCyprini1 was inferred to have either E122 (A, Chinen et al. [38]) or Q122 (B, present analysis). The k max s of Ancestors 1-3 are taken from (Chinen et al. [38]). The numbers in ovals and after P in rectangles show k max s of the ancestral and present-day pigments, respectively. The amino acids at site 122 are given at the right column. E122Q decreases the k max , whereas Q122E increases k max . E122Q* explains about 47% of the k max shift of Ancestor2. Blue, grey, black, and red indicates the k max s of 452-478, 488-492, 495-511, and 516-530 nm, respectively. Fig. S3. Absorption spectra of ancestral RH2 pigments. The k max values of AncJawedFish and AncTetrapod are identical at 488 nm but their absorbance levels at $ 280 nm are 1.1 and 1.6, respectively. Table S1. The source of RH2 pigment sequences. Table S2. Amino acids of AncAgnatha with PP < 0.95 (in parentheses) inferred using PAML with JTT model. Table S3. Amino acids of ancestral pigments at three critical sites with PP (in parentheses) inferred using PAML with JTT substitution model.