Structural basis of the specific interactions of GRAS family proteins

The plant‐specific GAI‐RGA‐and‐SCR (GRAS) family of proteins function as transcriptional regulators and play critical roles in development and signalling. Recent structural studies have shed light on the molecular functions at the structural level. The conserved GRAS domain comprises an α‐helical cap and α/β core subdomains. The α‐helical cap mediates head‐to‐head heterodimerization between SHR and SCR GRAS domains. This type of dimerization is predicted for the NSP1‐NSP2 heterodimer and DELLA proteins such as RGA and SLR1 homodimers. The α/β core subdomain possesses a hydrophobic groove formed by surface α3‐ and α7‐helices and mediates protein–protein interactions. The groove of the SHR GRAS domain accommodates the zinc fingers of JKD, a BIRD/IDD family transcription factor, while the groove of the SCL7 GRAS domain mediates the SCL7 homodimerization.

GRAS gene family members were first characterized as being key regulators in plant biology and comprised the three members GIBBERELLIN-INSENSITIVE (GAI), Repressor of ga1-3 (RGA) and SCARECROW (SCR), together with SCR-LIKEs (SCLs) from a database in 1999 [1]. Of the initially identified members, the gene products GAI and RGA are members of the DELLA proteins, which are key regulators in gibberellin (GA) signalling and also play important roles in jasmonate (JA) and light signalling [2]. SCR acts as a key regulator of radial patterning of Arabidopsis roots together with SHORT-ROOT (SHR), a member of another class of GRAS proteins [3,4]. At present, 33 and 66 members have been identified as being encoded in Arabidopsis and rice genomes, respectively, and the gene products, GRAS proteins, have since been recognized to play roles in the form of plant-specific key regulators of transcription in diverse processes including GA signal transduction, phytochrome signalling and root development [2,5]. Another class of GRAS proteins, Nodulation Signalling Pathway proteins regulate nodulation in legumes [2]. GRAS proteins function by forming homo-or heterodimers and/ or interacting with other proteins such as transcription factors [5][6][7][8][9][10][11]. Notwithstanding their importance in plant biology, determination of the three-dimensional structures of GRAS domains had been unsuccessful for quite some time. Recently, two crystallographic structural reports have appeared and clarified the fundamental three-dimensional structural and physical properties of the GRAS domain [12,13]. One crystallographic report has outlined the structure of the SCR-SHR heterodimer and its complex with a zinc finger-type transcription factor of the BIRD/INDE-TERMINATE DOMAIN (IDD) family, JACKDAW (JKD)/IDD10. SHR and SCR are extensively studied GRAS proteins that act as key regulators in Arabidopsis root development [3,4]. Intriguingly, SHR is referred to as a moving transcriptional regulator since after being transcribed in the stele, it moves into the adjacent layer where SCR sequesters SHR to the nucleus by forming the specific heterodimer SHR-SCR and blocks SHR movement out of the single cell layer of the endodermis. The SHR-SCR complex upregulates several genes including zinc finger (ZF) transcription factors of the BIRD/IDD family, containing JKD/IDD10 and MAGPIE (MGP)/IDD3 [14,15]. A regulatory network involving BIRD/IDD transcription factors and SCR with SHR organizes tissue patterns at all formative steps during growth, thus ensuring developmental plasticity [16]. The other crystallographic report has outlined the structure of rice SCL7 (OsSCL7), which belongs to the SCL4/7 subfamily. Although investigations concerning the biological functions of the SCL4/7 family members are limited, SCL4/7 members may function in response to environmental stress [17]. SCL7 is localized in the nucleus, and its overexpression led to increased salt and drought tolerance [18]. A recent study indicated that SCL4/7 members also play a role in axillary meristem development [19].
In this review, the available structural information of GRAS domains will be summarized, and consequences of protein-protein interactions including homo-and heterodimerization and the recognition of direct targets of GRAS proteins will be addressed in addition to the implications in plant biology.

GRAS protein classification
Most GRAS proteins comprise an N-terminal less-conserved variable region and a C-terminal conserved GRAS domain. A small number of GRAS proteins, however, have an N-terminal GRAS domain followed by another functional domain in the C-terminus, while other members are more exceptional in that they possess double GRAS domains. GRAS genes are usually monoexonic and encode for proteins with lengths between 360 and 850 amino acids. Typical GRAS domains comprise~390 amino acids and can be subdivided into five peptide regions possessing conserved sequence motifs: leucine heptad repeats I (LHRI), VHIID, leucine heptad repeats II (LHRII), PFYRE and SAW [1] (Fig. 1). VHIID, PFYRE and SAW are very short or scattered conserved sequence motifs. As discussed later, each region is found not to form a structural domain but to be part of a conserved structural domain. In contrast with the C-terminal conserved domain, the N-terminal region of GRAS proteins appears hypervariable and contains sequences typical of intrinsically disordered proteins [1,5,12,20]. Some GRAS proteins contain conserved sequence motifs in the N-terminal region, which are expected to be involved in molecular recognition. For example, DELLA proteins possess a conserved DELLA sequence motif in the N-terminal region and the peptide region containing the DELLA motif is found to be conformationally disordered in the free state, but refolds into a helical structure upon binding to GAbound GA receptor GID1 [20].
Phylogenetic analyses have shown that the GRAS family can be divided into 8-13 subfamilies [1,21]. A very recent study based on a panel of eight representative species of angiosperms (for monocots: Musa acuminata (Zingiberales), Phoenix dactylifera (Arecales) and Oryza sativa (Poales); for dicots: Arabidopsis thaliana, Vitis vinifera, Theobroma cacao (rosids) and Coffea canephora (asterid); and Amborella trichopoda as the basal angiosperm and outgroup for monocot and dicot phylogenies) identified 29 orthologous groups (OGs) of the GRAS gene family, and these OGs were regrouped into 17 subfamilies (NSP1, SCL32, SHR, PAT1, RAD1, SCLA, SCR, DELLA, RAM1, SCL3, DLT, SCLB, LISCL, SCL4/7, LS, NSP2 and HAM) containing five new subfamilies (DLT, RAD1, RAM1, SCLA and SCLB) ( Table 1) [22]. Compared with phylogenetic results obtained in a previous study that mainly focused on the model species Arabidopsis and rice, loss of GRAS members within these species was detected in a few OGs (12 and 4 respectively). Although these model species have often been used as a reference, these species have shown evidence of higher evolution rates than others.
The most extensively studied subfamilies contain the initially identified SCR, SHR and DELLA protein members. DELLA proteins are key regulators in GAresponse signalling, and are recognized by GA-bound GID1, which recruits DELLA proteins to the E3 enzyme with F-box protein SLY for ubiquitylation followed by degradation by the proteasome [2]. DELLA proteins also participate in PIF coactivation and function as JA signalling modulators. Arabidopsis DELLA proteins comprise five functional members, GAI, RGA and RGL1-3, while rice contains one functional DELLA protein SLENDER RICE 1 (SLR1) and two nonfunctional DELLA proteins SLENDER LIKE 1 and 2 (SLRL1 and SLRL2). Although SHR interact with SCR to play a key role in regulating the radial patterning of both the root and shoot [3,4], they belong to different subfamilies, which are distantly branched in the phylogenetic tree [22]. The Arabidopsis SCR subfamily comprises three OGs (SCR1-3), with SCR and SCL23 being included in these OGs, although OG-SCR3 is absent. SHR regulates the expression of SCR and SCL23. The SHR subfamily comprises two OGs, SHR1 and SHR2, while SHR2 OG members are absent in both Arabidopsis and rice, and are also absent in the Brassicales and Poales orders.
Since the molecular functions of SCL4/7 subfamily members were poorly understood, the name was derived from two Arabidopsis paralogs SCL4 and SCL7.SCL7 is upregulated under stress conditions, while its close homolog SCL4 is downregulated [17]. A Populus euphratica ortholog (PeSCL7) was found to be overexpressed during the early stage of induced severe salt-stress, and transgenic Arabidopsis plants overexpressing this GRAS gene showed increased tolerance to salt and drought stress [18]. These experiments suggested that SCL4/7 members participate in response to environmental stress such as salt, osmotic shock and drought.

Monomer-dimer equilibrium in solution
Intermolecular interactions between GRAS proteins were analysed using bioassays or biochemical methods such as the yeast two-hybrid (Y2H) assay, in vivo bimolecular fluorescence complementation (BifC) binding assays, or pull-down binding assays with cell lysates or purified recombinant protein samples [5][6][7][8][9][10][11][23][24][25][26][27]. The data obtained from such experiments suggested that GRAS proteins could form homodimers and/or heterodimers with other GRAS proteins. For example, SLR1 and Lotus japonicus NSP2 were reported to form homodimers [23,24], while formation of heterodimers was suggested for Arabidopsis SHR-SCR, SCL3-RGA and SCL3-GAI, Medicago truncatula NSP1-NSP2 and RAM1-NSP2 and L. japonicus RAD1-RAM1 and RAD1-NSP2 [8,11,[23][24][25][26][27]. RAD1 did not interact with itself, suggesting that RAD1 may exist as a monomer in the absence of RAM1 and NSP2. More definitive evidence showing oligomerization of GRAS proteins has been obtained using hydrodynamic analyses such as analytical ultracentrifugation (AUC) with purified protein samples ( Fig. 2) [12]. The analyses using AUC clarified the stable monodispersed states of Arabidopsis GRAS proteins in solution with estimated molecular masses showing that SCL5 exists as a monomer, while SCL3 exists as a homodimer, and SHR-SCR exist as a 1 : 1 heterodimer. Interestingly, the AUC profiles show a single dominant peak without additional major peaks, suggesting that each dimeric or monomeric form would be stable in solution, rather than as two or more multiple metastable forms in equilibrium.
In efforts to identify subdomains and/or motifs that are responsible for the dimerization, LHRI and LHRII regions were repeatedly deduced to be critical regions for homo-and heterodimerization [8,11,23,27]. The Y2H assay showed that the LHRI-VHIID-LHRII region is important for SCR-SHR binding [8]. The LHRI region of NSP2 is essential for the NSP1-NSP2 interaction and the LHRI region of RAD1 is essential for the RAD1-RAM1 interaction [11,27]. Similar dependencies were observed for DELLA protein members: the LZ region, which corresponds to the LHRI region, of SLR1 is important for homodimerization [23]. These results are consistent with the obtained structure, where the LHRI region was found to be directly involved in the intermolecular interactions in the SHR-SCR dimer [12] (see below).

GRAS domain structure
The reported crystal structures of the SHR-SCR heterodimer and the SCL7 homodimer were determined at 2.0 and 1.8 A resolution, respectively [12,13]. These structures revealed a common subdomain organization of the GRAS domain comprising an a-helical cap and a/b core subdomains (Fig. 3). The SHR GRAS domain contains fourteen a-helices, three 3 10 helices and nine b-strands, which form a central b-sheet (Fig. 4). The architecture belongs to the a/b folds of S-adenosyl methionine-dependent methyltransferases (SAM-MTs). The central b-sheet of SAM-MT comprises seven b-strands, which are conserved in the GRAS domain. In addition to the seven-stranded b-sheet, GRAS domains have two additional strands (b6 and b7) and one a-helix (a13) forming a b6-a13-b7 segment at one edge of the central b-sheet. This segment is important for protein-protein interactions that mediate direct binding between SHR and the zinc finger of the BIRD/IDD family of transcription factors, or to mediate dimerization of SCL7 (see below).
The a-helical cap of SHR forms an antiparallel helix bundle comprising six helices: three N-terminal a-helices (a1, a2 and a3) encompassing the LHRI motif, two a-helices (a10 and a11) from the a/b core subdomain, and aA-helix, which is located between a3 and a4 helices and links the a-helical cap and a/b core subdomains (Fig. 3). SCR has a short 3 10 -helix (gA) and a loop in lieu of the aA-helix of SHR. In SCL7, a helix corresponding to aA-helix of SHR or gA-helix of SCR was not found, and the entire loop between a3 and a4 helices is invisible probably due to a disordered structure. The a-helical cap sits on the a/b core subdomain, which incorporates a nine-stranded mixed b-sheet at the centre with seven a-helices on both sides of the b-sheet. The main part of the central b-sheet, b3-b2-b1-b4-b5, is parallel, but the remaining part of the b-sheet, b9-b8/b6-b7, is antiparallel. One side contains two a-helices (a9 and a12) and a long loop containing a short a-helix (a8) and a short 3 10 -helix (g1), whereas the other side contains helices (a4, a5, a6, a7 and a13) forming a large groove, which mediates . The overall structures of the GRAS domains, however, display a somewhat large deviation (2.7 A). This discrepancy is caused by a movement of the a-helical cap with respect to the a/b core subdomain. The well-conserved subdomain structures of the a-helical cap and a/b core subdomains are observed in the SCL7 structure. The overall structure of the SCL7 GRAS domain resembles that of the SHR structure (2.0 A) rather than that of SCR (2.3 A). GRAS domains possess a large cavity in the a/b core covered by the a-helical cap, as observed in members of the SAM-MT family (Fig. 5). However, GRAS domains lack the SAM-binding motifs, which are conserved in SAM-MT members and no binding was observed for SAM, S-adenosyl homocysteine, or the product monomethyl-L-lysine in our binding assays using isothermal titration calorimetry (ITC) [12], suggesting that the GRAS domains lack methyltransferase activity.

Structures of GRAS domain dimers
The GRAS domains of SHR and SCR form a 1 : 1 heterodimer with pseudo-dyad symmetry [12] (Fig. 6). Dimerization of SHR and SCR GRAS domains is mediated by the a-helical caps to form a head-to-head dimer. The dimer interface comprises eight a-helices (a2, a3, aA/gA and a11) and four loops (a2-a3 and a10-a11) from both a-helical caps with a large buried accessible surface area (~2070 A 2 ). Both polar and nonpolar contacts comprise the interface, involving direct hydrogen bonding and salt bridging, watermediated hydrogen bonding, and hydrophobic interactions. The interface contains asymmetric interactions, which may confer specificity required for SHR-SCR heterodimerization. Notably, the SCR nonpolar segment encompassing the C-terminal half of a3 helix followed by a3-gA loop and short 3 10 -helix gA acts as a 'hydrophobic belt', which wraps around a2 helix from SHR with nonpolar contacts. The hydrophobic belt is conserved in SCR beyond species but not in other GRAS proteins, and nonpolar residues of SHR a2 helix are also conserved and specific for SHR. In SHR, most of the long a3-gA loop is folded into the aA helix, which makes a parallel helix-helix interaction with a2 helix from SCR. Thus, these interactions should be specific to heterodimerization between SHR and SCR.
The SHR-SCR heterodimeric structure naturally suggests that a class of GRAS domains may be capable of forming dimers mediated by the antiparallel helix bundle of the a-helical cap. However, the homodimeric structure of the SCL7 GRAS domain provides another variation in dimerization (Fig. 7) [13]. The SCL7 GRAS domain forms a dimer in which protomers are related by crystallographic dyad symmetry. The interface consists of a groove formed by a4and a7-helices at the molecular surface of the a/b core subdomain and a12 helix from the other protomer docked into the groove. The interface produces a buried accessible surface area (1346 A 2 ), whereas this area is significantly smaller than that of the SHR-SCR heterodimer. Consistently, SCL7 exists in an equilibrium between monomeric and dimeric states in solution, as shown by size exclusion chromatography (SEC) analysis [13]. The a-helical cap of SCL7 has an abnormal feature in that nonpolar sidechains of three alanine residues (Ala227, Ala230 and Ala234) from a2-helix, Phe248 and Ala259 from a3-helix and Trp489 from a11-helix are exposed to the solvent region. These exposed nonpolar residues form a hydrophobic patch on the molecular surface and suggest that the a-helical cap of SCL7 should have a hitherto unidentified binding partner.

Downstream effector recognition by GRAS domains
The interaction between the SHR-SCR complex and BIRD/IDD transcription factors is one of the best studied examples of target effector protein recognition by GRAS proteins. BIRD/IDD transcription factors contain four conserved zinc fingers (ZF1-ZF4) at the N-terminal region and ZF3-ZF4 is important for binding to the SHR-SCR complex (Fig. 8). ITC experiments identified the relatively high affinity of the SHR-SCR complex to MGP/IDD3 ZF3-ZF4 (K D = 36 nM) and JKD/IDD10 ZF3-ZF4 (K D = 124 nM) [13]. The crystal structure of the SCR-SHR heterodimer bound to ZF3-ZF4 of JKD, the JKD-SHR-SCR complex structure, has been determined at 2.7 A resolution [13]. The crystal structure reveals that the zinc fingers ZF3 and ZF4 bind directly to SHR of the SHR-SCR complex (Fig. 8). Each ZF of JKD possesses a common bba-type structure representative of a classical C 2 HC zinc-finger. It is well-established that the a-helix of bba-type ZFs docks into the major groove of DNA for reading of the DNA sequence [28]. In the JKD-SHR-SCR complex structure, the a-helix of ZF4 is docked into the groove formed by a4and a7-helices of the a/b core subdomain of SHR with stabilization by nonpolar interactions, while the b-sheet of ZF3 binds a shallow groove formed by a13and a6-helices of the a/b core subdomain and is stabilized via polar interactions. Thus, the orientation of the two ZFs of JKD against SHR differ and the a-helix of ZF3 is accessible to DNA, but not that of ZF4. The observed binding mode is consistent with the fact that unlike the case with ZF4, zinc fingers ZF1-ZF3 are critical for DNA binding.
The ZF4 a-helix bound to the SHR groove contains the SHR-binding motif that comprises the ZF4-specific sequence R(K/R)DxxITHxAFCD (in which x represents any amino acid residue). The SHR-binding motif is highly conserved in 13 (from IDD1 to IDD13) of the 16 members of the A. thaliana BIRD/IDD family of transcription factors (Fig. 9). The other three members, IDD14, IDD15 and IDD16, lack a Phe residue corresponding to Phe206 of JKD ZF4, and lack other residues important for SHR binding. The SHR-binding motif is less conserved in other GRAS proteins, suggesting specific binding to SHR but not to other GRAS family proteins. The crystal structure of the DNA-bound form of mouse immediate early protein Zif268, which is a typical ZF transcription factor containing three tandem repeats of zinc fingers, revealed that each ZF is folded into a typical bba structural module, which is docked into the major groove of DNA in a configuration where the ZF a-helix is inside the DNA groove and the b-sheet is outside of the groove [29]. This binding mode enabled us to build a model of the DNA-bound JKD-SHR-SCR complex (Fig. 10). The model suggests that SHR-SCR are transcriptional cofactors that regulate target gene transcription via binding of SHR to BIRD transcription factors.
In contrast with the JKD-SHR-SCR complex structure, the SCL7 homodimer was suggested to bind directly to DNA [3]. This idea was drawn from the fact that the SCL7 homodimer found in the crystal has a positively charged cleft between the SCL7 protomers. If the cleft is expanded by 6.4 A in width, this enlarged cleft has been shown to be capable of accommodating a double-stranded DNA helix. In vitro binding analysis using electrophoretic mobility shift assays with an oligonucleotide having a modelled sequence and purified recombinant SCL7 GRAS domain showed a band shift indicating formation of a protein-DNA complex, although the band shift was rather faint, probably being partly caused by using an oligonucleotide with an artificial sequence. Further experiments are required to confirm the DNA binding activity of the GRAS domain by identifying the genuine target DNA sequence(s) involved. Highly conserved residues (more than 80%) and relatively conserved residues (60-80%) are filled in pink and yellow, respectively, while the conserved Cys or His residues which are essential for coordinating zinc ions are filled in green. In the JKD-SHR-SCR ternary complex [12], JKD/IDD10 residues whose side chain atoms form hydrogen bonds with SHR residues, residues whose main chain atoms form hydrogen bonds with SHR residues and residues involved in hydrophobic interactions with SHR are marked with filled circle (cyan), open circle (cyan) and triangle (red), respectively. protein-protein or protein-ligand interactions. One site is the a-helical cap, which facilitates intermolecular helix-bundle formation to form a GRAS domain dimer. The SCH-SCR heterodimer is an example of a-helical cap-mediated dimerization and similar dimerization modes mediated by the a-helical cap are predicted in homodimers of other GRAS proteins. For example, a-helical cap-mediated dimerization of RGA is expected based on the results of interaction site mapping that showed impaired RGA homodimerization by deletion of the LHRI region (the LRI region) [30]. Impaired homodimerization by deletion of the LZ region, which encompasses LHRI, supports SLR1 dimerization mediated by the a-helical cap [23]. Moreover, the importance of the LHRI region in heterodimerization of M. truncatula NSP1 and NSP2 has also been demonstrated [26]. The interaction between the a-helical caps of SHR and SCR is mediated by nonpolar contacts in addition to polar contacts. Therefore, in the absence of binding partners, the hydrophobic patch formed by surface nonpolar residues should be exposed to the solvent region and may destabilize the a-helical cap structure or tend to facilitate the formation of nonspecific aggregates. Moreover, the N-terminal variable region directly linked to the a-helical cap is characterized by conformationally flexible sequences displaying no stable secondary structure in the absence of binding partners. These facts partly account for the reason why the preparation of recombinant protein samples of GRAS proteins for structural and physical studies is relatively difficult compared with other soluble proteins. Studies directed towards identifying binding partners, which are other GRAS proteins or may be other classes of proteins, are essential and structural studies of their respective stable complex forms may overcome the difficulty associated with structural studies of isolated GRAS proteins. The crystal structure of SCL7 shows a homodimeric form with the free state of the a-helical cap, while the a-helical cap possesses surface hydrophobic patches that are capable of mediating interactions with other molecules. Thus, the presence of surface hydrophobic patches on the a-helical cap seems to be common to all GRAS domains and contribute to intermolecular interactions. The other potential site for intermolecular interactions is the hydrophobic groove formed by a3and a7-helices on the a/b core of GRAS domains. This groove is found in all three GRAS domain structures of SHR, SCR and SCL7, suggesting the presence of common structural characteristics in GRAS domains. The zinc-finger ZF4 of the JKD transcription factor of the BIRD/IDD family binds this hydrophobic groove of the SHR a/b core subdomain in the JKD-SHR-SCR structure. Detailed inspection of the binding mode revealed the SHR-binding motif found in the ZF4 zinc finger. Intriguingly, 13 (IDD1-IDD13) of the 16 members of the BIRD/IDD family have a conserved SHR-binding motif, predicting that these 13 transcription factors function in conjunction with SHR. Since the 13 transcription factors bind a common binding site, the SHR-binding of these transcription factors is exclusive. This type of binding competition could play a key role in regulation, as seen with JKD and MGP [12]. Thus, structural information of protein-protein complexes provides valuable information for understanding biologically important proteins. The hydrophobic groove formed by a3and a7-helices is extended to a negatively charged shallow groove formed by a13and a6-helices. This groove accommodates the positively charged b-sheet of the ZF3 zinc finger of JKD in a fashion enabling this zinc finger to bind DNA. Our model building shows that JKD can bind DNA in the SHR-SCR-bound form, suggesting that SHR-SCR plays a role as a transcriptional cofactor, which does not bind DNA directly but interacts with other transcription factors or general transcription factors such as polymerases. SCR possesses a hydrophobic groove formed by a3and a7helices, although the binding partner is unknown. SCR also possesses a shallow groove formed by a13and a6-helices, but this groove has no negative charges, suggesting distinct specificity from that of SHR. SCL7 also possesses a hydrophobic groove formed by a3and a7-helices. In this case, the groove accommodates a12-helix (corresponds to a13-helix of SHR) from the other protomer to mediate homodimerization by a/b core subdomain docking with direct interactions with a7-helix and a6-helix at the bottom of the groove. Thus, the hydrophobic groove may be utilized for dimerization of some GRAS proteins. A positively charged putative DNA-binding groove is located between the SCL7 protomers, suggesting that some GRAS proteins may function as transcription factors that bind directly to DNA. However, additional experimental evidence is required to confirm this possibility. Direct DNA-binding of M. truncatula NSP1 has been demonstrated with identification of the core AATTT motif in the target ENOD11, NIN and ERN1 promoters [26]. A closer relationship or common sequence characteristics between NSP1 and SCL7 might be expected, whereas SCL7 shows low sequence identity (14%) with NSP1. Moreover, the a-helical cap but not the a/b core subdomain was shown to be essential for dimerization of NSP1 and NSP2, as described above. These results suggest that the DNA-binding model of NSP1 and NSP1-NSP2 should differ from the model proposed for SCL7 [13]. The SHR-SCR heterodimer shows that both the SHR and SCR GRAS domains are overall negatively charged and lack the prerequisite for direct DNA-binding.