Journal list menu

Volume 4, Issue 3 p. 174-191
Open Access

Genetic susceptibility to breast cancer

Nasim Mavaddat

Nasim Mavaddat

Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge CB1 8RN, United Kingdom

Search for more papers by this author
Antonis C. Antoniou

Antonis C. Antoniou

Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge CB1 8RN, United Kingdom

Search for more papers by this author
Douglas F. Easton

Douglas F. Easton

Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge CB1 8RN, United Kingdom

Search for more papers by this author
Montserrat Garcia-Closas

Corresponding Author

Montserrat Garcia-Closas

Department of Oncology, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge CB1 8RN, United Kingdom

Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville MD, United States

Corresponding author. Department of Oncology, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge, CB1 8RN, United Kingdom. Tel.: +44 1223740145; fax: +44 1223740147.Search for more papers by this author
First published: 21 May 2010
Citations: 241


Genetic and lifestyle/environmental factors are implicated in the aetiology of breast cancer. This review summarizes the current state of knowledge on rare high penetrance mutations, as well as moderate and low-penetrance genetic variants implicated in breast cancer aetiology. We summarize recent discoveries from large collaborative efforts to combine data from candidate gene studies, and to conduct genome-wide association studies (GWAS), primarily in breast cancers in the general population. These findings are compared with results from collaborative efforts aiming to identify genetic modifiers in BRCA1 and BRCA2 carriers. Breast cancer is a heterogeneous disease, and tumours from BRCA1 and BRCA2 carriers display distinct pathological characteristics when compared with tumours unselected for family history. The relationship between genetic variants and pathological subtypes of breast cancer, and the implication of discoveries of novel genetic variants to risk prediction in BRCA1/2 mutation carriers and in populations unselected for mutation carrier status, are discussed.


  • GWAS
  • Genome-wide association studies
  • FRR
  • familial relative risk
  • ER
  • Estrogen Receptor
  • PR
  • Progesterone receptor
  • HER2
  • Human epidermal growth factor receptor 2
  • SNP
  • single nucleotide polymorphisms
  • LD
  • linkage disequilibrium
  • MAF
  • minor allele frequency
  • RR
  • relative risk
  • TN
  • triple negative
  • 1 Introduction

    Both non-genetic and genetic factors are involved in the aetiology of breast cancer. Non-genetic factors include menstrual and reproductive history, body mass index, alcohol intake and physical activity. The genetic component of the disease is reflected on a tendency to cluster in families, although this could also reflect shared life-style and environment. A measure of this familial clustering is the familial relative risk (FRR), defined as the ratio of the risk of disease for a relative of an affected individual to that for the general population. FRR for breast cancer varies both with age of cancer diagnosis of the index case and the age of the relative (Familial breast cancer, 2001; Pharoah et al., 2000). For example, FRR decreases from more than five-fold in women younger than age 40 years with a relative aged younger than 40 years at diagnosis, to 1.4 fold in women older than 60 years with a relative diagnosed over age 60 years (Easton, 2002). FRR increases progressively with the number of affected relatives (Familial breast cancer, 2001), and the increased risk extends to more distant relatives (Amundadottir et al., 2004). None of the known environmental risk factors for breast cancer influence FRR (Familial breast cancer, 2001). Simulation studies have demonstrated that even with complete correlation in the environmental risk factor among relatives, such risk factors would need to confer very large risk ratios in order to lead to even modest increases in the FRR (Hopper and Carlin, 1992). These findings, together with the observation that risk is very high in monozygotic twins and yearly incidence of breast cancer in the monozygotic twin is approximately constant following diagnosis of the affected twin (Peto and Mack, 2000), suggest that FRR for breast cancer is a direct reflection of the genetic component of the disease (Easton, 2002; Peto and Mack, 2000).

    This review focuses on genetic risk factors for breast cancer in individuals with and without a strong history of breast cancer. The genetic variants associated with breast cancer risk can be classified as high-penetrance mutations that are rare in the population but associated with very high risk (relative risk of carriers versus non-carriers of 5 to >20); moderate penetrance variants associated with moderate increases in risk; and low-penetrance polymorphisms which are common and associated with small increases in risk (relative risk <1.5). Here we summarize recent discoveries from large collaborative efforts to identify common genetic factors for breast cancer in populations unselected for mutation carrier status, and genetic modifiers of BRCA1 and BRCA2 mutation carriers, using candidate gene and genome-wide association studies (GWAS). We also discuss the relationship between genetic variants and pathological subtypes of breast cancer, and the implication of discoveries of novel genetic variants to risk prediction in BRCA1/2 mutation carriers and in the general population.

    2 Genetic factors associated with breast cancer

    2.1 High penetrance mutations

    Some of the familial clustering in breast cancer occurs as part of specific familial breast cancer syndromes where disease results from single alleles conferring a high risk (Table 1) (Turnbull and Rahman, 2008). Linkage studies conducted in the 1990s led to the discovery that mutations in tumour suppressor genes, BRCA1 and BRAC2, conferred a high risk of breast cancer (Easton et al., 1993; Hall et al., 1990; Wooster et al., 1994). These genes also predispose to ovarian cancer and a substantial percentage of families with breast and ovarian cancers harbour mutations in BRCA1 and/or BRCA2 (Ford et al., 1998; Narod, 2002). For example, in families with multiple cases of breast cancer, disease was linked to BRCA1 in 52% and to BRCA2 in 32% of families, and in families with breast-ovarian cancer, disease was linked to BRCA1 in 84% and to BRCA2 in 14% of cases (Ford et al., 1998). Most of the deleterious mutations are small deletions or insertions that result in translation of a truncated protein. The frequency of these mutations may vary in different populations, particularly in certain founder populations. For example, three deleterious founder mutations have been described in the Ashkenazi Jewish population (King et al., 2003). In BRCA1/2 associated tumours loss of heterozygosity of the normal allele is frequently observed (Osorio et al., 2002). Bi-allelic germ-line mutations in BRCA2 are associated with a subgroup of Fanconi anaemia (D1) in which there is susceptibility to childhood tumors (Howlett et al., 2002).

    Table 1. Allele frequency and effect sizes associated with high, moderate and intermediate penetrance variants, and their estimated contribution to the FRR for breast cancer.
    Locus Ref Genes in/near region Variant MAF RR FRR explainedb
    High penetrance mutations
    17q21 (Antoniou et al., 2008a) BRCA1 0.0006 5–45a 10%
    13q12.3 (Antoniou et al., 2008a) BRCA2 0.001 9–21a 12%
    17p13.1 (Birch et al., 2001) TP53 rare 2–10 5%
    10q23.3 (Nelen et al., 1996) PTEN rare 2–10
    19p13.3 (Jenne et al., 1998) STK11 rare 2–10
    16q22.1 (Masciari et al., 2007) CDH1 rare 2–10
    Moderate penetrance variants
    11q22.3 (Renwick et al., 2006) ATM 0.003 2–3
    22q12.1 (Meijers-Heijboer et al., 2002) CHEK2 0.004 2–3
    17q22-q24 (Seal et al., 2006) BRIP1 0.001 2–3
    16p12.1 (Rahman et al., 2007) PALB2 rare 2–4
    Low penetrance variants
    10q26 (Easton et al., 2007) FGFR2 rs2981582 0.38 1.26 8.3%
    16q12 (Easton et al., 2007) TOX3 rs3803662 0.25 1.20
    5q11 (Easton et al., 2007) MAP3K1 rs889312 0.28 1.13
    8q24 (Easton et al., 2007) FAM84B/c-MYC rs13281615 0.40 1.08
    11p15 (Easton et al., 2007) LSP1 rs3817198 0.30 1.07
    3p24 (Ahmed et al., 2009) NEK10/SLC4A7 rs4973768 0.46 1.11
    17q23.2 (Ahmed et al., 2009) COX11 rs6504950 0.27 0.95
    10p14 (Cox et al., 2007) CASP8 (D302H) rs1045485 0.13 0.88
    2q35 (Milne et al., 2009) TNP1/IGFBP5/IGFBP2/TNS1 rs13387042 0.52 1.12
    1p11.2 (Thomas et al., 2009) NOTCH2/FCGR1B rs11249433 0.40 1.14
    14q24.1 (Thomas et al., 2009) RAD51L1 rs999737 0.24 0.84
    5p12 (Stacey et al., 2008) MRPS30/FGFR10 rs10941679 0.26 1.19
    6q25.1c (Zheng et al., 2009b) ESR1 d rs2046210 0.35 1.29
    • MAF, minor allele frequency from European populations; RR: relative risk, FRR: familial relative risk.
    • a for BRCA1 and BRCA2 model-based estimates of relative risk with decreasing age is given; for low-penetrance polymorphisms, per-allele OR (relative to common homozygotes) is given.
    • b FRR explained is FRR as a proportion of total FRR for breast cancer, estimated to be 1.9 (ESR1 has not been included in this estimate); Ref, reference for MAF and RR.
    • c MAF and RR in Chinese population are shown.
    • d other genes are also located in this region.

    Other high penetrance alleles have been identified as part of inherited cancer syndromes. These include germ-line TP53 mutations found in Li-Fraumeni cancer syndrome (Garber et al., 1991; Malkin, 1994), PTEN germ-line mutations in Cowden syndrome (Nelen et al., 1996), and STK11/LKB1 mutations in Peutz-Jegher syndrome (Hemminki et al., 1998). Genome-wide linkage studies have failed to map other highly penetrant cancer susceptibility genes, suggesting strongly that no further high-penetrance genes of comparable importance to BRCA1 and BRCA2 exist (Smith et al., 2006). In spite of the high risks conferred by high penetrance mutations, these mutations are rare in the population, and are estimated to account for a relatively small percentage (about 20–25%) of the familial risk (Thompson and Easton, 2004).

    While mutations in BRCA1 and BRCA2 are known to confer high risks of breast and ovarian cancer, the precise magnitude of these risks are more uncertain. Initial estimates were derived in studies of selected high-risk families, but more recent estimates have been derived from population-based studies and have generally been lower. For example, a meta-analysis of BRCA1 and BRCA2 carrier families, identified through population-based studies estimated the corresponding risks to be 65% for BRCA1 and 45% for BRCA2 (Antoniou et al., 2003). Prospective studies of unaffected mutation carriers could provide penetrance estimates that are free of ascertainment bias (Kauff et al., 2008) but large numbers of individuals with long follow-up periods would be necessary to obtain precise estimates. In spite of methodological issues it is clear that that there is considerable variation in risk between mutation-carrying individuals and families (Begg, 2002; Begg et al., 2008). Risk estimates vary by age at diagnosis of the index case, type and site of the cancer (for example breast or ovarian, contralateral and unilateral), and site of the mutation (Antoniou et al., 2003; Begg et al., 2008). Segregation analysis models have also quantified the extent of variation in risk between and within families (Antoniou et al., 2008a; Begg et al., 2008). Lifestyle factors such as parity may influence risk (Andrieu et al., 2006; Cullinane et al., 2005; Milne et al., 2010). However, the higher risk in mutation carriers with a strong family history of the disease suggests that genetic modifiers of BRCA1 and BRCA2 also influence the risk of the disease (Antoniou et al., 2008a).

    2.2 Moderate penetrance variants

    Another group of genetic variants associated with breast cancer risk are uncommon variants minor allele frequency (MAF): 0.005–0.01 with moderate effects on risk (Table 1). These variants have been identified in candidate gene and family-based association studies. They include the protein-truncating variant in CHEK2, notably 1100delC (Meijers-Heijboer et al., 2002; CHEK2*1100delC and susceptibility to breast cancer, 2004), and variants in PALB2 (Rahman et al., 2007), BRIP1 (Seal et al., 2006), and ATM (Renwick et al., 2006). All of these genes are involved in DNA repair mechanisms. CHEK2 encodes a protein that regulates repair of DNA double-strand breaks by phosphorylating p53 and BRCA1. PALB2 encodes a protein that facilitates BRCA2-mediated DNA repair by promoting localization and stability of BRCA2. BRIP1 encodes a helicase that interacts with BRCA1 and plays a role in checkpoint control. The ATM protein kinase is involved in the phosphorylation of multiple proteins including p53, BRCA1 and BRCA2. Other candidate breast cancer genes have been proposed, for example MRE11, encoding a component of the MRE11-RAD50-NBS1 complex, critical for maintenance of genomic integrity and tumour suppression (Bartkova et al., 2008).

    Because of the modest increases in risk and relatively low frequency of this class of genetic variants, their contribution to familial relative risk is estimated to be less than 3% (Rahman et al., 2007). Since few genes have been studied in this way, it is likely that additional susceptibility variants of this class exist. However, re-sequencing of large numbers of cases and controls will be required to uncover them.

    2.3 Low penetrance variants

    Most of the unexplained fraction of familial relative risk is likely to be explained by a polygenic model involving a combination of many individual variants with weak associations with risk, the so called low-penetrance polymorphisms (Antoniou et al., 2002, 2004; Antoniou and Easton, 2006). The sequencing of the human genome and the mapping of single nucleotide polymorphisms (SNP), the most common form of genetic variation, has allowed rapid evolution of approaches to studying common genetic susceptibility factors. These technical advances, coupled with substantial decreases in genotyping cost, have enabled investigators to move beyond evaluating a few candidate variants in key genes, to conducting more comprehensive, as well as exploratory, evaluation of common genetic variation in candidate pathways to cancer, and to performing genome-wide association studies (GWAS) in very large study populations. Studies of candidate genes in breast cancer have evaluated a few hundred genes involved in known breast carcinogenic pathways, such as hormone metabolism and DNA repair. On the other hand, GWAS use a more agnostic approach to identify markers in genomic regions or genes associated with risk, which are subsequently studied to identify causal variants and biological mechanisms. Specifically, this strategy selects subsets of SNPs to capture most common variation in the genome in a given population, taking advantage of the correlation among neighbouring genetic variants that have not been randomly assorted by recombination (linkage disequilibrium, LD).

    Most breast cancer susceptibility loci identified through candidate gene approaches have not been confirmed (Commonly studied single-nucleotide polymorphisms and breast cancer, 2006). The most convincing association has been for a coding variant (D302H) in the caspase 8 gene (CASP8) (Cox et al., 2007). The H allele occurs in 13% of women of European background, and is associated with approximately 12% reduction in breast cancer risk (Table 1). Caspase-8 is a cysteine protease that plays an important role in the initiation of apoptosis or programmed cell death in response to DNA damage (Barnhart et al., 2003; Fulda, 2009). The functional implications of the D302H variant are unknown and it is possible that other variants in linkage disequilibrium are causative. Ongoing studies are further evaluating genetic variants in CASP8 and related genes. Another common variant with weaker evidence for an association with risk is a coding variant (L10P) in the transforming growth factor-β (TGFB1) gene (Cox et al., 2007). There was some suggestion that this association is restricted to progesterone receptor negative tumours; however, the evidence for the association is not conclusive.

    In contrast to candidate gene studies, GWAS in breast cancer so far have led to the discovery of genetic markers located in twelve susceptibility loci (Table 1). Some of these regions contain or are nearby known genes - FGFR2, TOX3, MAP3K1, LSP1 (Easton et al., 2007) and RAD51L1 (Thomas et al., 2009), while others are located in non-genic regions far from the nearest gene, for example on 8q24 (Easton et al., 2007) and 2q35 (Milne et al., 2009; Stacey et al., 2007). Current knowledge on possible biological functions of the SNPs has been recently reviewed in Ghoussani and Pharoah (Ghoussaini and Pharoah, 2009). Briefly, the variants in 10q26 lie in intron2 of FGFR2 that encodes a tyrosine kinase receptor. Fibroblast growth factors can act as oncogenes in murine mammary cancers (Dickson et al., 2000) and FGFR2 is amplified and over-expressed in 5–10% of human breast tumours (Adnane et al., 1991; Moffa et al., 2004). Genetic mapping and functional studies suggest strongly that rs2981578 is the causal variant in FGFR2 (Meyer et al., 2008; Udler et al., 2009). The SNP in 16q12 is located 8 kb from TOX3, a gene that encodes a protein predicted to act as a transcription factor (O'Flaherty and Kaye, 2003). TOX3 has been implicated in metastasis to bone in breast cancer (Smid et al., 2006). The 5q11 variant lies in a 298 kb LD block containing MAP3K1. This gene encodes a component of the MAPK signalling pathways, involved in cell signalling, proliferation and apoptosis (Easton et al., 2007). The associated region in 3p24 (Ahmed et al., 2009) contains two genes: the function of the product of NEK10 is largely unknown but it is one of a family of kinases involved in cell cycle control; The SLC4A7 protein is a tyrosine kinase substrate, expression of which is reduced in breast tumour sections and cell lines (Chen et al., 2007). The variant in 8q24 lies in a region that does not contain known genes, but is 400 kb upstream of the proto-oncogene MYC (Easton et al., 2007; Ghoussaini and Pharoah, 2009). This variant is a multi-cancer locus, associated with prostate and colorectal cancer risk (Easton et al., 2007). The SNP in 5p12 (Stacey et al., 2008) is in the region of MRPS30, a major component of apoptotic signalling pathways in cells (Cavdar et al., 2001a, 2001b). The SNP in 1p11.2 (Thomas et al., 2009) is located in the pericentromeric region of the chromosome, about 220 kb away from a SNP desert. The SNP maps to a large block of LD, which contains several pseudogenes, and a member of the low-affinity Fc gamma receptor family (Thomas et al., 2009). The promoter of NOTCH2, a gene recently shown to be associated with type 2 diabetes (Zeggini et al., 2008), lies distal to the SNP desert. RAD51L1 is in the double-strand break repair and homologous-recombination pathway (Thomas et al., 2009). Overall, the current set of susceptibility loci appears to implicate multiple complex cell signalling pathways. With the exception of the FGFR2, the causal variants and biological mechanisms underlying most of the associations are largely unknown. Ongoing and future sequencing, fine-mapping and functional studies are however required to elucidate the relevant biological mechanisms.

    In models of breast cancer based on complex segregation analyses, the polygenic component is log-normally distributed with a variance that declines linearly with age (Antoniou et al., 2008a). Age-specific FRRs predicted by the model fit the observed FRR estimated by epidemiological studies (Familial breast cancer, 2001; Antoniou et al., 2008a). It is interesting to note that, whereas relative risk estimates for high penetrance mutations decline with age at diagnosis, relative risks for most of the SNPs discovered thus far have not been found to vary with age. All of the SNPs identified to date confer modest increases in risk (per-allele odds ratios less than 1.5) (Easton and Eeles, 2008). Under a multiplicative model of disease susceptibility, the twelve low-penetrance variants are estimated to explain about 8.3% of the familial clustering of the disease (Table 1). This suggests that many other variants remain to be identified. Such variants may be found in ongoing GWAS of even larger study populations, and populations of different ethnic origins and geographical locations.

    3 Pathological characteristics of breast cancer

    Most of the genetic variants described above have been identified in studies considering breast cancer as a single disease. However, breast cancer is a heterogeneous disease and pathological characteristics such as morphology, grade and hormone-receptor profile stratify tumours into biologically and clinically distinct groups. Global gene expression analyses have identified major breast cancer intrinsic subtypes (such as Luminal A, Luminal B, HER2-enriched, and Basal-like) that further explain the biological complexities of the disease (Perou et al., 2000). Lønning has reviewed in this issue, various gene expression signatures and their value in predicting response to drug treatments (Lønning, 2010), while tumour lineage and hypotheses underlying developmental origins of breast cancer subtypes are discussed in detail in several reviews (Navin and Hicks, 2010; Podo et al., 2010; Kwei et al., 2010).

    Most epidemiological studies have been confined to the analysis of routinely available pathological data, while studies of subtypes derived from expression profiling are only beginning to emerge. Expression of the phenotypic markers oestrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) have been tested in large numbers of breast cancers and these markers are important for classification of this disease. Most breast cancers in the general population express ER and PR, and the percentage of positive tumours increases with age at diagnosis, e.g. ER positivity varies from 60 to 70% in women younger than 35 years to 80% in women over 60 years (Surveillance, Epidemiology, and End Results (SEER) Program Limited-Use Data, 1973–2006). Tumour subtypes defined by expression of ER and PR exhibit different incidence patterns, suggesting that these subtypes are aetiologically heterogeneous (Anderson and Matsuno, 2006; Anderson et al., 2002). The proportion of HER2-negative and HER2-positive tumours remains approximately the same over all age groups. A number of environmental and life-style risk factors for breast cancer vary by hormone-receptor status of the tumour. For example nulliparity, late age at first birth, obesity among postmenopausal women, and early menarche have been more strongly linked to ER and/or PR-positive than ER-negative tumours (Chlebowski et al., 2005; Colditz et al., 2004; Cotterchio et al., 2003; Garcia-Closas et al., 2006; Huang et al., 2000; Rusiecki et al., 2005; Sherman et al., 2007).

    In modeling genetic susceptibility to specific pathological subtypes it is important to study the FRR specific to each subtypes. Epidemiological studies have for the most part found no significant difference between FRR for breast cancer by ER status, or by joint ER/PR status (Colditz et al., 2004; Cotterchio et al., 2003; Hines et al., 2008; Huang et al., 2000; Potter et al., 1995; Rosenberg et al., 2006; Rusiecki et al., 2005; Tutera et al., 1996; Welsh et al., 2009; Yang et al., 2007). A recent population based cohort study of over 4500 cases with detailed family history information on all affected and unaffected relatives estimated overall breast cancer FRR for first degree relatives by tumour subtype (Mavaddat et al., 2010). Again, there was no significant difference in FRR for relatives of cases with ER-negative and ER-positive disease. However, there was some suggestion that the breast cancer FRR for relatives of patients with ER-negative disease was higher than that for ER-positive disease for ages of the relative less than 50 years old. The breast cancer FRR for relatives of patients with ER-positive disease was higher than that for ER-negative disease when the age of the relative was greater than 50 years (Mavaddat et al., 2010). Subtype-specific FRRs, that is the relative risk of cancer of a particular subtype in a relative of a case with that same subtype have not yet been estimated.

    3.1 Pathological characteristics of BRCA1 and BRCA2-related breast cancer

    BRCA1 and BRCA2-related tumours differ from each other and from sporadic cancers in their histo-pathological appearance, cytological and architectural features (Lakhani et al., 1998; Lakhani, 1999; Armes et al., 1998). BRCA1-related tumours are often poorly differentiated ductal carcinomas, of higher grade and higher mitotic count, greater degree of nuclear pleomorphism and less tubule formation than age-matched sporadic tumours (Lakhani, 1999; Lakhani et al., 2000). These tumours may also exhibit histological features of medullary carcinoma, a rare and poorly characterized form of cancer (Ridolfi et al., 1977), described in greater detail by Weigelt et al. (2010). There are fewer studies describing BRCA2–related tumours and these tumours are less distinctive. They tend to be of higher grade (Lakhani et al., 2000; Phillips, 2000), and exhibit less tubule formation (Pathology of familial breast cancer, 1997; Armes et al., 1998) than age-matched controls. The most striking feature of BRCA1 related tumours is however their lack of ER expression (Cortesi et al., 2000; Armes et al., 1999; Foulkes et al., 2004; Palacios et al., 2003; Karp et al., 1997; Lidereau et al., 2000; Eisinger et al., 1999; Loman et al., 1998; Osin and Lakhani, 1999; Johannsson et al., 1997; Lakhani et al., 2002). About 90% breast tumours in BRCA1 mutation carriers are ER-negative (Lakhani et al., 2002). Tumours from BRCA1 mutation carriers also lack expression of PR and show a lower frequency of HER2 expressing cells compared with sporadic tumours unselected for family history (Lakhani et al., 2002). BRCA2 mutation carriers express lower frequency of HER2 expressing cells than sporadic tumours, but similar levels of ER and PR (Lakhani et al., 2002).

    Differences in BRCA1 and BRCA2 related cancers are also reflected in different gene expression profiles. A pattern of genes differentially expressed by BRCA1 and BRCA2 related and sporadic cancers (Hedenfalk et al., 2001), and a signature of genes able to distinguish BRCA1 tumours from other ER-negative sporadic breast cancers (van 't Veer et al., 2002) has been identified. Tumours from BRCA1 mutation carriers have been shown to be similar to the basal subtype of cancers (Sorlie et al., 2003). Basal tumours are frequently of the triple negative (TN) phenotype- that is, they do not express ER, PR or HER2; but they often express basal cytokeratin surface markers. BRCA1-related tumours and sporadic basal-like tumours are similar in gene expression, phenotype and clinical features. Sporadic basal-like tumours in fact display defects in the BRCA1 DNA repair pathway (Yehiely et al., 2006). Triple negative breast cancers and their association with BRCA1 mutation are discussed in detail by Podo et al. (2010).

    3.2 Low penetrance susceptibility loci by tumour pathological characteristics

    The association of genetic risk factors for breast tumours from the general population with pathological subtypes of the disease have been assessed, and are summarized in Figure 1. Many of the recently discovered breast cancer susceptibility genes (for example FGFR2, MAP3K1, 8q and 5p loci) are more strongly associated with ER-positive disease. Some loci, such as TOX3, are associated with both disease types (Easton et al., 2007; Garcia-Closas et al., 2008; Stacey et al., 2007, 2008; Garcia-Closas and Chanock, 2008). These associations tend to correlate with PR status as ER and PR are highly correlated. Whether the small subset of ER-PR+ tumours are associated with a unique risk profile remains to be determined. As more pathological and genotype information is collated by large consortia such as the Breast Cancer Association Consortium (BCAC; or the Breast and Prostate Cohort Consortium (BPC3;, associations for specific tumour subtypes, including assessment of TN and basal-like tumours, will be possible.

    Details are in the caption following the image
    Susceptibility loci in ER-positive and ER-negative disease in the general population, and in BRCA1 and BRCA2 carriers. Data from published studies of Easton et al. (Easton et al., 2007; Garcia-Closas et al., 2008); Milne et al. (2009); Zheng et al. (2009b); Thomas et al. (2009); Ahmed et al. (2009); Cox et al. (2007); Stacey et al. (2008) and Antoniou et al. (2008b); Antoniou et al. (2009).

    The biological mechanisms of these associations are being assessed in functional studies: FGFR2 is more highly expressed in ER-positive than ER-negative cell lines and tumours and has been implicated in carcinogenesis of ER-positive tumours (Hishikawa et al., 2004; Tamaru et al., 2004; Zhang et al., 1998a, 1998b; Zang and Pento, 2000; Luqmani et al., 1992). The variant in the 5p12 region which also shows strong evidence for association primarily with ER-positive tumours, is close to the FGFR2 ligand, FGF10 (Stacey et al., 2008). The ESR1 SNP is located upstream of the transcription start site of exon 1 of the ESR1 gene (Zheng et al., 2009b). ESR1 encodes ER-α which regulates signal transduction of estrogen, elevated levels of which are associated with increased risk of breast cancer (Key et al., 2002). The association of SNPs with specific tumour subtypes and concomitant functional studies may shed light on the aetiology of subtypes of breast cancer and provide insights into the interaction of susceptibility loci with environmental risk factors.

    Most studies to date have evaluated genetic variants associated with overall breast cancer risk at genome-wide significance levels. The “main effect” or overall effect of an exposure is a weighted average of subgroup effects, depending on the distribution of subgroups and relative difference in effect sizes between the subgroups. If the relative risk in each of the subgroups is sufficiently heterogeneous, subgroup analysis may identify susceptibility loci missed in main effect analyses (Mavaddat et al., 2009). This hypothesis was tested in a study of 120 candidate genes and pathological subtypes in breast cancer. In this study subgroup analysis resulted in substantial reordering of ranks of SNPs, as assessed by the magnitude of the test statistics, and some associations that were not significant for an overall effect were detected in subgroups at a nominal 5% level adjusted for multiple testing (Mavaddat et al., 2009). These results suggest that, particularly in large GWAS, it may be worthwhile to conduct subgroup analysis by tumour subtype in the absence of a notable association in a “main effects” analysis.

    4 Genetic modifiers of BRCA1 and BRCA2-related breast cancer

    Several candidate genes were initially assessed as modifiers for BRCA1 and BRCA2 related cancer. Due to scarcity of carriers and limited power of individual studies to detect small effect sizes, a collaborative group known as the Consortium of Investigators of modifiers of BRCA1 and BRCA2 (CIMBA, was set up to investigate these associations further. CIMBA presently comprises 42 groups from across Europe, US, Australia and Asia. Of the candidate genes that have been investigated as modifiers of BRCA1 and BRCA2 (for example TP53, MDM2, AURKA, and others (Osorio et al., 2009; Spurdle et al., 2006, 2009a, 2009b; Sinilnikova et al., 2009; Johnatty et al., 2009; Rebbeck et al., 2009a; Couch et al., 2007)) the association of only one SNP, in the 5′untranslated region of RAD51, with BRCA2 mutation carriers has been confirmed where rare homozygotes were at a threefold increased risk of developing breast cancer (HR 3.18, 95% CI 1.39–7.27, p = .0007) (Antoniou et al., 2007). RAD51 is involved in double-stranded DNA repair and interacts with both BRCA1 and BRCA2 (Scully et al., 1997; Chen et al., 1999; Wong et al., 1997). The 135G->C variant affects RAD51 splicing within the 5′ UTR and may alter the expression of RAD51 (Antoniou et al., 2007). The effect of the RAD51 135G->C SNP has not replicated in breast tumours in the general population (Kuschel et al., 2002; Webb et al., 2005).

    Subsequently CIMBA genotyped SNPs identified through breast cancer GWAS in BRCA1 and BRCA2 mutation carriers (Antoniou et al., 2008b, 2009). These SNPs were assessed for association with the risk of breast cancer using retrospective cohort methods. The results of these analyses are shown in Figure 1. As shown, SNPs on FGFR2 and LSP1 were associated with an increased breast cancer risk in BRCA2 mutation carriers but not BRCA1 mutation carriers. The difference in the estimates between BRCA1 and BRCA2 carriers was significant. MAP3K1 was associated with breast cancer risk only in BRCA2 but not BRCA1 mutation carriers. However the SNPs on TOX3 and 2q35 were associated with risk in both BRCA1 and BRCA2 carriers. The SNP at 8q24 was not associated with breast cancer risk for either BRCA1 or BRCA2 mutation carriers but the estimated association for BRCA2 mutation carriers was consistent with odds ratio estimates derived from population-based case control studies. These results may be consistent with predictions from genetic models for the modifying polygenic component for BRCA1 and BRCA2, where the point estimate of the BRCA1 variance (1.32) was (non-significantly) lower than the polygenic variance for non-carriers, while that of the BRCA2 variance (1.73) was more similar to the latter (Antoniou et al., 2008a).

    The pattern of association with ER-positive and ER-negative disease parallels those observed in BRCA2 and BRCA1 mutation carriers, respectively. For example, associations found in BRCA2 carriers, who tend to develop ER-positive tumours (Lakhani et al., 2002), are similar to those for ER-positive tumours in the general population (Figure 1). In contrast, associations are generally weaker for BRCA1 carriers, who develop primarily ER-negative tumours (Lakhani et al., 2002), and for ER-negative tumours. The strongest associations with risk of ER-negative disease, among the six loci also evaluated in BRCA1 carriers, are for SNPs in TOX3 and 2q35. These SNPs are the only loci significantly associated with risk in BRCA1 carriers (Antoniou et al., 2008b, 2009). As BRCA1 related tumours are phenotypically similar to basal tumours, comparisons of susceptibility loci for TN and basal phenotypes might reveal more similarities. Similarly, a more direct comparison would be between susceptibility loci for specific tumor subtypes in carriers and non-carriers. Finally two GWAS, one for BRCA1 and one for BRCA2 mutation carriers are currently underway, and should shed further light on genetic predisposition to breast cancer in mutation carriers.

    The relationship between genetic susceptibility loci and tumour subtype has also been observed in ovarian cancer, a hormone-related cancer that shares certain environmental and genetic risk factors with breast cancer. Ovarian cancer in BRCA1/2 mutation carriers and non-carriers are associated with characteristic pathological features (Lakhani et al., 2004). Interestingly mutations in specific regions of the BRCA1 and BRCA2 genes are associated with higher ovarian and lower breast cancer risks compared to mutations elsewhere in these genes (Thompson and Easton, 2002, 2001). Ovarian GWAS have recently identified an ovarian cancer susceptibility locus on 9p22.2, and this association was strongest for serous cancers (Song et al., 2009). The association of this SNP with breast cancer has yet to be determined.

    5 Genetic predisposition to breast cancer in different populations

    There are large differences in the genetic architecture of populations: different populations have different LD structures, and genetic variants are represented in varying frequencies. It is possible that differences in genetic susceptibility contribute to disparities in the incidence and characteristics of breast cancer in different populations. For example, the incidence of breast cancer is lower in African American than women of European background. However, breast cancer in African American women is diagnosed at younger ages, and the disease tends to be more aggressive: tumours are more often ER and PR negative (Amend et al., 2006; Joslyn, 2002; Cunningham et al., 2009; Chu and Anderson, 2002), and there is a higher prevalence of basal-like tumours (Carey et al., 2006). Unique mutations and unclassified variants in BRCA1 and BRCA2 have been identified in African American women.

    The D302H variant in CASP8 identified to be associated with breast cancer in populations of European background is very rare in Asian populations. However, a 6-bp deletion polymorphism (−652 6N del) in the promoter of CASP8 has been associated with the risk of multiple cancers, including breast cancer in a Chinese population (Sun et al., 2007). Subsequent studies have not been able to confirm the association with breast cancer risk in populations of European background (Cybulski et al., 2008; Frank et al., 2008; Haiman et al., 2008), nor in populations of Asian and African origin in the USA (Haiman et al., 2008). A relatively small study in Italy suggested that this variant might be associated with age at diagnosis in familial breast cancer cases (De et al., 2009). The SNP in 6q22 has been associated with breast cancer in Ashkenazi Jewish population, but larger replication samples are required to assess its association in populations of European origin (Gold et al., 2008; Kirchhoff et al., 2009).

    GWAS in complex diseases have been primarily conducted in women of European background. It is possible that GWAS in different populations may facilitate the discovery of variants that are weakly tagged, or uncommon in European populations. A GWAS in Chinese populations has recently identified a variant in ESR1 associated with breast cancer risk (Zheng et al., 2009b). The effect size of this SNP is large compared with that of other variants found in populations of European background, particularly for ER-negative disease. The association was replicated in a small population of Europeans (Zheng et al., 2009b), but a much larger study is needed to compare the effect size of the SNP in the two populations, and to investigate if the causal variant/s are common across ethnic groups. Such studies will enhance our understanding of the relationship of tumour subtype, environmental and genetic susceptibility to different subtypes in different populations.

    African populations are genetically more diverse than European and Asian populations. High levels of haplotype diversity may facilitate fine-mapping the causal variants that underlie disease associations, providing that the causal variant is segregating. For instance, fine mapping in the African-American population contributed to the localisation of the causal variant in FGFR2 (Udler et al., 2009). The low level of LD in the African population can however be disadvantageous in GWAS, as it reduces the likelihood that a causal variant will have a significant level of correlation with nearby SNPs to be detected (Teo et al., 2010). Thus, higher density genome-arrays are required. Zheng et al. evaluated 11 breast cancer susceptibility loci identified through GWAS in a sample of 810 African American cases and 1784 African American controls (Zheng et al., 2009a). Twenty additional SNPs were selected to tag all common SNPs flanking the initially reported SNPs. These were SNPs that are in high linkage disequilibrium with initially reported SNPs in European descendants but not in Africans. Two SNPs, in 2q35 and FGFR2, were found to be associated with breast cancer risk. No additional significant associations were identified in this study (Zheng et al., 2009a). However, an association in 5p12 has recently been replicated in African American women (Ruiz-Narvaez et al., 2010).

    Admixture based genome-wide scanning of African American women has also been used to search for risk alleles for breast cancer that are highly differentiated in frequency between African American and European American women, and may contribute to specific breast cancer phenotypes. An example of use of this method is the study by Fejerman et al. (2009). Although this study was not large enough to detect alleles of modest effect sizes, admixture based genome-wide scanning is another method that may allow detection of genetic variants associated with breast cancer in different populations (Fejerman et al., 2009).

    6 Contribution of genetic variants to familial relative risk

    Using a model based approach it was estimated that about 24% of the breast cancer FRR to relatives of cases with ER-negative disease and 1% of the breast cancer FRR to relatives of cases with ER-positive is due to BRCA1 mutations. BRCA1 and BRCA2 mutations together were estimated to explain 32% of breast cancer FRR for ER-negative disease and 9.4% of FRR for ER-positive disease (Mavaddat et al., 2010).

    The contribution of the 12 common breast cancer susceptibility alleles to the breast cancer FRR in relatives of patients with ER-positive than ER-negative disease was estimated using subtype-specific genotypic relative risks and allele frequencies for each variant. SNPs associated with ER-negative disease contributed only 1.9% to the overall breast cancer FRR for relatives of patients with ER-negative disease, while SNPs associated with ER-positive disease were estimated to account for 9.6% of the breast cancer FRR for relatives of patients with ER-positive disease (Mavaddat et al., 2010).

    These estimates assumed concordance of the disease subtype in relatives, and also that the genetic variants interact multiplicatively on the risk of developing the disease. Although the latter seems a plausible assumption (Antoniou et al., 2002; Pharoah et al., 2002) such a model remains to be tested explicitly for all of the recently identified genetic variants. In the other extreme, if the true model for the combined effects was additive, then the estimated contributions would be somewhat lower. Antoniou et al. assessed the combined associations of SNPs in FGFR2 and TOX3 on breast cancer risk in BRCA2 carriers and found no difference between a multiplicative model that included a HR parameter for each SNP and a fully saturated model which included a separate HR parameter for each combined genotype (Antoniou et al., 2008b). The contribution of the variants may prove greater once the true causal variants have been identified. More complex relationships, such as parent of origin effects may also influence the contributions of a particular variant, as has been suggested for a SNP in the 11p15 region (Kong et al., 2009).

    There have been suggestions in a small number of studies that the rare variants in CHEK2 and PALB2 may have stronger associations with ER-positive (de Bock et al., 2006; Cybulski et al., 2009) or ER-negative (Heikkinen et al., 2009) disease. However larger studies are required to investigate these associations further.

    7 Contribution of newly discovered SNPs to risk prediction

    The contribution of the newly discovered susceptibility loci to risk prediction in complex diseases, has been extensively debated (Clayton, 2009; Gail, 2009; Ghoussaini and Pharoah, 2009; Pharoah et al., 2008; Wacholder et al., 2010). It is generally agreed that the effect sizes of individual loci are too small for any to be of independent predictive value. If the loci are considered jointly, the contribution is greater, although as discussed above precise models for joint action of the SNPs have not been established. Even in this case, the proportion of the population that would be homozygous for all risk alleles will be small. Stratification of risk in breast cancer may however be useful for targeting public health strategies such as screening. For example, using SNP profiles rather than age alone as a basis for deciding who should undergo screening mammography may lead to greater yield of cancers and fewer false positives (Ghoussaini and Pharoah, 2009; Pharoah et al., 2002, 2008).

    Risk prediction for BRCA1 and BRCA2 mutation carriers is different. In this context, the absolute risk of breast cancer conferred by mutations in these genes is already high. Therefore even small relative effect of the modifier loci is amplified. This is demonstrated in Figure 2 (Antoniou et al., 2008b). Based on published penetrance estimates, the average absolute risk of breast cancer by age 70 among BRCA2 carriers is 50% (Antoniou et al., 2008a). When the joint effects of FGFR2 and TOX3 are taken into account, BRCA2 mutation carriers with no FGFR2 or TOX3 risk alleles are predicted to have a risk of 41% whereas the risk for BRCA2 mutation carriers who are homozygote for the risk allele at both FGFR2 and TOX3 is predicted to be 70% (Antoniou et al., 2008b). Although only 1% of carriers are doubly homozygous, approximately 36% of carriers will have a HR of 1.5 or greater in comparison to the 20% of carriers with no risk allele (Antoniou et al., 2008b). For the general population, on the other hand, the risk of breast cancer by age 70 is 7% (Parkin et al., 2002); for carriers of no FGFR2 or TOX3 risk alleles the risk decreases to 5%, while for carriers with four risk alleles, the risk is only increased to 11%. Pharoah et al. calculated that based on the predicted distribution of ten susceptibility loci in the general population, women on the lowest centile of risk will have a 50% reduction in risk compared with the population average, whereas those in the top centile of risk will have an 84% increase in risk (Ghoussaini and Pharoah, 2009). Such differences would translate to bigger differences in the absolute risk of developing the disease for BRCA1 or BRCA2 mutation carriers. The effect of the newly discovered SNPs may therefore be particularly useful in predicting cancer risks to carriers of BRCA1 and BRCA2 mutations. Individuals found to be at increased risk may be offered more personalized clinical management and receive prophylactic therapies such as oophorectomy or mastectomy; or aromatase inhibitors, in the case of ER-positive disease (Friebel et al., 2007; Hartmann et al., 2001; Meijers-Heijboer et al., 2001; Rebbeck et al., 2002, 2004, 2009b).

    Details are in the caption following the image
    Cumulative risk of breast cancer for BRCA2 mutation carriers and general population by combined FGFR2 and TOX3 genotypes. The combined FGFR2 genotypes at extremes of risk are as follow: FGFR2/TOX3 = GG/CC, or FGFR2/TOX3 = AA/TT. “Average” represents the cumulative breast cancer risk over all possible modifying effects among BRCA2 mutation carriers, or individuals in the general population born after 1950 (Parkin et al., 2002). The minor allele frequencies for the FGFR2 and TOX3 SNPs were assumed to be 0.39 and 0.26 respectively (adapted from Antoniou et al., 2008b).

    8 Summary and future directions

    Only a minority of the FRR for breast cancer is explained by genetic variants discovered to date. The remainder of the FRR, which has been termed the ‘dark matter’, may be composed of common variants with even smaller effects, or rare variants not well tagged by common SNPs. The 1000 Genomes Project and other sequencing projects, coupled with higher density SNP arrays, may enable such variants to be identified. In addition, it is possible that some of the familial risk may be mediated through complex interactions that do not translate into notable main effects. Additional studies of haplotype, gene–gene or gene–environment interactions could help identify more complex associations with risk; however these studies require even larger study populations and, in the case of gene–environment interactions, accurate measurement of lifestyle/environmental factors. If tumour subtypes represent distinct biological entities, genetic predisposition to breast cancer might give rise preferentially to certain subtypes. Increasing evidence for such a hypothesis is emerging from analyses of genetic variants associated with breast cancer susceptibility in subgroups of the disease. The collection of reliable and comparable pathological data in individuals and their families is still a challenge. GWAS restricted to ER-negative breast cancer are already underway. It is also possible however that the subgroups presently explored are ‘surrogates’ for those sub-groupings that more directly reflect intrinsic biology of tumours. Integrating pathological information into risk prediction models may improve discrimination between carriers of deleterious mutations and non-carriers and enhance the performance of models, while prediction of subtypes of disease could allow more targeted use of prophylactic therapies. One of the first applications of the discovery of genetic variants of small effect may be in risk prediction for BRCA1 and BRCA2 mutation carriers.