Uracil moieties in Plasmodium falciparum genomic DNA

Plasmodium falciparum parasites undergo multiple genome duplication events during their development. Within the intraerythrocytic stages, parasites encounter an oxidative environment and DNA synthesis necessarily proceeds under these circumstances. In addition to these conditions, the extreme AT bias of the P. falciparum genome poses further constraints for DNA synthesis. Taken together, these circumstances may allow appearance of damaged bases in the Plasmodium DNA. Here, we focus on uracil that may arise in DNA either via oxidative deamination or thymine‐replacing incorporation. We determine the level of uracil at the ring, trophozoite, and schizont intraerythrocytic stages and evaluate the base‐excision repair potential of P. falciparum to deal with uracil‐DNA repair. We find approximately 7–10 uracil per million bases in the different parasite stages. This level is considerably higher than found in other wild‐type organisms from bacteria to mammalian species. Based on a systematic assessment of P. falciparum genome and transcriptome databases, we conclude that uracil‐DNA repair relies on one single uracil‐DNA glycosylase and proceeds through the long‐patch base‐excision repair route. Although potentially efficient, the repair route still leaves considerable level of uracils in parasite DNA, which may contribute to mutation rates in P. falciparum.

Plasmodium falciparum parasites undergo multiple genome duplication events during their development. Within the intraerythrocytic stages, parasites encounter an oxidative environment and DNA synthesis necessarily proceeds under these circumstances. In addition to these conditions, the extreme AT bias of the P. falciparum genome poses further constraints for DNA synthesis. Taken together, these circumstances may allow appearance of damaged bases in the Plasmodium DNA. Here, we focus on uracil that may arise in DNA either via oxidative deamination or thymine-replacing incorporation. We determine the level of uracil at the ring, trophozoite, and schizont intraerythrocytic stages and evaluate the base-excision repair potential of P. falciparum to deal with uracil-DNA repair. We find approximately 7-10 uracil per million bases in the different parasite stages. This level is considerably higher than found in other wild-type organisms from bacteria to mammalian species. Based on a systematic assessment of P. falciparum genome and transcriptome databases, we conclude that uracil-DNA repair relies on one single uracil-DNA glycosylase and proceeds through the long-patch base-excision repair route. Although potentially efficient, the repair route still leaves considerable level of uracils in parasite DNA, which may contribute to mutation rates in P. falciparum.
Malaria is a major health threat affecting large regions globally, resulting in the death of~450 000 people annually [1]. The parasite's capability of adaptation is a major hindering factor in the way of eliminating the disease, mostly represented by the growing resistance of parasites against antimalarials [1]. The causative agents of malaria belong to the Plasmodium genus. Among the five human parasites, Plasmodium falciparum (P. falciparum) presents an exceptional biomedical challenge being responsible for the most serious infections and most of the lethal cases [2].
The life cycle of P. falciparum is intriguingly complex (Fig. 1). The parasites undergo multiple DNA replications at several developmental stages in their vector (Anopheles mosquito) and host (human liver and bloodstream). The sexual phase of development occurs in the female Anopheles mosquito. The only meiotic division takes place when the zygote, originating from the fusion of microgametes and macrogametes inside the mosquito, evolves into ookinetes. These will then develop into multinuclear oocysts, wherein mitotic sporogenesis results in the formation of numerous sporozoites. After the mosquito bites a human host, sporozoites invade the liver and undergo at least a dozen rounds of mitosis to produce tens of thousands of haploid merozoites [3]. These start the intraerythrocytic cycle by the invasion of red blood cells. The importance of this cycle is emphasized by the fact that about two-thirds of the genes of a murine Plasmodium have been shown to be necessary for the blood stage growth of parasites [4]. First, they develop into rings, followed by trophozoites. At this stage, parasites enter the G1 phase, and start to prepare for DNA replication. The S phase starts around 30 h after erythrocyte invasion, when parasites are in the late trophozoite stage. The replication in the parasite is asynchronous and produces up to about 24n sister chromatids. Replication ends around 44 h postinvasion, after which each genome is packed into daughter merozoites (schizont form) [3,5]. Merozoites may exit the continuous intraerythrocytic cycle by differentiating into male or female gametocytes. These sexual forms, consumed by the mosquito, evolve into micro-or macrogametocytes. Microgametocytes undergo three mitotic cycles and form eight exflagellated microgametes [3].
Importantly, DNA replication cycles within the intraerythrocytic stages proceed in an environment rich in oxidative conditions. Especially, free heme and iron during the hemozoin formation process pose notable oxidative stress, which may result in DNA modifications (e.g., oxidative cytosine deamination leading to uracil and other oxidative processes). The specific composition of genomic DNA in Plasmodia may also facilitate the appearance of uracil in the DNA. Members of the Plasmodium genus possess the most AT-rich genome sequenced so far, including P. falciparum, having namely~80% AT content in the exonic regions and~90% in the intronic regions [6,7]. Comparing to other organisms, like the host, Homo sapiens, where the average AT content is 58.9%, and to other eukaryotic pathogens, namely Toxoplasma gondii and Trypanosoma brucei, having 47.7% and 53.2% AT, respectively, the base composition of the P. falciparum genome is indeed extraordinary [8]. A mutation bias has recently been pointed out in P. falciparum, showing an increased occurrence of GC?AT substitutions, which could promote the AT-rich genome structure [8]. The AT richness of the genome may increase the possibility of uracil content, as huge amounts of thymidines are incorporated, giving more chance for the polymerase to mistake thymidines with uridines as compared to a genome with lower levels of AT content. We therefore wished to determine uracil levels in genomic DNA of P. falciparum during the intraerythrocytic stages.
A dot-blot-based uracil detection method has recently been developed in our laboratory, which provides a robust and straightforward possibility for sensitive and quantitative detection of uracil-DNA levels [9]. The basis of detection is an engineered catalytically inactive uracil-DNA glycosylase (UNG), which is capable of recognizing and binding to uracils incorporated in DNA sequences. The uracil sensor UNG-construct can be equipped with diverse tags for ease of detection via antibodies using the dot-blot method. The sensitivity of the method is equivalent to that of MS-based methods, providing a limit in femtomolar concentrations [9].
In the present work, we analyzed the uracil content of genomic DNA from three different intraerythrocytic developmental stages of P. falciparum 3D7 parasites, namely the ring, trophozoite, and schizont stages. The quantification of uracil moieties was performed by the aforementioned dot-blot-based uracil detection method [9]. To assess the potential efficiency of uracil-DNA repair, we compared the existing orthologues of mammalian base-excision repair enzymes to those present in the parasite based on genome databases of H. sapiens and P. falciparum. We also analyzed transcriptome databases of the intraerythrocytic parasite stages with regard to expression level of base-excision repair enzymes.

Genomic DNA isolation
Synchronized parasites of different developmental stages were collected from two biologically independent cultures (i.e., biological replicates), and lysed as described elsewhere [12]. Briefly, red blood cells were lysed in 5% saponin (Sigma-Aldrich, St. Louis, MO, USA) in PBS, then incubated at 37°C for 3 h for parasite lysis in a lysis solution (pH = 7.5) of the following composition: 40 mM Tris/HCl; 80 mM EDTA; 2% SDS; 0.1 mgÁmL À1 proteinase K. After treatment, genomic DNA was purified using the QuickDNA Miniprep Plus kit obtained from Zymo Research (Irvine, CA, USA).

Dot-blot measurement and analysis
Dot-blot measurements were carried out in four independent replicates using samples from the P. falciparum developmental stages, namely rings, trophozoites, and schizonts, as described elsewhere [9]. Briefly, genomic DNA isolated from CJ236 Escherichia coli strain [dutÀ,ungÀ] served as a uracil standard, applied in 15 ng diluted into 1 lg of ultrapure salmon sperm DNA. The standard was diluted in a ½ dilution series. The two-third serial dilutions for P. falciparum samples started with 1 lg of DNA. Samples were spotted onto a prewetted positively charged nylon membrane (Amersham Hybond-Ny+; GE Healthcare, Little Chalfont, UK) using a vacuum-driven microfiltration apparatus (Bio-Dot, Bio-Rad, Hercules, CA, USA). The DNA was immobilized, and the membrane was blocked and incubated with the detector construct of UNG. After several washing steps, first primary, then secondary antibodies were applied. Immunoreactive bands were visualized by enhanced chemiluminescence reagent (Western Chemiluminescent HRP substrate from Merck Millipore, Burlington, MA, USA), and images were captured by a Bio-Rad Che-miDocTM MPImaging system. Densitometry was performed using IMAGEJ 1.48p software (National Institutes of Health, Bethesda, MD, USA). The number of deoxyuridine nucleotides was calculated as described elsewhere [9]. Calibration curve from the dilution of the standard was fitted with a polynomial with second order that provided a fit with R 2 ≥ 0.98. The number of uracil per million bases in the 'unknown' genomic DNA was determined by interpolating their normalized intensities in the calibration plot based on the amount of DNA applied.

Statistical analysis
Statistical analysis was carried out by ORIGINPRO 8.6 (Origi-nLab, Northampton, MA, USA) using one-way ANOVA test when samples passed homogeneity of variance test (Levene's test) and normal distribution tests (Kolmogorov-Smirnov test). Differences were considered statistically significant at P < 0.05.

Transcriptome analysis
Transcriptome analysis was carried out using the Transcriptomics function of the Plasmodb database. The gene expression level of each protein was estimated by RNA-Seq data for intraerythrocytic stages [13,14]. Raw data were plotted for four stages: ring, early and late trophozoite, and schizont. Table 1. Comparison of the mammalian and Plasmodium falciparum BER protein sets and their involvement in short-patch versus longpatch BER (cf 'x' marks). The functionality of DNA glycosylases is defined as mono-(M) or bifunctional (B). Question mark in case of DNA polymerase b indicates that a polymerase b-like enzyme was reported in P. falciparum, with an activity related to mammalian Pol b; however, the respective gene is not annotated. All abbreviations are listed in the Abbreviations section of the article.

Results and Discussion
We measured the amount of uracil moieties/million bases of DNA samples extracted from the three intraerythrocytic developmental stages of the P. falciparum parasites using a recently developed dotblot-based detection method [9]. Parasite cultures were synchronized, and cultures of ring, trophozoite, and schizont were collected for the isolation of genomic DNA. Two representative dot-blot images are shown in Fig. 2. The measured amount of uracil moieties in our samples could be fitted to the linear range of the standard dilution series (Fig. 2). The uracil content of ring, trophozoite, and schizont stage parasites were determined to be 9.6 AE 2.8, 6.7 AE 2.4, and 7.6 AE 3.8 uracil per million bases, respectively. One-way ANOVA statistical analysis revealed that the uracil contents of the measured erythrocytic stages under these experimental conditions do not differ significantly from each other (P = 0.618).
It is of interest to note that these uracil-DNA levels are significantly higher than those observed in other samples from different wild-type organisms. The level of uracil moieties in genomic DNA has been assessed by various methods in numerous organisms so far [9,15,16] The general conclusion from these studies agrees that wild-type organisms from bacteria to mammals, as well as normal cell lines, show low levels of uracil in DNA in the range of 0.1-1 uracil per million bases, or even lower [16,17]. An interesting exception was found in Drosophila S2 cells, where the uracil-DNA content was reported to be around 15-16 uracil per million bases [9]. This considerable high genomic uracil level is, however, probably directly related to the lack of the most efficient uracil-DNA glycosylase enzyme (UNG protein) from the Drosophila genome [18]. Organisms under genotoxic stress or engineered to lack uracil-DNA glycosylases also present increased genomic uracil levels [9,[15][16][17]. To discuss our results, it was therefore of immediate interest to investigate whether the approx. 7-10 uracil per million bases levels in the P. falciparum genomic DNA samples may be related to a limited set of repair enzymes in the parasite.
The DNA repair route to remove uracil from DNA relies on the base-excision repair (BER) pathway [19]. We therefore systematically compared the relevant set of proteins encoded in mammalian species vs P. falciparum. For the initial search of the related P. falciparum BER enzyme set, the KEGG pathway database [20] was used with manual curation and verification of the hits. In each case, we also performed a sequence alignment to decide whether the orthologues are truly relevant and include the functionally important residues. Whenever available, published studies on the specific proteins were also consulted. Results are shown in Table 1 and identify two interesting limitations of the BER protein set in P. falciparum.
These two limitations relate to, on the one hand, the set of enzymes capable of recognizing and cleaving uracil from DNA, and on the other hand, to the set of proteins required for the short-patch versus long-patch BER routes. Uracil-DNA glycosylases in diverse organisms include at least four enzyme families (UNG, TDG, SMUG, MBD4) [35,36]. The diversity in these enzymes defines their specific roles and different substrate specificities and underlies the high significance of uracil removal from DNA. In P. falciparum, however, only one uracil-DNA glycosylase gene is present: It encodes the archetypical UNG enzyme.
With regard to the second limitation, concerning short-patch vs long-patch BER pathways, it has been argued earlier that P. falciparum predominantly employs the long-patch pathway [37]. In agreement with this study, several proteins involved in shortpatch BER was not identified in P. falciparum (e.g., polymerase b, ligase 3) (cf Table 1). It has to be mentioned that although a protein with polymerase b-like enzyme activity has been reported in P. falciparum [23], its role in short-patch BER in P. falciparum has not been confirmed. As it is responsible for the synthesis of 3-to 5-bp oligonucleotides, it may be involved in long-patch repair [23]. Also, this 'polymerase b-like' protein may have another role in an alternative endjoining pathway in the parasite [38]. No orthologues of the LIG3 and its stabilizing scaffold protein, XRCC1, have been found so far. LIG3 and XRCC1 are responsible for the ligation process in short-patch BER [19]. In summary, the protein set encoded in P. falciparum is deficient on short-patch BER, but all protein orthologues necessary for the long-patch BER pathway have been clearly identified in the parasite.
Based on these data, a possible route of P. falciparum long-patch BER-based uracil-DNA repair mechanism is shown in Fig. 3. The recognition and excision of uracil in the DNA of the parasites are performed by UNG. The next step is DNA strand cleavage by Apn1, which results in the formation of a nick in the DNA backbone. DNA polymerase d (or e) binds to the DNA by the help of proliferating cell nuclear antigen (PCNA) and the replication factor C (RFC) orthologue to start the synthesis of~10 new nucleotides, while removing the downstream 5 0 DNA end. The replaced section forms a so-called flap structure that is still connected to the DNA. It is removed by flap endonuclease 1 (FEN1). The leftover nick is ligated by DNA ligase I [39].
It was also of interest to look into the expression profiles of the key enzymes involved in uracil-DNA repair in the parasite, in relation to the uracil-DNA levels during the intraerythrocytic stages. In this analysis, we also considered the dUTPase enzyme, which is responsible for cleaving dUTP to prevent uracil incorporation into DNA [36]. Based on the analysis of BER enzyme transcriptome levels of the different P. falciparum developmental stages, the pathway is initiated in the late trophozoite stage (Fig. 4). This is in good agreement with the DNA metabolism of the parasites, as DNA synthesis starts in the late trophozoite stage followed by DNA packaging into merozoites at the end of the intraerythrocytic cycle. In case of UNG and Pold, the expression level drops again in schizont stage. Apn1, FEN1, and DNA ligase I remain present after late trophozoite stage as well (Fig. 4C). The expression level of dUTPase, responsible for preventing uracil incorporation, is highly elevated in late trophozoite stage in parallel with the BER enzymes. However, transcriptome analysis data should be evaluated with caution as they may not reflect the efficiency of the enzymes participating in uracil repair.

Conclusions
We have determined uracil-DNA levels in different intraerythrocytic stages of P. falciparum genomic DNA and found that approx. 7-10 uracil per million bases can be detected. To account for this level, which is significantly higher as compared to other normal wild-type organisms, the balance between processes leading to uracil presence in DNA and its removal needs to be considered.
There are two mechanisms by which uracil can arise in DNA, as shown in Fig. 5. On the one hand, if cellular dUTP levels are high as compared to dTTP levels, polymerases can readily incorporate dUMP moieties into DNA. The enzyme family of dUTPases are responsible for keeping dUTP levels at a low value to prevent thymine-replacing incorporations. The significance of this DNA repair function of P. falciparum dUTPase is underlined by numerous studies that focus on plasmodial dUTPase inhibition as an important chemotherapeutic strategy against malaria [40][41][42].
Thymine-replacing uracil incorporation into P. falciparum genomic DNA may be enhanced by the exceptionally high AT content of the parasite genome. Also, the level of dUTPase expression as suggested by transcriptomic analysis is increased only in the late trophozoite stages, potentially allowing uracil incorporation at earlier stages where DNA replication is initiated. This pathway of uracil incorporation does not result in a stable mutation, but is considered dangerous because high levels of uracil in the DNA can lead to the hyperactivity of the uracil-repair mechanism, resulting in thymine-less cell death [36].
Another possibility for uracil to appear in DNA is the oxidative deamination of cytosine, resulting in a G:U pair instead of G:C [36,43]. In the next round of replication, DNA polymerases will incorporate adenosine opposite of U. Without repair, this will lead to the formation of an AT pair, aka a GC?AT substitution. The deamination of cytosine is considered one of the most frequent DNA mutations, with a rate of 100 to 500 UÁcell À1 Áday À1 [44]. In case of P. falciparum, the detoxifying process resulting in the formation of hemozoin crystals gives rise to the formation of oxidative agents. The presence of such reactive oxygen and nitrogen species in the parasitized erythrocytic environment can cause the deamination of cytidine in increased frequency, possibly contributing to the GC? AT substitutions [8].
Uracil removal from DNA requires the base-excision repair process. In mammalian cells, both shortand long-patch BER pathways are present for the repair of base excisions, but the parasites rely only on the long-patch repair. Moreover, from the different families of uracil-DNA glycosylases, P. falciparum contains only the single UNG protein, further limiting the capacity of parasites to remove uracil from the DNA.
It has been discussed that genomic architecture of P. falciparum, containing low complexity regions and repetitive sequences as a consequence of the AT richness, allows high indel mutation rates in coding and noncoding regions. Indel mutations occur 10-fold more frequently compared to base-pair substitutions, and this is probably the result of DNA polymerase slippages and unequal crossing over events [8]. Possible advantages of high mutation rates include an effect on gene regulation, an extended antigenic variance, a role in drug resistance, and an evolutionary benefit. The mutation of noncoding genes can have an effect on the gene expression, as these regions often have enhancer or repressor roles [8,45]. Probably, the high mutation rates combined with low complexity regions can facilitate adaptive evolution in P. falciparum parasites [8]. The presence of uracil moieties in the parasite genome may also contribute to mutation rates.