Письменный перевод с английского языка на русский "DNA Replication in Archaea, the Third Domain of Life"

Выполнение письменного перевода профессионального английского текста "Репликация ДНК у архей третьего домена жизни" на русский язык. Лингвистический анализ данного текста с описанием различных типов и способов перевода специализированных терминов.

Рубрика Иностранные языки и языкознание
Вид дипломная работа
Язык русский
Дата добавления 02.12.2013
Размер файла 3,0 M

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

МИНИСТЕРСТВО ОБРАЗОВАНИЯ И НАУКИ РОССИЙСКОЙ ФЕДЕРАЦИИ

Федеральное государственное бюджетное образовательное учреждение высшего профессионального образования

«Кубанский государственный университет»

Дополнительная профессиональная образовательная программа профессиональной переподготовки для получения дополнительной квалификации

«Переводчик в сфере профессиональной коммуникации»

ОТЧЕТ ПО ПЕРЕВОДЧЕСКОЙ ПРАКТИКЕ

Письменный перевод с английского языка на русский

«DNA Replication in Archaea, the Third Domain of Life»

Выполнил: Юхновский С.А.

Проверила: Спасова М.В.

Краснодар, 2013

TABLE OF CONTENTS

Essay

English text

1. Introduction

2. Replication origin

3. How does Cdc6/Orc1 recognize oriC?

4. MCM helicase

5. Recruitment of Mcm to the oriC region

6. GINS

7. Primase

8. Single-stranded DNA binding protein

9. DNA polymerase

10. PCNA and RFC

11. DNA ligase

12. Flap endonuclease 1 (FEN1)

13. Summary and perspectives

I. Russian Translation

II. Linguistic analysis of the Text

Bibliography

Glossary

СОДЕРЖАНИЕ

Эссе

Английский текст

1. Введение

2. Инициация репликации

3. Как распознают Cdc6/Orc1 в ORIC?

4. MCM геликаза

5. Комплектование MCM в области ORIC

6. GINS

7. Праймазы

8. Одноцепочечный ДНК-связывающий белок

9. ДНК-полимераза

10. PCNA и RFC

11. ДНК-лигазы

12. Флэп-эндонуклеаза 1 (FEN1)

13. Результаты и перспективы

III. Русский перевод

IV. Лингвистический анализ текста

Библиография

Глоссарий

Essay

My graduation paper includes thirteen parts, which are excerpts of two books. The first part is "DNA replication in Archaea, the Third Domain of Life", which gives a simple explanation of the molecular mechanism of DNA replication for solving biological problems. The source for this part was the book "Mechanisms of DNA replication", written and edited by Stewart. The second part is the "Replication origin", which describes the initiation factor, now referred to as a replication origin. The third part is the "How does Cdc6/Orc1 recognize oriC", which describes understanding how the Cdc6/Orc1 protein recognizes the oriC region. The fourth part is the "MCM helicase", which gives an understanding of the functions of the MCM helicase. The fifth part is the " Recruitment of Mcm to the oriC region", which solved an another important question - is how MCM is recruited onto the unwound region of oriC. The sixth part is the "GINS", which describes the main function of the GINS complex. The seventh part is the "Primase", which describes the short oligonucleotide, that required for the synthesis as a primer. The eighth part is the "Single-stranded DNA binding protein", which describes an important factor to protect the unwound single-stranded DNA from nuclease attack, chemical modification, and other disruptions during the DNA replication and repair processes. The ninth part is the "DNA polymerase", which describes a fundamental ability of DNA polymerases. The tenth part is the "PCNA and RFC", which provides an understanding of the functions of these protein structures. The eleventh part is the "DNA ligase", which describes how this enzyme to catalyze phosphodiester bond formation via three nucleotidyl transfer steps. The twelfth part is the "Flap endonuclease 1 (FEN1)", which describes a function of that structure. The thirteenth part is the "Summary and perspectives", in which describes application of this knowledge.

The sources for this chapters were the excerpts from two books: "The mechanisms of DNA replication" and "Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology. Second Edition".

One of the reasons why I chose this material for the translation was that the researching of the molecular mechanism of DNA replication is a central theme of molecular biology, and now archaeal organisms became popular in the total genome sequencing age, and most of the DNA replication proteins are now equally understood by biochemical characterizations.

Now I would like to describe the translation process and the different translation techniques, that I used. In the first instance, some words and verbal constructions have the strictly identified value in the field of biology. There are some simple words (eukaryote - эукариоты, polymerase - полимераза, genom - геном) and some word combinations (domain of life - домен жизни, pronein factor - протеиновый фактор, initiation factor - фактор инициации). Sometimes while translating a text I had should to use concretization (Some other protein factors may function in various archaea, for example a protein that is distantly related to eukaryotic Cdt1, which plays a crucial role during MCM loading in Eukaryota, exists in some archaeal organisms, although its function has not been characterized yet - Некоторые другие белковые факторы могут действовать у различных архей, например белок, отдаленно связаный с эукариотической Cdt1, который играет решающую роль во время комплектации MCM у эукариот, существует у некоторых архей, хотя его функции еще не были охарактеризованы). I also employed Interpretation when we had to refuse from the literal translation in order to make the Target Text transparent (A characteristic of the oriC is the conserved 13 bp repeats, as predicted earlier by bioinformatics, and two of the repeats are longer and surround apredicted DUE (DNA unwinding element) with an AT-rich sequence in Pyrococcus genomes - ORIC является сохраненными повторами 13 связывающих белков, как предсказано ранее биоинформатиками, и два из повторов больше и окружают предполагаемый DUE (элемент раскручивания ДНК) с AT-богатых последовательностей в геномах Pyrococcus).

I used more of the translation techniques, but describe them here does not seem possible. So, in the end, I would like to say, that I have got a great experience in understanding the issues that concern that field of biology. Thus, I not only improved my biological skills, but also gained the experience needed to improve my knowledge, as all the latest biological discoveries are always published in English. And now I have the opportunity to review and analyze the issues published without waiting for the translation.

English text

DNA Replication in Archaea, the Third Domain of Life

1. Introduction

The accurate duplication and transmission of genetic information are essential and crucially important for living organisms. The molecular mechanism of DNA replication has been one of the central themes of molecular biology, and continuous efforts to elucidate the precise molecular mechanism of DNA replication have been made since the discovery of the double helix DNA structure in 1953. The protein factors that function in the DNA replication process, have been identified to date in the three domains of life (Figure 1).

Figure 1. Stage of DNA replication

Archaea, the third domain of life, is a very interesting living organism to study from the aspects of molecular and evolutional biology. Rapid progress of whole genome sequence analyses has allowed us to perform comparative genomic studies. In addition, recent microbial ecology has revealed that archaeal organisms inhabit not only extreme environments, but also more ordinary habitats. In these situations, archaeal biology is among the most exciting of research fields.

Archaeal cells have a unicellular ultrastructure without a nucleus, resembling bacterial cells, but the proteins involved in the genetic information processing pathways, including DNA replication, transcription, and translation, share strong similarities with those of eukaryotes. Therefore, most of the archaeal proteins were identified as homologues of many eukaryotic replication proteins, including ORC (origin recognition complex), Cdc6, GINS (Sld5-Psf1-Psf2-Psf3), MCM (minichromosome maintenance), RPA (replication protein A), PCNA (proliferating cell nuclear antigen), RFC (replication factor C), FEN1 (flap endonuclease 1), in addition to the eukaryotic primase, DNA polymerase, and DNA ligase; these are obviously different from bacterial proteins and these proteins were biochemically characterized. Their similarities indicate that the DNA replication machineries of Archaea and Eukaryota evolved from a common ancestor, which was different from that of Bacteria.

Therefore, the archaeal organisms are good models to elucidate the functions of each component of the eukaryotic type replication machinery complex. Genomic and comparative genomic research with archaea is made easier by the fact that the genome size and the number of genes of archaea are much smaller than those of eukaryotes.

The archaeal replication machinery is probably a simplified form of that in eukaryotes. On the other hand, it is also interesting that the circular genome structure is conserved in Bacteria and Archaea and is different from the linear form of eukaryotic genomes. These features have encouraged us to study archaeal DNA replication, in the hopes of gaining fundamental insights into this molecular mechanism and its machinery from an evolutional perspective.

The study of bacterial DNA replication at a molecular level started in about 1960, and then eukaryotic studies followed since 1980. Because Archaea was recognized as the third domain of life later, the archaeal DNA replication research became active after 1990. With increasing the available total genome sequences, the progress of research on archaeal DNA replication has been rapid, and the depth of our knowledge of archaeal DNA replication has almost caught up with those of the bacterial and eukaryotic research fields. In this chapter, we will summarize the current knowledge of DNA replication in Archaea.

2. Replication origin

The basic mechanism of DNA replication was predicted as “replicon theory” by Jacob et al. They proposed that an initiation factor recognizes the replicator, now referred to as a replication origin, to start replication of the chromosomal DNA. Then, the replication origin of E. coli DNA was identified as oriC (origin of chromosome). The archaeal replication origin was identified in the Pyrococcus abyssi in 2001 as the first archaeal replication origin.

The origin was located just upstream of the gene encoding the Cdc6 and Orc1-like sequence s in the Pyrococcus genome. We discovered a gene encoding an amino acid sequence that bore similarity to those of both eukaryotic Cdc6 and Orc1, which are the eukaryotic initiators. After confirming that this protein actually binds to the oriC region on the chromosomal DNA we named the gene product Cdc6/Orc1 due to its roughly equal homology with regions of eukaryotic Orc1 and Cdc6. The gene consists of an operon with the gene encoding DNA polymerase D (it was originally called Pol II, as the second DNA polymerase from Pyrococcus furiosus) in the genome.

A characteristic of the oriC is the conserved 13 bp repeats, as predicted earlier by bioinformatics, and two of the repeats are longer and surround apredicted DUE (DNA unwinding element) with an AT-rich sequence in Pyrococcus genomes (Figure 2). The longer repeated sequence was designated as an ORB (Origin Recognition Box), and it was actually recognized by Cdc6/Orc1 in a Sulfolobus solfataricus study. The 13 base repeat is called a miniORB, as a minimal version of ORB. A whole genome microarray analysis of P. abyssi showed that the Cdc6/Orc1 binds to the oriC region with extremes pecificity, and the specific binding of the highly purified P. furiosus Cdc6/Orc1 to ORB and miniORB was confirmed in vitro. It has to be noted that multiple origins were identified in the Sulfolobus genomes. It is now well recognized that Sulfolobus has three origins and they work at the same time in the cell cycle.

Analysis of the mechanism of how the multiple origins are utilized for genome replication is an interesting subject in the research field of archaeal DNA replication. The main questions are how the initiation of replication from multiple origins is regulated and how the replication forks progress after the collision of two forks from opposite directions.

Figure 2. The oriC region in Pyrococcus genome.

3. How does Cdc6/Orc1 recognize oriC?

An important step in characterizing the initiation of DNA replication in Archaea is to understand how the Cdc6/Orc1 protein recognizes the oriC region. Based upon aminoacid sequence alignments, the archaeal Cdc6/Orc1 proteins belong to the AAA+ family of proteins. The crystal structures of the Cdc6/Orc1 protein from Pyrobaculum aerophilum and one of the two Cdc6/Orc1 proteins, ORC2 from Aeropyrum pernix (the two homologs in this organism are called ORC1 and ORC2 by the authors) were determined. These Cdc6/Orc1 proteins consist of three structural domains.

Domains I and II adopt a fold found in the AAA+ family proteins. A winged helix (WH) fold, which is present in a number of DNA binding proteins, is found in the domain III. There are four ORBs arranged in pairs on both sides of the DUE in the oriC region of A. pernix, and ORC1 binds to each ORB as a dimer. A mechanism was proposed in which ORC1 binds to all four ORBs to introduce a higher-order assembly for unwinding of the DUE with alterations in both topology and superhelicity. Furthermore, the crystal structures of S. solfataricus Cdc6-1 and Cdc6-3 (two of the three Cdc6/Orc1 proteins in this organism) forming a heterodimer bound to ori2 DNA (one of the three origins in this organism), and that of A. pernix ORC1 bound to an origin sequence were determined. These studies revealed that both the N-terminal AAA+ ATPase domain (domain I+II) and C-terminal WH domain (domain III) contribute to origin DNA binding, and the structural information not only defined the polarity of initiator assembly on the origin but also indicated the induction of substantial distortion, which probably triggers the unwinding of the duplex DNA to start replication, into the DNA strands. These structural data also provided the detailed interaction mode between the initiator protein and the oriC DNA.

Mutational analyses of the Methanothermobactor thermautotrophicus Cdc6-1 protein revealed the essential interaction between an arginine residue conserved in the archaeal Cdc6/Orc1 and an invariant guanine in the ORB sequence. P. furiosus Cdc6/Orc1 is difficult to purify in a soluble form. A specific site in the oriC to start unwinding in vitro, was identified using the protein prepared by a denaturation-renaturation procedure recently.

As shown in Figure 2, the local unwinding site is about 670 bp away from the transition site between leading and lagging syntheses, which was determined earlier by an in vivo replication initiation point (RIP) assay. Although the details of the replication machinery that must be established at the unwound site are not fully understood in Archaea, it is expected to minimally include MCM, GINS, primase, PCNA, DNA polymerase, and RPA, as described below. The following P. furiosus studies revealed that the ATPase activity of the Cdc6/Orc1 was completely suppressed by binding to DNA containing the ORB.

Limited proteolysis and DNase I-footprint experiments suggested that the Cdc6/Orc1 protein changes its conformation on the ORB sequence in the presence of ATP. The physiological meaning of this conformational change has not been solved, but it should have an important function to start the initiation process as in the case of bacterial DnaA protein. In addition, results from an in vitro recruiting assay indicated that MCM (Mcm protein complex), the replicative DNA helicase, is recruited onto the oriC region in a Cdc6/Orc1-dependent, but not ATP-dependent, manner, as described below. However, this recruitment is not sufficient for the unwinding function of MCM, and some other function remains to be identified for the functional loading of this helicase to promote the progression of the DNA replication fork.

4. MCM helicase

After unwinding of the oriC region, the replicative helicase needs to remain loaded to provide continuous unwinding of double stranded DNA (dsDNA) as the replication forks progress bidirectionally. The MCM protein complex, consisting of six subunits (Mcm2, 3, 4, 5, 6, and 7), is known to be the replicative helicase “core” in eukaryotic cells.

The MCM further interacts with Cdc45 and GINS, to form a ternary assembly referred to as the “CMG complex”, that is believed to be the functional helicase in eukaryotic cells (Figure 3). However, this idea is still not universal for the eukaryotic replicative helicase.

Figure 3. DNA-Unwinding complex in eukaryotes and archaea.

The CMG complex is the replicative helicase for the template DNA unwinding reaction in eukaryotes. The archaeal genomes contain the homologs of the Mcm and Gins proteins, but a Cdc45 homolog has not been identified. Recent research suggests that a RecJ-like exonuclease GAN, which has weak sequence homology to that of Cdc45, may work as a helicase complex with MCM and GINS.

Most archaeal genomes appear to encode at least one Mcm homologue, and the helicase activities of these proteins from several archaeal organisms have been confirmed in vitro. In contrast to the eukaryotic MCM, the archaeal MCMs, consist of a homohexamer or homo double hexamer, having distinct DNA helicase activity by themselves in vitro, and therefore, these MCMs on their own may function as the replicative helicase in vivo.

The structure-function relationships of the archaeal Mcms have been aggressively studied using purified proteins and site-directed mutagenesis. An early report using the ChIP method showed that the P. abyssi Mcm protein preferentially binds to the origin in vivo in exponentially growing cells. The P. furiosus MCM helicase does not display significant helicase activity in vitro. However, the DNA helicase activity was clearly stimulated by the addition of GINS (the Gins23-Gins51 complex), which is the homolog of the eukaryotic GINS complex (described below in more detail). This result suggests that MCM works with other accessory factors to form a core complex in P. furiosus similar to the eukaryotic CMG complex as described above.

Some archaeal organisms have more than two Cdc6/Orc1 homologs. It was found that the two Cdc6/Orc1 homologs, Cdc6-1 and Cdc6-2, both inhibit the helicase activity of MCM in M. thermautotrophicus. Similarly, Cdc6-1 inhibits MCM activity in S. solfataricus. In contrast, the Cdc6-2 protein stimulates the helicase activity of MCM in Thermoplasma acidophilum. Functional interactions between Cdc6/Orc1 and Mcm proteins need to be investigated in greater detail to achieve a more comprehensive understanding of the conservation and diversity of the initiation mechanism in archaeal DNA replication.

Another interesting feature of DNA replication initiation is that several archaea have multiple genes encoding Mcm homologs in their genomes. Based on the recent comprehensive genomic analyses, thirteen archaeal species have more than one mcm gene. However, many of the mcm genes in the archaeal genomes seem to reside within mobile elements, originating from viruses. For example, two of the three genes in the Thermococcus kodakarensis genome are located in regions where genetic elements have presumably been integrated.

The establishment of a genetic manipulation system for T. kodakarensis, is the first for a hyperthermophilic euryarchaeon, and is advantageous for investigating the function of these Mcm proteins. Two groups have recently performed gene disruption experiments for each mcm gene. These experiments revealed that the knock-out strains for mcm1 and mcm2 were easily isolated, but mcm3 could not be disrupted. Mcm3 is relatively abundant in the T. kodakarensis cells. Furthermore, an in vitro experiment using purified Mcm proteins showed that only Mcm3 forms a stable hexameric structure in solution. These results support the contention that Mcm3 is the main helicase core protein in the normal DNA replication process in T. kodakarensis.

The functions of the other two Mcm proteins remain to be elucidated. The genes for Mcm1 and Mcm2 are stably inherited, and their gene products may perform some important functions in the DNA metabolism in T. kodakarensis. The DNA helicase activity of the recombinant Mcm1 protein is strong in vitro, and a distinct amount of the Mcm1 protein is present in T. kodakarensis cells. Moreover, Mcm1 functionally interacts with the GINS complex from T. kodakarensis. These observations strongly suggest that Mcm1 does participate in some aspect of DNA transactions, and may be substituted with Mcm3.

Our immunoprecipitation experiments showed that Mcm1 co-precipitated with Mcm3 and GINS, although they did not form a heterohexameric complex, suggesting that Mcm1 is involved in the replisome or repairsome and shares some function in T. kodakarensis cells. Although western blot analysis could not detect Mcm2 in the extract from exponentially growing T. kodakarensis cells, a RT-PCR experiment detected the transcript of the mcm2 gene in the cells (Ishino et al., unpublished). The recombinant Mcm2 protein also has ATPase and helicase activities in vitro. Therefore, the mcm2 gene is expressed under normal growth conditions and may work in some process with a rapid turnover. Further experiments to measure the efficiency of mcm2 gene transcription by quantitative PCR, as well as to assess the stability of the Mcm2 protein in the cell extract, are needed.

Phenotypic analyses investigating the sensitivities of the Дmcm1 and Дmcm2 mutant strains to DNA damage caused by various mutagens, as reported for other DNA repair-related genes in T. kodakarensis, may provide a clue to elucidate the functions of these Mcm proteins. Methanococcus maripaludis S2 harbors four mcm genes in its genome, three of which seem to be derived from phage, a shotgun proteomics study detected peptides originating from three out of the four mcm gene products. Furthermore, the four gene products co-expressed in E. coli cells were co-purified in the same fraction. These results suggest that multiple Mcm proteins are functional in the M. maripaludis cells.

5. Recruitment of Mcm to the oriC region

Another important question is how MCM is recruited onto the unwound region of oriC. The detailed loading mechanism of the MCM helicase has not been elucidated. It is believed that archaea utilize divergent mechanisms of MCM helicase assembly at the oriC.

An in vitro recruiting assay showed that P. furiosus MCM is recruited to the oriC DNA in a Cdc6/Orc1-dependent manner. This assay revealed that preloading Cdc6/Orc1 onto the ORB DNA resulted in a clear reduction in MCM recruitment to the oriC region, suggesting that free Cdc6/Orc1 is preferable as a helicase recruiter, to associate with MCM and bring it to oriC. It would be interesting to understand how the two tasks, origin recognition and MCM recruiting, are performed by the Cdc6/Orc1 protein, because the WH domain, which primarily recognizes and binds ORB, also has strong affinity for the Mcm protein.

The assembly of the Mcm protein onto the ORB DNA by the Walker A-motif mutant of P. furiosus Cdc6/Orc1 occurred with the same efficiency as the wild type Cdc6/Orc1. The DNA binding of P. furiosus Cdc6/Orc1 was not drastically different in the presence and absence of ATP, as in the case of the initiator proteins from Archaeoglobus fulgidus, S. solfataricus, and A. pernix. Therefore, it is still not known whether the ATP binding and hydrolysis activity of Cdc6/Orc1 regulates the Mcm protein recruitment onto oriC in the cells.

One more important issue is the very low efficiency of the Mcm protein recruitment in the reported in vitro assay. Quantification of the recruited Mcm protein by the in vitro assay showed that less than one Mcm hexamer was recruited to the ORB. The linear DNA containing ORB1 and ORB2, used in the recruiting assay, may not be suitable to reconstitute the archaeal DNA replication machinery and a template that more closely mimics the chromosomal DNA may be required.

Additionally, it may be that as yet unidentified proteins are required to achieve efficient in vitro helicase loading in the P. furiosus cells. Finally, it will ultimately be necessary to construct a more defined in vitro replication system to analyze the regulatory functions of Cdc6/Orc1 precisely during replication initiation.

In M. thermautotrophicus, the Cdc6-2 proteins can dissociate the Mcm multimers. The activity of Cdc6-2 might be required as the MCM helicase loader in this organism. The interaction between Cdc6/Orc1 and Mcm is probably general. However, the effect of Cdc6/Orc1 on the MCM helicase activity differs among various organisms, as described above. Some other protein factors may function in various archaea, for example a protein that is distantly related to eukaryotic Cdt1, which plays a crucial role during MCM loading in Eukaryota, exists in some archaeal organisms, although its function has not been characterized yet.

6. GINS

The eukaryotic GINS complex was originally identified in Saccharomyces cerevisiae as essential protein factor for the initiation of DNA replication. GINS consists of four different proteins, Sld5, Psf1, Psf2, and Psf3 (therefore, GINS is an acronym for Japanese go-ichi-ni-san, meaning 5-1-2-3, after these four subunits). The amino acid sequences of the four subunits in the GINS complex share some conservation, suggesting that they are ancestral paralogs. However, most of the archaeal genomes have only one gene encoding this family protein, and more interestingly, the Crenarchaeota and Euryarchaeota (the two major subdomains of Archaea) characteristically have two genes with sequences similar to Psf2 and Psf3, and Sld5 and Psf1, respectively referred to as Gins23 and Gins51.

A Gins homolog, designated as Gins23, was biochemically detected in S. solfataricus as the first Gins protein in Archaea, in a yeast two-hybrid screening for interaction partners of the Mcm protein, and another subunit, designated as Gins15, was identified by mass-spectrometry analysis of an immunoaffinity-purified native GINS from an S. solfataricus cell extract. The S. solfataricus GINS, composed of two proteins, Gins23 and Gins15, forms a tetrameric structure with a 2:2 molar ratio. The GINS from P. furiosus, a complex of Gins23 and Gins51 with a 2:2 ratio, was identified as the first euryarchaeal GINS. Gins51 was preferred over Gins15 because of the order of the name of GINS.

The MCM2-7 hexamer was copurified in complex with Cdc45 and GINS from Drosophila melanogaster embryo extracts and S. cerevisiae lysates, and the “CMG (Cdc45-MCM2-7-GINS) complex” (Figure 3), as described above, should be important for the function of the replicative helicase. The CMG complex was also associated with the replication fork in Xenopus laevis egg extracts, and a large molecular machine, containing Cdc45, GINS, and MCM2-7, was proposed as the unwindosome to separate the DNA strands at the replication fork.

Therefore, GINS must be a critical factor for not only the initiation process, but also the elongation process in eukaryotic DNA replication. S. solfataricus GINS interacts with MCM and primase, suggesting that GINS is involved in the replisome. The concrete function of GINS in the replisome remains to be determined. No stimulation or inhibition of either the helicase or primase activity was observed by the interaction with S. solfataricus GINS in vitro. On the other hand, the DNA helicase activity of P. furiosus MCM is clearly stimulated by the addition of the P. furiosus GINS complex, as described above.

In contrast to S. solfataricus and P. furiosus, which each express a Gins23 and Gins51, Thermoplasma acidophilum has a single Gins homolog, Gins51. The recombinant Gins51 protein from T. acidophilum was confirmed to form a homotetramer by gel filtration and electron microscopy analyses. Furthermore, a physical interaction between T. acidophilum Gins51 and Mcm was detected by a surface plasmon resonance analysis (SPR). Although the T. acidophilum Gins51 did not affect the helicase activity of its cognate MCM, when the equal ratio of each molecule was tested in vitro, an excess amount of Gins51 clearly stimulated the helicase activity (Ogino et al., unpublished). In the case of T. kodakarensis, the ATPase and helicase activities of MCM1 and MCM3 were clearly stimulated by T. kodakarensis GINS in vitro. It is interesting that the helicase activity of MCM1 was stimulated more than that of MCM3.

Physical interactions between the T. kodakarensis Gins and Mcm proteins were also detected. These reports suggested that the MCM-GINS complex is a common part of the replicative helicase in Archaea (Figure 3). Recently, the crystal structure of the T. kodakarensis GINS tetramer, composed of Gins51 and Gins23 was determined, and the structure was conserved with the reported human GINS structures. Each subunit of human GINS shares a similar fold, and assembles into the heterotetramer of a unique trapezoidal shape. Sld5 and Psf1 possess the б-helical (A) domain at the N-terminus and the в-stranded domain (B) at the C-terminus (AB-type).

On the other hand, Psf2 and Psf3 are the permuted version (BA-type). The backbone structure of each subunit and the tetrameric assembly of T. kodakarensis GINS are similar to those of human GINS. However, the location of the C-terminal B domain of Gins51 is remarkably different between the two GINS structures.

A homology model of the homotetrameric GINS from T. acidophilum was performed using the T. kodakarensis GINS crystal structure as a template. The Gins 51 protein has a long disordered region inserted between the A and B domains and this allows the conformation of the C-terminal domains to be more flexible. This domain arrangement leads to the formation of an asymmetric homotetramer, rather than a symmetrical assembly, of the T. kodakarensis GINS.

The Cdc45 protein is ubiquitously distributed from yeast to human, supporting the notion that the formation of the CMG complex is universal in the eukaryotic DNA replication process. However, no archaeal homologue of Cdc45 has been identified. A recent report of bioinformatic analysis showed that the primary structure of eukaryotic Cdc45 and prokaryotic RecJ share a common ancestry. Indeed, a homolog of the DNA binding domain of RecJ has been co-purified with GINS from S. solfataricus.

Our experiment detected the stimulation of the 5'-3' exonuclease activity of the RecJ homologs from P. furiosus and T. kodakarensis by the cognate GINS complexes (Ishino et al., unpublished). The RecJ homolog from T. kodakarensis forms a stable complex with the GINS, and the 5'-3' exonuclease activity is enhanced in vitro; therefore, the RecJ homolog was designated as GAN, from GINS-Associated Nuclease in a very recent paper.

Another related report found that the human Cdc45 structure obtained by the small angle X-ray scattering analysis (SAXS) is consistent with the crystallographic structure of the RecJ family members. These current findings will promote further research on the structures and functions of the higher-order unwindosome in archaeal and eukaryotic cells.

7. Primase

To initiate DNA strand synthesis, a primase is required for the synthesis of a short oligonucleotide, as a primer. The DnaG and p48-p58 proteins are the primases in Bacteria and Eukaryota, respectively. The p48-p58 primase is further complexed with p180 and p70, to form DNA polymerase б-primase complex. The catalytic subunits of the eukaryotic (p48) and archaeal primases, share a little, but distinct sequence homology with those of the family X DNA polymerases.

The first archaeal primase was identified from Methanococcus jannaschii, as an ORF with a sequence similar to that of the eukaryotic p48. The gene product exhibited DNA polymerase activity and was able to synthesize oligonucleotides on the template DNA.

We characterized the p48-like protein (p41) from P. furiosus. Unexpectedly, the archaeal p41 protein did not synthesize short RNA by itself, but preferentially utilized deoxynucleotides to synthesize DNA strands up to several kilobases in length. Furthermore, the gene neighboring the p41 gene encodes a protein with very weak similarity to the p58 subunit of the eukaryotic primase. The gene product, designated p46, actually forms a stable complex with p41, and the complex can synthesize a short RNA primer, as well as DNA strands of several hundred nucleotides in vitro.

The short RNA but not DNA primers were identified in Pyrococcus cells, and therefore, some mechanism to dominantly use RNA primers exists in the cells.

Further research on the primase homologs from S. solfataricus, Pyrococcus horikoshii, and P. abyssi showed similar properties in vitro. Notably, p41 is the catalytic subunit, and the large one modulates the activity in the heterodimeric archaeal primases.

The small and large subunits are also called PriS and PriL, respectively. The crystal structure of the N-terminal domain of PriL complexed with PriS of S. solfataricus primase revealed that PriL does not directly contact the active site of PriS, and therefore, the large subunit may interact with the synthesized primer, to adjust its length to a 7-14 mer. The structure of the catalytic center is similar to those of the family X DNA polymerases.

The 3'-terminal nucleotidyl transferase activity, detected in the S. solfataricus primase, and the gap-filling and strand-displacement activities in the P. abyssi primase also support the structural similarity between PriS and the family X DNA polymerases. A unique activity, named PADT (template-dependent Polymerization Across Discontinuous Template), in the S. solfataricus PriSL complex was published very recently.

The activity may be involved in double-strand break repair in Archaea. The archaeal genomes also encode a sequence similar to the bacterial type DnaG primase. The DnaG homolog from the P. furiosus genome was expressed in E. coli, but the protein did not show any primer synthesis activity in vitro, and thus the archaeal DnaG-like protein may not act as a primase in Pyrococcus cells.

The DnaG-like protein was shown to participate in RNA degradation, as an exosome component. However, a recent paper reported that a DnaG homolog from S. solfataricus actually synthesizes primers with a 13 nucleotide length. It would be interesting to investigate if the two different primases share the primer synthesis for leading and lagging strand replication, respectively, in the Sulfolobus cells, as the authors suggested.

A proposed hypothesis about the evolution of PriSL and DnaG from the last universal common ancestor (LUCA) is interesting. The Sulfolobus PriSL protein was shown to interact with Mcm through Gins23. This primase- helicase interaction probably ensures the coupling of DNA unwinding and priming during the replication fork progression.

Furthermore, the direct interaction between PriSL and the clamp loader RFC (described below) in S. solfataricus may regulate the primer synthesis and its transfer to DNA polymerase in archaeal cells.

8. Single-stranded DNA binding protein

The single-stranded DNA binding protein, which is called SSB in Bacteria and RPA in Archaea and Eukaryota, is an important factor to protect the unwound single-stranded DNA from nuclease attack, chemical modification, and other disruptions during the DNA replication and repair processes. SSB and RPA have a structurally similar domain containing a common fold, called the OB (oligonucleotide/oligosaccharide binding)-fold, although there is little amino acid sequence similarity between them.

The common structure suggests that the mechanism of single-stranded DNA binding is conserved in living organisms despite the lack of sequence similarity. E. coli SSB is a homotetramer of a 20 kDa peptide with one OB-fold, and the SSBs from Deinococcus radiodurans and Thermus aquaticus consist of a homodimer of the peptide containing two OB-folds.

The eukaryotic RPA is a stable heterotrimer, composed of 70, 32, and 14 kDa proteins. RPA70 contains two tandem repeats of an OB-fold, which are responsible for the major interaction with a single-stranded DNA in its central region. The N-terminal and C-terminal regions of RPA70 mediate interactions with RPA32 and also with many cellular or viral proteins. RPA32 contains an OB-fold in the central region, and the C-terminal region interacts with other RPA subunits and various cellular proteins. RPA14 also contains an OB-fold.

The eukaryotic RPA interacts with the SV40 T-antigen and the DNA polymerase б-primase complex, and thus forms part of the initiation complex at the replication origin. The RPA also stimulates Polб-primase activity and PCNA-dependent Pol д activity. The RPAs from M. jannaschii and M. thermautotrophicus were reported in 1998, as the first archaeal single-stranded DNA binding proteins. These proteins share amino acid sequence similarity with the eukaryotic RPA70, and contain four or five repeated OB-fold and one zincfinger motif.

The M. jannaschii RPA exists as a monomer in solution, and has single-strand DNA binding activity. On the other hand, P. furiosus RPA forms a complex consisting of three distinct subunits, RPA41, RPA32, and RPA14, similar to the eukaryotic RPA. The P. furiosus RPA strikingly stimulates the RadA-promoted strand-exchange reaction in vitro.

While the euryarchaeal organisms have a eukaryotic-type RPA homologue, the crenarchaeal SSB proteins appear to be much more related to the bacterial proteins, with a single OB fold and a flexible C-terminal tail. However, the crystal structure of the SSB protein from S. solfataricus showed that the OB-fold domain is more similar to that of the eukaryotic RPAs, supporting the close relationship between Archaea and Eukaryota.

The RPA from Methanosarcina acetivorans displays a unique property. Unlike the multiple RPA proteins found in other archaea and eukaryotes, each subunit of the M. acetivorans RPAs, RPA1, RPA2, and RPA3, have 4, 2, and 2 OB-folds, respectively, and can act as a distinct single-stranded DNA-binding proteins. Furthermore, each of the three RPA proteins, as well as their combinations, clearly stimulates the primer extension activity of M. acetivorans DNA polymerase BI in vitro, as shown previously for bacterial SSB and eukaryotic RPA.

Architectures of SSB and RPA suggested that they are composed of different combinations of the OB fold. Bacterial and eukaryotic organisms contain one type of SSB or RPA, respectively. In contrast, archaeal organisms have various RPAs, composed of different organizations of OB-folds. A hypothesis that homologous recombination might play an important role in generating this diversity of OB-folds in archaeal cells was proposed, based on experiments characterizing the engineered RPAs with various OB-folds.

9. DNA polymerase

DNA polymerase catalyzes phosphodiester bond formation between the terminal 3'-OH of the primer and the б-phosphate of the incoming triphosphate to extend the short primer, and is therefore the main player of the DNA replication process. Based on the amino acid sequence similarity, DNA polymerases have been classified into seven families, A, B, C, D, E, X, and Y.

The fundamental ability of DNA polymerases to synthesize a deoxyribonucleotide chain is widely conserved, but more specific properties, including processivity, synthesis accuracy, and substrate nucleotide selectivity, differ depending on the family. The enzymes within the same family have basically similar properties. E. coli has five DNA polymerases, and Pol I, Pol II, and Pol III belong to families A, B, and C, respectively. Pol IV and Pol V are classified in family Y, as the DNA polymerases for translesion synthesis (TLS). In eukaryotes, the replicative DNA polymerases, Pol б, Pol д, and Pol е, belong to family B, and the translesion DNA polymerases, з, й, and к, belong to family Y.

The most interesting feature discovered at the inception of this research area was that the archaea indeed have the eukaryotic Pol б-like (Family B) DNA polymerases. Members of the Crenarchaeota have at least two family B DNA polymerases . On the other hand, there is only one family B DNA polymerase in the Euryarchaeota. Instead, the euryarchaeal genomes encode a family D DNA polymerase, proposed as Pol D, which seems to be specific for these archaeal organisms and has never been found in other domains.

The genes for family Y-like DNA polymerases are conserved in several, but not all, archaeal genomes. The role of each DNA polymerase in the archaeal cells is still not known, although the distribution of the DNA polymerases is getting clearer.

The first family D DNA polymerase was identified from P. furiosus, by screening for DNA polymerase activity in the cell extract. The corresponding gene was cloned, revealing that this new DNA polymerase consists of two proteins, named DP1 and DP2, and that the deduced amino acid sequences of these proteins were not conserved in the DNA polymerase families. P. furiosus Pol D exhibits efficient strand extension activity and strong proofreading activity.

Other family D DNA polymerases were also characterized by several groups. The Pol D genes had been found only in Euryarchaeota. However, recent environmental genomics and cultivation efforts revealed novel phyla in Archaea: Thaumarchaeota, Korarchaeota, and Aigarchaeota, and their genome sequences harbor the genes encoding Pol D. A genetic study on Halobacterium sp. NRC-1 showed that both Pol B and Pol D are essential for viability.

An interesting issue is to elucidate whether Pol B and Pol D work together at the replication fork for the synthesis of the leading and lagging strands, respectively. According to the usage of an RNA primer and the presence of strand displacement activity, Pol D may catalyze lagging strand synthesis.

Thaumarchaeota and Aigarchaeota harbor the genes encoding Pol D and crenarchaeal Pol BII, while Korarchaeota encodes Pol BI, Pol BII and Pol D. Biochemical characterization of these gene products will contribute to research on the evolution of DNA polymerases in living organisms.

A hypothesis that the archaeal ancestor of eukaryotes encoded three DNA polymerases, two distinct family B DNA polymerases and a family D DNA polymerase, which all contributed to the evolution of the eukaryotic replication machinery, consisting of Pol б, д, and е, has been proposed. A protein is encoded in the plasmid pRN1 isolated from a Sulfolobus strain. This protein, ORF904 (named RepA), has primase and DNA polymerase activities in the N-terminal domain and helicase activity in the C-terminal domain, and is likely to be essential for the replication of pRN1.

The amino acid sequence of the N-terminal domain lacks homology to any known DNA polymerases or primases, and therefore, family E is proposed. Similar proteins are encoded by various archaeal and bacterial plasmids, as well as by some bacterial viruses.

Recently, one protein, tn2-12p, encoded in the plasmid pTN2 isolated from Thermococcus nautilus, was experimentally identified as a DNA polymerase in this family. This enzyme is likely responsible for the replication of the plasmids. Further investigations of this family of DNA polymerases will be interesting from an evolutional perspective.

10. PCNA and RFC

The sliding clamp with the doughnut-shaped ring structure is conserved among living organisms, and functions as a platform or scaffold for proteins to work on the DNA strands. The eukaryotic and archaeal PCNAs form a homotrimeric ring structure, which encircles the DNA strand and anchors many important proteins involved in DNA replication and repair (Figure 4).

PCNA works as a processivity factor that retains the DNA polymerase on the DNA by binding it on one surface (front side) of the ring for continuous DNA strand synthesis in DNA replication (Figure 5). To introduce the DNA strand into the central hole of the clamp ring, a clamp loader is required to interact with the clamp and open its ring. The archaeal and eukaryotic clamp loader is called RFC (Figure 5).

The most studied archaeal PCNA and RFC molecules to date are P. furiosus PCNA and RFC. The PCNA and RFC molecules are essential for DNA polymerase to perform processive DNA synthesis. The molecular mechanism of the clamp loading process has been actively investigated (Figure 5).

An intermediate PCNA-RFC-DNA complex, in which the PCNA ring is opened with out-of plane mode, was detected by a single particle analysis of electron microscopic images using P. furiosus proteins (Figure 6).

The crystal structure of the complex, including the ATP-bound clamp loader, the ring-opened clamp, and the template-primer DNA, using proteins from bacteriophage T4, has recently been published, and our knowledge about the clamp loading mechanism is continuously progressing.

Figure 4. PCNA-interacting proteins.

After clamp loading, DNA polymerase accesses the clamp and the polymerase-clamp complex performs processive DNA synthesis. Therefore, structural and functional analyses of the DNA polymerase-PCNA complex is the next target to elucidate the overall mechanisms of replication fork progression. The PCNA interacting proteins contain a small conserved sequence motif, called the PIP box, which binds to a common site on PCNA. The PIP box consists of the sequence “Qxxhxxaa”, where “x” represents any amino acid, “h” represents a hydrophobic residue (e.g. L, I or M), and “a” represents an aromatic residue (e.g. F, Y or W). Archaeal DNA polymerases have PIP box-like motifs in their sequences. However, only a few studies have experimentally investigated the function of the motifs. The crystal structure of P. furiosus Pol B complexed with a monomeric PCNA mutant was determined, and a convincing model of the polymerase-PCNA ring interaction was constructed. This study revealed that a novel interaction is formed between a stretched loop of PCNA and the thumb domain of Pol B, in addition to the authentic PIP box.

A comparison of the model structure with the previously reported structures of a family B DNA polymerase from RB69 phage, complexed with DNA, suggested that the second interaction site plays a crucial role in switching between the polymerase and exonuclease modes, by inducing a PCNA-polymerase complex configuration that favors synthesis over editing.

This putative mechanism for the fidelity control of replicative DNA polymerases is supported by experiments, in which mutations at the second interaction site enhanced the exonuclease activity in the presence of PCNA. Furthermore, the three-dimensional structure of the DNA polymerase-PCNA-DNA ternary complex was analyzed by electron microscopic (EM) single particle analysis.

This structural view revealed the entire domain configuration of the trimeric ring of PCNA and DNA polymerase, including the protein-protein or protein-DNA contacts. This architecture provides clearer insights into the switching mechanism between the editing and synthesis modes.

Figure 5. Mechanisms of processive DNA synthesis

In contrast to most euryarchaeal organisms, which have a single PCNA homolog forming a homotrimeric ring structure, the majority of crenarchaea have multiple PCNA homologues, and they are capable of forming heterotrimeric rings for their functions. It is especially interesting that the three PCNAs, PCNA1, PCNA2, and PCNA3, specifically bind PCNA binding proteins, including DNA polymerases, DNA ligases, and FEN-1 endonuclease.


Подобные документы

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.