Our discussion of RNA synthesis begins with a comparison between transcription and DNA replication (Chapter 25). Transcription resembles replication in its fundamental chemical mechanism, its polarity (direction of synthesis), and its use of a template. And like replication, transcription has initiation, elongation, and termination phases. Transcription differs from replication in that it does not require a primer and, generally, involves only limited segments of a DNA molecule. Additionally, only one DNA strand serves as a template for a particular RNA molecule.
The discovery of DNA polymerase and its dependence on a DNA template spurred a search for an enzyme that synthesizes RNA complementary to a DNA strand. By 1960, four research groups had independently detected an enzyme in cellular extracts that could form an RNA polymer from ribonucleoside -triphosphates. Subsequent work on the purified Escherichia coli RNA polymerase helped to define the fundamental properties of transcription (Fig. 26-1). DNA-dependent RNA polymerase requires, in addition to a DNA template, all four ribonucleoside -triphosphates (ATP, GTP, UTP, and CTP) as precursors of the nucleotide units of RNA, as well as . The chemistry and mechanism of RNA synthesis closely resemble those used by DNA polymerases (see Fig. 25-3). RNA polymerase elongates an RNA strand by adding ribonucleotide units to the -hydroxyl end, building RNA in the direction. The -hydroxyl group acts as a nucleophile, attacking the α phosphate of the incoming ribonucleoside triphosphate (Fig. 26-1a) and releasing pyrophosphate. The overall reaction is
RNA polymerase requires DNA for activity and is most active when bound to a double-stranded DNA. As noted above, only one of the two DNA strands serves as a template. The template DNA strand is copied in the direction (antiparallel to the new RNA strand), just as in DNA replication. Each nucleotide in the newly formed RNA is selected by Watson-Crick base-pairing interactions: U residues are inserted in the RNA to pair with A residues in the DNA template, G residues are inserted to pair with C residues, and so on. Base-pair geometry (see Fig. 25-5) may also play a role in base selection.
Unlike DNA polymerase, RNA polymerase does not require a primer to initiate synthesis. Initiation occurs when RNA polymerase binds at specific DNA sequences called promoters (described below). The -triphosphate group of the first residue in a nascent (newly formed) RNA molecule is not cleaved to release , but instead remains intact and functions in eukaryotes as a substrate for the RNA-capping machinery (see Fig. 26-13). During the elongation phase of transcription, the growing end of the new RNA strand base-pairs temporarily with the DNA template to form a short hybrid RNA-DNA double helix, about 8 bp long (Fig. 26-1b). The RNA in this hybrid duplex “peels off” shortly after its formation, and the DNA duplex re-forms.
To enable RNA polymerase to synthesize an RNA strand complementary to one of the DNA strands, the DNA duplex must unwind over a short distance, forming a transcription “bubble.” During transcription, the E. coli RNA polymerase generally keeps about 17 bp unwound. The 8 bp RNA-DNA hybrid occurs in this unwound region. Elongation of a transcript by E. coli RNA polymerase proceeds at a rate of 50 to 90 nucleotides/s. Because DNA is a helix, movement of a transcription bubble requires considerable strand rotation of the nucleic acid molecules. DNA strand rotation is restricted in most DNAs by DNA-binding proteins and other structural barriers. As a result, a moving RNA polymerase generates waves of positive supercoils ahead of the transcription bubble and negative supercoils behind (Fig. 26-1c). This has been observed both in vitro and in vivo (in bacteria). In the cell, the topological problems caused by transcription are relieved through the action of topoisomerases (Chapter 24).
The two complementary DNA strands have different roles in transcription. The strand that serves as template for RNA synthesis is called the template strand. The DNA strand complementary to the template, the nontemplate strand, or coding strand, is identical in base sequence to the RNA transcribed from the gene, with U in the RNA in place of T in the DNA (Fig. 26-2). The coding strand for a particular gene may be located in either strand of a given chromosome (as shown in Fig. 26-3 for a virus). By convention, the regulatory sequences that control transcription (described later in this chapter) are designated by the sequences in the coding strand.
The DNA-dependent RNA polymerase of E. coli is a large, complex enzyme with five core subunits ; and a sixth subunit, one of a group designated , with variants designated by size (molecular weight). The subunit binds transiently to the core and directs the enzyme to specific binding sites on the DNA (described below). These six subunits constitute the RNA polymerase holoenzyme (Fig. 26-4). The RNA polymerase holoenzyme of E. coli thus exists in several forms, depending on the type of subunit. The most common subunit is , and the upcoming discussion focuses on the corresponding RNA polymerase holoenzyme.
RNA polymerases lack a separate proofreading exonuclease active site (such as that of many DNA polymerases), and the error rate for transcription is higher than that for chromosomal DNA replication — approximately one error for every to ribonucleotides incorporated into RNA. Because many copies of an RNA are generally produced from a single gene, and nearly all RNAs are eventually degraded and replaced, a mistake in an RNA molecule is of less consequence to the cell than a mistake in the permanent information stored in DNA. Many RNA polymerases, including bacterial RNA polymerase and the eukaryotic RNA polymerase II (discussed below), do pause when a mispaired base is added during transcription, and they can remove mismatched nucleotides from the end of a transcript by direct reversal of the polymerase reaction. But we do not yet know whether this activity is a true proofreading function and to what extent it may contribute to the fidelity of transcription.
Initiation of RNA synthesis at random points in a DNA molecule would be an extraordinarily wasteful process. Instead, an RNA polymerase binds to specific sequences in the DNA called cMm6zGWrbSpromoters, which direct the transcription of adjacent segments of DNA (genes). The sequences where RNA polymerases bind are variable, and much research has focused on identifying the particular sequences that are critical to promoter function.
In E. coli, RNA polymerase binding occurs within a region stretching from about 70 bp before the transcription start site to about 30 bp beyond it. By convention, the DNA base pairs that correspond to the beginning of an RNA molecule are given positive numbers, and those preceding the RNA start site are given negative numbers. The promoter region thus extends between positions and . Analyses and comparisons of the most common class of bacterial promoters (those recognized by an RNA polymerase holoenzyme containing ) have revealed consensus sequences centered about positions and (Fig. 26-5a). Although the sequences are not identical for all bacterial promoters in this class, certain nucleotides that are particularly common at each position form a consensus sequence. The consensus sequence at the region is TATAAT; at the region it is TTGACA. A third AT-rich recognition element, called the UP (upstream promoter) element, occurs between positions and in the promoters of certain highly expressed genes. The UP element is bound by the α subunit of RNA polymerase. The efficiency with which an RNA polymerase containing binds to a promoter and initiates transcription is determined in large measure by these sequences, the spacing between them, and their distance from the transcription start site. A change in only one base pair in the promoter can decrease the rate of binding by several orders of magnitude. The promoter sequence thus establishes a basal level of expression that can vary greatly from one E. coli gene to the next. The x-ray crystal structure of the RNA polymerase holoenzyme bound to its promoter shows how the factor recognizes both the RNA polymerase and the and regions by introducing a large bend in the DNA (Fig. 26-5b). Information about these interactions can also be obtained using the method illustrated in Box 26-1.
The pathway of transcription initiation and the fate of the subunit are illustrated in Figure 26-6. The pathway consists of two major parts, binding and initiation, each with multiple steps. First, the polymerase, directed by its bound factor, binds to the promoter. A closed complex (in which the bound DNA remains double-stranded) and an open complex (in which the bound DNA is partially unwound near the sequence) form in succession. Second, transcription is initiated within the complex, leading to a conformational change that converts the complex to the elongation form, followed by movement of the transcription complex away from the promoter (promoter clearance). Any of these steps can be affected by the specific makeup of the promoter sequences. The subunit dissociates at random as the polymerase enters the elongation phase of transcription. The protein NusA () binds to the elongating RNA polymerase, competitively with the subunit. Once transcription is complete, NusA dissociates from the enzyme, the RNA polymerase dissociates from the DNA, and a factor ( or another) can again bind to the enzyme to initiate transcription.
A horizontal double-stranded piece of blue D N A is shown at the top with a purple region in the venter labeled promoter. Step 1: R N A polymerase core and the sigma superscript 70 end superscript subunit bind to the D N A promoter. An arrow points down from the promoter accompanied by curved lines showing the addition of R N A polymerase and of sigma superscript 70 end superscript subunit. R N A polymerase is a roughly round purple structure with an opening in the center, a narrow vertical passage downward, a passage to the right, and a circular opening at the upper left. The sigma superscript 70 end superscript subunit has an orange oval attached by a curved strand to a smaller vertical oval adjacent to a larger vertical oval labeled sigma superscript 70 end superscript with a small horizontal oval beneath. This yields an R N A polymerase with the three adjacent ovals of the sigma superscript 70 end superscript across its top and the strand extending from the left-hand oval of this group down to the opening at the upper left of the R N A polymerase, where it joins to the base of the fourth oval. This fourth oval extends out behind R N A polymerase. The double-stranded D N A curves along the R N A polymerase in front of the rear oval, bends as it passes the left-hand oval of the three adjacent ovals, and then runs down to the right so that the purple promoter crosses diagonally across the large ventral oval. The strand turns blue at the end of the promoter as it reaches the right-hand side of the large oval and runs along the upper right edges of the R N A polymerase to end just past the right end of the enzyme. This is labeled closed complex. Step 2: Transcription bubble forms. This yields a similar structure in which the two strands of D N A have separated at the region of the promoter and to its right. The purple part of the promoter is against the sigma subunit. Below, the blue left-hand strand bends left along the orange strand connecting the left-hand oval to the rear oval and loops down before bending horizontally and rejoining the top strand as it exits through the opening to the right of the polymerase. The top D N A strand runs down to its right, at the top of the open region in the polymerase, and bends right to rejoin it as a double-stranded molecule. A small piece of green R N A is visible at the lower left part of the bend in the left-hand D N A strand, just above the narrow vertical channel extending down. This structure is labeled open complex. Step 3: Transcription is initiated. The rear orange oval bends upward and to the right, with the double stranded D N A running vertically up to its right above the three ovals below. The top of the D N A has two purple ends. The blue D N A still forms a replication bubble in the open region of the polymerase and the strand of green R N A now extends up slightly above the bottom blue D N A strand and then left to exit through the opening to the upper left that had previously been occupied by the strand to the rear orange oval. Step 4: Promoter clearance is followed by elongation. An arrow points left accompanied by a branched arrow showing the departure of the sigma superscript 70 end superscript subunit. Step 5: Elongation continues, sigma superscript 70 end superscript dissociates, and is replaced by N u s A. A curved line shows the addition of a green oval labeled N u s A to the arrow pointing from the previous structure to the next structure. In this structure, D N A has a similar configuration extending from the upper left, then curving vertically down to a replication bubble in the center of the polymerase, then bending horizontally to the right. The green strand of R N A has extended well beyond the R N A polymerase. A green oval labeled N u s A is positioned at the upper right corner of the R N A polymerase between the vertical and horizontal pieces of D N A. An arrow points upward accompanied by a curved arrow showing the loss of the green oval labeled N u s A and a green strand of R N A. Step 6: Transcription is terminated. N u s A dissociated, and the R N A polymerase is recycled. This yields an R N A polymerase with no associated D N A or sigma superscript 70 end superscript subunit ready to repeat step 1.
E. coli has other classes of promoters bound by RNA polymerase holoenzymes with different subunits, such as the promoters of the heat shock genes. The products of this set of genes are made at higher levels when the cell is exposed to environmental stress, such as a sudden increase in temperature. RNA polymerase binds to the promoters of these genes only when is replaced with the () subunit, which is specific for the heat shock promoters (see Fig. 28-3). By using different subunits, the cell can coordinate the expression of sets of genes, permitting major changes in cell physiology. Which sets of genes are expressed is determined by the availability of the various subunits. This, in turn, is determined by several factors: regulated rates of synthesis and degradation, posttranslational modifications that switch individual subunits between active and inactive forms, and a specialized class of anti- proteins, each type binding to and sequestering a particular subunit to render it unavailable for transcription initiation.
Requirements for any gene product vary with cellular conditions or developmental stage, and transcription of each gene is carefully regulated to form gene products only in the proportions needed. Regulation can occur at any step of transcription, including elongation and termination. However, much of the regulation is directed at the polymerase binding and transcription initiation steps outlined in Figure 26-6. Differences in promoter sequences are just one of several levels of control.
The binding of proteins to sequences both near to and distant from the promoter can also affect levels of gene expression. Protein binding can activate transcription by facilitating either RNA polymerase binding or steps farther along in the initiation process, or it can repress transcription by blocking the activity of the polymerase. In E. coli, one protein that activates transcription is the cAMP receptor protein (CRP), which increases the transcription of genes coding for enzymes that metabolize sugars other than glucose when cells are grown in the absence of glucose. Repressors are proteins that block the synthesis of RNA at specific genes. In the case of the Lac repressor, transcription of the genes for the enzymes of lactose metabolism is blocked when lactose is unavailable.
As described further in Chapter 27, transcription of mRNAs and their translation are tightly coupled in bacteria. As a protein-coding gene is being transcribed, ribosomes rapidly bind to and begin to translate the mRNA before its synthesis is complete. Another protein, NusG, binds directly to both the ribosome and RNA polymerase, linking the two complexes. The rate of translation directly affects the rate of transcription. In contrast, eukaryotes carry out transcription in the nucleus and translation in the cytoplasm, making it impossible for these two steps to be physically coupled.
RNA synthesis is processive; that is, the RNA polymerase introduces a large number of nucleotides into a growing RNA molecule before dissociating (p. 917). This is necessary because, if the polymerase released an RNA transcript prematurely, it could not resume synthesis of the same RNA and would have to start again from the beginning of the gene. However, an encounter with certain DNA sequences results in a pause in RNA synthesis, and at some of these sequences transcription is terminated. Our focus here is again on the well-studied systems in bacteria. E. coli has at least two classes of termination signals: one class relies on a protein factor called (rho), and the other is -independent.
Most -independent terminators have two distinguishing features. The first is a region that produces an RNA transcript with self-complementary sequences, permitting the formation of a hairpin structure (see Fig. 8-19a) centered 15 to 20 nucleotides before the projected end of the RNA strand. The second feature is a highly conserved string of three A residues in the template strand that are transcribed into U residues near the end of the hairpin. When a polymerase arrives at a termination site with this structure, it pauses (Fig. 26-7a). Formation of the hairpin structure in the RNA disrupts several base pairs in the RNA-DNA hybrid segment and may disrupt important interactions between RNA and the RNA polymerase, facilitating dissociation of the transcript.
Part a is labeled rho-independent termination. An R N A polymerase is shown a roughly round purple structure with an opening in the center, a narrow vertical passage downward, a passage to the right, and a circular opening at the upper left. A double-stranded piece of D N A is shown with a blue top portion and a light red bottom portion. The two strands separate as they pass through the opening in the venter of R N A polymerase, with the light red portion meeting a green R N A molecule with its 3 prime end extending down below. Just above the place that this piece of D N A meets R N A, there is a vertical sequence of five A nucleotides across from a vertical sequence of five 5 nucleotides in the blue piece of D N A. This series of bases is labeled terminator sequence. The green strand of R N A extends out through the opening at the upper left of the R N A polymerase, where it has a sequence of U U U before bending vertically to a light green piece with U U at its top that meets a dark green loop that bends to run vertically up along the light green piece of R N A before bending to run diagonally up to the left to end at its 5 prime end. Accompanying text in a grey box reads, An R N A hairpin forms at a palindromic sequence and disrupts interactions between the R N A and D N A template within the polymerase. An arrow points down. This yields a similar structure in which the R N A is shown to the left, separate from the R N A polymerase. Accompanying text in a gray box reads, The m R N A is released. Part b is labeled rho-dependent termination. R N A polymerase is shown with double-stranded D N A similar to the structure shown in part a except that no termination sequence is shown. A green strand of R N A with its 3 prime end extending downward into the narrow vertical passage downward extends out through the opening to the upper left and runs through a circle of six orange spheres labeled rho helicase just outside of the R N A polymerase. The green R N A runs horizontally to the left and has a short purple piece labeled r u t element just before becoming green again and ending at its 5 prime end. A gray arrow points from the 5 prime end toward the R N A polymerase. Text in a gray box reads, The rho helicase separates the m R N A from the D N A template. An arrow points down to show a similar structure in which the R N A molecule is to the left of the R N A polymerase and the rho helicase is to the lower left, separate from the R N A molecule.
The -dependent terminators lack the sequence of repeated A residues in the template strand but usually include a CA-rich sequence called a rut (rho utilization) element. The protein associates with the RNA at specific binding sites and migrates in the direction until it reaches the transcription complex that is paused at a termination site (Fig. 26-7b). Here it promotes release of the RNA transcript. The protein has an ATP-dependent RNA-DNA helicase activity that permits translocation of the protein along the RNA, and ATP is hydrolyzed by the protein during the termination process. The detailed mechanism by which the protein promotes the release of the RNA transcript is not known.
The transcriptional machinery in the nucleus of a eukaryotic cell is much more complex than that in bacteria. Eukaryotes have three nuclear RNA polymerases, designated I, II, and III, which are distinct complexes but have certain subunits in common. Each polymerase has a specific function (Table 26-1) and is recruited to a specific promoter sequence. In addition, eukaryotic mitochondria and chloroplasts have their own RNA polymerases for transcription of genes encoded in their own DNA (see Fig. 19-40). The RNA polymerases in these organelles are similar to bacterial RNA polymerases and less elaborate than the nuclear transcription machinery discussed below.
RNA polymerase | Types of RNA synthesized |
---|---|
I |
Pre-ribosomal RNA |
II |
mRNA ncRNA |
III |
tRNA 5S rRNA ncRNA |
RNA polymerase I (Pol I) is responsible for the synthesis of only one type of RNA, a transcript called pre-ribosomal RNA (or pre-rRNA), which contains the precursor for the 18S, 5.8S, and 28S rRNAs. The principal function of RNA polymerase II (Pol II) is the synthesis of mRNAs and many ncRNAs. This enzyme can recognize thousands of promoters that vary greatly in sequence. Some Pol II promoters have a few sequence features in common, including a TATA box (eukaryotic consensus sequence TATA(A/T)A(A/T)(A/G)) near base pair and an Inr sequence (initiator) near the RNA start site at (Fig. 26-8). However, such promoters are in the minority, and elaborate interactions with regulatory proteins guide Pol II function at many promoters that lack these features.
A bar is shown with its 5 prime end on the left and its 3 prime end on the right. From left to right, it has a short blue region, a purple region labeled various regulatory sequences, a blue region, a purple region containing T A T A A A labeled T A T A box and with negative 30 above its upper left, then a blue region, then a red region labeled l n r containing Y Y A N T/A Y Y with plus 1 just to the right of its center, then a yellow region.
RNA polymerase III (Pol III) makes tRNAs, the 5S rRNA, and other small, specialized ncRNAs, including the U6 RNA component of the spliceosome, which we will discuss in Section 26.2. The promoters recognized by Pol III are well characterized. Some of the sequences required for the regulated initiation of transcription by Pol III are located within the gene itself, whereas others are in more conventional locations upstream of the RNA start site (Chapter 28).
RNA polymerase II is central to eukaryotic gene expression and has been studied extensively. Although this polymerase is strikingly more complex than its bacterial counterpart, the complexity masks a remarkable conservation of structure, function, and mechanism. Pol II isolated from either yeast or human cells is a 12-subunit enzyme with an aggregate molecular weight of more than 510,000. The largest subunit (RBP1) exhibits a high degree of homology to the subunit of bacterial RNA polymerase. Another subunit (RBP2) is structurally similar to the bacterial β subunit, and two others (RBP3 and RBP11) show some structural homology to the two bacterial α subunits. Pol II must function with genomes that are more complex and with DNA molecules more elaborately packaged than in bacteria. The need for protein-protein contacts with the numerous other protein factors required to navigate this labyrinth accounts in large measure for the added complexity of the eukaryotic polymerase.
The largest subunit of Pol II (RBP1) also has an unusual feature, a long carboxyl-terminal tail consisting of many repeats of a consensus heptad amino acid sequence, —YSPTSPS—. There are 26 repeats in the yeast enzyme (19 exactly matching the consensus) and 52 (21 exact) in the mouse and human enzymes. This carboxyl-terminal domain (CTD) is separated from the main body of the enzyme by an intrinsically disordered linker sequence. The CTD has many important roles in Pol II function, as outlined below.
RNA polymerase II requires an array of other proteins, called transcription factors, to form the active transcription complex. The general transcription factors required at every Pol II promoter (factors usually designated TFII with an additional identifier) are highly conserved in all eukaryotes (Table 26-2). The process of transcription by Pol II can be described in terms of several phases — assembly, initiation, elongation, termination — each associated with characteristic proteins (Fig. 26-9). The step-by-step pathway described below leads to active transcription in vitro. In the cell, many of the proteins may be present in larger, preassembled complexes, simplifying the pathways for assembly on promoters. As you read about this process, consult Figure 26-9 and Table 26-2 to help keep track of the many participants.
Part a begins with a blue horizontal double-stranded piece of D N A. The top strand runs from 5 prime to 3 prime and the bottom strand runs from 3 prime to 5 prime. Just to the left of the center, a purple box labeled T A T A is shown above negative 30. Just to the right of center, a red box labeled l n r is shown above plus 1 with a vertical line above that joins a horizontal arrow pointing to the right. An arrow points down joined by lines showing the addition of multiple factors. These include the following: T F I I A, a small tan oval with a point at the upper left; T B P, a small dark tan oval with a rounded cutout in the bottom; T F I I D, a large yellow structure that is roughly rectangular but wider at the bottom than at the top with a rounded cutout in the bottom center; T F I I B, a green crescent-shaped structure; T F I I F, a dark brown oval structure with a protrusion to the upper right; T F I I E, a narrow blue crescent structure with the top of its curve joined to a horizontal protrusion from T F I I H; T F I I H, a gray structure with a rectangular center, a curved protrusion below, and a horizontal protrusion to the upper left. These arrows come together beneath the word assembly. Step 1: Pol Roman numeral 2 is recruited to the D N A by transcription factors. This yields the preinitiation complex (closed). Double-stranded D N A runs almost vertically from the lower right to the purple T A T A box, where it curves to run horizontally. The red l n r box is present just past the bend. The curve occurs with T B P across the D N A, T F I I A to the right adjacent to T F I I F, the tip of T F I I F extending up across D N A to the right of l n r; T F I I B curving behind the curve in D N A; a rounded structure labeled clamp running from the left above D N A above l n r with a thin protrusion extending almost vertically up to the left. T F I I E curves from the base of this thin protrusion over the clamp and down to meet the D N A to the right of the protrusion from T F I I F. T F I I H is to the right of T F I I E and extends down across D N A. T F I I D is at the lower right with its upper left corner near l n r. An arrow points clockwise downward. Step 2: The transcription bubble forms. A structure is shown that is labeled initiation complex (open). This resembles the previous structure except that the narrow protrusion to the upper left is labeled C T D and a transcription bubble has appeared where the region labeled l n r has separated. This is labeled unwound l n h D N A. This stage is labeled initiation. An arrow points clockwise to the lower left accompanied by a branched arrow showing the loss of T F I I E and T F I I H. This yields Pol Roman numeral 2, shown as a rounded brown structure with a crescent-shaped lower portion, an open region in the center that curves upward and narrows to the left, and an oval portion at the upper center to left that ends with a thin diagonal protrusion with three circles labeled P on each side. T F I I F still extends up across D N A to the right of the open transcription bubble. A green strand of R N A begins within the bubble and runs left to exist and run just to the left of the thin projection to the upper left with P on each side. This structure is labeled elongation complex. Step 3: The C T D is phosphorylated during initiation. The polymerase escapes the promoter. The D N A extends far to the left through a l n r box before bending to a vertical T A T A box and extending down to the lower right. Much of the initiation complex Is still assembled around it, although Pol Roman numeral 2 Is not present and T F I I H is not present. An arrow curves clockwise up to the left accompanied by a curved arrow showing the addition of three brown spheres labeled elongation factors. Step 4: Transcription elongation is aided by elongation factors after T F I I E and T F I I H dissociate. This yields an illustration labeled elongation. D N A pol Roman numeral 2 has a horizontal blue piece of double-stranded D N A running through it with a transcription bubble present in the opening in the center just to the left of T F I I F. Elongation factors are present above and below the replication bubble and at the base of the phosphorylated C T D. A green strand of R N A begins in the replication bubble and runs up and then to the left. An arrow points clockwise upward and to the right. Step 5: Elongation factors dissociate. The C T D is dephosphorylated as transcription terminates, a process facilitated by termination factors. An arrow branches away to show the loss of three brown spheres labeled elongation factors. Three pale spheres labeled termination factors are added and then leave. An arrow branches off to show the loss of R N A and six red circles labeled P. This is labeled termination and results in all of the transcription factors at the top of the figure ready to begin the process again. Part b shows a blue horizontal cylinder containing a double helix labeled D N A. A red region in the center is labeled l n r plus 1. A vertical line extends up from l n r and meets a horizontal arrow pointing to the right. A piece of yellow is visible above l n r and a larger irregular vertical yellow piece is visible to the right. This piece curves down with an irregular, bumpy shape, before curving back up to meet the left-hand side of the blue D N A. It is labeled T F I I D. At its left side, R F I I D meets a vertical gray pieve labeled T F I I A that is almost oval and meets a roughly rectangular brown piece labeled T B P that is narrower at its top. A purple region within T B P is labeled T A T A negative 30. The D N A bend down beneath this purple region.
Transcription protein | Number of different subunits | Subunit(s) a | Function(s) |
---|---|---|---|
Initiation | |||
Pol II |
12 |
7,000–220,000 |
Catalyzes RNA synthesis |
TBP (TATA-binding protein) |
1 |
38,000 |
Specifically recognizes the TATA box |
TFIIA |
2 |
13,000, 42,000 |
Stabilizes binding of TFIIB and TBP to the promoter |
TFIIB |
1 |
35,000 |
Binds to TBP; recruits Pol II–TFIIF complex |
TFIIDb |
13–14 |
14,000–213,000 |
Required for initiation at promoters lacking a TATA box |
TFIIE |
2 |
33,000, 50,000 |
Recruits TFIIH; has ATPase and helicase activities |
TFIIF |
2–3 |
29,000–58,000 |
Binds tightly to Pol II; binds to TFIIB and prevents binding of Pol II to nonspecific DNA sequences |
TFIIH |
10 |
35,000–89,000 |
Unwinds DNA at promoter (helicase activity); phosphorylates Pol II CTD; recruits nucleotide-excision repair proteins |
Elongationc | |||
ELLd |
1 |
80,000 |
|
pTEFb |
2 |
43,000, 124,000 |
Phosphorylates Pol II CTD |
SII (TFIIS) |
1 |
38,000 |
|
Elongin (SIII) |
3 |
15,000, 18,000, 110,000 |
|
a reflects the subunits present in the complexes of human cells. bThe presence of multiple copies of some TFIID subunits brings the total subunit composition of the complex to 21–22. cThe function of all elongation factors is to suppress the pausing or arrest of transcription by the Pol II–TFIIF complex. dName derived from eleven-nineteen lysine-rich leukemia. The gene for ELL is the site of chromosomal recombination events frequently associated with acute myeloid leukemia. |
The formation of a closed complex begins when the TATA-binding protein (TBP) binds to the TATA box (Fig. 26-9a, step ). At promoters lacking a TATA box, TBP arrives as part of a multisubunit complex called TFIID. TBP is bound, in turn, by the transcription factor TFIIB. TFIIA then binds and, along with TFIIB, helps to stabilize the TBP-DNA complex. The TFIIB-TBP complex is next bound by another complex consisting of TFIIF and Pol II. TFIIF helps target Pol II to its promoters, both by interacting with TFIIB and by reducing the binding of the polymerase to nonspecific sites on the DNA. Finally, TFIIE and TFIIH bind to create the closed, preinitiation complex (PIC).
A key function of TFIID in the PIC is to position TBP on the promoter, which in turn dictates the location of Pol II loading and transcription initiation. Because most human promoters (~80%) lack a TATA box, how TFIID correctly positions TBP and Pol II relative to the transcription start site was poorly understood until their structures were determined by cryo-EM (Fig. 26-9b). These structures showed that TFIID binds the promoter DNA in an elongated complex that is anchored by TBP–DNA interactions on one end and extends linearly over 70 base pairs. The Inr sequence is positioned roughly in the middle, straddled on both ends by TFIID subunits. TFIID thus acts as a scaffold to direct binding of Pol II and other PIC components and uses its structure and interactions with TBP to help define the transcription start site.
TFIIH has multiple subunits and includes a DNA helicase activity that promotes the unwinding of DNA near the RNA start site (a process requiring the hydrolysis of ATP), thereby creating an open initiation complex (Fig. 26-9a, step ). Counting all the subunits of the various factors (including TFIIA and the subunits of TFIID), this active initiation complex can have more than 50 polypeptides.
TFIIH has an additional function during the initiation phase. A kinase activity in one of its subunits phosphorylates Pol II at many places in the CTD (Fig. 26-9a, step ). Several other protein kinases, including CDK9 (cyclin-dependent kinase 9), which is part of the complex pTEFb (positive transcription elongation factor b), also phosphorylate the CTD, primarily on Ser residues of the CTD repeat sequence. CTD phosphorylation causes a conformational change in the overall complex, initiating transcription. During the subsequent elongation phase of transcription, the phosphorylation state of the CTD changes, affecting which RNA processing components are bound to the transcription complexes (Fig. 26-10).
The first step is labeled initiation. R N A polymerase Roman numeral 2 is shown as a rounded structure with a crescent-shaped base and an oval at the center to upper left. It has an opening in the center that is wider to the right and narrower to the upper left. A long strand extends at a slight diagonal upward from its upper left. Beginning at the polymerase, this strand is labeled Y 1, S 2, P 3, T 4, S 5 next to a red circle labeled P, P 6, S 7 next to a red circle labeled P. The strand becomes lighter and the sequence continues as Y 1, S 2, P 3, T 4, S 5, P 6, S 7. Accompanying text reads, multiple repeats of C T D tail sequence not shown. An arrow points right to a similar structure in which only the dark park of the C T D tail is shown and there is an additional red circle labeled P bonded to S 2. An arrow points right labeled elongation to a similar structure with Y 1, S 2, and T 4 all bonded to red circles labeled P. Another arrow points right to a structure labeled termination that is similar except that only S 2 and T 4 are bonded to red circles labeled P.
During synthesis of the initial 60 to 70 nucleotides of RNA, first TFIIE, then TFIIH is released, and Pol II enters the elongation phase of transcription (Fig. 26-9a, step ).
TFIIF remains associated with Pol II throughout elongation. During this stage, polymerase activity is greatly enhanced by protein elongation factors (Table 26-2). The elongation factors, some bound to the phosphorylated CTD, suppress pausing during transcription and also coordinate interactions between the supramolecular complexes involved in the posttranscriptional processing of mRNAs. Once the RNA transcript is completed, transcription is terminated (Fig. 26-9a, step ). The Pol II CTD is dephosphorylated and the transcription machinery recycled, ready to initiate another transcript.
Regulation of transcription at Pol II promoters is an elaborate process. It involves the interaction of a wide variety of other proteins with the preinitiation complex. Some of these regulatory proteins interact with transcription factors, others with Pol II itself. The regulation of eukaryotic transcription is described in more detail in Chapter 28.
Both bacterial and eukaryotic RNA polymerases are the targets of a large number of chemical inhibitors. Some of these molecules inhibit transcription of both types of RNA polymerases; others selectively inhibit only certain types of polymerase.
The elongation of RNA strands by RNA polymerase in both bacteria and eukaryotes is inhibited by the antibiotic actinomycin D. The planar portion of this molecule inserts (intercalates) into the double-helical DNA between successive G≡C base pairs, deforming the DNA duplex. This prevents movement of the polymerase along the DNA during transcription. Because actinomycin D inhibits RNA elongation in intact cells as well as in cell extracts, it can be used to identify cell processes that depend on RNA synthesis.
Rifampin (Fig. 26-11a) inhibits bacterial RNA synthesis by preventing the promoter clearance step of transcription. Rifampin is an important antibiotic for the treatment of tuberculosis (TB), which is caused by the bacterium Mycobacterium tuberculosis and kills approximately 1.8 million people each year. The antibiotic binds near the active site of RNA polymerase and prevents extension of the RNA product beyond 2 to 3 nucleotides. Unfortunately, M. tuberculosis can develop resistance to rifampin; more than 600,000 cases of rifampin-resistant TB are reported each year. In many cases, resistance is due to mutation in the rifampin binding site (Fig. 26-11b), particularly at , , and of the β subunit. New drugs that inhibit M. tuberculosis RNA polymerase are desperately needed for treatment of drug-resistant TB.
Part a shows the structure of rifampin. It has a benzene ring that shares its right side with the left side of an adjacent benzene ring. The left-hand ring has a top vertex bonded to O H, an upper left vertex bonded to C H 3, and a lower left side shared with the upper right side of a five-membered ring to the lower left. This five-membered ring has O substituted for C at its upper left vertex, a lower right vertex double bonded to O, and a lower left vertex hashed wedge bonded to C H 3 and solid wedge bonded to O. This O begins a ring of partial ring structures that loops back around to the right-hand benzene ring of the original pair. The O is bonded to C H double bonded to C H bonded to C at the left vertex of a partial ring shape. This C is hashed wedge bonded to O further bonded to C H 3 and bonded to C to the lower right hashed bonded to C H 3 and bonded to C H that is solid wedge bonded to O further bonded to C double bonded to O and bonded to C H 3 and bonded to C to the upper right that is solid wedge bonded to C H 3 and bonded to C at the upper left corner that is hashed wedge bonded to O H and bonded to C at the top vertex of the left-hand partial ring. This C is hashed wedge bonded to CH 3 above and bonded to C to the lower right hashed wedge bonded to O H below and bonded to C at the top vertex of the middle partial ring that is solid wedge boned to C H 3 above and bonded to C H to the lower right. This C H is double bonded to C H bonded to C H double bonded to C bonded to C H 3 and bonded to C to the lower right double bonded to O and bonded to N H below. This N H is bonded to the upper right vertex of the right-hand benzene ring of the benzene ring pair, which has top and bottom vertices bonded to O H and a lower right vertex bonded to C H double bonded to N bonded to N to the lower right substituted for C at the upper left vertex of a six-membered ring with N substituted for C at its lower right vertex and further bonded to C H 3. Part b shows highlighted structures surrounded by gray helices and beta sheets. Bleu D N A is shown entering from the right and primarily visible in the venter. Green R N A is visible to the lower left of the D N A and connected to a green strand below that ends at M g 2 plus at the left end. Rifampin is shown as an oblong structure with many gray spheres with two blue spheres visible behind and many red spheres toward the right side and top side. Wireframe structures that are mostly yellow with some blue and red surround rifampin and are labeled, polymerase amino acids interaction with rifampin.
The death cap mushroom Amanita phalloides has a very effective defense mechanism against predators. It produces α-amanitin, which disrupts transcription in animal cells by blocking Pol II and, at higher concentrations, Pol III. Neither Pol I nor bacterial RNA polymerase is sensitive to α-amanitin — nor is the RNA polymerase II of A. phalloides itself. Because α-amanitin is selective for inhibiting the function of only certain RNA polymerases, it has proven useful for identifying the functions of different polymerases in the cell. Mitochondrial and bacterial RNA polymerases share significant similarities to one another, including α-amanitin resistance. By exposing eukaryotic cells to α-amanitin, it is possible to detect newly synthesized mRNAs that arise only from mitochondrial and not nuclear transcription. Researchers using α-amanitin need to exercise abundant caution because it is highly toxic to humans. An amount of -amanitin the size of a grain of rice contains a lethal dose.