8.2 Nucleic Acid Structure in Chapter 8 Nucleotides and Nucleic Acids

DNA Is a Double Helix That Stores Genetic Information

DNA was first isolated and characterized by Friedrich Miescher in 1869. He called the phosphorus-containing substance “nuclein.” Not until the 1940s, with the work of Oswald T. Avery, Colin MacLeod, and Maclyn McCarty, was there any compelling evidence that DNA was the genetic material. Avery and his colleagues found that an extract of a virulent strain of the bacterium Streptococcus pneumoniae (causing disease in mice) could be used to transform a nonvirulent strain of the same bacterium into a virulent strain. They were able to demonstrate through various chemical tests that it was DNA from the virulent strain (not protein, polysaccharide, or RNA, for example) that carried the genetic information for virulence. Then in 1952, experiments by Alfred D. Hershey and Martha Chase — in which they studied the infection of bacterial cells by a virus (bacteriophage) with radioactively labeled DNA or protein — removed any remaining doubt that DNA, not protein, carried the genetic information.

Another important clue to the structure of DNA came from the work of Erwin Chargaff and his colleagues in the late 1940s. Examining dozens of species, they found that the four nucleotide bases of DNA occur in different ratios in the DNAs of different organisms. However, the base composition remains constant in different tissues of the same species, and does not vary with age, environment, nutritional state, or generation. Furthermore, regardless of the species, the number of adenosine residues is equal to the number of thymidine residues (that is, A = T), and the number of guanosine residues is equal to the number of cytidine residues (G = C). From these relationships it follows that the sum of the purine residues equals the sum of the pyrimidine residues; that is, A + G = T + C. These quantitative relationships, sometimes called “Chargaff’s rules,” were a key to establishing the three-dimensional structure of DNA.

To shed more light on the structure of DNA, in the early 1950s Rosalind Franklin and Maurice Wilkins used the powerful method of x-ray diffraction (see Fig. 4-30) to analyze DNA fibers. Although lacking the molecular definition of diffraction from crystals, the x-ray diffraction pattern generated from the fibers was informative (Fig. 8-12). The pattern revealed that DNA molecules are helical, with two periodicities along their long axis: a primary one of 3.4 Å and a secondary one of 34 Å. The problem then was to formulate a three-dimensional model of the DNA molecule that could account not only for the x-ray diffraction data but also for the specific A = T and G = C base equivalences discovered by Chargaff and for the other chemical properties of DNA.

An X ray diffraction pattern has a roughly circular pale area with darker semicircles to the left and right, an open circle in the center, and an X extending from the center with each side of the X made up of two bands extending out from the center, a space, and then a wider band that runs into the dark outer semicircles. — FIGURE 8-12 X-ray diffraction pattern of DNA fibers. The spots forming a cross in the center denote a helical structure. The heavy bands at the left and the right arise from the recurring bases.

A photograph of Rosalind Franklin, 1920–1958. — Rosalind Franklin, 1920–1958

A photograph of Maurice Wilkins, 1916–2004. — Maurice Wilkins, 1916–2004

James Watson and Francis Crick relied on this accumulated information about DNA to set about deducing its structure. In 1953 they postulated a three-dimensional model of DNA structure that accounted for all the available data. It consists of two helical DNA chains wound around the same axis to form a right-handed double helix. (See Box 4-1 for an explanation of the right- or left-handed sense of a helical structure.) The hydrophilic backbones of alternating deoxyribose and phosphate groups are on the outside of the double helix, facing the surrounding water. The furanose ring of each deoxyribose is in the $C- 2^{'}$ $upper C hyphen 2 prime$ endo conformation. The purine and pyrimidine bases of both strands are stacked inside the double helix, with their hydrophobic and nearly planar ring structures very close together and perpendicular to the long axis. The offset pairing of the two strands creates a major groove and a minor groove on the surface of the duplex (Fig. 8-13). Each nucleotide base of one strand is paired in the same plane with a base of the other strand. Watson and Crick found that the hydrogen-bonded base pairs illustrated in Figure 8-11, G with C and A with T, are those that fit best within the structure, providing a rationale for Chargaff’s rule that in any DNA, G = C and A = T. It is important to note that three hydrogen bonds can form between G and C, symbolized G≡C, but only two can form between A and T, symbolized A═T. Pairings of bases other than G with C and A with T tend (to varying degrees) to destabilize the double-helical structure.

A three-part figure, a, b, and c, shows the structure of D N A. — FIGURE 8-13 Watson-Crick model for the structure of DNA. The original model proposed by Watson and Crick had 10 bp, or 34 Å (3.4 nm), per turn of the helix; subsequent measurements revealed 10.5 bp, or 36 Å (3.6 nm), per turn. (a) Schematic representation, showing dimensions of the helix. (b) Stick representation showing the backbone and stacking of the bases. (c) Space-filling model.

Part a shows a vertical double helix with ribbon sides and bars extending into the center from the sides. Pairs of bars from each side are joined together by sets of three vertical lines. As the strands twist, the relatively narrow space between two adjacent strands is labeled minor groove. Across from this, the distance between two sets of bases is labeled as 3.4 Angstroms. Immediately above the minor groove, there is a wider space between the bottom and top ribbon labeled major groove. The distance of one complete turn is shown to be 36 Angstroms, and the width of the D N A is shown to be 20 Angstroms. Part b shows the same structure as a stick representation with mostly blue rings on the side representing sugars, orange dots on the chains connecting the sugars and yellow bases extending into the center. Each horizontal pair of bases has one with a double ring and one with a single ring. Part c shows the same structure as a space filling model. The center is yellow and the strands are mostly blue with some orange. The difference between the minor groove and major groove is more visible as there are alternating narrow and wide strips of yellow visible.

When Watson and Crick constructed their model, they had to decide at the outset whether the strands of DNA should be parallel or antiparallel — whether their $3^{'}$ $3 prime$ , $5^{'}$ $5 prime$ -phosphodiester bonds should run in the same or opposite directions. An antiparallel orientation produced the most convincing model, and later work with DNA polymerases (Chapter 25) provided experimental evidence that the strands are indeed antiparallel, a finding ultimately confirmed by x-ray analysis.

To account for the periodicities observed in the x-ray diffraction patterns of DNA fibers, Watson and Crick manipulated molecular models to arrive at a structure in which the vertically stacked bases inside the double helix would be 3.4 Å apart; the secondary repeat distance of about 34 Å was accounted for by the presence of 10 base pairs (bp) in each complete turn of the double helix. The structure in aqueous solution differs slightly from that in fibers, having 10.5 bp per helical turn (Fig. 8-13).

As Figure 8-14 shows, the two antiparallel polynucleotide chains of double-helical DNA are not identical in either base sequence or composition. Instead they are complementary to each other. Wherever adenine occurs in one chain, thymine is found in the other; similarly, wherever guanine occurs in one chain, cytosine is found in the other.

A figure shows complementary base pairing in a vertical double helix of D N A. — FIGURE 8-14 Complementarity of strands in the DNA double helix. The complementary antiparallel strands of DNA follow the pairing rules proposed by Watson and Crick. The base-paired antiparallel strands differ in base composition: the left strand has the composition $A_{3} T_{2} G_{1} C_{3}$ $upper A Subscript 3 Baseline upper T Subscript 2 Baseline upper G Subscript 1 Baseline upper C Subscript 3$ ; the right strand has $A_{2} T_{3} G_{3} C_{1}$ $upper A Subscript 2 Baseline upper T Subscript 3 Baseline upper G Subscript 3 Baseline upper C Subscript 1$ . They also differ in sequence when each chain is read in the $5^{'} \to 3^{'}$ $5 prime right-arrow 3 prime$ direction. Note the base equivalences: A = T and G = C in the duplex.

FIGURE 8-14 Complementarity of strands in the DNA double helix. The complementary antiparallel strands of DNA follow the pairing rules proposed by Watson and Crick. The base-paired antiparallel strands differ in base composition: the left strand has the composition $A_{3} T_{2} G_{1} C_{3}$ $upper A Subscript 3 Baseline upper T Subscript 2 Baseline upper G Subscript 1 Baseline upper C Subscript 3$ ; the right strand has $A_{2} T_{3} G_{3} C_{1}$ $upper A Subscript 2 Baseline upper T Subscript 3 Baseline upper G Subscript 3 Baseline upper C Subscript 1$ . They also differ in sequence when each chain is read in the $5^{'} \to 3^{'}$ $5 prime right-arrow 3 prime$ direction. Note the base equivalences: A = T and G = C in the duplex.

The left strand of D N A runs from 5 prime at the top to 3 prime at the bottom. It begins with a circle connected to a five-membered ring that is connected to another circle and so on, with a total of 9 nucleotide units. Each five-membered ring is connected by its right side vertex to a single hexagonal ring representing a pyrimidine or to a five-membered ring that is joined with six-membered ring representing a purine. From top to bottom, the sequence of purines and pyrimidines is C, A, A, T, C, G, T, C, and A. The right-hand strand has its 3 prime end on top and its 5 prime end on the bottom. It has a similar structure to the first strand but flipped. From top to bottom, the sequence of purines and pyrimidines is G, T, T, A, G, C, A, G, and T. Sets of three blue vertical lines are shown connecting the purines and pyrimidines. Each C has three sets of vertical lines connecting it with a G and each A has two sets of vertical lines connecting it with a T.

The DNA double helix, or duplex, is held together by hydrogen bonding between complementary base pairs (Fig. 8-11) and by base-stacking interactions. The complementarity between the DNA strands is attributable to the hydrogen bonding between base pairs; however, the hydrogen bonds do not contribute significantly to the stability of the structure. The double helix is primarily stabilized by metal cations, which shield the negative charges of backbone phosphates, and by base-stacking interactions between successive base pairs. Base-stacking interactions between successive $G ≡ C$ $upper G identical-to upper C$ or $C ≡ G$ $upper C identical-to upper G$ pairs are stronger than those between successive $A ═ T$ $upper A box drawings double horizontal upper T$ and $T ═ A$ $upper T box drawings double horizontal upper A$ pairs or adjacent pairs including all four bases. Because of this, DNA duplexes with higher $G ≡ C$ $upper G identical-to upper C$ content are more stable.

The important features of the double-helical model of DNA structure are now supported by much chemical and biological evidence. Moreover, the model immediately suggested a mechanism for the transmission of genetic information. The essential feature of the model was the complementarity of the two DNA strands. As Watson and Crick were able to see, well before confirmatory data became available, this structure could logically be replicated by (1) separating the two strands and (2) synthesizing a complementary strand for each. Because nucleotides in each new strand are joined in a sequence specified by the base-pairing rules stated above, each preexisting strand functions as a template to guide the synthesis of one complementary strand (Fig. 8-15). These expectations were experimentally confirmed, inaugurating a revolution in our understanding of biological inheritance.

A figure shows D N A replication. — FIGURE 8-15 Replication of DNA as suggested by Watson and Crick. The preexisting or “parent” strands become separated, and each is the template for biosynthesis of a complementary “daughter” strand (in pink).

DNA Can Occur in Different Three-Dimensional Forms

DNA is a remarkably flexible molecule. Considerable rotation is possible around several types of bonds in the sugar–phosphate (phosphodeoxyribose) backbone, and thermal fluctuation can produce bending, stretching, and unpairing (melting) of the strands. Many significant deviations from the Watson-Crick DNA structure are found in cellular DNA, some or all of which may be important in DNA metabolism. These structural variations generally do not affect the key properties of DNA defined by Watson and Crick: strand complementarity, antiparallel strands, and the requirement for $A ═ T$ $upper A box drawings double horizontal upper T$ and $G ≡ C$ $upper G identical-to upper C$ base pairs.

Structural variation in DNA reflects three things: the different possible conformations of the deoxyribose, rotation about the contiguous bonds that make up the phosphodeoxyribose backbone (Fig. 8-16a), and free rotation about the C- $1^{'}$ $1 prime$ –N-glycosyl bond (Fig. 8-16b). Because of steric constraints, purines in purine nucleotides are restricted to two stable conformations with respect to deoxyribose, called syn and anti (Fig. 8-16b). Pyrimidines are generally restricted to the anti conformation because of steric interference between the sugar and the carbonyl oxygen at C-2 of the pyrimidine.

A two-part figure, a and b, shows structural variation in D N A. — FIGURE 8-16 Structural variation in DNA. (a) The conformation of a nucleotide in DNA is affected by rotation about seven different bonds. Six of the bonds rotate freely. The limited rotation about bond 4 gives rise to ring pucker. This conformation is endo or exo, depending on whether the atom is displaced to the same side of the plane as $C- 5^{'}$ $upper C hyphen 5 prime$ or to the opposite side (see Fig. 8-3b). (b) For purine bases in nucleotides, only two conformations with respect to the attached ribose units are sterically permitted: anti or syn. Pyrimidines occur in the anti conformation.

FIGURE 8-16 Structural variation in DNA. (a) The conformation of a nucleotide in DNA is affected by rotation about seven different bonds. Six of the bonds rotate freely. The limited rotation about bond 4 gives rise to ring pucker. This conformation is endo or exo, depending on whether the atom is displaced to the same side of the plane as $C- 5^{'}$ $upper C hyphen 5 prime$ or to the opposite side (see Fig. 8-3b). (b) For purine bases in nucleotides, only two conformations with respect to the attached ribose units are sterically permitted: anti or syn. Pyrimidines occur in the anti conformation.

Part a shows a ball and stick model with the 5 prime end at the top and the 3 prime end at the bottom. Some of the rods between atoms have numbered arrows over them showing rotation. There is a large yellow atom at the top with four rods extending out from it in a tetrahedron with red atoms at the end of each rod. The rod extending downward and forward has a counterclockwise arrow labeled 1 over the top. The red atom beneath arrow 1 has an almost horizontal rod that extends to the front right to a gray atom. This rod has a clockwise arrow labeled 2 over it. The gray atom has a rod pointing down and slightly to the left that has a clockwise arrow labeled 3 over it and that extends to a gray atom at the top right vertex a five-membered ring. This atom has a rod extending to the left and slightly upward to a red atom at the top vertex and a rod that extends down to a gray atom at the bottom right vertex and that has a dotted counterclockwise arrow labeled 4 around it. Two rods extend from the gray atom, one down and to the right and the other to the left and slightly down. The first rod has a clockwise arrow labeled 5 and extends to a red atom from which a rod which a clockwise arrow labeled 6 extends down to a large yellow atom with three evenly spaced rods extending from its sides to red atoms. The rod that extends from the gray atom at the lower right vertex of the ring to the left reaches a gray atom from which a rod extends up and forward to another gray ring atom, from which one rod extends up to the red atom at the top vertex to complete the ring and the other rod extends to the left with a counterclockwise arrow labeled 7 to a box labeled base. Part b shows conformations of purine and pyrimidine bases. S y n-Adenosine has a five-membered sugar ring with O at the top vertex between C 4 prime and C 1 prime. C 1 prime is bonded to N 9 of adenine above the ring and H below the ring, C 2 prime is bonded to H above the ring and H below the ring, C 3 prime is bonded to H above the ring and O H below the ring, C 4 prime is bonded to C 5 prime above the ring and H below the ring, and C 5 prime is bonded to 2 H and O H. Adenine has a six-membered ring on the left bonded to a five-membered ring on the right. N is in position 1 and position 3, C 6 is bonded to N H 2, there are double bonds between positions 2 and 3 and positions 1 and 6, and there is a double bond between positions 3 and 4 that is shared with the five-membered ring. The five-membered ring has N at position 7, a double bond between positions 7 and 8, and N at position 9 that is bonded to C 1 prime of the sugar below by a long bond with a counterclockwise arrow around it. Anti-adenosine has a similar structure to s y n-adenosine except that adenine has rotated so that its five-membered ring of adenine is on the left and its six-membered ring is on the right. Anti-cytidine has a similar five-membered sugar ring to anti-adenosine, but C 1 prime is bonded to N 1 of cytosine above the ring by a long bond with a counterclockwise arrow around it, and cytosine has a single ring with N 1 bonded to C 1 prime of the sugar, C 2 at the lower right vertex double bonded to O, N at position 3 at the upper right vertex, C 4 bonded to N H 2 at the top vertex, and double bonds between position 3 and position 4 and between position 5 and position 6.

The Watson-Crick structure is also referred to as B-form DNA, or B-DNA. The B form is the most stable structure for a random-sequence DNA molecule under physiological conditions and is therefore the standard point of reference in any study of the properties of DNA. Two structural variants that have been well characterized in crystal structures are the A and Z forms. These three DNA conformations are shown in Figure 8-17, with a summary of their properties. The A form is favored in many solutions that are relatively devoid of water. The DNA is still arranged in a right-handed double helix, but the helix is wider and the number of base pairs per helical turn is 11, rather than 10.5 as in B-DNA. The plane of the base pairs in A-DNA is tilted about $math alt text08-1$ $20 degree$ relative to B-DNA base pairs, thus the base pairs in A-DNA are not perfectly perpendicular to the helix axis. These structural changes deepen the major groove while making the minor groove shallower. The reagents used to promote crystallization of DNA tend to dehydrate it, and thus most short DNA molecules tend to crystallize in the A form.

A figure and table compare the A, B, and Z forms of D N A. — FIGURE 8-17 Comparison of A, B, and Z forms of DNA. Each structure shown here has 36 bp. The riboses and bases are shown in yellow. The phosphodiester backbone is represented as a blue rope. Blue is the color used to represent DNA strands in later chapters. The table summarizes some properties of the three forms of DNA.

Each form of D N A is shown as a vertical double helix and in cross section. The A form has an even double helix with the nitrogenous bases relatively closely packed together and the turns closer together than in the B form. Some nitrogenous bases are visible outside of the parent strands where the strands curve inward at the major grooves. The cross section shows an even circle of backbone with an even circle of nitrogenous bases inside, forming a double-ring structure around an open center. The B form has a double helix with evenly spaced turns, fewer nitrogenous bases visible inside, and no nitrogenous bases significantly visible outside of the parent strands. The cross section shows an even circle of backbone with a filled circle in the center made up of nitrogenous bases with a small opening in the very center and spokes extending out evenly from the central circle to the outer backbone. The Z form has an irregularly shaped backbone with many step like bends and bumps in the outer strands and with some nitrogenous bases outside of the outer strands. The cross section shows a circle of backbone with bends that give it a roughly hexagonal shape and with some nitrogenous bases overlapping and wrapped around it. There is a six-pointed-star-shaped yellow region of nitrogenous bases in the center with a small opening in the very center and edges that touch the backbone. To the right of these structures, the table has three columns and seven rows. Row 1 helical sense: A form right-handed, B form right-handed, Z-form left-handed; Row 2 diameter: A form approximately 26 Angstroms; B form approximately 20 Angstroms; Z form approximately 18 Angstroms; Row 3 base pairs per helical turn: A form 11, B form 10.5, Z form 12; Row 4: helix rise per base pair: A form 2.6 Angstroms, B form 3.4 Angstroms, Z form 3.7 Angstroms; Row 5: base tilt normal to the helix axis: A form 20 degrees, B form 6 degrees, Z form 7 degrees; Row 6 sugar pucker conformation: A form C-39 endo, B form C-29 endo, Z form C-29 endo for pyrimidines, C-39 endo for purines; Row 7: glycosyl bond formation: A form anti, B form anti, Z form anti for pyrimidines, syn for purines.

Z-form DNA is a more radical departure from the B structure; the most obvious distinction is the left-handed helical rotation. There are 12 bp per helical turn, and the structure appears more slender and elongated. The DNA backbone takes on a zigzag appearance. Certain nucleotide sequences fold into left-handed Z helices much more readily than others. Prominent examples are sequences in which pyrimidines alternate with purines, especially alternating C and G (that is, in the helix, alternating $C ≡ G$ $upper C identical-to upper G$ and $G ≡ C$ $upper G identical-to upper C$ pairs) or 5-methyl-C and G residues. To form the left-handed helix in Z-DNA, the purine residues flip to the syn conformation, alternating with pyrimidines in the anti conformation. The major groove is barely apparent in Z-DNA, and the minor groove is narrow and deep.

Whether A-DNA occurs in cells is uncertain, but there is evidence for some short stretches (tracts) of Z-DNA in both bacteria and eukaryotes. These Z-DNA tracts may play a role (as yet undefined) in regulating the expression of some genes or in genetic recombination.

Certain DNA Sequences Adopt Unusual Structures

Other sequence-dependent structural variations found in larger chromosomes may affect the function and metabolism of the DNA segments in their immediate vicinity. For example, bends occur in the DNA helix wherever four or more adenosine residues appear sequentially in one strand. Six adenosines in a row produce a bend of about $math alt text08-2$ $18 degree$ . The bending observed with this and other sequences may be important in the binding of some proteins to DNA.

A common type of DNA sequence is a palindrome. A palindrome is a word, phrase, or sentence that is spelled identically when read either forward or backward; two examples are ROTATOR and NURSES RUN. In DNA, the term is applied to regions of DNA with inverted repeats, such that an inverted, self-complementary sequence in one strand is repeated in the opposite orientation in the paired strand, as in Figure 8-18. The self-complementarity within each strand confers the potential to form hairpin or cruciform (cross-shaped) structures (Fig. 8-19). When the inverted repeat occurs within each individual strand of the DNA, the sequence is called a mirror repeat. Mirror repeats do not have complementary sequences within the same strand and thus cannot form hairpin or cruciform structures. Sequences of these types are found in almost every large DNA molecule and can encompass a few base pairs or thousands. The extent to which palindromes occur as cruciforms in cells is not known, although some cruciform structures have been demonstrated in vivo in Escherichia coli. Self-complementary sequences cause isolated single strands of DNA (or RNA) in solution to fold into complex structures containing multiple hairpins.

A figure shows palindromes and mirror repeats. — FIGURE 8-18 Palindromes and mirror repeats. Palindromes are sequences of double-stranded nucleic acids with twofold symmetry. To superimpose one repeat (shaded sequence) on the other, it must be rotated $math alt text08-3$ $180 degree$ about the horizontal axis and then $math alt text08-4$ $180 degree$ about the vertical axis, as shown by the colored arrows. A mirror repeat, on the other hand, has a symmetric sequence within each strand. Superimposing one repeat on the other requires only a single $math alt text08-5$ $180 degree$ rotation about the vertical axis.

FIGURE 8-18 Palindromes and mirror repeats. Palindromes are sequences of double-stranded nucleic acids with twofold symmetry. To superimpose one repeat (shaded sequence) on the other, it must be rotated $math alt text08-3$ $180 degree$ about the horizontal axis and then $math alt text08-4$ $180 degree$ about the vertical axis, as shown by the colored arrows. A mirror repeat, on the other hand, has a symmetric sequence within each strand. Superimposing one repeat on the other requires only a single $math alt text08-5$ $180 degree$ rotation about the vertical axis.

The figure shows two horizontal strands of double-stranded D N A. The top strand of double-stranded D N A is labeled palindrome above a vertical arrow that curves away from the observer at the top, runs down along the back, and curves forward toward the observer at the front, where it points upward. A horizontal second arrow curves from the front right, back along the front of the downward arrow, and forward toward the viewer on the left before pointing right toward the center. The top strand begins with its 5 prime end and the bottom strand begins with its 3 prime end. Beneath each letter representing a nitrogenous base, there is a vertical line extending down from the top strand to meet a vertical line extending up from the bottom strand. The top strand has a highlighted sequence of 5 prime T T A G C A C, then a non-highlighted sequence of G T G C T A A. The bottom strand has a nonhighlighted sequence of 3 prime A A T C G T C followed by a highlighted sequence of C A C G A T T. The bottom strand of double stranded D N A has a horizontal curved arrow like the one for the top molecule, curving from the front right, around the back, and forward to the left to point into the center. However, it lacks a vertical arrow. This is labeled mirror repeat. The top strand is highlighted 5 prime T T A G C A C, small space, C A C G A T T. The bottom strand is 3 prime A A T C G T G G T G C T A A.

A two-part figure, a and b, shows hairpins and cruciform. — FIGURE 8-19 Hairpins and cruciforms. Palindromic DNA (or RNA) sequences can form alternative structures with intrastrand base pairing. (a) Hairpin structures involve a single DNA or RNA strand. (b) Cruciform structures involve both strands of a duplex DNA. Blue shading highlights asymmetric sequences that can pair with the complementary sequence either in the same strand or in the complementary strand.

Part a shows a strand of D N A with a double-headed arrow backbone. Bases are shown across the top with short vertical line extending down beneath them. There are three vertical lines, then a highlighted region with the following sequence of bases, each above a vertical line: T G C G A T. There is an unhighlighted region of the following bases, each above a vertical line: A C T C. Another highlighted region has A T C G C A, each above a vertical line. Three more dark vertical lines extending down are shown. An arrow points down to a structure with horizontal lines to the left and right around a central loop. The backbone of this structure is the double-headed arrow from the D N A strand above. The loop is labeled hairpin. The 5 prime end is on the left, then there is a horizontal segment with three vertical lines extending down, then the line bends upward and the highlighted region of T G C G A T extends vertically upward with horizontal lines extending from each letter toward the center of the loop. Above this is a small circular loop with A at the bottom left, C at the top left, T at the top right, and C at the bottom right, all with lines extending inward, then the highlighted region of A T C G C A extends vertically downward with horizontal lines extending from each letter to meet the lines from the other side in the center of the loop. Beneath the right-hand side of the loop, the backbone bends horizontally, three vertical lines extend down, and it end at the 3 prime end. Part b shows two structures. The top one has two parallel double-headed arrows with bases arranged along them. The top arrow is identical to the single strand shown at the top of part a, and the bottom arrow is inverted so the bases are underneath and the vertical lines point upward to indicate bonds between complementary bases of the two strands. Below this is a structure labeled cruciform. There are two parallel double-headed arrows that each have hairpin loops in the center, one above and one below, forming a cross shape. The top strand has its 5 prime end on the left and its 3 prime end on the right. The bottom strand has its 3 prime end on the left and its 5 prime end on the right. The top strand is identical to the hairpin shown in part a, and the bottom strand is the same but mirrored both horizontally and vertically. The unlabeled bases at the end of each strand are connected to the other strand.

Several unusual DNA structures are formed from three or even four DNA strands. Nucleotides participating in a Watson-Crick base pair (Fig. 8-11) can form additional hydrogen bonds with a third strand, particularly with functional groups arrayed in the major groove. For example, the guanosine residue of a $G ≡ C$ $upper G identical-to upper C$ nucleotide pair can pair with a cytidine residue (if protonated) on a third strand (Fig. 8-20a); the adenosine of an $A ═ T$ $upper A box drawings double horizontal upper T$ pair can pair with a thymidine residue. The N-7, $O^{6}$ $upper O Superscript 6$ , and $N^{6}$ $upper N Superscript 6$ of purines, the atoms that participate in the hydrogen bonding with a third DNA strand, are often referred to as Hoogsteen positions, and the non-Watson-Crick pairing is called Hoogsteen pairing, after Karst Hoogsteen, who in 1963 first recognized the potential for these unusual pairings. Hoogsteen pairing allows the formation of triplex DNAs. The triplexes shown in Figure 8-20 (a, b) are most stable at low pH because the $C ≡ G • C^{+}$ $upper C identical-to upper G bullet upper C Superscript plus$ triplet requires a protonated cytosine. In the triplex, the $p K_{a}$ $p upper K Subscript a Baseline$ of this cytosine is >7.5, altered from its normal value of 4.2. The triplexes also form most readily within long sequences containing only pyrimidines or only purines in a given strand. Some triplex DNAs contain two pyrimidine strands and one purine strand; others contain two purine strands and one pyrimidine strand.

A five-part figure, a, b, c, d, and e, shows D N A structures containing three or four D N A strands. — FIGURE 8-20 DNA structures containing three or four DNA strands. (a) Base-pairing patterns in one well-characterized form of triplex DNA. The Hoogsteen pair in each case is shown in red. (b) Triple-helical DNA containing two pyrimidine strands (red and white; sequence TTCCTT) and one purine strand (blue; sequence AAGGAA). The blue and white strands are antiparallel and paired by normal Watson-Crick base-pairing patterns. The third (all-pyrimidine) strand (red) is parallel to the purine strand and paired through non-Watson-Crick hydrogen bonds. The triplex is viewed from the side, with six triplets shown. (c) Base-pairing pattern in the guanosine tetraplex structure. (d) Four successive tetraplets from a G tetraplex structure. (e) Possible variants in the orientation of strands in a G tetraplex. [Data from (b) PDB ID 1BCE, J. L. Asensio et al., *Nucleic Acids Res.* 26:3677, 1998; (d) PDB ID 244D, G. Laughlan et al., *Science* 265:520, 1994.]

Part a shows two base pairing patterns. The structure on the left is labeled T double parallel line A dot T. T on the left is the only molecule that is not highlighted. It is shown on the left as a six-membered ring with N in position 1 bonded to 1 prime C, C in position 2 double bonded to O, N in position 3 bonded to H that is connected by a set of three vertical lines to red highlighted N 1 of adenine, C in position 4 double bonded to O that is connected by a set of three vertical lines to red highlighted H of N H 2 bonded to C 6 in adenine, C 5 is bonded to C H 3, and there is a double bond between C 5 and C 6. A is shown between this T and another T on the right, both of which are entirely highlighted in red. A has a six-membered ring joined with a five-membered ring. N in position 1 is connected to H of N H in position 3 of thymine, there is N at position 3, C in position 6 is bonded to N that is further bonded to an H that is connected by three vertical lines to O double bonded to C 4 in the left-hand T, and the N is also single bonded to a second H that is connected by three vertical lines to O double bonded to C 4 of the right-hand T, there are double bonds between positions 2 and 3 and between positions 1 and 6, and there is a double bond between position 3 and position 4 that is shared with the five-membered ring. The 5-membered ring has N at position 7 that is connected by three vertical lines to H bonded to N at position 3 of the right-hand T, N at position 9 that is further bonded to C 1 prime, and a double bond between C 7 and C 8. The right-hand T has N 1 bonded to C 1 prime, C 2 double bonded to O, N at position 3 bonded to H that is connected by three vertical lines to N at position 7 of A, C 4 double bonded to O that is connected by three vertical lines to H of N H 2 bonded to C 6 of A, C 5 bonded to C H 3, and a double bond between C 5 and C 6. The structure on the right is labeled C three parallel lines C dot C plus. It has nonhighlighted C on the left with red highlighted G in the middle and red highlighted C on the right side. C has a six-membered ring with N in position 1 bonded to 1 prime C, C 2 double bonded to O that is connected by three vertical lines to highlighted H of N H s bonded to C 2 of G, N in position 3 connected by three vertical lines to red H bonded to N in position 1 of G, C 4 bonded to N H 2 that is connected by three vertical lines to O double bonded to C 6 of G, and double bonds between positions 3 and 4 and positions 5 and 6. Red highlighted G has a six-membered ring on the left joined with a five-membered ring on the right. The six-membered ring has N at position 1 bonded to H that is connected by three vertical lines to N in position 3 of C, C 2 is bonded to N H 2 that has three vertical lines connecting one H to O that is double bonded to C 2 of C, there is N in position 3, C 6 is double bonded to O that is connected by three vertical lines on the left to H of N H 2 bonded to C 4 in C and is further connected by three vertical lines on the right to H of N H 2 bonded to C 4 in C plus, there is a double bonds between positions 2 and 3 and a double bond between positions 4 and 5 that is shared with the five-membered ring. The five-membered ring has N 7 connected by three vertical lines to H bonded to N plus in position 3 of C plus, N 9 bonded to C 1 prime, and a double bond between C 7 and C 8. C plus has a six-membered ring with N in position 1 bonded to C 1 prime, C 2 double bonded to O, N plus in position 3 and bonded to H that has three horizontal lines connecting it to N 7 in G, C 5 bonded to N H 2 with one H connected by three vertical lines to O bonded to C 6 of G, and double bonds between positions 3 and 4 and positions 4 and 5. Part b shows three backbone strands of different colors. One begins at the lower left and curls around the front to the top center, forming a J shape. One begins in the bottom center and runs around the back up to the top left before bending back toward the center. The third begins on the right and runs upward along the back almost parallel to the central strand to stop near the place that the middle strand curves inward. Each stand has single and double ring structures extending into the central region. Part c shows a structure labeled guanosine tetraplex. It consists of four G molecules connected by hydrogen bonds, shown as sets of three lines, to form a rotationally symmetric, roughly square structure. The first G is at the lower left. It has a five-membered ring on the left bonded to a six-membered ring on the right. The six-membered ring has N in position 1 bonded to H that is connected by a hydrogen bond to O bonded to C 6 of the right-hand G, C at position 2 bonded to N H 2 from which one H is connected by a hydrogen bond to N 7 of the right-hand G, N at position 3, C at position 6 double bonded to O that is connected by a hydrogen bond to H bonded to N at position 1 of the top left G, a double bond between positions 2 and 3, and a double bond between positions 4 and 5 that is shared with the five-membered ring. The five-membered ring has N at position 7 that is connected by a hydrogen bond to H bonded to N H bonded to C 2 of the top right G, N at position 9 bonded to C 1 prime below, and a double bond between position 7 and position 8. The top left G has the same overall structure as the bottom left G but is oriented with the five-membered ring above and the six-membered ring below. H in position 9 s bonded to C 1 prime to the top left. C 6 of the six-membered ring is double bonded to O that is connected by a hydrogen bond to H bonded to N at position 1 of the top right hand G. N in position 7 on the five-membered ring is connected by a hydrogen bond to H that is further bonded to N H that is bonded to C 2 of the six-membered ring of the top right G. The top right G is similar in structure overall to the others, but has N 9 bonded to C 1 prime at the top right vertex, C 7 connected by a hydrogen bond to H of N H bonded to C 2 of the G below, and C 6 double bonded to O that is connected by a hydrogen bond to H bonded to N at position 1 of the bottom right G. The bottom right G has a similar structure overall except that C 6 is double bonded to O that is connected by a hydrogen bond to H bonded to N 1 of the left-hand G and N 7 is connected by a hydrogen bond to H of N H 2 bonded to C 2 on the left-hand G. Part d shows four stands in different colors. One begins at the lower left and curves diagonally upward along the front to the upper right. One curves from the bottom center around the left to the center front top. Another curves from the right bottom to the rear center top. A fourth begins at the rear right and curves along the rear up to the top left. Single and double ring bases extend into the center from all four strands. Part e shows two structures. The one on the left, labeled parallel, has three horizontal layers that each contain four sets of double rings, each with one six-membered ring and one five-membered ring, with one pair of rings in each corner in the same location on each layer. Arrows run from below the bottom layer to above the top layer along each corner, and all point upward. The structure on the right, labeled antiparallel, has a similar structure except that the left front and right back arrows point up and the left rear and right front arrows point down.

Four DNA strands can also pair to form a tetraplex (quadruplex), but this occurs readily only for DNA sequences with a very high proportion of guanosine residues (Fig. 8-20c, d). The guanosine tetraplex, or G tetraplex, is quite stable over a broad range of conditions. The orientation of strands in the tetraplex can vary as shown in Figure 8-20e.

In the DNA of living cells, sites recognized by many sequence-specific DNA-binding proteins (Chapter 28) are arranged as palindromes, and polypyrimidine or polypurine sequences that can form triple helices are found within regions involved in the regulation of expression of some eukaryotic genes.

Messenger RNAs Code for Polypeptide Chains

We now turn our attention to the expression of the genetic information that DNA contains. Given that the DNA of eukaryotes is largely confined to the nucleus, whereas protein synthesis occurs on ribosomes in the cytoplasm, some molecule other than DNA must carry the genetic message from the nucleus to the cytoplasm. As early as the 1950s, RNA was considered the logical candidate: RNA is found in both the nucleus and the cytoplasm, and an increase in protein synthesis is accompanied by an increase in the amount of cytoplasmic RNA and an increase in its rate of turnover. These and other observations led several researchers to suggest that RNA carries genetic information from DNA to the protein-synthesizing machinery of the ribosome. In 1961, François Jacob and Jacques Monod presented a unified (and essentially correct) picture of many aspects of this process. They proposed the name “messenger RNA” (mRNA) for that portion of the total cellular RNA carrying the genetic information from DNA to the ribosomes. The mRNAs are formed on a DNA template by the process of transcription. Once they reach the ribosomes, the messengers provide the templates that specify amino acid sequences in polypeptide chains. Although mRNAs from different genes can vary greatly in length, the mRNAs from a particular gene generally have a defined size.

In bacteria and archaea, a single mRNA molecule may code for one or several polypeptide chains. If it carries the code for only one polypeptide, the mRNA is monocistronic; if it codes for two or more different polypeptides, the mRNA is polycistronic. In eukaryotes, most mRNAs are monocistronic. (For the purposes of this discussion, “cistron” refers to a gene. The term itself has historical roots in the science of genetics, and its formal genetic definition is beyond the scope of this text.) The minimum length of an mRNA is set in part by the length of the polypeptide chain for which it codes. For example, a polypeptide chain of 100 amino acid residues requires an RNA coding sequence of at least 300 nucleotides, because each amino acid is coded by a nucleotide triplet (this and other details of protein synthesis are discussed in Chapter 27). However, mRNAs transcribed from DNA are always somewhat longer than the length needed simply to code for a polypeptide sequence (or sequences). The additional, noncoding RNA includes sequences required to begin and end translation by the ribosome, as well as regulatory sequences. Figure 8-21 summarizes the general structure of bacterial mRNAs.

A figure shows monocistronic and polycistronic bacterial m R N A. — FIGURE 8-21 Bacterial mRNA. Schematic diagrams show (a) monocistronic and (b) polycistronic mRNAs of bacteria. Red segments represent RNA coding for a gene product; gray segments represent noncoding RNA. In the polycistronic transcript, noncoding RNA separates the three genes.

Many RNAs Have More Complex Three-Dimensional Structures

Messenger RNA is only one of several classes of cellular RNA. Transfer RNAs are adapter molecules that act in protein synthesis; covalently linked to an amino acid at one end, each tRNA pairs with the mRNA in such a way that amino acids are joined to a growing polypeptide in the correct sequence. Ribosomal RNAs are components of ribosomes. There is also a wide variety of noncoding RNAs, including some (called ribozymes) that have enzymatic activity. All the RNAs are considered in detail in Chapter 26. The diverse and often complex functions of these RNAs reflect a diversity of structure much richer than that observed in DNA molecules.

The product of transcription of DNA is always single-stranded RNA. The single strand tends to assume a right-handed helical conformation dominated by base-stacking interactions (Fig. 8-22), which are stronger between two purines than between a purine and a pyrimidine or between two pyrimidines. The purine-purine interaction is so strong that a pyrimidine separating two purines is often displaced from the stacking pattern so that the purines can interact. Any self-complementary sequences in the molecule trigger folding into structures with more complexity. RNA can base-pair with complementary regions of either RNA or DNA. Base pairing matches the pattern for DNA: G pairs with C and A pairs with U (or with the occasional T residue in some RNAs). One difference is that base pairing between G and U residues is allowed in RNA (see Fig. 8-24) when complementary sequences in two single strands of RNA (or within a single strand of RNA that folds back on itself to align the residues) pair with each other. The paired strands in RNA or RNA-DNA duplexes are antiparallel, as in DNA.

A figure shows the typical right-handed stacking pattern of single-stranded R N A. — FIGURE 8-22 Typical right-handed stacking pattern of single-stranded RNA. The bases are shown in yellow, the phosphorus atoms in orange, and the riboses and phosphate oxygens in green. Green is used to represent RNA strands in succeeding chapters, just as blue is used for DNA.

When two strands of RNA with perfectly complementary sequences are paired, the predominant double-stranded structure is an A-form right-handed double helix. However, strands of RNA that are perfectly paired over long regions of sequence are uncommon. The three-dimensional structures of many RNAs, like those of proteins, are complex and unique. Weak interactions, especially base-stacking interactions, help stabilize RNA structures, just as they do in DNA. Z-form helices have been made in the laboratory (under very high-salt or high-temperature conditions). The B form of RNA has not been observed. Breaks in the regular A-form helix caused by mismatched or unmatched bases in one or both strands are common and result in bulges or internal loops (Fig. 8-23). Hairpin loops form between nearby self-complementary (palindromic) sequences. Extensive base-paired helical segments are formed in many RNAs (Fig. 8-24), and the resulting hairpins are the most common type of secondary structure in RNA. Specific short base sequences (such as UUCG) are often found at the ends of RNA hairpins and are known to form particularly tight and stable loops. Such sequences may act as starting points for the folding of an RNA molecule into its precise three-dimensional structure. Other contributions are made by hydrogen bonds that are not part of standard Watson-Crick base pairs. For example, the $2^{'}$ $2 prime$ -hydroxyl group of ribose can hydrogen-bond with other groups. Some of these properties are evident in the tertiary structure of the phenylalanine transfer RNA of yeast — the tRNA responsible for inserting Phe residues into polypeptides — and in two RNA enzymes, or ribozymes, whose functions, like those of protein enzymes, depend on their three-dimensional structures (Fig. 8-25).

A two-part figure, a and b, shows the types of secondary structures of R N As, bulge, internal loop, hairpin loop, and how a hairpin loop can form a right-handed helix. — FIGURE 8-23 Secondary structure of RNAs. (a) Bulge, internal loop, and hairpin loop. (b) The paired regions generally have an A-form right-handed helix, as shown for a hairpin. The single UG base pair is identified with a green dot. [(b) Data from PDB ID 1GID, J. H. Cate et al., *Science* 273:1678, 1996.]

Part a shows two single strands on the left that come together, separate slightly to form a bulge, then come together, then separate to form a circular opening labeled internal loop, then come together, then separate as the top forms a hairpin loop before they come together again. The top strand is shown with vertical lines extending downward from each base, and the bottom strand is shown with vertical lines extending upward from each base. The top strand begins with G C A and the bottom strand begins with U G C. This section is labeled single strands. The strands come together next. The top strand has A C C U and the bottom strand has U G G A. Next, there is U on the top strand with no vertical line below, a highlighted dot between the strands, and G on the bottom strand with no vertical line above. Next, there is G on the top strand connected to C on the bottom strand. Next, a highlighted piece of the top strand with A bumps upward and is labeled bulge. It has no complementary base below. Next, the strands reconnect with C U A C C on top and G A U C C on the bottom. Next, the strands separate to form a highlighted internal loop. The top half of the loop has U, then A at the top, then A. The bottom half has G, then A at the bottom, then A. The strands come back together with C G U on the top strand and G C A on the bottom strand. The top strand extends upward to the right in a loop. The strand extends up with U C C C U bonded to the other side of the hairpin, then curves around with unbonded A U U C G G, then A G G A that pairs with the opposite side of the loop. The strand bends parallel to the bottom strand and pairing begins again as the two strands bend down to the right. The top strand has G C C and the bottom strand has C G G. Part b shows a hairpin double helix. A single backbone strand forms a helix that ends in a closed loop before running back in a second helix bonded to the first. Single and double ring bases are visible throughout, with some outside of the backbone.

A two-part figure, a and b, shows many base paired helical structures in the P R N A component of an enzyme with the secondary structure shown in part a and the three-dimensional structure shown with a complexed t R N A in part b. — FIGURE 8-24 Base-paired helical structures in an RNA. Shown here are (a) the secondary structure and (b) the three-dimensional structure of the P RNA component of the RNase P of *Thermotoga maritima*. RNase P, which also contains a protein component (not shown), functions in the processing of transfer RNAs. A complexed tRNA is also shown in (b). Separate C (catalytic) and S (specificity) domains are denoted with yellow and light red backbones in both images. The blue dots in (a) indicate non-Watson-Crick $G –U$ $upper G en-dash upper U$ base pairs (boxed inset). Note that $G –U$ $upper G en-dash upper U$ base pairs are allowed only when presynthesized strands of RNA fold up or anneal with each other. [(a) Information from N. J. Reiter et al., *Nature* 468:784, 2010, Fig. 2a. (b) Data from PDB ID 3Q1R, N. J. Reiter et al., *Nature* 468:784, 2010.]

FIGURE 8-24 Base-paired helical structures in an RNA. Shown here are (a) the secondary structure and (b) the three-dimensional structure of the P RNA component of the RNase P of *Thermotoga maritima*. RNase P, which also contains a protein component (not shown), functions in the processing of transfer RNAs. A complexed tRNA is also shown in (b). Separate C (catalytic) and S (specificity) domains are denoted with yellow and light red backbones in both images. The blue dots in (a) indicate non-Watson-Crick $G –U$ $upper G en-dash upper U$ base pairs (boxed inset). Note that $G –U$ $upper G en-dash upper U$ base pairs are allowed only when presynthesized strands of RNA fold up or anneal with each other. [(a) Information from N. J. Reiter et al., *Nature* 468:784, 2010, Fig. 2a. (b) Data from PDB ID 3Q1R, N. J. Reiter et al., *Nature* 468:784, 2010.]

Part a shows a complex structure with two halves. There is a C domain on the left with many parallel pieces of double-stranded R N A connected by bars representing hydrogen bonds with a few loops, bulges, and hairpins present. Long unpaired individual segments connect the various paired segments and to the lower right, connect the C domain to the S domain. The S domain is smaller and somewhat less complex with several prominent hairpins. A close-up of a double-stranded piece at the top right of the C domain where there is a dot between the strands shows guanine hydrogen bonded to uracil. Guanine has a five-membered ring on the left bonded to a six-membered ring on the right. The five-membered ring has N in position 7, N in position 9 that is further bonded, a double bond between position 7 and position 8, and a double bond between position 4 and position 5 that is shared with the six-membered ring. The six-membered ring has N in position 1 bonded to H that is hydrogen bonded to O double bonded to C 2 in uracil; C 2 bonded to N H 2, N in position 3; a double bond between position 2 and position 3; and C 6 double bonded to O that is hydrogen bonded to H bonded to N at position 3 of uracil. Uracil is a six-membered ring with N in position 1 further bonded; C 2 double bonded to O that is hydrogen bonded to H bonded to N in position 1 of guanine; N in position 3 bonded to H that is hydrogen bonded to O that is double bonded to C 6 of guanine; C double bonded to O in position 4; and a double bond between C 5 and C 6. The detail of the R N A structures is as follows. The double-stranded pieces are connected by single lines and will be numbered here for clarity but are not numbered in the figure. At the lower left, there is a vertical piece, 1, with its left strand connected to another vertical piece above, 2, and its right strand connected to a horizontal line to the right. Piece 2 above ends with a rounded region, forming a loop, and has a bulge on the right with a dot between the strands. The bottom of the right strand connects to a line that extends horizontally, then bends vertically, then extends horizontally to a piece that goes upward, piece 3, and has X-shaped bonds beneath a horizontal bond connecting it with a longer piece on its left from which a strand runs horizontally back to the bottom right strand of piece 1. The longer left strand has three angled bars to the left to a long strand that is connected below by a strand that runs horizontally, has a short vertical piece, and then runs horizontally to join the top right of piece 1 at the far left. A strand extends from the bottom left side of piece 1 and runs horizontally across the bottom, then bends vertically before bending to meet the top right side of a vertical piece, piece 4. The right side of piece 4 runs lower than the left side and has a line above to vertical piece 5, which has a slight bump on its left side. Its left side has a horizontal strand below that connects to the small piece with a single bond and an X-shaped bond to piece 3, before running back to the bottom right side of piece 2. A strand from the top right half of piece 5 bends horizontally and then down to the left side of piece 3. A strand from the top right of piece 5 extends up to a short horizontal piece, piece 6, with three horizontal bars and then a dot at the top. A line runs horizontally from the bottom right side of piece 6 to a small vertical single piece before bending to the left and then up again to join the right side of a diagonal piece, piece 7. A strand extends down from the bottom right of piece 7 to a horizontal piece, then jogs downward to a stem loop with the curved part at the bottom, piece 8. A strand extends from the top right of piece 8 and runs horizontally for a short distance before bending down to the previously referenced long piece that has three angled bars from which a strand runs to the to right of piece 1. The diagonal piece 7 has four sets of bonds and then opens into a loop. Two strands extend from the top of the loop and run diagonally up to the right, then horizontally, before joining the top of piece 9, which runs diagonally down to an opening with a half-loop on the left and a strand on the right before forming a vertical piece, piece 10, with one bar, a dot, and then two bars. A close-up of the region with the dot is shown to the right. Beneath piece 10, a strand runs from the bottom of the right side to the bottom of the right side of vertical piece 11 below, from which a strand runs up from the top left to the bottom left of piece 10. A strand runs from the bottom left side of piece 11 to the bottom of the first horizontal piece of the S domain, piece 12, and a strand runs horizontally from the top left side of piece 11 to bend down to meet a short vertical piece, piece 13, with a dot between the strands at the top and three bars below. A strand runs horizontally from the top of piece 13 and then bends vertically to meet the left end of the first piece of the S domain, piece 12. A strand runs horizontally from the bottom right of piece 13 to bend up through a small piece that connects to a strand that bends left and then up to join the top strand of diagonal piece 7. A strand runs from the bottom left of piece 13 to the left side of piece 5, which has a small bump on its left side. The S domain begins with piece 12, which is a small horizontal piece with a dot between the strand on the left and then four bars. The top connects to a strand that bends up, right, and then up to join the left side of an opening in the middle of piece 14, which is a horizontal piece with loops on each end connected by a short strand at the top. A strand runs down from the right half of the opening at the bottom of piece 14 and bends left to join the top half of diagonal piece 15, which is a large piece that angles down to the right. A strand runs from the bottom of piece 12 to the bottom half of diagonal piece 15. This piece has a loop on the top and a loop on the bottom. The top strand bends upward to an irregular, large, bumpy region before ending with another diagonal horizontal piece that ends in a loop. The complementary side of the loop runs straight apart from a small bump just before the large, bumpy region above and then has a small bump underneath the middle of the bumpy region before a strand runs down vertically to a horizontal piece, piece 16, with loops on each side. There are many irregular bonds at different angles in the wide bumpy region. A pair of strands runs down from piece 15 to the horizontal loop structure below, one from each side of an opening in diagonal piece 15 beneath the left side of the bumpy region. The horizontal loop structure has a small bump on the top just to the left of where the vertical strands join and has a strand with a small larger piece on it across the bottom beneath the open region. Part b shows a t R N A with three different parts. The bottom part extends up vertically to meet the top two parts, one extending out to the left and one extending out to the right.

A three-part figure, a, b, and c, shows three-dimensional structure in a phenylalanine t R N A with unusual base-pairing, a hammerhead ribozyme, and an m R N A intron that is also a ribozyme. — FIGURE 8-25 Three-dimensional structure in RNA. (a) Three-dimensional structure of phenylalanine tRNA of yeast. Some unusual base-pairing patterns found in this tRNA are shown in the numbered insets. Note in a hydrogen bond with a ribose $2^{'}$ $2 prime$ -hydroxyl group and in a hydrogen bond with the oxygen of a ribose phosphodiester (both shown in red). (b) A hammerhead ribozyme (so named because the secondary structure at the active site looks like the head of a hammer), derived from certain plant viruses. Ribozymes, or RNA enzymes, catalyze a variety of reactions, primarily in RNA metabolism and protein synthesis. The complex three-dimensional structures of these RNAs reflect the complexity inherent in catalysis, as described for protein enzymes in Chapter 6. (c) A segment of mRNA known as an intron, from the ciliated protozoan *Tetrahymena thermophila.* This intron (a ribozyme) catalyzes its own excision from between exons in an mRNA strand (discussed in Chapter 26). [Data from (a) PDB ID 1TRA, E. Westhof and M. Sundaralingam, *Biochemistry* 25:4868, 1986; (b) PDB ID 1MME, W. G. Scott et al., *Cell* 81:991, 1995; (c) PDB ID 1GRZ, B. L. Golden et al., *Science* 282:259, 1998.]

FIGURE 8-25 Three-dimensional structure in RNA. (a) Three-dimensional structure of phenylalanine tRNA of yeast. Some unusual base-pairing patterns found in this tRNA are shown in the numbered insets. Note in a hydrogen bond with a ribose $2^{'}$ $2 prime$ -hydroxyl group and in a hydrogen bond with the oxygen of a ribose phosphodiester (both shown in red). (b) A hammerhead ribozyme (so named because the secondary structure at the active site looks like the head of a hammer), derived from certain plant viruses. Ribozymes, or RNA enzymes, catalyze a variety of reactions, primarily in RNA metabolism and protein synthesis. The complex three-dimensional structures of these RNAs reflect the complexity inherent in catalysis, as described for protein enzymes in Chapter 6. (c) A segment of mRNA known as an intron, from the ciliated protozoan *Tetrahymena thermophila.* This intron (a ribozyme) catalyzes its own excision from between exons in an mRNA strand (discussed in Chapter 26). [Data from (a) PDB ID 1TRA, E. Westhof and M. Sundaralingam, *Biochemistry* 25:4868, 1986; (b) PDB ID 1MME, W. G. Scott et al., *Cell* 81:991, 1995; (c) PDB ID 1GRZ, B. L. Golden et al., *Science* 282:259, 1998.]

Part a shows a double-stranded molecule that has a vertical piece connected to the left side of a horizontal piece on top. Arrows numbered 1, 2, and 3 from top to bottom point to regions in the vertical piece. Part b shows a piece of R N A that forms roughly a V-shape with strands to left and right that join together at the bottom. Part c shows a rounded piece of irregularly arranged R N A surrounding many bases inside with a double stranded piece extending up diagonally to the right from near the left side of the larger piece below. Closeups show the chemical structure of the three numbered portions of the molecule in part a. Number 1 shows a small, curved piece of R N A backbone with an adenine base at its lower right forming a hydrogen bond to uracil of a vertical strand to its right. Another strand runs diagonally from lower left to upper right behind the top of the strand with a uracil and has an adenine that extends down to form two hydrogen bonds with uracil below. A close-up of the structures of the nitrogenous bases shows the bonding. Adenine on the left has a five-membered ring joined to a six-membered ring. The five-membered ring has N 7 at the bottom left vertex, N 9 at its top left vertex and bonded to ribose, a double bond between position 7 and position 8, and a double bond between C 4 and C 5 that is shared with the six-membered ring. The six-membered ring has N at position 1 that is hydrogen bonded to highlighted H that is further bonded to O that is bonded to 2 prime ribose; N at position 3; C 6 bonded to N H 2; and double bonds between C 1 and C 6 and C 2 and C 3. Uracil is shown to the right as a six-membered ring with C 1 bonded to ribose with highlighted 2 prime connected to highlighted O bonded to H that forms a hydrogen bond with N in position 1 of adenine; C 2 double bonded to O that has a hydrogen bond with H of N H 2 bonded to C 2 of a second adenine above; N in position 3 with H that his hydrogen bonded to N in position 9 of the adenine above; C 4 double bonded to O; and a double bond between C 5 and C 6. Adenine is above and to the left of uracil and has a six-membered ring on the left and a five-membered ring on the right. The six-membered ring has N in position 1, N in position 3; C 6 bonded to N H 2 of which one H forms a hydrogen bond with O double bonded to C 2 of uracil; double bonds between positions 2 and 3 and positions 1 and 6; and a double bond between C 4 and C 5 that is shared with the five-membered ring. The five-membered ring has N at position 7 with a hydrogen bond to H bonded to N at position 3 of uracil; N at position 9 bonded to ribose; and a double bond between position 7 and position 8. Number 2 shows a strand running toward the right toward the viewer on the left with 7-methylguanine above it, a strand running almost vertically across the bottom with a bond up to 7-methylguanine, an almost vertical strand on the right with cytosine extending on the left with three hydrogen bonds to guanine attached to an almost horizontal strand across the top, and two hydrogen bonds between guanine and 7-methylguanine. A close-up of the structures shows 7-methylguanine as a five-membered ring on the left joined with a six-membered ring on the right. The five-membered ring has N plus in position 7 and bonded to C H 3; N in position 9 bonded to ribose below; a double bond between position 7 and position 8; and a double bond between position 4 and position 5 that is shared with the six-membered ring. The six-membered ring has N in position 1 bonded to H that is hydrogen bonded to N in position 7 of guanine; N in position 3; C double bonded to O in position 6; C in position 2 that is bonded to N H 2 from which one H is hydrogen bonded to O of a red-highlighted phosphate below and the other H is hydrogen bonded to O double bonded to C 6 of guanine; and a double bond between position 2 and position 3. The red highlighted phosphate is bonded to O on the right that is hydrogen bonded to H of N H bonded to C 2 of 7-methylguanine and also bonded to ribose, bonded to O minus below, bonded to O on the left that is further bonded to ribose, and double-bonded to O above. Cytosine is a six-membered ring with N in position 1 bonded to ribose, C 2 double bonded to O that has a hydrogen bond to H of N H 2 bonded to C 2 of guanine; N in position 3 that has a hydrogen bond to H bonded to N in position 1 of guanine; C 4 bonded to N H 2 of which one H has a hydrogen bond to O double bonded to C 6 in guanine; and double bonds between positions 3 and 4 and positions 5 and 6. Guanine has a five-membered ring on the left joined with a six-membered ring on the right. The five-membered ring has N in position 9 bonded to ribose above; N in position 7 hydrogen bonded to H bonded to N in position 1 of 7-methylguanine; a double bond between position 7 and position 8; and a double bond between position 4 and position 5 that is shared with the six-membered ring. The six-membered ring has N in position 1 bonded to H that is hydrogen bonded to N in position 3 of cytosine; C in position 2 bonded to N H 2 of which one H is hydrogen bonded to O double bonded to C 2 in cytosine; N in position 3; C 6 double bonded to O that is hydrogen bonded to H of N H 2 bonded to C 2 of 7-methylguanine and hydrogen bonded to H of N H 2 bonded to C 4 of cytosine; and a double bond between positions 2 and 3. Number 3 shows a C-shaped curved strand with adenine extending horizontally to the right to hydrogen bond with N superscript 2 end superscript dimethylguanine above a strand that curves away from the observer. A close-up of the structures shows adenine on the left and N superscript 2 end superscript dimethylguanine on the right. Adenine has a five-membered ring on the left joined with a six-membered ring on the right. The five-membered ring has N in position 7; N in position 9 bonded to ribose below; a double bond between positions 7 and 8; and a double bond between positions 4 and 5 that is shared with the six-membered ring. The six-membered ring has N in position 1 hydrogen bonded to H bonded to N in position 1 of N superscript 2 end superscript dimethylguanine; N in position 3; C 6 bonded to N H 2 of which H is hydrogen bonded to O double bonded to C 6 of N superscript 2 end superscript dimethylguanine; and double bonds between positions 2 and 3 and positions 1 and 6. N superscript 2 end superscript dimethylguanine has a six-membered ring on the left joined with a five-membered ring on the right. The six-membered ring has N in position 1 bonded to H that is hydrogen bonded to N in position 1 of adenine; C 2 bonded to N that is further bonded to 2 C H 3; N in position 3; C 6 double bonded to O that is hydrogen bonded to H bonded to N H bonded to C 6 of adenine; a double bond between position 2 and position 3; and a double bond between position 4 and position 5 that is shared with the five-membered ring. The five-membered ring has N in position 7; N in position 9 bonded to ribose; and a double bond between positions 7 and 8.

The analysis of RNA structure and the relationship between its structure and its function remains a robust field of inquiry that has many of the same complexities as the analysis of protein structure. The importance of understanding RNA structure grows as we become increasingly aware of the large number of functional roles for RNA molecules.

SUMMARY 8.2 Nucleic Acid Structure

Many lines of evidence show that DNA bears genetic information. Some of the earliest evidence came from the Avery-MacLeod-McCarty experiment, which showed that DNA isolated from one bacterial strain can enter and transform the cells of another strain, endowing the second strain with some of the inheritable characteristics of the donor. The Hershey-Chase experiment showed that the DNA of a bacterial virus, but not its protein coat, carries the genetic message for replication of the virus in a host cell.
Putting together the available data, Watson and Crick postulated that native DNA consists of two antiparallel chains in a right-handed double-helical arrangement. Complementary base pairs, $A ═ T$ $upper A box drawings double horizontal upper T$ and $G ≡ C$ $upper G identical-to upper C$ , are formed by hydrogen bonding between chains in the helix. The base pairs are stacked perpendicular to the long axis of the double helix, 3.4 Å apart, with 10.5 bp per turn.
DNA can exist in several structural forms. Two variations of the Watson-Crick form, or B-DNA, are A- and Z-DNA.
Some sequence-dependent structural variations cause bends in the DNA molecule. DNA strands with appropriate sequences can form hairpin or cruciform structures or triplex or tetraplex DNA.
Messenger RNA transfers genetic information from DNA to ribosomes for protein synthesis.
Transfer RNA and ribosomal RNA are also involved in protein synthesis. RNA can be structurally complex; single RNA strands can fold into hairpins, double-stranded regions, or complex loops. Additional noncoding RNAs have a variety of special functions.