Proteins are polymers of amino acids, with each amino acid residue joined to its neighbor by a specific type of covalent bond. (The term “residue” reflects the loss of the elements of water when one amino acid is joined to another.) Proteins can be broken down (hydrolyzed) to their constituent amino acids by a variety of methods, and the earliest studies of proteins naturally focused on the free amino acids derived from them. Twenty different amino acids are commonly found in proteins. The first to be discovered was asparagine, in 1806. The last of the 20 to be found, threonine, was not identified until 1938. All the amino acids have trivial or common names, in some cases derived from the source from which they were first isolated. Asparagine was first found in asparagus, and glutamate in wheat gluten; tyrosine was first isolated from cheese (its name is derived from the Greek tyros, “cheese”); glycine (Greek glykos, “sweet”) was so named because of its sweet taste.
Learning the names, structures, and chemical properties of the 20 common amino acids found in proteins is one of the key memorization trials of every beginning biochemistry student. The necessity rapidly becomes apparent in succeeding chapters. It is impossible to discuss protein structure, protein function, ligand-binding sites, enzyme active sites, and most other biochemical topics without this foundation. The amino acids are part of the biochemistry vocabulary.
All 20 of the common amino acids are α-amino acids. They have a carboxyl group and an amino group bonded to the same carbon atom (the α carbon) (Fig. 3-2). They differ from each other in their side chains, or R groups, which vary in structure, size, and electric charge, and which influence the solubility of the amino acids in water. In addition to these 20 amino acids, there are many less common ones. Some are residues modified after a protein has been synthesized, others are amino acids present in living organisms but not as constituents of proteins, and two are special cases found in just a few proteins. The common amino acids of proteins have been assigned three-letter abbreviations and one-letter symbols (see Table 3-1), which are used as shorthand to indicate the composition and sequence of amino acids polymerized in proteins.
The three-letter code is easily understood, the abbreviations generally consisting of the first three letters of the amino acid name. The one-letter code was devised by Margaret Oakley Dayhoff, considered by many to be the founder of the field of bioinformatics. The one-letter code reflects an attempt to reduce the size of the data files (in an era of limited computer memory) used to describe amino acid sequences. It was designed to be easily memorized, and understanding its origin can help students do just that. For six amino acids (CHIMSV), the first letter of the amino acid name is unique and thus is used as the symbol. For five others (AGLPT), the first letter of the name is not unique but is assigned to the amino acid that is most common in proteins (for example, leucine is more common than lysine). For another four, the letter used is phonetically suggestive (RFYW: aRginine, Fenylalanine, tYrosine, tWiptophan). The rest were harder to assign. Four (DNEQ) were assigned letters found within or suggested by their names (asparDic, asparagiNe, glutamEke, Q-tamine). That left lysine. Only a few letters were left, and K was chosen because it was the closest to L.
For all the common amino acids except glycine, the α carbon is bonded to four different groups: a carboxyl group, an amino group, an R group, and a hydrogen atom (Fig. 3-2; in glycine, the R group is another hydrogen atom). The α-carbon atom is thus a chiral center (p. 61). Because of the tetrahedral arrangement of the bonding orbitals around the α-carbon atom, the four different groups can occupy two unique spatial arrangements, and thus amino acids have two possible stereoisomers. Since they are nonsuperposable mirror images of each other (Fig. 3-3), the two forms represent a class of stereoisomers called enantiomers (see Fig. 1-21). All molecules with a chiral center are also optically active — that is, they rotate the plane of plane-polarized light (see Box 1-2).
Two conventions are used to identify the carbons in an amino acid — a practice that can be confusing. The additional carbons in an R group are commonly designated β, γ, δ, ε, and so forth, proceeding out from the α carbon. For most other organic molecules, carbon atoms are simply numbered from one end, giving highest priority (C-1) to the carbon with the substituent containing the atom of highest atomic number. Within this latter convention, the carboxyl carbon of an amino acid would be C-1 and the α carbon would be C-2.
In cases such as amino acids with heterocyclic R groups (e.g., histidine), where the Greek lettering system is ambiguous, the numbering system is used. For branched amino acid side chains, equivalent carbons are given numbers after the Greek letters. Leucine thus has δ1 and δ2 carbons (see the structure in Fig. 3-5).
Special nomenclature has been developed to specify the absolute configuration of the four substituents of asymmetric carbon atoms. The absolute configurations of simple sugars and amino acids are specified by the d, l system (Fig. 3-4), based on the absolute configuration of the three-carbon sugar glyceraldehyde, a convention proposed by Emil Fischer in 1891. (Fischer knew what groups surrounded the asymmetric carbon of glyceraldehyde but had to guess at their absolute configuration; he guessed right, as was later confirmed by x-ray diffraction analysis.) For all chiral compounds, stereoisomers having a configuration related to that of l-glyceraldehyde are designated l, and stereoisomers related to d-glyceraldehyde are designated d. The functional groups of l-alanine are matched with those of l-glyceraldehyde by aligning those that can be interconverted by simple, one-step chemical reactions. Thus the carboxyl group of l-alanine occupies the same position about the chiral carbon as does the aldehyde group of l-glyceraldehyde, because an aldehyde is readily converted to a carboxyl group via a one-step oxidation. Historically, the similar l and d designations were used for levorotatory (rotating plane-polarized light to the left) and dextrorotatory (rotating light to the right). However, not all l-amino acids are levorotatory, and the convention shown in Figure 3-4 was needed to avoid potential ambiguities about absolute configuration. By Fischer’s convention, l and d refer only to the absolute configuration of the four substituents around the chiral carbon, not to optical properties of the molecule.
Another system of specifying configuration around a chiral center is the RS system, which is used in the systematic nomenclature of organic chemistry and describes more precisely the configuration of molecules with more than one chiral center (p. 17).
Nearly all biological compounds with a chiral center occur naturally in only one stereoisomeric form, either d or l. The amino acid residues in protein molecules are almost all l stereoisomers, with less than 1% being found in the d-configuration. The rare d-amino acid residues generally have a precise structural purpose, and they are introduced to a protein by enzyme-catalyzed reactions that occur after the proteins are synthesized on a ribosome.
It is remarkable that virtually all amino acid residues in proteins are l stereoisomers. When chiral compounds are formed by ordinary chemical reactions, the result is a racemic mixture of d and l isomers, which are difficult for a chemist to distinguish and separate. But to a living system, d and l isomers are as different as the right hand and the left. The formation of stable, repeating substructures in proteins (Chapter 4) requires that their constituent amino acids be of one stereochemical series. Cells are able to specifically synthesize the l isomers of amino acids because the active sites of enzymes are asymmetric, causing the reactions they catalyze to be stereospecific.
Knowledge of the chemical properties of the common amino acids is central to an understanding of biochemistry. The topic can be simplified by grouping the amino acids into five main classes based on the properties of their R groups (Table 3-1), particularly their polarity, or tendency to interact with water at biological pH (near pH 7.0). The polarity of the R groups varies widely, from nonpolar and hydrophobic (water-insoluble) to highly polar and hydrophilic (water-soluble). A few amino acids — especially glycine, histidine, and cysteine — are somewhat difficult to characterize or do not fit perfectly in any one group. They are assigned to particular groupings based on considered judgments rather than absolutes.
values | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Amino acid | Abbreviation/symbol | a | (—COOH) | () | (R group) | pI | Hydropathy indexb | Occurrence in proteins (%)c | ||
Nonpolar, aliphatic R groups | ||||||||||
Glycine |
Gly G |
75 |
2.34 |
9.60 |
5.97 |
−0.4 |
7.2 |
7.3 |
7.3 |
|
Alanine |
Ala A |
89 |
2.34 |
9.69 |
6.01 |
1.8 |
7.8 |
9.4 |
7.2 |
|
Proline |
Pro P |
115 |
1.99 |
10.96 |
6.48 |
−1.6d |
5.2 |
4.4 |
4.2 |
|
Valine |
Val V |
117 |
2.32 |
9.62 |
5.97 |
4.2 |
6.6 |
7.1 |
8.2 |
|
Leucine |
Leu L |
131 |
2.36 |
9.60 |
5.98 |
3.8 |
9.1 |
10.6 |
9.9 |
|
Isoleucine |
Ile I |
131 |
2.36 |
9.68 |
6.02 |
4.5 |
5.3 |
6.0 |
7.6 |
|
Methionine |
Met M |
149 |
2.28 |
9.21 |
5.74 |
1.9 |
2.3 |
2.2 |
2.2 |
|
Aromatic R groups | ||||||||||
Phenylalanine |
Phe F |
165 |
1.83 |
9.13 |
5.48 |
2.8 |
3.9 |
4.0 |
4.5 |
|
Tyrosine |
Tyr Y |
181 |
2.20 |
9.11 |
10.07 |
5.66 |
−1.3 |
3.2 |
3.0 |
3.9 |
Tryptophan |
Trp W |
204 |
2.38 |
9.39 |
5.89 |
−0.9 |
1.4 |
1.3 |
1.1 |
|
Polar, uncharged R groups | ||||||||||
Serine |
Ser S |
105 |
2.21 |
9.15 |
5.68 |
−0.8 |
6.8 |
6.1 |
5.7 |
|
Threonine |
Thr T |
119 |
2.11 |
9.62 |
5.87 |
−0.7 |
5.9 |
5.4 |
4.5 |
|
Cysteinee |
Cys C |
121 |
1.96 |
10.28 |
8.18 |
5.07 |
2.5 |
1.9 |
1.2 |
0.8 |
Asparagine |
Asn N |
132 |
2.02 |
8.80 |
5.41 |
−3.5 |
4.3 |
3.7 |
3.4 |
|
Glutamine |
Gln Q |
146 |
2.17 |
9.13 |
5.65 |
−3.5 |
4.2 |
4.5 |
2.0 |
|
Positively charged R groups | ||||||||||
Lysine |
Lys K |
146 |
2.18 |
8.95 |
10.53 |
9.74 |
−3.9 |
5.9 |
4.7 |
6.8 |
Histidine |
His H |
155 |
1.82 |
9.17 |
6.00 |
7.59 |
−3.2 |
2.3 |
2.4 |
1.6 |
Arginine |
Arg R |
174 |
2.17 |
9.04 |
12.48 |
10.76 |
−4.5 |
5.1 |
5.6 |
5.9 |
Negatively charged R groups | ||||||||||
Aspartate |
Asp D |
133 |
1.88 |
9.60 |
3.65 |
2.77 |
−3.5 |
5.3 |
5.1 |
5.0 |
Glutamate |
Glu E |
147 |
2.19 |
9.67 |
4.25 |
3.22 |
−3.5 |
6.3 |
6.0 |
8.2 |
|
The structures of the 20 common amino acids are shown in Figure 3-5, and some of their properties are listed in Table 3-1. Within each class there are gradations of polarity, size, and shape of the R groups.
Nonpolar, Aliphatic R Groups The R groups in this class of amino acids are nonpolar and hydrophobic. The side chains of alanine, valine, leucine, and isoleucine tend to cluster together within proteins, stabilizing protein structure through the hydrophobic effect. Glycine has the simplest structure. Although it is most easily grouped with the nonpolar amino acids, its very small side chain makes no real contribution to interactions driven by the hydrophobic effect. Methionine, one of the two sulfur-containing amino acids, has a slightly nonpolar thioether group in its side chain. Proline has an aliphatic side chain with a distinctive cyclic structure. The secondary amino (imino) group of proline residues is held in a rigid conformation that reduces the structural flexibility of polypeptide regions containing proline.
Aromatic R Groups Phenylalanine, tyrosine, and tryptophan, with their aromatic side chains, are relatively nonpolar (hydrophobic). All can contribute to the hydrophobic effect. The hydroxyl group of tyrosine can form hydrogen bonds, and it is an important functional group in some enzymes. Tyrosine and tryptophan are significantly more polar than phenylalanine because of the tyrosine hydroxyl group and the nitrogen of the tryptophan indole ring.
Tryptophan and tyrosine, and to a much lesser extent phenylalanine, absorb ultraviolet light (Fig. 3-6; see also Box 3-1). This accounts for the characteristic strong absorbance of light by most proteins at a wavelength of 280 nm, a property exploited by researchers in the characterization of proteins.
Polar, Uncharged R Groups The R groups of these amino acids are more soluble in water, or more hydrophilic, than those of the nonpolar amino acids because they contain functional groups that form hydrogen bonds with water. This class of amino acids includes serine, threonine, cysteine, asparagine, and glutamine. The polarity of serine and threonine is contributed by their hydroxyl groups, and the polarity of asparagine and glutamine is contributed by their amide groups. Cysteine is an outlier here because its polarity, contributed by its sulfhydryl group, is quite modest. Cysteine is a weak acid and can make weak hydrogen bonds with oxygen or nitrogen.
Asparagine and glutamine are the amides of two other amino acids also found in proteins — aspartate and glutamate, respectively — to which asparagine and glutamine are easily hydrolyzed by acid or base. Cysteine is readily oxidized to form a covalently linked dimeric amino acid called cystine, in which two cysteine molecules or residues are joined by a disulfide bond (Fig. 3-7). The disulfide-linked residues are strongly hydrophobic (nonpolar). Disulfide bonds play a special role in the structures of many proteins by forming covalent links between parts of a polypeptide molecule or between two different polypeptide chains.
Cysteine is a vertical chain with C O O minus on top bonded to C H below that is bonded to N H 3 plus on the left and C H 2 below that is further bonded to highlighted S H. A second molecule of cysteine is inverted beneath the first so that their S H groups are adjacent. 2 H plus plus 2 e minus leave to produce cystine, in which the two S atoms have each lost H and are bonded to join the two cysteine molecules together. In the reverse reaction, 2 H plus plus 2 e minus must be added.
Positively Charged (Basic) R Groups The most hydrophilic R groups are those that are either positively charged or negatively charged. The amino acids in which the R groups have significant positive charge at pH 7.0 are lysine, which has a second primary amino group at the ε position on its aliphatic chain; arginine, which has a positively charged guanidinium group; and histidine, which has an aromatic imidazole group. As the only common amino acid having an ionizable side chain with near neutrality, histidine may be positively charged (protonated form) or uncharged at pH 7.0. His residues facilitate many enzyme-catalyzed reactions by serving as proton donors/acceptors.
Negatively Charged (Acidic) R Groups The two amino acids having R groups with a net negative charge at pH 7.0 are aspartate and glutamate, each of which has a second carboxyl group.
In addition to the 20 common amino acids, proteins may contain residues created by modification of common residues already incorporated into a polypeptide — that is, through postsynthetic modification (Fig. 3-8a). Among these uncommon amino acids are 4-hydroxyproline, a derivative of proline found in the fibrous protein collagen, and γ-carboxyglutamate, found in the blood-clotting protein prothrombin and in certain other proteins that bind as part of their biological function. More complex is desmosine, a derivative of four Lys residues, which is found in the fibrous protein elastin.
Part a shows structures of four molecules. 4-Hydroxyproline has a five-membered ring with N plus H 2 substituted for C at the bottom vertex, C O O minus bonded to the right side vertex, and highlighted O H bonded to the upper left vertex. Gamma-Carboxyglutamate is a five-carbon chain with C 2 bonded to H, C O O minus (C 1) to the left, and highlighted C O O minus above; C 3 bonded to 2 H, C 4 bonded to H and N H 3 plus, and C 5 bonded to C O O minus. Desmosine has a central benzene ring with N plus substituted for C at the bottom vertex that is further bonded to a highlighted chain of 4 C H 2 that is bonded to highlighted C H that is bonded to highlighted C O O minus and nonhighlighted N H 3 plus and three additional chains bonded at the upper right vertex, the upper left vertex, and the top vertex. The upper left vertex and adjacent double bond on the left side of the ring are highlighted, and the vertex is bonded to highlighted 2 C H 2 that are further bonded to highlighted C H that is further bonded to highlighted C O O minus and nonhighlighted N H 3 plus. The top vertex of the benzene ring is bonded to highlighted 3 C H 2 that is further bonded to highlighted C H that is further bonded to highlighted C O O minus and nonhighlighted N H 3 plus. The upper right vertex and adjacent single bond on the right side of the ring are highlighted, and the vertex is bonded to highlighted 2 C H 2 that is further bonded to highlighted C H that is further bonded to highlighted C O O minus and nonhighlighted N H 3 plus. Selenocysteine has a central carbon bonded to C O O minus on the right, N H 3 plus below, and C H 2 on the left that is further bonded to highlighted H S e. Pyrrolysine has a six-carbon chain with C 2 bonded to C O O minus (C 1), N H 3 plus below, and a chain of 4 C H 2 to the left that bonds to N H that further bonds to highlighted C double bonded to highlighted O below and further bonded to a highlighted five-membered ring with N at the top vertex, C H 3 bonded to the lower right vertex, and a double bond on the upper left side. Part b shows seven molecules. The first three all have a highlighted phosphate group on the left side with P double bonded to O above, bonded to O minus to the left and below, and further bonded on the right. Phosphoserine has a central carbon bonded to H, C O O minus on the right, N H 3 plus below, and C H 2 to the left that further bonds to O that bonds to P of the highlighted phosphate group. Phosphothreonine has a similar structure to phosphoserine, except that the C H 2 bonded to the O bonded to the phosphate group is bonded to H and C H 3 instead of 2 H. Phosphotyrosine has a similar structure to phosphoserine, except that there is a benzene ring between the O bonded to the phosphate group and the C H 2 with the substituents bonded in a para position. Omega-N-methylarginine has C bonded to C O O minus on the right, N H 3 plus below, H, and a chain of 3 C H 2 to the left that further bond to N H that is bonded to C that is bonded to N H 2 above and N H that is further bonded to highlighted C below. There is a dotted arc along the angle between the N H 2 and the N H with a plus sign next to the C. 6-N-acetyllysine has C bonded to C O O minus on the right, N H 3 plus below, H, and a chain of four C H 2 to the left that is further bond to N H that is bonded to a highlighted C below that is further double bonded to highlighted O and bonded to highlighted C H 3. Glutamate gamma-methyl ester has C that is bonded to C O O minus on the right, N H 3 plus below, H, and a chain of 2 C H 2 on the left that is bonded to C that is double bonded to O above and bonded to O that is further bonded to highlighted C H 3 on the left. Adenylyltyrosine has the same structure as phosphotyrosine on the right but the left-hand O of the highlighted phosphate is further bonded to a highlighted C H 2 that is bonded to the right vertex of a highlighted five-membered ring that is further bonded to a highlighted double ring structure. The five-membered ring has O at its top vertex, C bonded to the C H 2 bonded to phosphate of the chain to the right above and H below at its right vertex, C at the bottom right and bottom left vertices each with H above and O H below, and C at the left vertex bonded to H below and N at the lower right vertex of a double ring structure above. The double ring structure has a six-membered ring on the left and a five-membered ring on the right. The ring on the left is a benzene ring with N substituted for C at the upper left and bottom vertices, N H 2 bonded to C at the top vertex, and a double bond at the right side bond shared with the five-membered ring. The five-membered ring has N at its top right vertex, N at its bottom right vertex that bonds further below, and a double bond at its upper right side. Part c shows two molecules. Ornithine has C bonded to C O O minus on the right, N H 3 plus below, H, and a chain of 3 C H 2 to the left that is further bonded to N H 3 plus. Citrulline has C bonded to C O O minus on the right, N H 3 plus below, H, and a chain of 2 C H 2 on the left that is further bonded to N with H below that is further bonded to C that is double bonded to O below and bonded to N H 2 on the left.
Selenocysteine and pyrrolysine are special cases. These rare amino acid residues are not created through a postsynthetic modification. Instead, they are introduced during protein synthesis through an unusual adaptation of the genetic code, which we describe in Chapter 27. Selenocysteine contains selenium rather than the sulfur of cysteine. Actually derived from serine, selenocysteine is a constituent of just a few known proteins. Pyrrolysine is found in a few proteins in several methanogenic (methane-producing) archaea and in one known bacterium; it plays a role in methane biosynthesis.
Some amino acid residues in a protein may be modified transiently to alter the protein’s function. The addition of phosphoryl, methyl, acetyl, adenylyl, ADP-ribosyl, or other groups to particular amino acid residues can increase or decrease a protein’s activity (Fig. 3-8b). Phosphorylation is a particularly common regulatory modification. Covalent modification as a protein regulatory strategy is discussed in more detail in Chapter 6.
Some 300 additional amino acids have been found in cells. They have a variety of functions, but not all are constituents of proteins. Ornithine and citrulline (Fig. 3-8c) deserve special note because they are key intermediates (metabolites) in the biosynthesis of arginine (Chapter 22) and in the urea cycle (Chapter 18).
The amino and carboxyl groups of amino acids, along with the ionizable R groups of some amino acids, function as weak acids and bases. When an amino acid lacking an ionizable R group is dissolved in water at neutral pH, the α-amino and carboxyl groups create a dipolar ion, or zwitterion (German for “hybrid ion”), which can act as either an acid or a base (Fig. 3-9). Substances having this dual (acid-base) nature are amphoteric and are often called ampholytes (from “amphoteric electrolytes”). A simple monoamino monocarboxylic α-amino acid, such as alanine, is a diprotic acid when fully protonated; it has two groups, the group and the group, that can yield protons:
The first molecule has C in the center with C O O H on the right, N H 3 plus below, H above, and R to the left. The net charge is plus 1. A proton is lost and a similar molecule is produced that has C O O minus in place of C O O H and a net charge of 0. A proton is lost and a similar molecule is produced that has N H 2 in place of N H 3 plus and a net charge of minus 1.
The nonionic form has C bonded to C O O H on the right, N H 2 below, H above, and R to the left. The zwitterionic form is similar, except that C O O H is replaced by highlighted C O O minus and N H 2 is replaced by highlighted N H 3 plus. The zwitterion is shown below with only N H 3 plus highlighted. Text below reads, zwitterion as acid. In a reversible reaction, the zwitterion becomes a similar molecule with highlighted N H 2 in place of N H 3 plus, and there is a free proton. The zwitterion is again shown below with only C O O minus highlighted. Text below reads, zwitterion as base. When H plus is added in a reversible reaction, C O O minus becomes highlighted C O O H.
Acid-base titration involves the gradual addition or removal of protons (Chapter 2). Figure 3-10 shows the titration curve of the diprotic form of glycine. The two ionizable groups of glycine, the carboxyl group and the amino group, are titrated with a strong base such as NaOH. The plot has two distinct stages, corresponding to deprotonation of two different groups on glycine. Each of the two stages resembles in shape the titration curve of a monoprotic acid, such as acetic acid (see Fig. 2-16), and can be analyzed in the same way. At very low pH, the predominant ionic species of glycine is the fully protonated form, .
The graph plots O H minus equivalents on the horizontal axis ranging from 0 to 2, labeled in increments of 0.5, against p H on the vertical axis ranging from 0 to 13, labeled at 7 and 13. Across the top of the graph, a reaction is shown. C bonded to C O O H below, 2 H, and N H 3 plus above is shown on the left and undergoes a reversible reaction labeled, p K 1 to produce a similar structure shown midway across the graph that has C O O minus in place of C O O H. This second molecule undergoes a reversible reaction labeled, p K 2 to produce a similar molecule at the far right of the graph that has N H 2 in place of N H 3 plus. The curve begins at (0, 0), rises sharply to (0, 0.9), and curves gradually into a blue rectangle through point (0.5, 2.34). Dotted lines extend from 0.5 and 2.34 to this dot, and text above the blue box reads, p K 1 equals 2.34. The line continues to rise gradually to (0.8, 2.9), then rises sharply after it leaves the blue box and runs along a vertical line extending from 1 on the horizontal axis from (1, 4) to (1, 8.2). At the midpoint of this section, a dot at (1, 5.97) is labeled, p I equals 5.97. The line curves away from the vertical line above (1, 7.5) and curves gradually up to a point at (1.5, 9.60) in the middle of another blue rectangle. Text above the rectangle reads, p K 2 equals 9.62. The curve continues and then begins to curve upward sharply as it leaves the blue box at (1.8, 10) before ending at (2, 13). All data are approximate.
In the first stage of the titration, the group of glycine (with its lower ) loses its proton. At the midpoint of this stage, equimolar concentrations of the proton-donor and the proton-acceptor species are present. As in the titration of any weak acid, a point of inflection is reached at this midpoint where the pH is equal to the of the protonated group that is being titrated (see Fig. 2-17). For glycine, the pH at the midpoint is 2.34; thus its group has a (labeled in Fig. 3-10) of 2.34. (Recall from Chapter 2 that pH and are simply convenient notations for proton concentration and the equilibrium constant for ionization, respectively. The is a measure of the tendency of a group to give up a proton, with that tendency decreasing 10-fold as the increases by one unit.) As the titration of glycine proceeds, another point of inflection is reached at pH 5.97; at this point, removal of the first proton is essentially complete and removal of the second has just begun. At this pH, glycine is present largely as the dipolar ion (zwitterion) . We shall return to the significance of this inflection point in the titration curve (labeled pI in Fig. 3-10) shortly.
The second stage of the titration corresponds to the removal of a proton from the group of glycine. The pH at the midpoint of this stage is 9.60, equal to the (labeled in Fig. 3-10) for the group. The titration is essentially complete at a pH of about 12, at which point the predominant form of glycine is .
From the titration curve of glycine we can derive several important pieces of information. First, it gives a quantitative measure of the of each of the two ionizing groups: 2.34 for the group and 9.60 for the group. Note that the carboxyl group of glycine is over 100 times more acidic (more easily ionized) than the carboxyl group of acetic acid, which, as we saw in Chapter 2, has a of 4.76 — about average for a carboxyl group attached to an otherwise unsubstituted aliphatic hydrocarbon. The perturbed of glycine is caused primarily by the nearby positively charged amino group on the α-carbon atom, an electronegative group that tends to pull electrons toward it (a process called electron withdrawal), as described in Figure 3-11. The opposite charges on the resulting zwitterion are also somewhat stabilizing. Similarly, the of the amino group in glycine is perturbed downward relative to the average of an amino group. This effect is due largely to electron withdrawal by the electronegative oxygen atoms in the carboxyl groups, increasing the tendency of the amino group to give up a proton. Hence, the α-amino group has a that is lower than that of an aliphatic amine such as methylamine (Fig. 3-11). In short, the of any functional group is greatly affected by its chemical environment, a phenomenon sometimes exploited in the active sites of enzymes to promote exquisitely adapted reaction mechanisms that depend on the perturbed values of proton donor/acceptor groups of specific residues.
Chemical reactions are shown along a scale labeled, p K subscript a on the top horizontal axis that ranges from 0 to 12, labeled in increments of 2. Near the top, text in a box reads, methyl-substituted carboxyl and amino groups. To the right of this box, two reactions are shown. The first reaction is labeled, acetic acid; the normal p K subscript a for a carboxyl group is around 4.8. C H 3 bonded to highlighted C O O H is shown with its right side intersecting the line representing a p K subscript a of 4. In a reversible reaction, H plus leaves to produce C H 3 bonded to highlighted C O O minus that ends at about 7 on the scale. To the right, the second reaction is accompanied by text reading, methylamine: the normal p K subscript a for an amino group is about 10.6. C H 3 is bonded to highlighted N H 3 plus just before 10 on the scale. In a reversible reaction, H plus leaves to produce C H 3 bonded to highlighted N H 2 that extends just past 12 on the scale. Along the middle of the left side, text in a box reads, carboxyl and amino groups in glycine. Two reactions are shown at this vertical level on the figure. The first is labeled, alpha-amino acid (glycine); p K subscript a equals 2.34; the protonated amino group withdraws electrons from the carboxyl group, lowering its p K subscript a. A molecule located just before 2 on the scale has C bonded to highlighted C O O H on the right, N H 3 plus above, H to the left, and H below and undergoes a reversible reaction to produce a similar molecule located just before 6 on the scale that has highlighted C O O minus in place of C O O H and highlighted N H 3 plus. This molecule undergoes a reversible reaction to lose H plus to produce a similar molecule that has a nonhighlighted C O O minus and a highlighted N H 2 in place of N H 3 plus located at 11 on the scale. This molecule is labeled, alpha-amino acid (glycine); p K subscript a equals 9.60; electronegative oxygen atoms in the carboxyl group withdraw electrons from the amino group, lowering its p K subscript a.
The second piece of information provided by the titration curve of glycine is that this amino acid has two regions of buffering power. One of these is the relatively flat portion of the curve, extending for approximately 1 pH unit on either side of the first of 2.34, indicating that glycine is a good buffer near this pH. The other buffering zone is centered around pH 9.60. (Note that glycine is not a good buffer at the pH of intracellular fluid or blood, about 7.4.) Within the buffering ranges of glycine, the Henderson-Hasselbalch equation (p. 60) can be used to calculate the proportions of proton-donor and proton-acceptor species of glycine required to make a buffer at a given pH.
A final important piece of information derived from the titration curve of an amino acid is the relationship between its net charge and the pH of the solution. At pH 5.97, the point of inflection between the two stages in its titration curve, glycine is present predominantly as its dipolar form, fully ionized but with no net electric charge (Fig. 3-10). The characteristic pH at which the net electric charge is zero is called the isoelectric point or isoelectric pH, designated pI. For glycine, which has no ionizable group in its side chain, the isoelectric point is simply the arithmetic mean of the two values:
As is evident in Figure 3-10, glycine has a net negative charge at any pH above its pI and thus will move toward the positive electrode (the anode) when placed in an electric field. At any pH below its pI, glycine has a net positive charge and will move toward the negative electrode (the cathode). The farther the pH of a glycine solution is from its isoelectric point, the greater the net electric charge of the population of glycine molecules. At pH 1.0, for example, glycine exists almost entirely as the form with a net positive charge of 1.0. At pH 2.34, where there is an equal mixture of and the average or net positive charge is 0.5. The sign and the magnitude of the net charge of any amino acid at any pH can be predicted in the same way.
The shared properties of many amino acids permit some simplifying generalizations about their acid-base behaviors. First, all amino acids with a single α-amino group, a single α-carboxyl group, and an R group that does not ionize have titration curves resembling that of glycine (Fig. 3-10). These amino acids have very similar, although not identical, values: of the group in the range of 1.8 to 2.4, and of the group in the range of 8.8 to 11.0 (Table 3-1). The differences in these values reflect the chemical environments imposed by their R groups.
Second, amino acids with an ionizable R group have more complex titration curves, with three stages corresponding to the three possible ionization steps; thus, they have three values. The additional stage for the titration of the ionizable R group merges to some extent with that for the titration of the α-carboxyl group, the titration of the α-amino group, or both. The titration curves for two amino acids of this type, glutamate and histidine, are shown in Figure 3-12. The isoelectric points reflect the nature of the ionizing R groups that are present. For example, glutamate has a pI of 3.22, considerably lower than that of glycine. This is due to the presence of two carboxyl groups, which, at the average of their values (3.22), contribute a net charge of −1 that balances the +1 contributed by the amino group. Similarly, the pI of histidine, with two groups that are positively charged when protonated, is 7.59 (the average of the values of the amino and imidazole groups), much higher than that of glycine.
Part a shows glutamate. The graph has O H superscript minus equivalents on the horizontal axis, ranging from 0 to 3.0 and labeled in increments of 1.0, and p H on the vertical axis, ranging from 0 to above 10 and labeled in increments of 2. Glutamate is shown above the vertical axis with a plus 1 net charge and the following structure. The central carbon of the amino acid is bonded to H 3 N superscript plus to the left, C O O H above, H on the right, and C H 2 below that is further bonded to C H 2 that is further bonded to C O O H. A double-headed arrow labeled, p K subscript 1 points to a similar structure labeled, net charge 0 that differs in that the C O O H bonded to the central carbon from above is now C O O superscript minus. A double-headed arrow labeled, p K subscript uppercase R points to a similar structure labeled, net charge minus 1 that differs in that the C O O H at the end of the R group is now C O O minus. A double-headed arrow labeled, p K subscript 2 points to a similar structure labeled, net charge plus 1 that differs in that the H 3 N plus is now H 2 N. The data in the graph are as follows. The graph shows a wavy line that begins at (0, 0). It moves upward rapidly, then levels off to an almost diagonal line at about (0.20, 1.75). A dotted vertical line and a dotted horizontal line meet at a point at (0.5, 2.19) beneath text reading, p K subscript 1 equals 2.19. The line continues to move upward almost diagonally through a point at (1.0, 3.22) labeled, p I equals 3.22 beneath the structure of glutamate labeled, 0. The curve continues upward through a point at (1.24, 4.24). A dotted vertical line and a dotted horizontal line meet at this point beneath text reading, p K subscript R equals 3.22. The line begins to curve upward more rapidly at about (2.0, 4.8), then begins to curve slightly at (2.2, 9) to become almost diagonal again. A dotted vertical line and a dotted horizontal line meet at a point at (2.5, 9.18) beneath text reading, p K subscript 2 equals 9.67. The line curves upward more rapidly beginning at about (2.8, 10.2) and ends at the top right corner at (3.0, 11.0) beneath the structure of glutamate labeled, minus 2. Part b shows histidine. The graph has O H superscript minus equivalents on the horizontal axis, ranging from 0 to 3.0 and labeled in increments of 1.0, and p H on the vertical axis, ranging from 0 to above 10 and labeled in increments of 2. All data are approximate. Histidine is shown above the vertical axis with a plus 2 charge and the following structure. The central carbon of the amino acid is bonded to H 3 N superscript plus to the left, C O O H above, H on the right, and C H 2 further bonded to the upper left corner of a ring below. The ring has N bonded to H substituted for C at the top vertex, C bonded to H at the right side vertex, a double bond on the lower right side, N superscript plus bonded to H at the lower right vertex, C bonded to H at the lower left vertex, and a double bond on the left side. A double-headed arrow labeled, p K subscript 1 points to a similar structure labeled, net charge plus 1 that differs in that the C O O H is now C O O superscript minus. A double-headed arrow labeled, p K subscript uppercase R points to a similar structure labeled, net charge 0 that differs in that the N H superscript plus at the lower right vertex of the ring is now N with no H and with no charge. A double-headed arrow labeled, p K subscript 2 points to a similar structure labeled, net charge minus 1 that differs in that the H 3 N plus is now H 2 N. The data in the graph are as follows. The graph shows a wavy line that begins at (0, 0). It moves upward rapidly, then levels off to an almost diagonal line at about (0.25, 1.75). A dotted vertical line and a dotted horizontal line meet at a point at (0.5, 1.82) beneath text reading, p K subscript 1 equals 1.82. The line begins to curve upward more rapidly at (1.0, 3.0), then becomes almost diagonal again at about (1.2, 5.75) underneath the plus 1 histidine above. A dotted vertical line and a dotted horizontal line meet at a point at (1.5, 6.0) beneath text reading, p K subscript R equals 6.0. The line begins to curve upward more rapidly at about (2.0, 7.0), then forms a slight inflection at a point labeled, p I equals 7.59 at (2.1, 7.59) below the structure of histidine labeled, 0. The curve begins to level off again and becomes almost diagonal at about (2.4, 8.8). A dotted vertical line and a dotted horizontal line meet at a point at (2.5, 9.18) beneath text reading, p K subscript 2 equals 9.17. The line curves upward more rapidly beginning at about (2.8, 9.8) and ends at the top right corner at (3.0, 11.0) beneath the structure of histidine labeled, minus 1. All data are approximate.
Finally, in an aqueous environment, only histidine has an R group providing significant buffering power near the neutral pH usually found in the intracellular and extracellular fluids of most animals and bacteria (Table 3-1).