Biological information is expensive. We have discussed the extraordinary energetic cost of replication, transcription, and translation in previous chapters. The ATP requirements of converting information from a gene in a chromosome to protein imposes a need for efficiency, explaining the pervasive and often complex regulation of the expression of every gene.
Of the 4,000 or so genes in the typical bacterial genome, or the 20,000 genes in the human genome, only a fraction are expressed in a cell at any given time. Some gene products are present in very large amounts: the elongation factors required for protein synthesis, for example, are among the most abundant proteins in bacteria, and ribulose 1,5-bisphosphate carboxylase/oxygenase (rubisco) of plants and photosynthetic bacteria is one of the most abundant enzymes in the biosphere. Other gene products occur in much smaller amounts; for instance, a cell may contain only a few molecules of the enzymes that repair rare DNA lesions. Requirements for some gene products change over time. The need for enzymes in certain metabolic pathways may wax and wane as food sources change or are depleted. During development of a multicellular organism, some proteins that influence cellular differentiation are present for just a brief time in only a few cells. Specialization of cellular function can greatly affect the need for various gene products; an example is the uniquely high concentration of a single protein — hemoglobin — in erythrocytes. It is clear from these examples that the appearance of gene products must be regulated. Our exploration of the regulation of gene expression is once again guided by multiple principles:
The cellular concentration of a protein is determined by a delicate balance of at least seven processes, each having several potential points of regulation. These processes include synthesis of the primary RNA transcript (transcription); posttranscriptional modification of mRNA; degradation of mRNA; protein synthesis (translation); posttranslational modification of proteins; protein targeting and transport; and protein degradation.
Regulation is achieved by specialized proteins and RNAs. The proteins are usually ligand-binding proteins with no other function. They bind to specific sequences in DNA or RNA. They respond to molecular signals that can be any kind of biological molecule. The RNAs either interact with other RNAs or serve as protein cofactors.
Regulated gene expression may bring about increases or decreases in the amount of a gene product. Gene products that increase in concentration under particular molecular circumstances are referred to as inducible; the process of increasing their expression is induction. Conversely, gene products that decrease in concentration in response to a molecular signal are referred to as repressible, and the process is called repression.
The default transcriptional state of a gene, on or off, is dictated in part by the size and complexity of the genome. In bacteria, where genomes are relatively small and DNA is readily accessible, the default state of genes is generally “on.” Transcription of each gene or gene cluster is usually limited by a specific protein repressor. In eukaryotes, where genomes are larger and genes are encapsulated in chromatin, the default state of most genes is “off.” Gene transcription requires chromatin modification followed by the action of transcription activators.
Regulation is expensive. For many genes, especially in eukaryotes, the regulatory processes can require a considerable investment of chemical energy. That expenditure is nevertheless small when compared to the cost of RNA and protein synthesis when the gene is expressed.
The steps required to generate and then remove an active protein or RNA, all of which may be regulated, are summarized in Figure 28-1. We have examined several of the relevant regulatory mechanisms in previous chapters. Posttranscriptional modification of mRNA, by processes such as alternative splicing patterns (see Fig. 26-20) or RNA editing (see Figs 27-10 and 27-12), can affect which proteins are produced from an mRNA transcript and in what amounts. A variety of nucleotide sequences in an mRNA can affect the rate of its degradation (p. 986). Many factors affect the rate at which an mRNA is translated into a protein, as well as the posttranslational modification, targeting, and eventual degradation of that protein (Chapter 27).
Of the regulatory processes illustrated in Figure 28-1, those operating at the level of transcription initiation are particularly well-documented. These processes are a major focus of this chapter, although we also consider other mechanisms. As noted in earlier chapters, the complexity of an organism is not reflected in the number of its protein-coding genes. Instead, as complexity increases from bacteria to mammals, mechanisms of gene regulation become more elaborate, and posttranscriptional and translational regulation play greater roles.
Control of transcription initiation permits the synchronized regulation of multiple genes encoding products with interdependent activities. For example, when their DNA is heavily damaged, bacterial cells require a coordinated increase in the levels of the many DNA repair enzymes. And perhaps the most sophisticated form of coordination occurs in the complex regulatory circuits that guide the development of multicellular eukaryotes, which can include many types of regulatory mechanisms.
We begin by examining the interactions between proteins and DNA that are the key to transcriptional regulation. We next discuss the specific proteins that influence the expression of specific genes, first in bacterial and then in eukaryotic cells. Information about posttranscriptional and translational controls is included in the discussion, where relevant, to provide a more complete overview of the rich complexity of cellular regulation.