Nothing in biology makes sense except in the light of evolution.
— Theodosius Dobzhansky, in The American Biology Teacher, March 1973
Great progress in biochemistry and molecular biology in recent decades has amply confirmed the validity of Dobzhansky’s striking generalization. The remarkable similarity of metabolic pathways and gene sequences across the three domains of life argues strongly that all modern organisms are derived from a common evolutionary progenitor by a series of small changes (mutations), each of which conferred a selective advantage to some organism in some ecological niche.
Despite the near-perfect fidelity of genetic replication, infrequent unrepaired mistakes in the DNA replication process lead to changes in the nucleotide sequence of DNA, producing a genetic mutation and changing the instructions for a cellular component. Incorrectly repaired damage to one of the DNA strands has the same effect. Mutations in the DNA handed down to offspring — that is, mutations carried in the reproductive cells — may be harmful or even lethal to the new organism or cell; they may, for example, cause the synthesis of a defective enzyme that is not able to catalyze an essential metabolic reaction. Occasionally, however, a mutation better equips an organism or cell to survive in its environment (Fig. 1-32). The mutant enzyme might have acquired a slightly different specificity, for example, so that it is now able to use some compound that the cell was previously unable to metabolize. If a population of cells were to find itself in an environment where that compound was the only or the most abundant available source of fuel, the mutant cell would have a selective advantage over the other, unmutated (wild-type) cells in the population. The mutant cell and its progeny would survive and prosper in the new environment, whereas wild-type cells would starve and be eliminated. This is what Charles Darwin meant by natural selection — what is sometimes summarized as “survival of the fittest.”
Occasionally, a second copy of a whole gene is introduced into the chromosome as a result of defective replication of the chromosome. The second copy is superfluous, and mutations in this gene will not be deleterious; it becomes a means by which the cell may evolve, by producing a new gene with a new function while retaining the original gene and gene function. Seen in this light, the DNA molecules of modern organisms are historical documents, records of the long journey from the earliest cells to modern organisms. The historical accounts in DNA are not complete, however; in the course of evolution, many mutations must have been erased or written over. But DNA molecules are the best source of biological history we have. The frequency of errors in DNA replication represents a balance between too many errors, which would yield nonviable daughter cells, and too few errors, which would prevent the genetic variation that allows survival of mutant cells in new ecological niches.
Several billion years of natural selection have refined cellular systems to take maximum advantage of the chemical and physical properties of available raw materials. Chance genetic mutations occurring in individuals in a population, combined with natural selection, have resulted in the evolution of the enormous variety of species we see today, each adapted to its particular ecological niche.
In our account thus far, we have passed over the first chapter of the story of evolution: the appearance of the first living cell. Apart from their occurrence in living organisms, organic compounds, including the basic biomolecules such as amino acids and carbohydrates, are found in only trace amounts in the Earth’s crust, the sea, and the atmosphere. How did the first living organisms acquire their characteristic organic building blocks? According to one hypothesis, these compounds were created by the effects of powerful environmental forces — ultraviolet irradiation, lightning, or volcanic eruptions — on the gases in the prebiotic Earth’s atmosphere and on inorganic solutes in superheated thermal vents deep in the ocean.
This hypothesis was tested in a classic experiment on the abiotic (nonbiological) origin of organic biomolecules carried out in 1953 by biochemist Stanley Miller in the laboratory of the physical chemist Harold Urey. Miller subjected gaseous mixtures such as those presumed to exist on the prebiotic Earth, including and to electrical sparks produced across a pair of electrodes (to simulate lightning) for periods of a week or more, then analyzed the contents of the closed reaction vessel (Fig. 1-33). The gas phase of the resulting mixture contained CO and as well as the starting materials. The water phase contained a variety of organic compounds, including some amino acids, hydroxy acids, aldehydes, and hydrogen cyanide (HCN). This experiment established the possibility of abiotic production of biomolecules in relatively short times under relatively mild conditions. When Miller’s carefully stored samples were rediscovered in 2010 and examined with much more sensitive and discriminating techniques (high-performance liquid chromatography and mass spectrometry), his original observations were confirmed and greatly broadened. Previously unpublished experiments by Miller that included in the gas mixture (mimicking the “smoking” volcanic plumes at the sea bottom; Fig. 1-34) showed the formation of 23 amino acids and 7 organosulfur compounds, as well as a large number of other simple compounds that might have served as building blocks in prebiotic evolution.
More-refined laboratory experiments have provided good evidence that many of the chemical components of living cells can form under these conditions. Polymers of the nucleic acid RNA (ribonucleic acid) can act as catalysts in biologically significant reactions (see Chapters 26 and 27), and RNA probably played a crucial role in prebiotic evolution, both as catalyst and as information repository.
In modern organisms, nucleic acids encode the genetic information that specifies the structure of enzymes, and enzymes catalyze the replication and repair of nucleic acids. The mutual dependence of these two classes of biomolecules brings up the perplexing question: which came first, DNA or protein?
The answer may be that they appeared about the same time, and RNA preceded them both. The discovery that RNA molecules can act as catalysts in their own formation suggests that RNA or a similar molecule may have been the first gene and the first catalyst. According to this scenario (Fig. 1-35), one of the earliest stages of biological evolution was the chance formation of an RNA molecule that could catalyze the formation of other RNA molecules of the same sequence — a self-replicating, self-perpetuating RNA. The concentration of a self-replicating RNA molecule would increase exponentially, as one molecule formed several, several formed many, and so on. The fidelity of self-replication was presumably less than perfect, so the process would generate variants of the RNA, some of which might be even better able to self-replicate. In the competition for nucleotides, the most efficient of the self-replicating sequences would win, and less efficient replicators would fade from the population.
The division of function between DNA (genetic information storage) and protein (catalysis) was, according to the “RNA world” hypothesis, a later development. New variants of self-replicating RNA molecules developed that had the additional ability to catalyze the condensation of amino acids into peptides. Occasionally, the peptide(s) thus formed would reinforce the self-replicating ability of the RNA, and the pair — RNA molecule and helping peptide — could undergo further modifications in sequence, generating increasingly efficient self-replicating systems. The remarkable discovery that in the protein-synthesizing machinery of modern cells (ribosomes), RNA molecules, not proteins, catalyze the formation of peptide bonds is consistent with the RNA world hypothesis.
Some time after the evolution of this primitive protein-synthesizing system, there was a further development: DNA molecules with sequences complementary to the self-replicating RNA molecules took over the function of conserving the “genetic” information, and RNA molecules evolved to play roles in protein synthesis. (We explain in Chapter 8 why DNA is a more stable molecule than RNA and thus a better repository of inheritable information.) Proteins proved to be versatile catalysts and, over time, took over most of that function. Lipidlike compounds in the primordial mixture formed relatively impermeable layers around self-replicating collections of molecules. The concentration of proteins and nucleic acids within these lipid enclosures favored the molecular interactions required in self-replication.
The RNA world scenario is intellectually satisfying, but it leaves unanswered a vexing question: where did the nucleotides needed to make the initial RNA molecules come from? An alternative to this scenario supposes that simple metabolic pathways evolved first, perhaps at the hot vents in the ocean floor. A set of linked chemical reactions there might have produced precursors, including nucleotides, before the advent of lipid membranes or RNA. Without more experimental evidence, neither of these hypotheses can be disproved.
Earth was formed about 4.6 billion years ago, and the first evidence of life dates to more than 3.5 billion years ago (see the timeline in Figure 1-36). In 1996, scientists working in Greenland found chemical evidence of life (“fossil molecules”) from as far back as 3.85 billion years ago, forms of carbon embedded in rock that seem to have a distinctly biological origin. Somewhere on Earth during its first billion years the first simple organism arose, capable of replicating its own structure from a template (RNA?) that was the first genetic material. Because the terrestrial atmosphere at the dawn of life was nearly devoid of oxygen, and because there were few microorganisms to scavenge organic compounds formed by natural processes, these compounds were relatively stable. Given this stability and eons of time, the improbable became inevitable: lipid vesicles containing organic compounds and self-replicating RNA gave rise to the first cells, or protocells, and those protocells with the greatest capacity for self-replication became more numerous. The process of biological evolution had begun.
The earliest cells arose in a reducing atmosphere (there was no oxygen) and probably obtained energy from inorganic fuels such as ferrous sulfide and ferrous carbonate, both abundant on the early Earth. For example, the reaction
yields enough energy to drive the synthesis of ATP or similar compounds. The organic compounds these early cells required may have arisen by the nonbiological actions of lightning or of heat from volcanoes or thermal vents in the sea on components of the early atmosphere such as CO, and An alternative source of organic compounds has been proposed: extraterrestrial space. Space missions in 2006 (the NASA Stardust space probe) and 2014 (the European Space Agency lander Philae) found particles of comet dust containing the simple amino acid glycine and 20 other organic compounds capable of reacting to form biomolecules.
Early unicellular organisms gradually acquired the ability to derive energy from compounds in their environment and to use that energy to synthesize more of their own precursor molecules, thereby becoming less dependent on outside sources. A very significant evolutionary event was the development of pigments capable of capturing the energy of light from the sun, which could be used to reduce, or “fix,” to form more complex, organic compounds. The original electron donor for these photosynthetic processes was probably yielding elemental sulfur or sulfate as the byproduct. Some hydrothermal vents in the sea bottom (black smokers; Fig. 1-36) emit significant amounts of which is another possible electron donor in the metabolism of the earliest organisms. Later cells developed the enzymatic capacity to use as the electron donor in photosynthetic reactions, producing as waste. Cyanobacteria are the modern descendants of these early photosynthetic oxygen-producers.
Because the atmosphere of Earth in the earliest stages of biological evolution was nearly devoid of oxygen, the earliest cells were anaerobic. Under these conditions, chemotrophs could oxidize organic compounds to by passing electrons not to but to acceptors such as in this case yielding as the product. With the rise of -producing photosynthetic bacteria, the atmosphere became progressively richer in oxygen — a powerful oxidant and a deadly poison to anaerobes. Responding to the evolutionary pressure of what evolutionary theorist and biologist Lynn Margulis and science writer Dorion Sagan called the “oxygen holocaust,” some lineages of microorganisms gave rise to aerobes that obtained energy by passing electrons from fuel molecules to oxygen. Because the transfer of electrons from organic molecules to releases a great deal of energy, aerobic organisms had an energetic advantage over their anaerobic counterparts when both competed in an environment containing oxygen. This advantage translated into the predominance of aerobic organisms in -rich environments.
Modern bacteria and archaea inhabit almost every ecological niche in the biosphere, and there are organisms capable of using virtually every type of organic compound as a source of carbon and energy. Photosynthetic microbes in both fresh and marine waters trap solar energy and use it to generate carbohydrates and all other cell constituents, which are in turn used as food by other forms of life. The process of evolution continues — and, in rapidly reproducing bacterial cells, on a time scale that allows us to witness it in the laboratory.
Starting about 1.5 billion years ago, the fossil record begins to show evidence of larger and more complex organisms, probably the earliest eukaryotic cells (see Fig. 1-37). Details of the evolutionary path from non-nucleated to nucleated cells cannot be deduced from the fossil record alone, but morphological and biochemical comparisons of modern organisms have suggested a sequence of events consistent with the fossil evidence.
Three major changes must have occurred. First, as cells acquired more DNA, the mechanisms required to fold it compactly into discrete complexes with specific proteins and to divide it equally between daughter cells at cell division became more elaborate. Specialized proteins were required to stabilize folded DNA and to pull the resulting DNA-protein complexes (chromosomes) apart during cell division. This was the evolution of the chromosome. Second, as cells became larger, a system of intracellular membranes developed, including a double membrane surrounding the DNA. This membrane segregated the nuclear process of RNA synthesis on a DNA template from the cytoplasmic process of protein synthesis on ribosomes. This was the evolution of the nucleus, a defining feature of eukaryotes. Third, early eukaryotic cells, which were incapable of photosynthesis or aerobic metabolism, enveloped aerobic bacteria or photosynthetic bacteria to form endosymbiotic associations that eventually became permanent (Fig. 1-37). Some aerobic bacteria evolved into the mitochondria of modern eukaryotes, and some photosynthetic cyanobacteria became the plastids, such as the chloroplasts of green algae, the likely ancestors of modern plant cells.
At some later stage of evolution, unicellular organisms found it advantageous to cluster together, thereby acquiring greater motility, efficiency, or reproductive success than their free-living single-celled competitors. Further evolution of such clustered organisms led to permanent associations among individual cells and eventually to specialization within the colony — to cellular differentiation.
The advantages of cellular specialization led to the evolution of increasingly complex and highly differentiated organisms, in which some cells carried out the sensory functions; others the digestive, photosynthetic, or reproductive functions; and so forth. Many modern multicellular organisms contain hundreds of different cell types, each specialized for a function that supports the entire organism. Fundamental mechanisms that evolved early have been further refined and embellished through evolution. The same basic structures and mechanisms that underlie the beating motion of cilia in Paramecium and of flagella in Chlamydomonas are employed by the highly differentiated vertebrate sperm cell, for example.
Now that genomes can be sequenced relatively quickly and inexpensively, biochemists have an enormously rich, ever-increasing treasury of information on the molecular anatomy of cells that they can use to analyze evolutionary relationships and refine evolutionary theory. Thus far, the molecular phylogeny derived from gene sequences is consistent with, but in many cases more precise than, the classical phylogeny based on macroscopic structures. Although organisms have continuously diverged at the level of gross anatomy, at the molecular level the basic unity of life is readily apparent; molecular structures and mechanisms are remarkably similar from the simplest to the most complex organisms. These similarities are most easily seen at the level of sequences, either the DNA sequences that encode proteins or the protein sequences themselves.
When two genes share readily detectable sequence similarities (nucleotide sequence in DNA or amino acid sequence in the proteins they encode), their sequences are said to be homologous and the proteins they encode are homologs. In the course of evolution, new structures, processes, or regulatory mechanisms are acquired, reflections of the changing genomes of the evolving organisms. The genome of a simple eukaryote such as yeast should have genes related to formation of the nuclear membrane, genes not present in bacteria or archaea. The genome of an insect should contain genes that encode proteins involved in specifying a characteristic segmented body plan, genes not present in yeast. The genomes of all vertebrate animals should share genes that specify the development of a spinal column, and those of mammals should have unique genes necessary for the development of the placenta, a characteristic of mammals — and so on. Comparisons of the whole genomes of species in each phylum are leading to the identification of genes critical to fundamental evolutionary changes in body plan and development.
When the sequence of a genome is fully determined and each gene is assigned a function, molecular geneticists can group genes according to the processes (DNA synthesis, protein synthesis, generation of ATP, and so forth) in which they function and thus find what fraction of the genome is allocated to each of a cell’s activities. The largest category of genes in E. coli, A. thaliana, and H. sapiens consists of those of (as yet) unknown function, which make up more than 40% of the genes in each species. The genes encoding the transporters that move ions and small molecules across plasma membranes make up a significant proportion of the genes in all three species, more in the bacterium and plant than in the mammal (10% of the ~4,400 genes of E. coli, ~8% of the ~27,000 genes of A. thaliana, and ~4% of the ~20,000 genes of H. sapiens). Genes that encode the proteins and RNA required for protein synthesis make up 3% to 4% of the E. coli genome; but in the more complex cells of A. thaliana, more genes are needed for targeting proteins to their final location in the cell than are needed to synthesize those proteins (about 6% and 2% of the genome, respectively). In general, the more complex the organism, the greater the proportion of its genome that encodes genes involved in the regulation of cellular processes and the smaller the proportion dedicated to basic processes, or “housekeeping” functions, such as ATP generation and protein synthesis. The housekeeping genes typically are expressed under all conditions and are not subject to much regulation.
Large-scale studies in which the entire genomic sequence has been determined for hundreds or thousands of people with cancer, type 2 diabetes, schizophrenia, or other diseases or conditions have allowed the identification of many genes in which mutations correlate with a medical condition. Typically, sequence differences are found in a number of different genes, each of which makes a partial contribution to the predisposition to a given condition or disease. Each of those genes codes for a protein that, in principle, might become the target for drugs to treat that condition. We may expect that for some genetic diseases, palliatives will be replaced by cures, and that for disease susceptibilities associated with particular genetic markers, forewarning and perhaps increased preventive measures will prevail. Today’s “medical history” may be replaced by a “medical forecast.”