The complexity of the molecules and systems revealed in this book can sometimes conceal a scientific reality: what we have learned is just a beginning. Novel proteins and lipids and carbohydrates and nucleic acids are discovered every day, and we often have no clue as to their functions. How many have yet to be encountered, and what might they do? Even well-characterized biomolecules continue to challenge researchers with countless unresolved mechanistic and functional questions. A new era, defined by technologies that provide broad access to the entirety of a cell’s DNA, the genome, has accelerated progress.
This is a methods chapter, a necessary prelude to much of what comes later in this book. It is organized around just a few straightforward principles:
An organism’s DNA — its genome — is the ultimate source of biological information. Genomic information is a resource of unparalleled importance for investigators studying any aspect of biology. Genomes vary in size but all are large enough to direct all aspects of an organism’s structure and function. To approach them often requires tools to break them into small parts that are experimentally digestible.
Genomic information is accessible. Advances in DNA sequencing (Chapter 8) are being matched by new approaches to understanding how chromosomal information is expressed and regulated on a genomic and cellular scale. Important clues to protein function are embedded in the sequences of the genes that encode them.
Genomic information is malleable. We can not only elucidate cellular genomic information; we can also change it. That capacity provides a path to altering any aspect of cellular metabolism, structure, or function.
The word “genome,” coined by German botanist Hans Winkler in 1920, was derived from the Greek words genesis and soma to describe a body of genes. A genome today is defined as the complete haploid genetic complement of an organism. In essence, a genome is one copy of the hereditary information required to specify the organism. For sexually reproducing organisms, the genome includes one set of autosomes and one of each type of sex chromosome. When cells have organelles that also contain DNA, the genetic content of the organelles is not considered part of the nuclear genome. Mitochondria, found in most eukaryotic cells, and chloroplasts, in the light-harvesting cells of photosynthetic organisms, each have their own distinct genome. For viruses, which can have genetic material composed of DNA or RNA, the genome is a complete copy of the nucleic acid required to specify the virus.
As objects of study, DNA molecules present a special problem: their size. Chromosomes are far and away the largest biomolecules in any cell. How do researchers find the information they seek when it is just a small part of a chromosome that can include millions or even billions of contiguous base pairs? Decades of advances by thousands of scientists working in genetics, biochemistry, cell biology, and physical chemistry came together in the laboratories of Paul Berg, Herbert Boyer, and Stanley Cohen to yield the first techniques for locating, isolating, preparing, and studying small segments of DNA derived from much larger chromosomes. The science of genomics is dedicated to the study of DNA on a cellular scale. In turn, genomics contributes to systems biology, the study of biochemistry on the scale of whole cells and organisms.
The methods described in this chapter were built on advances in our understanding of DNA and RNA metabolism that are not presented in this text until Part III. Fundamental concepts of DNA replication, RNA transcription, protein synthesis, and gene regulation are intrinsic to an appreciation for how these methods work. Yet all facets of modern biochemistry rely on these same methods to such an extent that a current treatment of any aspect of the discipline becomes very difficult without a proper introduction to them. By presenting these technologies early in the book, we acknowledge that they are inextricably interwoven with both the advances that gave rise to them and the newer discoveries they now make possible. The background we necessarily provide makes the discussion here not just an introduction to technology but also a preview of many of the fundamentals of DNA and RNA biochemistry encountered in later chapters.
We begin by outlining the principles of DNA cloning, then illustrate the range of applications and the potential of many newer technologies that support and accelerate the advance of biochemistry.