Genes
Each DNA molecule contains many genes the basic physical and functional units of heredity. A gene is a specific sequence of nucleotide bases, whose sequences carry the information required for constructing proteins, which provide the structural components of cells and tissues as well as enzymes for essential biochemical reactions. The human genome is estimated to comprise at least 100,000 genes.
Human genes vary widely in length, often extending over thousands of bases, but only about 10% of the genome is known to include the protein-
coding sequences (exons) of genes. Interspersed within many genes are intron sequences, which have no coding function. The balance of the genome is thought to consist of other noncoding regions (such as control sequences and intergenic regions), whose functions are obscure. All living organisms are composed largely of proteins; humans can synthesize at least 100,000 different kinds. Proteins are large, complex molecules made up of long chains of subunits called amino acids. Twenty different kinds of amino acids are usually found in proteins. Within the gene, each specific sequence of three DNA bases (codons) directs the cells protein-
synthesizing machinery to add specific amino acids. For example, the base sequence ATG codes for the amino acid methionine. Since 3 bases code for 1 amino acid, the protein coded by an average-
sized gene (3000 bp) will contain 1000 amino acids. The genetic code is thus a series of codons that specify which amino acids are required to make up specific proteins.
The protein-
coding instructions from the genes are transmitted indirectly through messenger ribonucleic acid (mRNA), a transient intermediary molecule similar to a single strand of DNA. For the information within a gene to be expressed, a complementary RNA strand is produced (a process called transcription) from the DNA template in the nucleus. This mRNA is moved from the nucleus to the cellular cytoplasm, where it serves as the template for protein synthesis. The cells protein-
synthesizing machinery then translates the codons into a string of amino acids that will constitute the protein molecule for which it codes (Fig. 5: Gene Expression). In the laboratory, the mRNA molecule can be isolated and used as a template to synthesize a complementary DNA (cDNA) strand, which can then be used to locate the corresponding genes on a chromosome map. The utility of this strategy is described in the section on physical mapping.
Chromosomes
The 3 billion bp in the human genome are organized into 24 distinct, physically separate microscopic units called chromosomes. All genes are arranged linearly along the chromosomes. The nucleus of most human cells contains 2 sets of chromosomes, 1 set given by each parent. Each set has 23 single chromosomes22 autosomes and an X or Y sex chromosome. (A normal female will have a pair of X chromosomes; a male will have an X and Y pair.) Chromosomes contain roughly equal parts of protein and DNA; chromosomal DNA contains an average of 150 million bases. DNA molecules are among the largest molecules now known.
Chromosomes can be seen under a light microscope and, when stained with certain dyes, reveal a
pattern of light and dark bands reflecting regional variations in the amounts of A and T vs G and C.
Differences in size and banding pattern allow the 24 chromosomes to be distinguished from each
other, an analysis called a karyotype. A few types of major chromosomal abnormalities, including
missing or extra copies of a chromosome or gross breaks and rejoinings (translocations), can be
detected by microscopic examination; Downs syndrome, in which an individual's cells contain a
third copy of chromosome 21, is diagnosed by karyotype
analysis (Fig. 6: Karyotype). Most changes in DNA, however, are too subtle
to be detected by this technique and require molecular analysis. These subtle DNA abnormalities
(mutations) are responsible for many inherited diseases such as cystic fibrosis and sickle cell
anemia or may predispose an individual to cancer, major psychiatric illnesses, and other complex
diseases.
Mapping and Sequencing the Human Genome
Table of Contents