Molecular Anatomy Reveals Evolutionary Relatio
The naturalist Carolus Linnaeus of the XVIII Gin of different types from a common ancestor. Biochemical research in the 20th century revealed the molecular anatomy of cells of different species, the sequences of monomeric subunits, and the three-dimensional structures of individual nucleic acids and proteins. Biochemists today have an enormously rich and growing body of evidence with which to analyze evolutionary relationships and refine the theory of evolution. The sequence of the genome (the complete genetic makeup of an organism) is completely determined for many eubacteria and for some archaebacteria; for the eukaryotic microorganisms Saccharomyces cere visiae and Plasmodium sp .; for Arabidopsis thaliana and rice plants; and for the multicellular animals Caenorhabditis elegans (a roundworm), Drosophila melanogaster (the fruit fly), mice, rats and Homo sapi ens (Sie). This list is periodically expanded to include additional sequences. With such sequences in hand, detailed and quantitative comparisons among species can provide deep insight into the evolutionary process. Thus far, the molecular phylogeny derived from gene sequences is consistent with, but in many cases more precise than, the classical phylogeny based on macroscopic structures. Although organisms have continuously diverged at the level of gross anatomy, at the molecular level the basic unity of life is readily apparent; molecular structures and mechanisms are remarkably similar from the simplest to the most complex organisms. These similarities are most easily seen at the level of sequences, either the DNA se quences that encode proteins or the protein sequences themselves.
When two genes share readily detectable sequence similarities (nucleotide sequence in DNA or amino acid sequence in the proteins they encode), their sequences
are said to be homologous and the proteins they encode are homologs. If two homologous genes occur in the same species, they are said to be paralogous and their protein products are paralogs. Paralogous genes are presumed to have been derived by gene duplication fol lowed by gradual changes in the sequences of both copies. Typically, paralogous proteins are similar not only in sequence but also in three-dimensional structure, although they commonly have acquired different func tions during their evolution.
Two homologous genes (or proteins) found in dif ferent species are said to be orthologous, and their pro tein products are orthologs. Orthologs are commonly found to have the same function in both organisms, and when a newly sequenced gene in one species is found to be strongly orthologous with a gene in another, this gene is presumed to encode a protein with the same function in both species. By this means, the function of gene products can be deduced from the genomic se quence, without any biochemical characterization of the gene product. An annotated genome includes, in ad dition to the DNA sequence itself, a description of the likely function of each gene product, deduced from com parisons with other genomic sequences and established protein functions. In principle, by identifying the path ways (sets of enzymes) encoded in a genome, we can deduce from the genomic sequence alone the organism’s