Monday, Jul. 03, 2000
The Genome Is Mapped. Now What?
By MICHAEL D. LEMONICK
It was supposed to be like putting a man on the moon. Sequencing the entire human genome--spelling out the 3.1 billion chemical "letters" that make up human DNA--would be, scientists said, as challenging and rewarding as the Apollo mission that deposited Neil Armstrong on the lunar surface. But the comparison was never exact, and as the genome project approaches completion, it is becoming increasingly clear just how bad the analogy really is. Landing a human on our nearest cosmic neighbor was a straightforward achievement with no need for caveats or footnotes. As of July 20, 1969, nobody had set foot on another world. The next day, Armstrong had. Simple as that.
By contrast, when scientists from Craig Venter's Celera Genomics and the Human Genome Project announce that they're finished sequencing the genome--which they are scheduled to do this week--the milestone will be a lot murkier. That's because they're not really finished. What the scientists at Celera have done is sequence about 97% of the genome, and the remaining 150 million or so letters won't be deciphered anytime soon. The HGP is even further behind; unlike Celera, it hasn't put its strings of letters into proper order yet. This loose end should be cleared up in a year or two, but even then the so-called book of life will remain unreadable. That's because, explains Gerald Rubin, vice president for biomedical research at the Howard Hughes Medical Institute, "it's written in a foreign language. It's a very complicated problem. It's going to be a long time coming."
Molecular biologists still know so little about the human genome, in fact, that even with some 85% of the sequence published on the HGP's GenBank website for every scientist in the world to see, nobody has even a ballpark figure for how many genes humans have. Before this week, the betting ranged from as few as 28,000 to as many as 140,000. Now it looks more like 50,000.
Beyond that, knowing the code for a gene doesn't mean you know what protein it produces in the body, or what that protein does, or how it interacts with other proteins--vital information if you want to know how the genetic code locked in our cells ends up constructing and maintaining a fully functioning human being.
Given this seemingly overwhelming ignorance, why is everyone making such a fuss? Because laying out the biochemical code for all our genes, however many there turn out to be, and locating them within the 23 chromosomes in the human genome may turn out to be the necessary first step to solving all these mysteries. The hope is that the completed genome will enable scientists to lay bare the genetic triggers for hundreds of diseases--from Alzheimer's to diabetes to heart disease--and to devise exquisitely sensitive diagnostic tests. It will help pharmaceutical companies create drugs tailored to a patient's genetic profile, boosting effectiveness while drastically reducing side effects. It could change our very conception of what a disease is, replacing broad descriptive categories--breast cancer, for example--with precise genetic definitions that make diagnosis sure and treatment swift.
And while it's true that researchers can and have sequenced individual genes, they had to use a process that was expensive and terribly laborious--like writing your own reference book before you can start any real experiments. Having the sequences laid out in advance gives the scientific world a big head start.
Those sequences are so useful, in fact, that researchers started tapping into the data long before they were complete. Scientists at drug firms, biotech companies and university labs have taken literally hundreds of baby steps into the era of genomic medicine using an impressive array of powerful new tools: DNA chips and microarrays that let scientists see at a glance which of thousands of genes are active in a given tissue sample; sophisticated software that can organize gigabytes of genetic data; huge databases of genes, disease-tissue samples and mRNA--the molecules that initiate the actual construction of working proteins. "The announcement of finishing the genome is to us a mini-event," says Allen Roses, worldwide director of genetics for Glaxo Wellcome and a prominent Alzheimer's researcher. "We've been making use of the information as it has become available, and we've already done some proof of the concept that finding genes for disease and developing the right drug for the right patient will actually work."
One scientist whose work has been transformed by genomics is Dr. David Altshuler, an endocrinologist at Massachusetts General Hospital who does research at M.I.T.'s Whitehead Institute. A diabetes expert, he wanted to learn more about a gene known to be involved in adult-onset (Type II) diabetes and obesity. He knew that the gene was about 100,000 chemical letters--or base pairs--long, and that only about 2,000 of those directed the production of a protein.
Hidden somewhere in the remaining 98,000 base pairs are instructions that govern how much protein gets churned out--an essential clue for developing eventual treatments for diabetics. But before the public project's data began going up on GenBank, finding the hidden code would have been a daunting task. "To isolate the DNA and do all the sequencing would have taken a highly trained Ph.D. a year or two," says Altshuler, "an ungodly, unacceptable amount of work."
This spring Altshuler simply went to the public database, fed in the 2,000 base pairs he already knew about and asked the computer: Is the rest of the gene sequenced? "For four months," he says, "we went back every week, and the answer was 'nope, nope, nope.' Then one week, all of a sudden, there it was," he says, "all 100,000 base pairs in a row--a year, two years' work handed to us, all before lunch."
His next step was to look at the same gene in the mouse, taking advantage of the fact that the noncoding portions of the genome in man and mouse are 75% similar. Three weeks after pulling the gene's human sequence off GenBank, Altshuler lined up the mouse and man genes side by side and spotted five regions that were active in both. Now he's going to focus on these five regions as possible targets for drug design, figuring this is where the regulatory action is likely to be.
Another Whitehead scientist, oncologist Dr. Todd Golub, is trying to improve on the primitive techniques doctors use to guide their fight against cancer. Currently, pathologists use the location of a tumor in the patient's body and its appearance under a microscope to determine what sort of malignancy is involved. It works often--but not always. Melanoma, for example, starts out as a skin cancer but may end up in the lung or breast, where it can be much more damaging than primary lung or breast cancer.
Proof at the genetic level that origin means more than location is coming in. Researchers at Stanford have been studying liver, breast, prostate and lung cancers for clues to their telltale molecular fingerprints. Using microarrays to sense which genes are turned on in sample tissues, says geneticist Charles Perou, the Stanford team has discovered that most of the genes expressed by both normal breast cells and primary-breast-cancer cells are similar, and so are cells for normal lung tissue and lung cancer, normal prostate and prostate cancer, and so on--which should ultimately give doctors biochemical identifiers to guide their treatments.
Golub and colleagues at Boston's Dana-Farber/Harvard Cancer Center, meanwhile, are learning that tumors from a specific category--lung, prostate, colon--can be divided into previously unsuspected subcategories. Golub says of his specialty, for example, "We're trying to understand why some men die with prostate cancer rather than of prostate cancer, whereas others have aggressive disease that kills them."
Researchers led by Louis Staudt, an oncologist at the National Cancer Institute, have been asking similar questions about lymphoma. In a paper published in the scientific journal Nature, they showed how lymphomas that look the same under the pathologist's microscope aren't necessarily identical. Staudt and his colleagues used DNA chips to see which genetic switches were being thrown in each of 40 different biopsy samples from lymphoma patients. By looking at specific genes involved in cell proliferation and immune-cell response, says Staudt, they determined that "two different kinds of tumors are hiding within the single diagnosis of diffuse large-cell lymphoma." That, he believes, may be why only 40% of patients respond to currently available treatments: the rest are getting the wrong kind of therapy.
These sorts of tightly focused studies are already beginning to make cancer treatment more effective. Last year physicians approached the Maryland biotech company Gene Logic for guidance. They had a patient with esophageal cancer--an especially lethal type--so they wanted to find the best therapy in a hurry. Would radiation be appropriate? What about chemotherapy? And if so, which type? Or perhaps it made sense to go right to one of the new experimental antiangiogenesis medications that cut off a tumor's blood supply.
They went to Gene Logic because the company is one of a handful, along with California's Affymetrix and Incyte, that have developed DNA-chip and microarray technology--in this case, chips that can monitor some 42,000 genes in one shot--and software to analyze the results. Using these powerful tools, Gene Logic scientists tested the patient's cells alongside others from both healthy and sick people. In a few days, they completed the analysis.
The findings are now being prepared for scientific publication, and thus can't be revealed in detail. But Gene Logic will say that based on the genes active in this patient's cancer, antiangiogenesis drugs and most chemotherapy wouldn't work but three drugs would. Moreover, the scientists discovered that this cancer was producing enormous quantities of a particular enzyme that happens to be the target of yet another experimental drug--something to try if chemotherapy failed. The patient is now in remission.
Mounting this kind of operation for a single patient is hugely expensive--this case ran up a bill of $37,000--and the advice that works for one patient wouldn't apply to someone whose genetic makeup is different. That's why scientists at Gene Logic and other firms--Millennium Pharmaceuticals in Massachusetts and Glaxo Wellcome (both in England and the U.S.) are just two examples--are putting together databases of tissue samples to look for one-letter genetic differences. (These differences are formally known as single-nucleotide polymorphisms, or SNPs.) Fourteen drug companies and the philanthropic Wellcome Trust (not affiliated with Glaxo Wellcome) organized an SNPs consortium last year to begin building a publicly available SNP database. Both the Human Genome Project and Celera are currently sequencing the genomes of many different people, of both sexes and all sorts of ethnic backgrounds, to get a better sense of where the SNPs are.
The strategy clearly works. Last October scientists at Glaxo Wellcome announced that their patient database and SNP mapping information had yielded four genes that were promising drug targets: one each for Alzheimer's, diabetes, psoriasis and migraine. "We've already done what people are proposing to do in the future," says Roses. Glaxo isn't alone. Dozens of firms are concentrating on this subcategory of gene-based medicine, known as functional genomics.
Almost everyone agrees that the complete genome sequence is essential to functional genomics--everyone, that is, except William Haseltine, CEO of Human Genome Sciences, a firm he started with Craig Venter in 1992. "Human genome sequencing [of the entire genome] helps us understand the deep and interesting questions of how our genome relates to those of other species," says Haseltine dismissively. "But it isn't particularly practical."
Following his lead, HGS scientists are ignoring most of the genetic code and concentrating on the mRNA that puts the code into action. During the 1990s the company amassed a huge library of mRNA and used microarrays to see which of these molecular snippets was active in disease. Haseltine's scientists were able to isolate 10 proteins, made from strands of mRNA, that are active in the healing of intractable wounds. Of these, nine were discarded because they may promote cancer.
The remaining candidate, known as repifermin, is currently in FDA-approved clinical trials for patients with skin ulcers, ulcerative colitis and mucosal damage due to chemotherapy, and may be expanded to burn and smoke-inhalation sufferers. And while most scientists disagree with his iconoclastic views, Haseltine is getting results: besides repifermin, his firm has three drugs in clinical trials and expects to add two or three more next year. That puts HGS far ahead of any other company.
In any case, pinpointing the genetic basis for disease and an individual's genetic response to medication is only a small part of what genomics is about. Further along the biochemical cascade of cause and effect are the proteins, the ultimate products of genetic information. Understanding the nature of proteins and their complex interactions will give scientists an entirely different and perhaps even more valuable insight into disease--and again, nobody's waiting for the genome sequence to be done to start finding out. (See accompanying story.)
The actions of individual genes, moreover, make a lot more sense in the context of other genes. "Right now," says Stanford biochemist Patrick Brown, "it's like watching a movie on TV a few pixels at a time and trying to figure out the overall story. Having the complete genome sequence is something categorically different, like going from 100 scattered pixels on your screen to having the whole image. There will be a substantial increase in the rate at which discoveries are made."
Maybe too much of an increase, argues Tom Delbanco, chair of general medicine at Harvard Medical School. "Discovery is intoxicating," he says. "But the consequences of discovery are often complex, and instead of progress, it can lead to disaster." Delbanco is worried that the revolution in genetic medicine may further drain the limited amount of time that physicians have to spend with patients and add even more costs to the already expensive health-care system.
Yet while Delbanco's fears may be justified--and while the genetic revolution has raised plenty of other troubling issues (see "What We Should Worry About")--its promise is so huge that putting on the brakes may be impossible. The age of genomic medicine is here; the sequencing of the human genome just marks the ceremonial start.
And that's perhaps the most significant difference between the genome project and the first moon landing. The latter was a clean, well-defined achievement. But more than 30 years after Neil Armstrong's dusty first step, space travel has gone pretty much nowhere. Thirty years from now, our understanding of the human organism and its various ills is likely to be transformed beyond recognition.
--Reported by Dan Cray/Los Angeles, Alice Park and Sora Song/New York and Dick Thompson/Washington
With reporting by Dan Cray/Los Angeles, Alice Park and Sora Song/New York and Dick Thompson/Washington