Mathematical Sciences Research Institute

Home » Genetics of Complex Disease


Genetics of Complex Disease February 09, 2004 - February 13, 2004
Registration Deadline: February 13, 2004 over 13 years ago
To apply for Funding you must register by: November 09, 2003 over 13 years ago
Parent Program: --
Organizers Jun Liu, Mary Sara McPeek, Richard Olshen (chair), David O. Siegmund, and Wing Wong

Show List of Speakers


This workshop is sponsored by MSRI and in part by Affymetrix, Aventis, Bristol-Myers Squibb and Pfizer. Our workshop will be held February 9-13, 2004 at the Mathematical Sciences Research Institute in Berkeley. Its topic is the genetics of complex human disease. The goals can be classified by subject matter by a variety of criteria, discussed below; but in human terms the goal is simple. We will bring together individuals who work at the forefront of laboratory research (perhaps also with patients) and others whose related activities are also cutting edge, but for whom the emphases are algorithmic, probabilistic, or statistical. While there are many conferences on specific topics from among those we cite, there are very few that span "bench to computer to bedside" topics and at the same time spare nothing in mathematical sophistication when suitable applications of mathematics can shed light on important biomedical problems. With respect to human diseases, the appropriate technologies, algorithms and clinical thinking are differentiated to a degree by the disease of interest. Important examples are cancer, autoimmune diseases, cardiovascular diseases, and lipid abnormalities. To some extent study of the human immune system holds these topics together, but there is much that is special to each topic. For example, much recent interest in cancer -- including but not limited to retinoblastoma, cancers of the head and neck, breast, and lung -- has focused on loss of heterozygosity and comparative genome hybridization, topics we will explore in lectures and discussion. By now there are numerous chips that are brought to bear on these diseases and others. "Historically," meaning, perhaps, one to five years ago, chips were primarily of the cDNA type, with 500-5000 base pairs studied at once, and oligonucleotide DNA chips, where attention is restricted to 20-80mers of DNA. Analysis of these and other chips and wafers will be topics of concern to us. One could think of the study of complex human disease as being analogous to a triangle, where scientists in the corners have their own emphases, but meet in the middle and interact. One corner concerns finding DNA markers, i.e., polymorphic sites, preferably in coding or regulatory regions of genes, that bear upon disease. While there might be viewed to be tension between studying allele sharing and identity by descent from linkage data on the one hand and association analysis of candidate genes on the other, we take it as a challenge to combine information from these different approaches. The use of animal models, where linkage analysis is easier than in humans, combined with identification of candidate genes in humans through homology searches provides another important tool. Linkage analysis has stimulated interest in first passage problems for Ornstein-Uhlenbeck processes to deal with problems of multiple testing. Candidate gene studies have stimulated interest in supervised learning, which in statistics is often called "classification." An important example is the study of angiotensinogen and protein tyrosine phsophatase as they bear upon hypertension. Cytochrome P450 genes are part of the study of all cited processes of disease. A second corner of the triangle concerns understanding genetic control and gene expression. There are now related data from beads, and other technologies, too. Analyses of these data might be in terms of supervised learning (when the outcome/phenotype is given) and unsupervised learning, or clustering (when the outcome/phenotype) is not. Clusters may be distinct or overlapping. Nearly always the clustering is of data that can be modeled as though they are points in Euclidean spaces, but where the cardinality of the sample pales by comparison with the dimension of the relevant space. With supervised learning there is, typically, a finite set of outcomes, the "covering diagnoses," and the goal is to classify, that is, to assign, each vector of expression values to a diagnosis. Some examples of interest in this area have been different flavors of hematopoietic malignancy. Most but not all classifiers of interest recently devolve from "voting methods," such as the celebrated AdaBoost method. The third corner of the triangle is concerned with the direct studies of proteins and their interactions, for example by time of flight mass spectrometry. Typical output here is a curve, or family of curves, with geometry (or geometries) that may apply to a particular genetic profile or disease. One approach of interest could be that of extracting a parsimonious set of basis functions for families of curves and representing a curve of interest to within specified discrepancy in a suitable norm. Perhaps low fractional Besov norms are relevant. Once a suitable basis and corresponding expansions are computed, we are back in the problem of supervised or unsupervised learning, as the case may be. Since what we get are the weights of proteins, it is imperative to be able to do the "inverse problem" of inferring the protein from the molecular weight. This could bring us to concerns of "fast table lookup," that have been important to streaming video over the Web and other problems. None of the above is meant to preclude interest in problems of evolution, which bear upon our subject matter through the identification of regions in proteins that are conserved across organisms and in evolutionary analysis of various pathogens, nor in problems in more traditional genetic epidemiology and statistical genetics. The latter can bring us to models where unconditional distributions are mixtures of Gaussians or other smooth distributions, and where sometimes distinctions between inference conditional on some data and unconditionally are blurred. The resulting inferential and computational issues can be very subtle. We on the committee that is organizing our workshop have contacted many individuals. Most are very interested in participating. Below please find a list of some of the individuals who will participate in our workshop. Although these are well known senior scientists, we are also committed to encourage many young and creative, but less well known, individuals to join us. Warren Ewens, Professor of Biology University of Pennsylvania (Winner, Weldon Memorial Prize, Oxford University, 2002) Joe W. Gray, Professor of Laboratory Medicine and Radiation Oncology Principal Investigator, UCSF Comprehensive Cancer Center University of California, San Francisco Jun Liu, Professor of Statistics and of Biostatistics Harvard University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 2002) Mary Sara McPeek, Associate Professor, Departments of Statistics and Human Genetics Member, Committee on Genetics University of Chicago Richard Olshen, Professor of Health Research and Policy (Biostatistics) and (by Courtesy) of Electrical Engineering and Statistics Stanford University Thomas Quertermous, William G. Irwin Professor in Cardiovascular Medicine Research Chief, Division of Cardiovascular Medicine Stanford University Koustubh Ranade, Pharmaceutical Research Institute Bristol-Myers Squibb Princeton, New Jersey Neil Risch, Professor of Genetics and (by Courtesy) of Health Research and Policy and of Statistics Stanford University Adjunct Investigator, Division of Research Kaiser Permanente, Northern California David Siegmund, John D. and Sigrid Banks Professor and Professor of Statistics Stanford University Mark Skolnick, Chief Scientific Officer, Myriad Genetics, Inc. Terry Speed, Professor of Statistics University of California, Berkeley Head, Division of Bioinformatics Walter & Eliza Hall Institute of Medical Research Melbourne, Australia Robert Tibshirani, Professor of Health Research and Policy (Biostatistics) and (by Courtesy) of Statistics Stanford University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 1996) Wing Hung Wong, Professor of Computational Biology Department of Biostatistics and Professor of Statistics Harvard University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 1993)

Show Tags and Subject Classification
Primary Mathematics Subject Classification No Primary AMS MSC
Secondary Mathematics Subject Classification No Secondary AMS MSC
Funding & Logistics Show All Collapse

Show Funding

To apply for funding, you must register by the funding application deadline displayed above.

Students, recent Ph.D.'s, women, and members of underrepresented minorities are particularly encouraged to apply. Funding awards are typically made 6 weeks before the workshop begins. Requests received after the funding deadline are considered only if additional funds become available.

Show Lodging

A block of rooms has been reserved at the Hotel Durant. Reservations may be made by calling 1-800-238-7268. When making reservations, guests must request the MSRI preferred rate. If you are making your reservations on line, please go to this link and enter the promo/corporate code MSRI123. Our preferred rate is $129 per night for a Deluxe Queen/King, based on availability.

Additional lodging options (short term housing page - Short Term Housing

Show Directions to Venue

Show Visa/Immigration