First published online April 20, 2007
Journal of Experimental Biology 210, 1507-1517 (2007)
Published by The Company of Biologists 2007
doi: 10.1242/jeb.004432
Extracting biology from high-dimensional biological data
John Quackenbush
Department of Biostatistics and Computational Biology and Department
of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA and
Department of Biostatistics, Harvard School of Public Health, Boston, MA,
USA

View larger version (9K):
[in this window]
[in a new window]
|
Fig. 1. Biological systems can be thought of as `information management' systems
with multiple levels of organization that interact and influence each other.
The study of biological systems has a long history and although
high-throughput `'omics' approaches are expanding their range of applications,
integrating information from various levels can provide powerful insights for
the interpretation of high-throughput data.
|
|

View larger version (59K):
[in this window]
[in a new window]
|
Fig. 2. Genes identified as both differentially expressed and also genetically
linked to the differential response to inhaled LPS in a mouse model of
environmentally induced asthma. Responses were measured by qPCR, comparing
exposed mice to matched controls from the same strain. The ordering of the
strains by expression levels for these genes closely mimics that produced when
ordered by phenotype.
|
|

View larger version (80K):
[in this window]
[in a new window]
|
Fig. 3. Heat map and hierarchical clustering dendrogram, in which the elements
being clustered are the GO term assignments and the values represented in each
cell are the log10(P-values) of this being
significantly different from the null hypothesis, based on EASE analysis. From
Larkin et al. (Larkin et al.,
2004 ).
|
|

View larger version (4K):
[in this window]
[in a new window]
|
Fig. 4. An example of a Bayesian Network model for four genes. If we assume that
Gene 1 activates Gene 2, then we can construct a conditional probability table
(shown at the right) that captures our observations of the state of Gene 1
when we observe Gene 2 to be upregulated. Here the values for Gene 1 of
1, 0, and +1 represent states where Gene 1 is downregulated, unchanging
or upregulated, respectively.
|
|

View larger version (44K):
[in this window]
[in a new window]
|
Fig. 5. Representations of the networks produced by a Bayesian Network analysis of
the top 40 genes selected as distinguishing ALL and AML in the microarray
dataset of Golub et al. (Golub et al.,
1999 ) for links with confidence greater than 0.7; links with
confidence greater than 0.9 are shown in bold. Networks represent the
consensus of 200 iterations for (A) the microarray data alone, (B) the
microarray data with constraints from protein-protein interaction (PPI) data,
(C) microarray data with constraints from literature networks, and (D)
microarray data with constraints from a combination of microarray and PPI
data.
|
|

CiteULike
Complore
Connotea
Del.icio.us
Digg
Reddit
Technorati
Twitter What's this?
© The Company of Biologists Ltd 2007