dChip: Compare samples
Another high-level analysis performed by dChip is the
comparison analysis. Given two samples or two groups of samples, we want to
identify genes that are reliably differentially expressed between the two
groups. dChip provides several filtering criterion to filter out interesting
genes. Select menu “Analysis/Compare Samples”:

Select one or
more arrays in the listbox of the Baseline group and the Experiment group. The
current “Array list file” specifies the
samples used in the comparison and “pooling replicate arrays”
will be performed before the comparison if needed. The pooled replicate arrays
are indicated by the ending *’s in the sample name.
If a “Sample information file” is specified
in “Analysis/Open Group/Other information” dialog, we can use the “Select by
category” button to specify samples with particular annotation:
Check one option among “Overwrite”, “Or”, “And” and
“Use inversion” to specify how the specified samples with a particular
annotation is combined with the existing selected sample in the “Baseline” and
“Experiment” sample list. For example, the “Use inversion” option is for
selecting samples not having a particular annotation.
Compute group
means and their standard errors
“E” and “B” in the dialog stand for the group means,
which are the mean expression level of samples in the groups. When a group
contains multiple samples, the group mean and its standard error (SE) is
computed by pooling arrays considering measurement accuracy (Li and Wong 2003, page 5, section 5.2.4), where
samples in the same group are regarded as “replicates”. When a group contains only
one sample, the group mean is the expression value in this sample, and its SE
is the attached standard error (if this sample corresponds to one array, the SE
is the model-based standard error). The SE is set to 0 if the expression values
are from “Analysis/Get External Data”, are truncated or log-transformed.
The standard errors (SE) attached to MBEI values are
array-specific measurement errors, and they are obtained using the model by
comparing the probe response pattern in a particular array to the pattern seen
in most arrays (Li and Wong 2001a). This
standard error is usually smaller than the variation between replicate arrays
or samples. For example, given two replicate arrays, the distribution of (MBEI1
– MBEI2) / sqrt(SE1^2 + SE^2) (of all probe sets) can have much heavier tail
(can reach 40 or larger) than the normal distribution and is more like the t
distribution with d.f. 2-3. Thus when comparing two single arrays with lower
bound fold change or t-test, the standard errors of MBEI help to prevent
unreliable MBEI (with large SE) from giving the erroneous result, but does not
substitute the use of true replication arrays or samples to obtain more
realistic lower bound fold change or t-test p-values, since the biological or
experimental variation can only be estimated through such replicates.
Pooling arrays is mainly for down-weighting the
unreliable expression values with large standard errors when computing group
means. As a result, the calculated means, standard errors of means, and
t-statistics (stored in the “compare result file”)
may be slightly different than those computed by the traditional t-test
software such as Excel. The general effect of considering measurement error is
to avoid under-estimation of the standard error of group means when the sample
size is small, and thus lead to less significant p-values than the standard
analysis. In V1.3+, uncheck “Tools/Options/Analysis/Consider measurement error
when average” to compute group means and standard errors without considering
measurement error. For example, the group mean is the simple average.
One note is that the replicate pooling procedure is
carried out by resampling. As a result, when one switch the order of arrays in
the baseline or comparison group, the random number generator yields random
numbers in a different order for each array and thus the groups means and
standard errors will be also slightly different. Another example is that when
the baseline and the comparison group are switched at paired t-test, the
difference values between paired samples are reversed and eventually this lead
to different t-statistics and p-values. In this case, the intersection (e.g. by
“Combine comparisons”) of the two gene lists when switching the B and E groups
are more reliable.
Comparison criteria
Next we can check to select the
filtering criterion to be applied when comparing the two groups. The compare
criterion (1) requires the fold change between the group means to exceed a
specified threshold. We can also check the “Use lower 90% confidence bound” to
specify to use the lower confidence bound of fold changes for filtering. The
computation of confidence intervals of fold changes and the application of
using lower bound as filtering criterion are described in Li and Wong 2001b (page 6, section “Confidence
interval for fold change”). Theta_1_hat and Theta_2_hat are the E and B
described above (expression values in one samples or the mean expression of
samples in the baseline or experiment group). The motivation is that we are
really interested in the fold change of real expression values r = Theta_1 /
Theta_2, rather than observed fold change Theta_1_hat / Theta_2_hat.
A negative expression value (or
group mean) for a probe set is truncated to 1 before calculating fold changes.
When one expression is large (say 1000) and the other is -10 (at noise level of
absent genes), it is helpful to bring -10 to a small positive number so a large
fold change is calculated and the gene gets selected. In the comparison output
file, fold changes are always larger than 1 with signs representing the
direction of changes, while confidence intervals are for the absolute value of
the fold change. We have found it useful to order genes by the lower confidence
bound ("Lower CB" columns), which is a conservative estimate of the
real underlying fold change. In some cases, one may get (0, infinity) as the
confidence interval, indicating very unreliable estimated fold change.
The compare criterion (2) specifies the threshold
for absolute difference between the two group means.
Since the down-regulated genes and the up-regulated genes may have different
change magnitude, once can specify different fold change criterion for E/B and
B/E and different mean difference criterion for E-B and B-E. If the expression
values have been log-transformed,
E-B and B-E refer to the log fold change (logE* – logB* = log (E*/B*) when E*
and B* are in original scale), and correspondingly the E-B and B-E threshold
should also be specified in log scale. The criterion (1) is generally not used
since they are more difficult to interpret when the expression value are in log
scale.
The compare criterion (3) tests whether the mean
difference in (2) equals to zero by the unpaired t-test. See t-statistic
and its p-value. The default p-value threshold 0.05 filters genes that
differ in group means with a two-tailed p-value < 0.05. When the sample size
is small, the t-statistics and p-values should better be used only as a ranking
or filtering device rather than the actual significance value, especially in
the context of applying t-test on thousands of genes. If no multiple
comparison adjustment is applied, one can specify very stringent p-value
threshold (e.g. 0.05 divided by the total number of genes on the array; this is
Bonferroni
correction assuming all genes are independent).
The compare criterion (4) requires that the
percentage of samples called “Present” in the samples used in both groups be
larger than a threshold. In V1.3+, this criterion can be specified for the
baseline and the experiment group separately. This can be used to obtain genes
called as P in 100% samples of the B group and called as A in 100% samples of
the E group. For example, if Comparison 1 is “P call of B >= 100% and P call
of E >= 0%” and Comparison 2 is “P call of B >= 0% and P call of E >=
1%”, in "Compare samples/Combine comparisons" one can use "And
not" to combine the two comparisons: Comparison 1 "and not"
Comparison 2.
The compare criterion (5) specifies that the p-value
threshold for the paired t-test. The “paired t-test” assumes the first baseline
sample to be compared with the first experiment sample, and so on. To ease the
pairing specification the samples can first be ordered at “Tools/Array list file”. When the baseline and
experimental arrays are paired, using paired t-test has larger power than the
unpaired t-test (detect more real changes).
Combing these statistics can help us focus attention
on a small set of interesting genes. Typically one may need to estimate the
number of differentially expressed genes beforehand, look at these statistics
for known differentially expressed genes, and experiment with different
parameters to finally filter a reasonable set of interesting genes.
t-statistic
and its p-value
The t-statistic is computed as (mean1 – mean2) /
sqrt(SE(mean1)2 + SE(mean2) 2) (see Compute
group means), and its p-value is computed based on the t-distribution, and
the degree of freedom is set according to Welch modified two-sample t-test (see
S-PLUS t.test function; the same is used in Excel TTEST()
function for unequal variance case). It is required to have at least two
samples in each group to perform unpaired and paired t-test.
When “Tools/Options/Analysis/Consider
measurement error when average” is unchecked, group means and standard errors will
be computed without considering measurement error, and thus agree with the
results from standard analysis such as Excel two-sample t-test. The effect of
considering measurement error is to avoid under-estimation of the standard
error of group means when the sample size is small, and thus lead to less
significant p-values than the standard analysis.
[Analysis example] We need to pay attention to the
degree of freedom of the t-test. If both groups have one sample (e.g. each is
the pooled result of N replicated
arrays in the “Array list file”), the degree of freedom for the t-test is
1+1-1 = 1 (in old dChip version, d.f. is N1+N2-1 to accommodate 1 vs. 1
comparison); however if we do not pool replicates but specify N vs. N
comparison in this dialog to compare the two groups of arrays, the degree of
freedom for t-test is N+N-1 > 1. The former gives less significant p-values
for the same t-statistics and results fewer filtered genes. Therefore the N vs
N comparison without pooling replicates is preferred to the 1 vs. 1 comparison.
In V1.3 5/21/03+, 1 vs 1 t-test is not permitted.
Compute false
discovery rate
Since tens of thousands of genes are compared, many
genes can be false positives. However, genes are not all independent and genes
in the same pathway could have similar t-statistics or p-values. Methods of
multiple-comparison adjusted p-values (Dudoit et al.
2002b), False Discovery Rate (FDR, Tusher et al. 2001; SAM software, Q-VALUE software) and
resampling-based multiple test (Pollard and van der Laan
2003) have been proposed to handle the multiple comparison issues in the
context of microarray data.
In dChip, the empirical false discovery rate can be estimated
by permutation. For more rigorous FDR computation, use the SAM or Q-VALUE
software. Check “Analysis/Compare samples/Combine
comparison/Permute samples to assess False Discovery Rate” to assess how
many genes would be obtained by the same comparison criteria when permuting
group labels randomly. Usually it is good to try different comparison criteria
to finally obtain a list of genes with median FDR < 5% or 10%. Such approach
has been pioneered by Golub et al. 1999 and Tusher et al. 2001. FDR can be applied only when
they are more than 1 sample in each group. Also it should be applied when
comparing on all genes, instead of only on the selected gene list such as an
ANOVA filtered gene list.
FDR in dChip is computed
by applying the original comparison criteria (e.g. fold change, p-value) to the
sample-wise permuted datasets and recording the number of obtained genes at each
such permutation. After a number of permutations (say 100), we get 100 values
of the number of genes in the obtained gene lists. The median of these 100
values is reported as the median FDR, and the 90-th percentile (90th
largest value) of these 100 values is reported as 90% FDR, a more conservative
FDR estimate. When the number of permutation is small (e.g. 100), the FDR
reported in different runs could be more variable.
Combine comparisons
After specifying these criterion a user can either
click “OK” to proceed with comparison or go to “Analysis/Compare
Samples/Combine comparisons” tab to specify other advanced options. For
example, one compares sample group A and B and get a gene list 1, and compare
sample group C and D and get a gene list 2. One may be interested in what genes
are common in both lists, or what genes are in list 1 but not list 2. “Combine
comparisons ” is for this purpose.

The comparison parameters in “Compare Samples” tab
are automatically added in the grid sheet. We can add multiple comparisons and
combine them using logical relations. To do this, click the last line of the
grid sheet (the line with a single right parenthesis ‘)’) to highlight this
line, then click “Compare Samples” tab on the top to specify other comparison
samples and filtering criterion, then click “Combine comparisons” tab on the
top to come back to this dialog, select “Combine Type” to be “AND” or “OR”, then click “Insert comparison” button to insert the new
comparison. Parentheses can also be inserted after the highlighted line in the
grid sheet using the “Insert parenthesis” button, to specify two-level logic
combination of comparisons.
In addition, “not” relation can be specified by the
“And not” or “Or not” combine type (“Insert complement” button for Version
1.0). This is useful, for example, when one wants a list of genes that are
up-regulated by growth factor A when comparing to the control but not the genes
that are also up-regulated by growth factor B.
To change a comparison condition, highlight the
corresponding line and click “Compare samples” tab to change comparing and
filtering parameters, then click “Combine comparisons” tab to come back and the
highlighted line will be changed; to insert a new comparison at a certain row,
first click to highlight the last line (the line with single right parenthesis
‘)‘), then click “Compare samples” tab to specify comparing and filtering
parameters, then click “Combine comparisons” tab and click the grid sheet to
highlight a line after which the new comparison is to be inserted, and click
“Insert comparison” button. The “Delete entry” button will delete a comparison,
and the logical relations associated with parentheses may be used in the
deleted row.
The comparison can be restricted to an existing gene
list or its complement set, if a “Gene list file” (a tab-delimited text file
with the first column of each row being the probe set name) is specified in the
“Compare on gene list” button. Click the button multiple times to switch
between “using all genes”, “using gene list” and “excluding gene list”.
Compare result file
Specify an output “compare result file”. This file
contains genes that pass the filtering criterion (marked by ‘*’ in the
“Filtered” columns) and the comparison statistics for these genes. A previously
saved “compare criterion file” can be loaded by specifying the file and
clicking the “Use” button (“Use the criteria in this file” in V1.3+). One may
also manually edit the "compare criteria file" (save in text format)
and then click the "Use" button to use the specified comparison. In
V1.3+, the compare criterion are saved in the beginning of the “Compare result
file” if “Output comparison criteria” is
checked, instead of a separate “Compare criterion file”.
Sometimes we may wonder why some known
differentially expressed genes are not selected by the comparison. In such case
it is helpful to check the button “Output all genes” to output the statistics
of all genes, and then check the statistics for known genes to suggest refined
filtering criterion.
To output the comparison statistics not used in the
filtering, check the corresponding compare criterion in the “Compare samples”
dialog but specify non-informative thresholds. For example, set the p-value to
1 for criterion (3) or (5), or E/B value to be 0 (since the lower confidence
bound of a > 1 fold change can be less than 1) for criterion (1), so that
these statistics are calculated and exported but do not influence the
comparison. Check the option “Tools/Options/Analysis/Treat outlier expression
as missing values” to treat “array-outlier” expression values as missing data
in the comparison analysis and compare result file (output as blanks).
After clicking “OK” a “Compare criterion file” will
be saved. Its file name is stored in the Windows clipboard automatically, so
one can use Control+V to paste the file name into other dialogs. The “compare
result file” can be used as the “Gene list file” in “Analysis/Hierarchical
clustering” dialog to visualize their expression changes and identify
functional groups enriched in the comparison gene lists. It is also useful to
specify this result file for classifing
genes by annotation terms.
[New]
[Analysis example] For a list of genes, we want to view their expression values
in clustering figure and annotate genes with fold changes, like this example figure (data courtesy of Jooeun
Bae). Use “Compare samples” to compare this list of genes and output all genes,
and then copy the probe set name, gene names and fold change columns into a new
Excel file and sort by gene names or other criteria. Then copy the probe set
names into a text file and save as a “gene list file”. Finally perform
hierarchical clustering using this gene list but uncheck “Cluster genes”. Use
“View/Export image” to copy and paste EMF file into the Excel file, and stretch
the figure to align with the rows.
[V2/26/07++] At "Analysis/Compare samples/Combine
comparisons", check "Make M-A plot" to plot log ratio against
log product of the two groups' mean expression level. The genes selected by "Compare
samples" will be highlighted in red circles. This plot is useful to check
the effect of using different comparison criteria. Genes selected by other
methods may also be highlighted, by unchecking all comparison criteria and specifying
a gene list file at "Combine comparisons/Compare on gene list".
