Map a list of genes generated by “Analysis/Compare samples” or “Analysis/Filter genes” may help to identify chromosomal translocations or duplications. Note that we need to analyze a gene list in the context of all the genes on the array for its chromosomal location enrichment. So if “Get external data” is used to read in the data file, all genes should be in the data file. Choose menu “Analysis/Map chromosome” (“Analysis/Genome” in V1.2+) and specify a “Genome information file” and an optional “Gene list file”. When a hierarchical clustering already exists, the genes used for clustering will be used for mapping. Also check “Tools/Options/Analysis/Mask redundant probe sets”. Click “OK” to enter the “Genome View”:

In the “Genome View”, genes in the gene list are colored in black in small vertical bars, and the other genes are colored int light-gray. The transcription starting site is used for gene position. The vertical bar above (below) the horizontal line means the gene is on the forward (reverse) strand. The top is for the forward strand. If “Analysis/Hierarchical clustering” is performed before this step and the gene branches are colored using Control+Click (e.g. up-regulated in red and down-regulated in blue; check “Tools/Options/Clustering/Add new color for control-click” before doing this), the genes will be displayed in the same color in the “Genome View”. Genes on each chromosome are placed proportionally from chromosomal position 0 to the gene with the maximal chromosomal position in the “Genome information file”.
P-values are calculated for all stretches containing <=20 selected genes (genes specified in the "gene list" in "Analysis/Genome") to assess the significance of “gene proximity”, and the significant P-values are reported in the “Analysis View”. For example, “Chromosome 6, the stretch of gene 1 to 9 has p-value 0.021799”. These significant gene stretches are also outlined in blue boxes. If a significant longer stretch contains a significant shorter stretch, only the longer one is reported and outlined. Since there is no correction for multiple hypothesis testing, the p-values here are used for drawing attention to specific genes and should not be taken in a strict sense. When there are too many significant stretches (> 150), one will get the “MAX_STRETCH limit is reached” error message. At this time one may uncheck “Analysis/Genome/Outline stretch” to turn off the stretch highlighting, or set a smaller P-value threshold at “Analysis/Genome” (“Tools/Options/Genome” for V1.2+).
Click to select the current gene, and use Arrow keys to zoom in and out. Select “View/Export Image” menu to export the chromosome image, and “View/Find gene” to find a specific gene in the highlighted gene set. Use the “Enter” key or “View/Next view” to go to the other Views.
[Version 9/23/05+] After the analysis, one can go back to the analysis output window and view the gene names in the significant stretches and the information of multiple comparison, such as “24 significant stretches found at 0.05 level from 2494 p-value assessments”. If from the multiple comparison consideration the number of nominal significant stretches can occur by chance (such as above), one should try smaller p-value threshold.
We observe a stretch of n interesting genes along a chromosome and want to assess its significance. If the genes in the stretch have very close positions on the chromosome we want to call it significant. Define X_i to be the normalized rank of gene i on the chromosome (rank of gene i / number of all genes on chromosome) and use X_i ~ Uniform(0,1) to approximate the distribution. The ranks are based on genes' positions in the chromosome from p-ter. Smaller rank differences between the first and the latest genes in a stretch indicates tight clustering of these selected genes (from the specified gene list).
Define Y = X(n) - X(1) (X(n) is order statistics) as the normalized rank distance of this stretch. Then we could obtain the P-value P(Y <= observed) without getting the densities of Y:
P(X(i+1) – X(i) < y) = 1 – (1-y)^n = 1 - P(none of X_i < y) = P(X(1) < y)
P(X(i+2) – X(i) < y) = 1 – (1-y)^n – n*y*(1-y)^(n-1) = 1 – P(none of X_i < y) – P(exact 1 of X_i < y) = P(X(2) < y)
and so on…
Thus all normalized rank distances are reduced to the order statistics of uniform distributions and we can calculate their P-values. In effect, we assess the “tightness” (the rank distance of the genes on the two ends of the stretch) of a stretch of n genes against that of n genes randomly put on the chromosome.