dChip: Pathway Drawing and Analysis

 

Filter genes using pathway

 

Draw pathways


[Version 10/7/05+, this is an experimenting function] In dChip we can draw pathways based on biological knowledge. Here is the partial representation of the Notch pathway involved in the inner ear development (Sage et al. 2005; pathway suggested by Zheng-Yi Chen):

 

A pathway represented in such computable format contains the information of transcriptional or functional relations between known and unknown genes in the pathway, as well as the approximate effect time among the genes. The lines between nodes (boxes) represent known or desired correlation relationship and the associated time lag in the time course experiment. Arrows means one gene activates another in 1 (or X) time unit, “---|” means one gene inhibits another in 1 (or X) time unit, and simple line means the two genes have coordinated expression (e.g. they are in the same protein complex or activated by a transcription factor at the same time).

 

A gene information file needs to be specified at “Analysis/Open group” and the expression values are already computed. Select “Analysis/Pathway” to start the pathway window. Then select “Pathway/Edit node” to assign a gene to a node and select “Pathway/Edit arrow” to assign arrows between gene nodes. Unknown genes are allowed in the pathway and their relationship to the known genes should be represented by arrows for filtering genes. If the samples form a time course, order the samples by time in an array list file and one unit along the arrows represents one sample lag in the array list file. Use “Pathway/Open, Save” for pathway file access (example pathway file).

 

Filter genes using pathway

 

Once having a partial pathway with both known and unknown genes, one can supply a gene list based on variation filtering or gene ontology category, and search in this list for the sets of genes whose expression profiles satisfy the correlation constraints when these genes are placed at the unknown nodes in the pathway. In this way we use both the partial pathway information and the time course expression data to find more candidate genes involved in the pathway by considering the regulatory dynamics.

 

Select “Analysis/Pathway” and “Pathway/Open” to read in a pathway file. Then supply a gene list (<1000 genes, can be from variation filtering or based on known biology) and correlation threshold (require all the links in the pathway to satisfy the correlation or lag correlation). Selecting “Pathway/Filter genes” will search all the sets of genes for the unknown nodes that satisfy the pathway. Only the unknown nodes in a pathway will be checked against the input gene list for consistence. The known genes’ expression profiles constrain the pathway and lower correlation threshold tend to find more consistent gene assignment to the unknown nodes. But if correlation is too low the found gene set may not be biologically interesting.

 

The time unit means along the arrows between genes represents lag correlation between two genes along the time course. Setting it to 0 will use correlation for samples that are not time course. If the pathway is too complex then less likely a gene satisfying pathway will be found, so one can use smaller correlation threshold.