BADGE Report (November 13, 2002)

Molecular Classification


  Materials

The complete database comprises 102 expression measurements of 12625 genes, 50 in the first condition and 52 in the second condition. The recorded expression values range between 0.01 and 9910.90.

  Methods

Data were analyzed using BADGE (Bayesian Analysis of Differential Gene Expression) version 1.0 (1), a computer program implementing a Bayesian appraoch to identify differentially expressed genes across experimental conditions. For each gene, the method computes the posterior probability that the gene is expressed more than one fold in the first condition than in the second condition. This posterior probability is computed as the weighted average of the posterior probability computed under the assumption that the observations are generated by a Gamma distribution and a Log-normal distribution.

A predictive evaluation of the results obtained by this method was performed using leave-one-out cross validation. This techique consists of removing from the database one case at the time, estimating the model parameters from the remaining cases, and predicting the condition of the removed case on the basis of the selected genes and the parameters estimated from the remaining cases. If the condition predicted correponds to the condition of the predicted case, the prediction will be considred correct. If it does not, it will be taken as incorrect.
  Results

Using the approach described in the Methods section, we selected the genes with more than 99.9980% and less than 0.40% probability of being more expressed in the first condition, for a total of 37 and 24 genes in the first and the second case, respectively. Figure 1 displays the distribution of the probabilities of all genes in the dataset.

Figure 1.   Distribution of the posterior probability of being differentially expressed for each gene in the dataset.

Figure 2 displays a colormap of the selected genes.

Figure 2.   A colormap of the genes selected by the analysis. The intensity of each color denotes the standardized ratio between each value and the estimated average expression of each gene. click here to see an enlarged version.
Leave one out cross validation achived an accuracy of 96.08%.
  References

1. Sebastiani, P and Ramoni, M (2002). Bayesian Analysis of Differential Gene Expression. Under review.

This report was generated by BADGE v1.0.