Digital Biology Laboratory
EXCAVATOR Version 1.0:


EXCAVATOR is a computer software for gene expression data clustering. It employs a set of unique clustering algorithms developed by the Protein Informatics Group of Oak Ridge National Laboratory. Unlike any existing gene expression analysis tools, EXCAVATOR represents a set of gene expression data as a minimum spanning tree (MST), a concept from the graph theory. We have rigorously proved that the MST representation, though simple in its structure, did not lose any essential information for the purpose of the data clustering. Through this representation, we have reduced a multiple-dimensional data clustering problem to a tree partitioning problem, a much simpler problem computationally. The MST representation facilitates

  • efficient implementations of clustering algorithms with guaranteed mathematical properties,
    including global optimality;
  • including global optimality;
  • strong capabilities in handling data clusters with complex cluster boundaries; and
  • strong capabilities in overcoming problems caused by background noise.


  • clustering gene expression profiles under various definitions of distances and clustering
  • data-constrained clustering;
  • automatic selection of the most plausible number of clusters in a data set;
  • removal of background noise;
  • identification of gene with similar expression profiles to a set of specified seed genes; and
  • comparison of clustering results using different clustering parameters and algorithms.


If you use EXCAVATOR, please refer to:

Ying Xu, Victor Olman, and Dong Xu. Clustering Gene Expression Data Using a
Graph-Theoretic Approach: An Application of Minimum Spanning Trees. Bioinformatics.
In press.

Bug Reports

Please email us any bugs of the program.

Copyright / Availability

The program EXCAVATOR is copyrighted and is NOT in the public domain. Please contact us
about the download information. EXCAVATOR is free of charge to academic users. For
commercial users, please send us an email for more information.

The authors of EXCAVATOR make no representations about the suitability of the software
for any purpose. It is provided "as is" without express or implied warranty. The authors shall not
be liable for any damages suffered by Licensee from the use of this software.


The development of EXCAVATOR was sponsored by the Office of Biological and
Environmental Research, U.S. Department of Energy, under Contract
DE-AC05-00OR22725, managed by UT-Battelle, LLC.

Dept. of Computer Science College of Engineering University of Missouri-Columbia Department of Computer Science College of Engineering University of Missouri-Columbia