Using Machine Learning to Design and Interpret Gene-Expression Microarrays

  • Michael Molla
  • Michael Waddell
  • David Page
  • Jude Shavlik


Gene-expression microarrays, commonly called gene chips, make it possible to simultaneously measure the rate at which a cell or tissue is expressing -- translating into a protein -- each of its thousands of genes. One can use these comprehensive snapshots of biological activity to infer regulatory pathways in cells; identify novel targets for drug design; and improve the diagnosis, prognosis, and treatment planning for those suffering from disease. However, the amount of data this new technology produces is more than one can manually analyze. Hence, the need for automated analysis of microarray data offers an opportunity for machine learning to have a significant impact on biology and medicine. This article describes microarray technology, the data it produces, and the types of machine learning tasks that naturally arise with these data. It also reviews some of the recent prominent applications of machine learning to gene-chip data, points to related tasks where machine learning might have a further impact on biology and medicine, and describes additional types of interesting data that recent advances in biotechnology allow biomedical researchers to collect.