Daniel Hennessy, Vanathi Gopalakrishnan, Bruce G. Buchanan, John M. Rosenberg, and Devika Subramanian
X-ray crystallography is the method of choice for determlning the 3-D structure of large macromolecules at a high enough resolution. The rate limiting step in structure determination is the crystallization itself. It takes anywhere between a few weeks to several years to obtain macromolecuiar crystals that yield good diffraction patterns. The theory of forces that promote and maintain crystal growth is pre|im;nary, and crystallographers systematically search a large parameter space of experimental settings to grow good crystals. There is a wealth of experimental data on crystal growth most of which is in paper laboratory notebooks. Some of the data haa been gathered in electronic form, e.g., the Biological Macromoleculur Crystallization Database (BMCD) which is a repository of successful experimental conditions for growingg over 800 different macromoiecules (GUl and 1987). Crystallographers are in need of computational tools to gather and analyze past data to design new crystal growth trails. We are building the Crystallographer’s Assistant (CA) to help crystallographers record and maintain experimental context in electronic form, offer suggestions on experimental conditions that are likely to be successful, and provide explanations for failed experiments. As an initial step in this project, we have applied B_L, an inductive learning program, to the BMCD. In this paper we report initial experiments and findings in applying RL to the BMCD. From the point of view of crystallography, we have discovered possibly significant new empirical relationships in crystal growth. From the point of view of machine learning, our work suggests refinements of existing methods for incorporating detailed domain knowledge into inductive analysis techniques.