Ross D. King, Dominic A. Clark, Jack Shirazi, and Michael J. E. Sternberg
This paper describes the application of the Inductive Logic Programming 0LP) program GOLEM to the discovery of constraints in the packing of beta-sheets in alpha/beta proteins. These constraints (rules) have a role in understanding the protein folding problem. Constraints were learnt for four features of beta-sheet packing: the winding direction of two sequential strands, whether two consecutive strands pack parallel or anti-parallel, whether two strands pack adjacently, and whether a beta-strand is at an edge. Investigation of the learnt constraints revealed interesting patterns, some of which were previously known, others that were novel. Novel features include the discovery: that the relationship between pairs of sequential strands is in general one of decreasing size, and that more sequential pairs of strands wind in the direction out than the direction in. We conclude that machine learning has a useful place in molecular biology as a pattern discovery tool.