Ramachandran Chandrasekar, Thanukrishnan Srinivasan
In this paper we present an improved version of the Probabilistic Ant based Clustering Algorithm for Distributed Databases (PACE). The most important feature of this algorithm is the formation of numerous zones in different sites based on corresponding user queries to the distributed database. Keywords, extracted out of the queries, are used to assign a range of values according to their corresponding probability of occurrence or hit ratio at each site. We propose the introduction of weights for individual or groups of data items in each zone according to their relevance to the queries along with the concept of familial pheromone trails as part of an Ant Odor Identification Model to bias the movements of different types of ants towards the members of their own family.Its performance is compared against PACE and other known clustering algorithms for different evaluation measures and an improvement is shown in terms of convergence speed and quality of solution obtained.
Subjects: 12. Machine Learning and Discoveryn
Submitted: Oct 16, 2006