Empirical Determination of Lower Bounds on RP Embedding

Lili He and Ian R. Greenshields

Data analysis is critical to many (if not all) Homeland Security missions. Data to be fielded in this domain is typically immense in cardinality (number of records) and immense in dimensionality (features per record). Random Projections (RP) have been proposed as an effective technique for embedding a metric space of dimension d into one of dimension k while retaining bounds on the distortion of the metric under embedding. More recently, Achlioptas has shown from work founded on the Johnson-Lindenstrauss lemma on embeddings that a sparse random matrix could give additional computational savings in random projection, and he also gives a theorical embedding bound k0. However, empirical evidence shows that one can choose k << k0 and yet retain low distortion in the metric. In the paper we propose an experimental approach to find the optimum lower bound of k0 for any distribution models in terms of classification.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.