Text this: A heuristic approach for finding similarity indexes of multivariate data sets