Text this: Predictive data mining based on similarity and clustering methods.