Text this: A Parallel-Model Speech Emotion Recognition Network Based on Feature Clustering